Rails: How to handle existing invalid dates in database? - mysql

First, this is directly related to my other question:
How to gracefully handle "Mysql2::Error: Invalid date" in ActiveRecord?
But I still do not want to jump through all the loops of writing migrations which fix dates. That won't be the last table with invalid dates and I need some more generic approach.
So here we go:
I'm using a legacy MySQL database which contains invalid dates, sometimes like 2010-01-00 or 0000-04-25... Rails does not load such records (older versions of Rails did).
I do not want to (and cannot) correct these dates manually or automated. It should be up to the authors of those records to correct these dates. The old system was a PHP application which allowed such annoyances. The Rails application should/will just prevent the user from saving the record until the dates are valid.
The problem does not seem to be within Rails itself, but deeper within an .so library of the rails mysql gem.
So my question is not about how to validate the date or how to insert invalid dates. I don't want to do that and that's covered by numerous answers all over stackoverflow and the rest of the internet. My question is how to READ invalid dates from MySQL that already exist in the database without Rails exploding into 1000 little pieces...
The column type is DATETIME and I'm not sure if casting to string could help because Rails chokes before any ActiveRecord related parsing kicks in.
Here's the exact error and backtrace:
$ rails c
Loading development environment (Rails 3.2.13)
irb(main):001:0> Poll.first
Poll Load (0.5ms) SELECT `polls`.* FROM `polls` LIMIT 1
Mysql2::Error: Invalid date: 2003-00-01 00:00:00
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/connection_adapters/mysql2_adapter.rb:216:in `each'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/connection_adapters/mysql2_adapter.rb:216:in `to_a'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/connection_adapters/mysql2_adapter.rb:216:in `exec_query'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/connection_adapters/mysql2_adapter.rb:224:in `select'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/connection_adapters/abstract/database_statements.rb:18:in `select_all'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/connection_adapters/abstract/query_cache.rb:63:in `select_all'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/querying.rb:38:in `find_by_sql'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/explain.rb:41:in `logging_query_plan'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/querying.rb:37:in `find_by_sql'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/relation.rb:171:in `exec_queries'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/relation.rb:160:in `to_a'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/explain.rb:34:in `logging_query_plan'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/relation.rb:159:in `to_a'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/relation/finder_methods.rb:380:in `find_first'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/relation/finder_methods.rb:122:in `first'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/querying.rb:5:in `__send__'
from /home/kakra/.gem/ruby/1.8/gems/activerecord-3.2.13/lib/active_record/querying.rb:5:in `first'
from (irb):1
The backtrace remains the same even when I do Poll.first.title so some date should never reach any output routine in IRB and thus should never be parsed. So suggestions to use a value before typecasting would not help.

I think the simplest solution that worked for me was to set in database.yml file cast: false, e.g. for development section
development
<<: *default
adapter: mysql2
(... some other settings ...)
cast: false

try this out
ActiveRecord::AttributeMethods::BeforeTypeCast provides a way to read the value of the attributes before typecasting and deserialization.
http://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/BeforeTypeCast.html

Related

Ingesting Huge CSV on Rails/Heroku: MySQL Connection Closed - SignalException: SIGTERM

I've got a colossal CSV file (2.4 GB) hosted remotely (s3) which I'm trying to ingest into my rails app.
I've loaded it into temp and seems to work fine, but the connection keeps terminating on me SIGTERM about ten minutes after I begin to ingest / iterate over the file.
I'm on heroku running rails 4.2 with mysql 0.3.20.
What am I missing? How do I get this done?
rake aborted!
SignalException: SIGTERM
/app/vendor/bundle/ruby/2.2.0/gems/mysql2-0.3.21/lib/mysql2/client.rb:80:in `_query'
/app/vendor/bundle/ruby/2.2.0/gems/mysql2-0.3.21/lib/mysql2/client.rb:80:in `block in query'
/app/vendor/bundle/ruby/2.2.0/gems/mysql2-0.3.21/lib/mysql2/client.rb:79:in `handle_interrupt'
/app/vendor/bundle/ruby/2.2.0/gems/mysql2-0.3.21/lib/mysql2/client.rb:79:in `query'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `block in execute'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract_adapter.rb:466:in `block in log'
/app/vendor/bundle/ruby/2.2.0/gems/activesupport-4.2.0/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract_adapter.rb:460:in `log'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `execute'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/mysql2_adapter.rb:231:in `execute'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/mysql2_adapter.rb:235:in `exec_query'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/database_statements.rb:336:in `select'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/database_statements.rb:32:in `select_all'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/query_cache.rb:70:in `select_all'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/database_statements.rb:38:in `select_one'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/database_statements.rb:43:in `select_value'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/relation/finder_methods.rb:314:in `exists?'
/app/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.0/lib/active_record/querying.rb:3:in `exists?'
You can do it two ways: Use the SmarterCSV gem and test it locally first to make sure it can handle the size. If the size isn't an issue this would be your best bet since it makes it very easy to process large csv's before input. If that doesn't work you can do this:
Use mysql's import feature (discussed here: http://dev.mysql.com/doc/refman/5.7/en/mysqlimport.html ) to first directly throw the data into a table in mysql. Then, you can iterate through the records and transfer the data to the appropriate table using rails find_each method to avoid overloading the garbage collector. I'm not sure if the import feature works the same as postgres' COPY but if it does make sure you create a table without a primary key in rails to hold the initial data transfer if your csv file doesn't have a primary key column.

Converting Time or DateTime to MySQL compatible DATETIME

According to "Ruby datetime suitable for mysql comparison", I should be able to do:
Time.now.to_s(:db)
This doesn't appear to be valid anymore. I get:
irb(main):001:0> Time.now.to_s(:db)
ArgumentError: wrong number of arguments (1 for 0)
from (irb):1:in `to_s'
from (irb):1
from C:/Ruby22/bin/irb:11:in `<main>'
Does this functionality still exist or do I have to manually format the date and time to fit MySQL format?
I'm using ruby 2.2.2.
Time#to_s doesn't accept arguments in Ruby. If you're using Rails, ActiveSupport::TimeWithZone supplies the to_s method you were referring to.
To get this format in Ruby without ActiveSupport you can use:
Time.now.strftime('%Y-%m-%d %H:%M:%S')

Postgres not allowing ">=" but mysql does, how to overcome?

My local rails database is mysql but my server host (heroku) is Postgres.
Probably a fairly common combination.
I have an advanced search form that work locally in development mode but not in production and it looks like it might be a Postgres specific thing as the heroku log shows I am getting:
LINE 1: ...,18,19,17,4,32,23,24,16,6,13) and (version_number >= 0.0 or ...
2014-06-23T01:47:54.198026+00:00 app[web.1]: ^
2014-06-23T01:47:54.198022+00:00 app[web.1]: ActiveRecord::StatementInvalid (PG::UndefinedFunction: ERROR: operator does not exist: character varying >= numeric
2014-06-23T01:47:54.198028+00:00 app[web.1]: HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
in the log.
Is there another way to do >= in postgres.
Locally I do see that the datatype is string in schema.rb which is probably the problem. Is there a way I can cast it into integer for rails for pg?
PostgreSQL definitely does have the >= operator: http://www.postgresql.org/docs/current/static/functions-comparison.html
Your problem is that you seem to be comparing a string with a number.
Is there a way I can cast it into integer for rails for pg?
Probably - but we can't see your code. Did you write the SQL? Or did you rely on ActiveRecord? DataMapper? Sequel? Can't help without seeing what you did.

Rails - Model Validations doesn't apply on mysql insert/update command

For the reason, I've used mysql cmd insert into table_name (....) update custom_reports ...and hence I miss out on Model validations
validates_uniqueness_of :name
validates_presence_of :name, :description
How to validate now in rails way? Or, use the mysql way to validate(needs help in this way too)?
Rails validation and other ActiveRecord and ActiveModel magic don't work if you only execute custom SQL command. None of your model classes is even instantized then.
For Mysql (or any sql like DB), you can modify the column attribute to:
Unique (this would validate uniqueness)
Not null (this would validate presence)
I know doing the above with OCI8 and oracle would result in exceptions which I am guessing should be same with ActiveRecord and Mysql, so you should be handling your exceptions correctly
But as #Marek as said you should be relying on Active record and be doing things like
Model.create()
OR
model_instance.save()
If you want to find (and perhaps handle) the entries in your db that are not valid, try the following in the rails console:
ModelName.find_each do |item|
unless item.valid?
puts "Item ##{item.id} is invalid"
# code to fix the problem...
end
end
valid? runs the Validations again, but does not alter the data.

How does Rails build a MySQL statement?

I have the following code that run on heroku inside a controller that intermittently fails. It's a no-brainer that it should work to me, but I must be missing something.
#artist = Artist.find(params[:artist_id])
The parameters hash looks like this:
{"utf8"=>"������",
"authenticity_token"=>"XXXXXXXXXXXXXXX",
"password"=>"[FILTERED]",
"commit"=>"Download",
"action"=>"show",
"controller"=>"albums",
"artist_id"=>"62",
"id"=>"157"}
The error I get looks like this:
ActiveRecord::StatementInvalid: Mysql::Error: : SELECT `artists`.* FROM `artists` WHERE `artists`.`id` = ? LIMIT 1
notice the WHEREartists.id= ? part of the statement? It's trying to find an ID of QUESTION MARK. Meaning Rails is not passing in the params[:artist_id] which is obviously in the params hash. I'm at complete loss.
I get the same error on different pages trying to select the record in a similar fashion.
My environment: Cedar Stack on Heroku (this only happens on Heroku), Ruby 1.9.3, Rails 3.2.8, files being hosted on Amazon S3 (though I doubt it matters), using the mysql gem (not mysql2, which doesn't work at all), ClearDB MySQL database.
Here's the full trace.
Any help would be tremendously appreciated.
try sql?
If it's just this one statement, and it's causing production problems, can you omit the query generator just for now? In other words, for very short term, just write the SQL yourself. This will buy you a bit of time.
# All on one line:
Artist.find_by_sql
"SELECT `artists`.* FROM `artists`
WHERE `artists`.`id` = #{params[:artist_id].to_i} LIMIT 1"
ARel/MySQL explain?
Rails can help explain what MySQL is trying to do:
Artist.find(params[:artist_id]).explain
http://weblog.rubyonrails.org/2011/12/6/what-s-new-in-edge-rails-explain/
Perhaps you can discover some kind of difference between the queries that are succeeding vs. failing, such as how the explain uses indexes or optimizations.
mysql2 gem?
Can you try changing from the mysql gem to the mysql2 gem? What failure do you get when you switch to the mysql2 gem?
volatility?
Perhaps there's something else changing the params hash on the fly, so you see it when you print it, but it's changed by the time the query runs?
Try assigning the variable as soon as you receive the params:
artist_id = params[:artist_id]
... whatever code here...
#artist = Artist.find(artist_id)
not the params hash?
You wrote "Meaning Rails is not passing in the params[:artist_id] which is obviously in the params hash." I don't think that's the problem-- I expect that you're seeing this because Rails is using the "?" as a placeholder for a prepared statement.
To find out, run the commands suggested by #Mori and compare them; they should be the same.
Article.find(42).to_sql
Article.find(params[:artist_id]).to_sql
prepared statements?
Could be a prepared statement cache problem, when the query is actually executed.
Here's the code that is failing-- and there's a big fat warning.
begin
stmt.execute(*binds.map { |col, val| type_cast(val, col) })
rescue Mysql::Error => e
# Older versions of MySQL leave the prepared statement in a bad
# place when an error occurs. To support older mysql versions, we
# need to close the statement and delete the statement from the
# cache.
stmt.close
#statements.delete sql
raise e
end
Try configuring your database to turn off prepared statements, to see if that makes a difference.
In your ./config/database.yml file:
production:
adapter: mysql
prepared_statements: false
...
bugs with prepared statements?
There may be a problem with Rails ignoring this setting. If you want to know a lot more about it, see this discussion and bug fix by Jeremey Cole and Aaron: https://github.com/rails/rails/pull/7042
Heroku may ignore the setting. Here's a way you can try overriding Heroku by patching the prepared_statements setup: https://github.com/rails/rails/issues/5297
remove the query cache?
Try removing the ActiveRecord QueryCache to see if that makes a difference:
config.middleware.delete ActiveRecord::QueryCache
http://edgeguides.rubyonrails.org/configuring.html#configuring-middle
try postgres?
If you can try Postgres, that could clear it up too. That may not be a long term solution for you, but it would isolate the problem to MySQL.
The MySQL statement is obviously wrong, but the Ruby code you mentioned would not produce it. Something is wrong here, either you use a different Ruby code (maybe one from a before_filter) or pass a different parameter (like params[:artist_id] = "?"). Looks like you use nested resources, something like Artist has_many :albums. Maybe the #artist variable is not initialized correctly in the previous action, so that params[:artist_id] has not the right value?