Issue with ruby ActiveRecord gem - mysql

I am using rails 4.2.4 version.
I have a database table 'upload' which has 10000 entries.
file_name | file_path | parent_directory | created_at
I have a model, Upload with following function:
select(:parent_directory).distinct
This should provide me a list of distinct parent directories present in the table.
When I do select(:parent_directory).distinct.size,
it executes select distinct id from upload;
and gives me all 10000 entries, which is wrong.
But when I do select(:parent_directory).distinct.count,
it executes select distinct parent_directory from upload;
and gives me 3000 entries, which is correct.
Is this some kind of issue with ActiveRecord gem or am I doing something wrong here?

There is an open issue regarding this in the rails repo https://github.com/rails/rails/issues/16182.
The problem is with size which is trying to make intelligent choices for you. If you use length instead of size you will have the expected result.

Related

Rails 3: What is the best way to update a column in a very large table

I want to update all of a column in a table with over 2.2 million rows where the attribute is set to null. There is a Users table and a Posts table. Even though there is a column for num_posts in User, only about 70,000 users have that number populated; otherwise I have to query the db like so:
#num_posts = #user.posts.count
I want to use a migration to update the attributes and I'm not sure whether or not it's the best way to do it. Here is my migration file:
class UpdateNilPostCountInUsers < ActiveRecord::Migration
def up
nil_count = User.select(:id).where("num_posts IS NULL")
nil_count.each do |user|
user.update_attribute :num_posts, user.posts.count
end
end
def down
end
end
In my console, I ran a query on the first 10 rows where num_posts was null, and then used puts for each user.posts.count . The total time was 85.3ms for 10 rows, for an avg of 8.53ms. 8.53ms*2.2million rows is about 5.25 hours, and that's without updating any attributes. How do I know if my migration is running as expected? Is there a way to log to the console %complete? I really don't want to wait 5+ hours to find out it didn't do anything. Much appreciated.
EDIT:
Per Max's comment below, I abandoned the migration route and used find_each to solve the problem in batches. I solved the problem by writing the following code in the User model, which I successfully ran from the Rails console:
def self.update_post_count
nil_count = User.select(:id).where("num_posts IS NULL")
nil_count.find_each { |user|
user.update_column(:num_posts, user.posts.count) if user.posts
}
end
Thanks again for the help everyone!
desc 'Update User post cache counter'
task :update_cache_counter => :environment do
users = User.joins('LEFT OUTER JOIN "posts" ON "posts.user_id" = "users.id"')
.select('"users.id", "posts.id", COUNT("posts.id") AS "p_count"')
.where('"num_posts" IS NULL')
puts "Updating user post counts:"
users.find_each do |user|
print '.'
user.update_attribute(:num_posts, user.p_count)
end
end
First off don't use a migration for what is essentially a maintenance task. Migrations should mainly alter the schema of your database. Especially if it is long running like in this case and may fail midway resulting in a botched migration and problems with the database state.
Then you need to address the fact that calling user.posts is causing a N+1 query and you instead should join the posts table and select a count.
And without using batches you are likely to exhaust the servers memory quickly.
You can use update_all and subquery to do this.
sub_query = 'SELECT count(*) FROM `posts` WHERE `posts`.`user_id` = `users`.`id`'
User.where('num_posts IS NULL').update_all('num_posts = (#{sub_query})')
It will take only seconds instead of hours.
If so, you may not have to find a way to log something.

Query is dropping its WHERE clause with sinatra, mysql2, and activerecord

I am having a strange issue with queries dropping their WHERE clause. Here are the gems I'm using:
sinatra (1.4.3)
mysql2 (0.4.1)
activerecord (3.0.20)
activerecord-mysql2-adapter (0.0.3)
Here is an example of what is happening. First, I query for a record and then update one of its attributes. This executes successfully.
IRB :> subscription = Subscription.last
IRB :> subscription.times_billed = 7
IRB :> subscription.save
However, here is what the mysql general query log is reporting when I save the record:
UPDATE `subscriptions` SET `times_billed` = 7
It is missing the WHERE clause. The query should be something like:
UPDATE `subscriptions` SET `times_billed` = 7 WHERE `id` = [ID]
This is causing every row in the table to be updated, rather than just the desired record. Somewhere in the stack, the query is being malformed.
I cannot find any other reports of something like this happening. Has anyone encountered something like this before? Please let me know if I can provide additional information.

Zero padding : convert MD-1 to MD-001 with pure sql

guys I need help.
I am using Mysql / phpmyadim.
I have db with table which stores name and code id of people.
+--------+---------+
| Name | code_id |
+--------+---------+
| Nazeer | MD-1 |
+--------+---------+
I have 10 contacts and ids. I am using php program which used to generate automatic code.
recently i imported more records in to db from excel file and record increase to 5000+.
My php automatic code stopped generating codes giving me syntax error on code id.
I figured out that my excel import was having code id like MD-1, MD-2, etc. and my program used automatic code for number in 3 digits since my record is over thousands which 4 digit it give syntax error.
I did some research on solving that and the answer was to change all 2 digit numbers eg. "MD-1" ~ "MD-99" TO "MD-001" ~ "MD-099" and my program will work again.
so the question is how do i do that in phpmyadmin sql to change it. I need to keep 'MD-' and add '0' then add back the corresponding number.
thanks and appreciate your help in advance.
Regrds.
this sql will update all your data, but like I said in comments, you better off fixing your php code instead.
WARNING : this sql only works assuming all your data are in the format of [MD-xxx] with 3 or less numbers in it
UPDATE your_table SET
code_id=case length(substr(code_id,4))
WHEN 1 THEN concat("MD-00",substr(code_id,4))
WHEN 2 THEN concat("MD-0",substr(code_id,4))
ELSE code_id END;
I assume that you want to update the content MD-1 to MD-001 and MD-99 to MD-099. To do that you can write a PHP code to retrieve the rows one by one and have to match patterns and then update. Here are some useful links. link 1
HINT : you can check 5 digit string and then add another 0 in the position of 3.(use [exploid] to split by "-" and then concat with "-0" 2) There are no way to do the same only by using MYSQl since it's not a programming language. And other thing is PHP is not a program. It's a programming language.
run UPDATE query and use CONCAT function :
for ($x=0; $x=<upto>; $x++){
UPDATE <table_name>
SET <columnname>= CONCAT('MD-',0,$x)
WHERE <columnname>= CONCAT('MD-',$x)
}
Below simple update command can help you.
UPDATE mytable
SET code_id=IF(LENGTH(code_id)=4,CONCAT(SUBSTRING_INDEX(code_id,'-',1),'-00',SUBSTRING_INDEX(code_id,'-',-1)),IF(LENGTH(code_id)=5,CONCAT(SUBSTRING_INDEX(code_id,'-',1),'-0',SUBSTRING_INDEX(code_id,'-',-1)),code_id));

Rails model not syncing properly with database

I have a rails model called Merchant with attributes name and id. I am having an issue where rails and my database disagree on a certain Merchant's name.
This is what's happening from rails:
1.9.2p320 :001 > Merchant.where(:id=>550).count
=> 1
1.9.2p320 :002 > Merchant.where(:id=>550).first.name
=> nil
And this is what's happening from mysql:
mysql> SELECT name FROM merchants WHERE id=550;
+----------+
| name |
+----------+
| Testname |
+----------+
1 row in set (0.00 sec)
According to FlyersAdmin::Application.config.database_configuration[::Rails.env] the database being used by rails in my first code window is the same as that being used by mysql in the second window. Why the merchant's name is nil instead of "Testname" is what I'm stumped on.
Worth noting is that recently I updated the database they're both using with new data, is it possible that this is causing the discrepancy? Maybe rails caches data and so hasn't looked at the updated database yet? I'm stumped, any help is appreciated.
EDIT:
Here's another clue to add to the mystery: Running Merchant.where(:name => nil) returns the empty list! Why isn't it picking the Merchant with id 550?
What should be the output of this Merchant.where(:id=>550).first.name?. It is possible you created a Merchant but you just did not define a name, so it got the default value nil
Why you do .first when you select one record anyway?
Merchant.find(550).name
should give you your record.
Try
Merchant.first.name
to get the first record of the model.

How does Rails build a MySQL statement?

I have the following code that run on heroku inside a controller that intermittently fails. It's a no-brainer that it should work to me, but I must be missing something.
#artist = Artist.find(params[:artist_id])
The parameters hash looks like this:
{"utf8"=>"������",
"authenticity_token"=>"XXXXXXXXXXXXXXX",
"password"=>"[FILTERED]",
"commit"=>"Download",
"action"=>"show",
"controller"=>"albums",
"artist_id"=>"62",
"id"=>"157"}
The error I get looks like this:
ActiveRecord::StatementInvalid: Mysql::Error: : SELECT `artists`.* FROM `artists` WHERE `artists`.`id` = ? LIMIT 1
notice the WHEREartists.id= ? part of the statement? It's trying to find an ID of QUESTION MARK. Meaning Rails is not passing in the params[:artist_id] which is obviously in the params hash. I'm at complete loss.
I get the same error on different pages trying to select the record in a similar fashion.
My environment: Cedar Stack on Heroku (this only happens on Heroku), Ruby 1.9.3, Rails 3.2.8, files being hosted on Amazon S3 (though I doubt it matters), using the mysql gem (not mysql2, which doesn't work at all), ClearDB MySQL database.
Here's the full trace.
Any help would be tremendously appreciated.
try sql?
If it's just this one statement, and it's causing production problems, can you omit the query generator just for now? In other words, for very short term, just write the SQL yourself. This will buy you a bit of time.
# All on one line:
Artist.find_by_sql
"SELECT `artists`.* FROM `artists`
WHERE `artists`.`id` = #{params[:artist_id].to_i} LIMIT 1"
ARel/MySQL explain?
Rails can help explain what MySQL is trying to do:
Artist.find(params[:artist_id]).explain
http://weblog.rubyonrails.org/2011/12/6/what-s-new-in-edge-rails-explain/
Perhaps you can discover some kind of difference between the queries that are succeeding vs. failing, such as how the explain uses indexes or optimizations.
mysql2 gem?
Can you try changing from the mysql gem to the mysql2 gem? What failure do you get when you switch to the mysql2 gem?
volatility?
Perhaps there's something else changing the params hash on the fly, so you see it when you print it, but it's changed by the time the query runs?
Try assigning the variable as soon as you receive the params:
artist_id = params[:artist_id]
... whatever code here...
#artist = Artist.find(artist_id)
not the params hash?
You wrote "Meaning Rails is not passing in the params[:artist_id] which is obviously in the params hash." I don't think that's the problem-- I expect that you're seeing this because Rails is using the "?" as a placeholder for a prepared statement.
To find out, run the commands suggested by #Mori and compare them; they should be the same.
Article.find(42).to_sql
Article.find(params[:artist_id]).to_sql
prepared statements?
Could be a prepared statement cache problem, when the query is actually executed.
Here's the code that is failing-- and there's a big fat warning.
begin
stmt.execute(*binds.map { |col, val| type_cast(val, col) })
rescue Mysql::Error => e
# Older versions of MySQL leave the prepared statement in a bad
# place when an error occurs. To support older mysql versions, we
# need to close the statement and delete the statement from the
# cache.
stmt.close
#statements.delete sql
raise e
end
Try configuring your database to turn off prepared statements, to see if that makes a difference.
In your ./config/database.yml file:
production:
adapter: mysql
prepared_statements: false
...
bugs with prepared statements?
There may be a problem with Rails ignoring this setting. If you want to know a lot more about it, see this discussion and bug fix by Jeremey Cole and Aaron: https://github.com/rails/rails/pull/7042
Heroku may ignore the setting. Here's a way you can try overriding Heroku by patching the prepared_statements setup: https://github.com/rails/rails/issues/5297
remove the query cache?
Try removing the ActiveRecord QueryCache to see if that makes a difference:
config.middleware.delete ActiveRecord::QueryCache
http://edgeguides.rubyonrails.org/configuring.html#configuring-middle
try postgres?
If you can try Postgres, that could clear it up too. That may not be a long term solution for you, but it would isolate the problem to MySQL.
The MySQL statement is obviously wrong, but the Ruby code you mentioned would not produce it. Something is wrong here, either you use a different Ruby code (maybe one from a before_filter) or pass a different parameter (like params[:artist_id] = "?"). Looks like you use nested resources, something like Artist has_many :albums. Maybe the #artist variable is not initialized correctly in the previous action, so that params[:artist_id] has not the right value?