Writing a migration to alter a specific feature within a specific column - mysql

I have a rails app in which we were accidentally counting promotional codes twice instead of once. I was able to solve the problem in the actual codebase such that it won't happen again, but i'm having a hard time writing a migration to reset all of the old ones. I don't have a whole lot of experience with writing migrations, let alone altering them, so I decided to just adding one element to each of them first as practice, and then going forward at dividing everything in half.
With my migration I have written. the promo_codes is the name of the table, and the times_used is what I eventually want to cut in half.
class PromoCodeTest1 < ActiveRecord::Migration
def change
change_column :promo_codes, update_attribute(:times_used + 1)
end
end
Going down this road I end up getting an undefined error with update attribute. Would anybody know what I need to do to achieve this goal?

I have never seen update_attribute helper in migrations.
You need just to run SQL query (note - don't use ActiveRecord models, please).
It can look like:
def up
execute "UPDATE promo_codes SET times_used = times_used / 2;"
end
Maybe you need add some checks if columnd times_used can contain NULL values or there are odd values...

Related

Rails 3: What is the best way to update a column in a very large table

I want to update all of a column in a table with over 2.2 million rows where the attribute is set to null. There is a Users table and a Posts table. Even though there is a column for num_posts in User, only about 70,000 users have that number populated; otherwise I have to query the db like so:
#num_posts = #user.posts.count
I want to use a migration to update the attributes and I'm not sure whether or not it's the best way to do it. Here is my migration file:
class UpdateNilPostCountInUsers < ActiveRecord::Migration
def up
nil_count = User.select(:id).where("num_posts IS NULL")
nil_count.each do |user|
user.update_attribute :num_posts, user.posts.count
end
end
def down
end
end
In my console, I ran a query on the first 10 rows where num_posts was null, and then used puts for each user.posts.count . The total time was 85.3ms for 10 rows, for an avg of 8.53ms. 8.53ms*2.2million rows is about 5.25 hours, and that's without updating any attributes. How do I know if my migration is running as expected? Is there a way to log to the console %complete? I really don't want to wait 5+ hours to find out it didn't do anything. Much appreciated.
EDIT:
Per Max's comment below, I abandoned the migration route and used find_each to solve the problem in batches. I solved the problem by writing the following code in the User model, which I successfully ran from the Rails console:
def self.update_post_count
nil_count = User.select(:id).where("num_posts IS NULL")
nil_count.find_each { |user|
user.update_column(:num_posts, user.posts.count) if user.posts
}
end
Thanks again for the help everyone!
desc 'Update User post cache counter'
task :update_cache_counter => :environment do
users = User.joins('LEFT OUTER JOIN "posts" ON "posts.user_id" = "users.id"')
.select('"users.id", "posts.id", COUNT("posts.id") AS "p_count"')
.where('"num_posts" IS NULL')
puts "Updating user post counts:"
users.find_each do |user|
print '.'
user.update_attribute(:num_posts, user.p_count)
end
end
First off don't use a migration for what is essentially a maintenance task. Migrations should mainly alter the schema of your database. Especially if it is long running like in this case and may fail midway resulting in a botched migration and problems with the database state.
Then you need to address the fact that calling user.posts is causing a N+1 query and you instead should join the posts table and select a count.
And without using batches you are likely to exhaust the servers memory quickly.
You can use update_all and subquery to do this.
sub_query = 'SELECT count(*) FROM `posts` WHERE `posts`.`user_id` = `users`.`id`'
User.where('num_posts IS NULL').update_all('num_posts = (#{sub_query})')
It will take only seconds instead of hours.
If so, you may not have to find a way to log something.

Changing FROM in all queries for an ActiveRecord model

I'm working on a rails project that is connected to a third-party MySQL database that I cannot change the schema for. So far, I've been able to shoe-horn everything into rails and make it play nice, but I've come across an interesting problem.
I have a table, we'll call it foos. I have an ActiveRecord model called Foo that uses this table. The problem is that this table represents two similar but distinct types of record. We'll call them Foo type A and Foo type B. To get around this, I've created two classes, FooTypeA and FooTypeB that inherit from Foo and have default scopes so that they only contain records of their respective types.
My code looks something like this:
class Foo < ActiveRecord::Base
# methods common to both types
end
class FooTypeA < Foo
default_scope -> { where is_type_a: true }
# methods for type A
end
class FooTypeB < Foo
default_scope -> { where is_type_a: false }
# methods for type B
end
For the most part, this works pretty well, except for the fact that sometimes an association chain joins over both of these models. Since they come from the same table, this causes ambiguity problems, and generates exploding SQL queries. I've been writing custom join queries to get around this, but it's quickly becoming cumbersome.
I know I can change the default table name for a model with the self.table_name value, but is there a way that I can tell rails to change the FROM portion of the SQL query for a model so that I can make all queries from FooTypeA read as: SELECT foo_as.* FROM foos AS foo_as ...
I'm open to other suggestions, but this seems like the easiest solution if it's possible.
Wouldn't the ActiveRecord .from method solve your problem?
You could also create two views (depending on mysql version) and use those for table sources but unless you only read from the tables, you can get into writable view issues which I would try and avoid.

Rspec and Capybara: saving results to database?

I'm fairly new to RSpec and have been trying to create some tests for my website, on which a user can post a reservation to the website, which is then saved to our database. I've been trying, using Rspec and Capybara, to simulate a user posting a reservation to the website. We have an existing test database, and at the end of the Rspec test want the new reservation to be written to the database, and not removed at the end of the Rspec test.
One of two things happens when we run the code: either it "works" but the new reservation can't be found in the database, or we get this error:
Failure/Error: Unable to find matching line from backtrace
ActiveRecord::StatementInvalid:
Mysql2::Error: This connection is in use by: #<Thread:0x007fb421fd6218 sleep>: SELECT `users`.* FROM `users` WHERE `users`.`id` = 6 ORDER BY `users`.`id` ASC LIMIT 1
# ./app/controllers/application_controller.rb:95:in `pass_login_status_to_js'
# ./app/middleware/search_suggestions.rb:12:in `call'
Why would this be happening? I realize that Capybara isn't generally meant to be making permanent changes to a database; is there a different program/gem you recommend?
I currently have config.use_transactional_fixtures = false, and also have added the following on the recommendation of a few websites:
class ActiveRecord::Base
mattr_accessor :shared_connection
##shared_connection = nil
def self.connection
##shared_connection || retrieve_connection
end
end
ActiveRecord::Base.shared_connection = ActiveRecord::Base.connection
To reiterate, I do want Capybara to be writing to my database (we use SQL). What can I do differently? Does it have something to do with database cleaner?
Yes, it has everything to do with database_cleaner. If you have it setup properly, it will clean your database between scenarios, to keep the tests isolated.
There are a few ways to do what you want:
You can explicitly tell database_cleaner not to clean certain tables between scenarios:
DatabaseCleaner.strategy = :transaction, {except: [:countries, :states]}
DatabaseCleaner.clean_with(:truncation, {except: [:countries, :states]})
You can add your code to a before(:each) or before(:all) block
You can add your data to one or many fixtures
There are only a few cases where you should share data between scenarios (ie. countries, states tables, which are good candidates for #3)
In any other case, I advise against sharing data between scenarios.

Add a column to a table

how to add a column to my table Users.
because I ran the migration, I have to do something like:
rails generate migration AddShowmsgColumnToUsers show_msg:boolean
and then:
rake db:migrate
but I'm not sure about "AddShowmsgColumnToUsers". how can I know how it suppose to be? why not: AddShow_msgColumnToUsers? if the problem was pluralization and singularization, I can run the rails console and check that, but how can I know about the uppercase letter: ShowMsg/Show_msg/Show_Msg/Showmsg? is there a command that helps me to check it?
In answer to your first question, it doesn't matter, as long as the table name is correct - Rails uses the arguments you specify for the columns rather than the name of the migration.
Also, you should only really be asking one question at a time... ;-)
If you generate a migration to add an column, you should use either camelcase or underscores. Besides you dont have to put "Column" inside your migration generator, with Add...To... the migration already knows you are adding a column.
So either:
rails generate migration AddShowMsgToUsers show_msg:boolean
or:
rails generate migration add_show_msg_to_users show_msg:boolean
Is the way to go. The migration-generator will result in the following migration:
class AddShowMsgToUsers < ActiveRecord::Migration
def change
add_column :users, :show_msg, :boolean
end
end
Of course you could also do it all manually, but the whole point of generators is that you don't need to write everything yourself.

Rails best way to add huge amount of records

I've got to add like 25000 records to database at once in Rails.
I have to validate them, too.
Here is what i have for now:
# controller create action
def create
emails = params[:emails][:list].split("\r\n")
#created_count = 0
#rejected_count = 0
inserts = []
emails.each do |email|
#email = Email.new(:email => email)
if #email.valid?
#created_count += 1
inserts.push "('#{email}', '#{Date.today}', '#{Date.today}')"
else
#rejected_count += 1
end
end
return if emails.empty?
sql = "INSERT INTO `emails` (`email`, `updated_at`, `created_at`) VALUES #{inserts.join(", ")}"
Email.connection.execute(sql) unless inserts.empty?
redirect_to new_email_path, :notice => "Successfuly created #{#created_count} emails, rejected #{#rejected_count}"
end
It's VERY slow now, no way to add such number of records 'cause of timeout.
Any ideas? I'm using mysql.
Three things come into mind:
You can help yourself with proper tools like:
zdennis/activerecord-import or jsuchal/activerecord-fast-import. The problem is with, your example, that you will also create 25000 objects. If you tell activerecord-import to not use validations, it will not create new objects (activerecord-import/wiki/Benchmarks)
Importing tens thousands of rows into relational database will never be super fast, it should be done asynchronously via background process. And there are also tools for that, like DelayedJob and more: https://www.ruby-toolbox.com/
Move the code that belongs to model out of controller(TM)
And after that, you need to rethink the flow of this part of application. If you're using background processing inside a controller action like create, you can not just simply return HTTP 201, or HTTP 200. What you need to do is to return "quick" HTTP 202 Accepted, and provide a link to another representation where user could check the status of their request (do we already have success response? how many emails failed?), as it is in now beeing processed in the background.
It can sound a bit complicated, and it is, which is a sign, that you maybe shouldn't do it like that. Why do you have to add like 25000 records in one request? What's the backgorund?
Why don't you create a rake task for the work? The following link explains it pretty well.
http://www.ultrasaurus.com/sarahblog/2009/12/creating-a-custom-rake-task/
In a nutshell, once you write your rake task, you can kick off the work by:
rake member:load_emails
If speed is your concern, I'd attack the problem from a different angle.
Create a table that copies the structure of your emails table; let it be emails_copy. Don't copy indexes and constraints.
Import the 25k records into it using your database's fast import tools. Consult your DB docs or see e.g. this answer for MySQL. You will have to prepare the input file, but it's way faster to do — I suppose you already have the data in some text or tabular form.
Create indexes and constraints for emails_copy to mimic emails table. Constraint violations, if any, will surface; fix them.
Validate the data inside the table. It may take a few raw SQL statements to check for severe errors. You don't have to validate emails for anything but very simple format anyway. Maybe all your validation could be done against the text you'll use for import.
insert into emails select * from emails_copy to put the emails into the production table. Well, you might play a bit with it to get autoincrement IDs right.
Once you're positive that the process succeeded, drop table emails_copy.