I'm trying to seed about 100,000 users using rake db:seed in my Rails 3 project and it is really slow!
Here's the code sample:
# ...
User.create!(
:display_name => "#{title} #{name} #{surname}",
:email => "#{name}.#{surname}_#{num}#localtinkers.com",
:password => '12341234'
)
It works, but it is really slow because for each user:
Devise issues a SELECT statement to find out if the email is already taken.
A separate INSERT statement is issued.
For other objects I use "ar-extensions" and "activerecord-import" gems as follows:
tags.each do |tag|
all_tags << Tag.new(:name => tag)
end
Tag.import(all_tags, :validate => false, :ignore => true)
The above creates just one INSERT statement for all the tags and it works really fast, just like MySql database restore from the SQL dump.
But for users I cannot do this because I need Devise to generate encrypted password, salt, etc for each user. Is there a way to generate them on the SQL side or are there other efficient ways of seeding users?
Thank you.
How about:
u = User.new(
:display_name => "#{title} #{name} #{surname}",
:email => "#{name}.#{surname}_#{num}#localtinkers.com",
:password => '12341234'
)
u.save!(:validate => false)
This should create and save the record without executing the validations, and therefore without checking for e-mail address uniqueness. Obviously the downside of this is that you're not being protected by any other validations on the user too, so make sure you check your data first!
Related
All my previous projects used DatabaseCleaner, so I'm used to starting with an empty DB and creating test data within each test with FactoryGirl.
Currently, I'm working on a project that has a test database with many records. It is an sql file that all developers must import in their local test environments. The same DB is imported in the continuous integration server. I feel like having less control over the test data makes the testing process harder.
Some features allow their tests to focus on specific data, such as records that are associated to a certain user. In those cases, the preexisting data is irrelevant. Other features such as a report that displays projects of all clients do not allow me to "ignore" the preexisting data.
Is there any way to ignore the test DB contents in some tests (emulate an empty DB and create my own test data without actually deleting all rows in the test DB)? Maybe have two databases (both in the same MySQL server) and being able to switch between them (e.g., some tests use one DB, other tests use the other DB)?
Any other recommendations on how deal with to this scenario?
Thank you.
I would recommend preserving your test_database, and the 'test' environment as your 'clean' state. Then you could setup a separate database that you initially seed as your 'dirty' database. A before hook in your rails_helper file could also be setup with something like this:
RSpec.configure do |config|
config.before :each, type: :feature do |example|
if ENV['TEST_DIRTY'] || example.metadata[:test_dirty]
ActiveRecord::Base.establish_connection(
{
:adapter => 'mysql2',
:database => 'test_dirty',
:host => '127.0.0.1',
:username => 'root',
:password => 'password'
}
)
end
end
end
Your database.yml file will need configurations added for your 'dirty' database. But I think the key here is keeping your clean and dirty states separate. Cheers!
I have found that adding the following configuration to spec/rails_helper.rb will run all DB operations inside tests or before(:each) blocks as transactions, which are rolled back after each test is finished. That means we can do something like before(:each) { MyModel.delete_all }, create our own test data, run our assertions (which will only see the data we created) and after the end of the test, all preexisting data will still be in the DB because the deletion will be rolled back.
RSpec.configure do |config|
config.use_transactional_fixtures = true
end
I run a SQL query with a ruby script that should take around 2 hours.
How I can make sure the script will exit/end only when the process of the query finish, because right now I ran the script, it pass the query to the DB, and the script immediately close while the query still running on the DB.
most of the query is commands like inserts, drop tables, create tables.
#!/usr/bin/env ruby
require 'mysql2'
client = Mysql2::Client.new(:host => ENV_YML['host'], :username => ENV_YML['username'], :password => ENV_YML['password'], :database => ENV_YML['dbtemp'], :flags => Mysql2::Client::MULTI_STATEMENTS)
client.query("
...
")
I want to run this query only after the first one finish
client.query("SELECT ;").each do |row|
....
end
Any idea how to wait for the query to finish, because I want to add another query in the same script that check the first query after it finish.
From the official documentation:
Multiple result sets
You can also retrieve multiple result sets. For this to work you need
to connect with flags Mysql2::Client::MULTI_STATEMENTS. Multiple
result sets can be used with stored procedures that return more than
one result set, and for bundling several SQL statements into a single
call to client.query.
client = Mysql2::Client.new(:host => "localhost", :username => "root", :flags => Mysql2::Client::MULTI_STATEMENTS)
result = client.query('...')
while client.next_result
result = client.store_result
# result now contains the next result set
end
ok, so lets get the basics out of the way.
I'm running ruby 1.8.7, I'm using the sequel gem version '2.6.0'.
I have a table called Users and a table called Teams
Right now a user can have one team and as such it's relation is:
belongs_to :npt_team
However as part of a feature upgrade for teams I have to make it so Users can be apart of multiple teams.
What I want to know:
I can change it to one of the following:
:has_and_belongs_to_many
:many_to_many
:many_to_many_by_ids
which one is the best to use and why(because I like to know)?
Second of all what will happen to the DB in the tables when I change this?
Any thing else I should be wary of/know about?
I'm using the following mysql version:
mysql Ver 14.14 Distrib 5.6.29, for osx10.11 (x86_64) using EditLine
wrapper
EDIT:
Ooops forgot to mention a rather pertinent point.
I'm not using rails, I'm use an old frame work called Ramaze.
The answer to my question is:
to create the relationship I need to add the following to the Users table:
has_and_belongs_to_many(:npt_teams,
:join_table => :users_teams,
:class => 'NptTeam',
:left_key => :user_id,
:right_key => :npt_team_id)
many_to_many_by_ids :npt_teams, 'UsersTeams'
Create a new join table like so:
class UsersTeams < Sequel::Model
clear_all
set_schema {
primary_key :id
integer :user_id, :null => false, :default => 0
integer :npt_team_id, :null => false, :default => 0
}
create_table unless table_exists?
belongs_to :user
belongs_to :npt_team
end
and the relationship is created along with the join table.
I don't know if this is the best way to do it but It seems to work.
As for the second question, I don't know, the data currently in the DB seems to be unaffected.
Now I just need to move the current Team to the new table and that should be it.
As for what else I might need to know well I don't, becuase you know, those that do know have seen to know have not respond so I'm just going to have to wing it.
EDIT:
script to move data across:
User.all.each do |user|
join = UsersTeams.create(:user_id => user.id, :npt_team_id => user.npt_team_id)
puts join.inspect
join.save
puts user.npt_teams.to_a.map {|t|t.inspect}.to_s
end
I did a ruby script that parses a lot of files in ruby data structures, like hashes for example.
I need to insert all this data in a MySQL database.
What I found:
mysql2
tmtm
dbi
Is there some native way to do this?
Thanks for any help
EDIT
Lets say that I have a hash with 100 entries like this:
hash = {"a" => 1, "b" => 2 ..., "c" => 100}
I would like to create a table at mysql with all this columns. I am afraid of Active Record is gonna be hard to do that.
PS: Im not using Rails, just a simple ruby script
If I were you, I would prefer ActiveRecord, because I don't have to clutter my code with lots of SQL statements. Besides activerecord makes life easier.
Set it up like this
require 'active_record'
ActiveRecord::Base.establish_connection(
:adapter => "mysql2",
:host => "host",
:username=>"user",
:password=>"user",
:database => "your_db"
)
Then use tables like this
class SingularTableName < ActiveRecord::Base
has_many :table_relationship
end
Then query like this
SingularTableName.all #=> all records
SingularTableName.first #=> first record
SingularTableName.where("query")
SingularTableName.create("...) #=> create a record/row
You can find more methods here => http://api.rubyonrails.org/classes/ActiveRecord/Base.html
Update:
To overcome plural table names and default primary key, you can use
class AnyName < ActiveRecord::Base
self.table_name = 'your table name'
self.primary_key = 'your primary key'
...
end
i got this call in my controller:
#tournaments = Tournament.unignored.all(
:include => [:matches,:sport,:category],
:conditions=> ["matches.status in (0,4)
&& matches.date < ?",
Time.now.end_of_week + 1.day],
:order => "sports.sort ASC, categories.sort ASC, tournaments.sort ASC")
All works out in production mode and in the development console as well. But when I try to browse to that certain page in development mode i get:
The error occurred while evaluating nil.each
When I paste the created SQL Query into MySQL Browser there are results.
It refers to mysql2 (0.2.11) lib/active_record/connection_adapters/mysql2_adapter.rb:587:in `select'
The query arrives correctly in this one.
Did anyone had similar problems? This error came out of nowhere. No updates etc...
Thanks!
Rails 3.0.9 MySql 5.5 Ruby 1.8.7 and mysql2 0.2.11 gem
It looks like you need to use :joins instead of :include.
the :include option to all (and find and where etc) tells rails to separately do a query to load all the ncessary data for the given associated records.
The :join option gets rails to perform an SQL query that JOIN`s the associated models so you can query on their fields.
If you want to both query on the fields and preload them into the associations, you need to do both:
#tournaments = Tournament.unignored.all(
:include => [:matches,:sport,:category],
:joins => [:matches,:sport,:category],
:conditions=> ["matches.status in (0,4)
&& matches.date < ?",
Time.now.end_of_week + 1.day],
:order => "sports.sort ASC, categories.sort ASC, tournaments.sort ASC")