I'm trying to do a multi-table join that has a NOT IN component. Tables are
Post -> Term Relationship -> Term
Post
has_many :term_relationships
has_many :terms, :through => :term_relationships
TermRelationship
belongs_to :post
belongs_to :term
Term
has_many :term_relationships
has_many :posts, :through => :term_relationships
The goal is to get all posts except for those in "featured" let's say. My current query would looks like:
WpPost.includes(:terms).where("terms.term NOT IN (?)", ["featured"])
This works great if the only term that it has attached is "featured". If the post belongs to "featured" and "awesome" it will still show because of "awesome".
Anyway to exclude a row entirely? Will it require a subquery? And if it does, how would I go about doing that in rails?
Thanks all!
Justin
You misuse the includes. It's for eager loading, not for joining!
But you're right about the approach. It can be used in your case. But Rails won't issue nested request for NOT IN (?) even if it would be logical. You'll get 2 queries instead (you'll get NOT IN (id1, id2....,) instead of NOT IN (SELECT ....)).
So I would recommend you to use the squeel gem:
regular AR code (can also be prettified with squeel):
featured_posts = WpPost.joins(:terms).where(terms:{term: ['featured']}).uniq
and then use the sqeel's power:
WpPost.where{id.not_in featured_posts}
(in and not_in are also aliased as >> and << but I didn't want to scary anybody)
Note the using blocks and absence of symbols.
Some measurements based on Chinook Database under SQLite:
> Track.all
Track Load (35.0ms) SELECT "Track".* FROM "Track"
Relation with joins and like:
oldie = Track.joins{playlists}.where{playlists.name.like_any %w[%classic% %90%]}
Here's NOT IN:
> Track.where{trackId.not_in oldie}.all
Track Load (37.5ms) SELECT "Track".* FROM "Track" WHERE "Track"."trackId"
NOT IN (SELECT "Track"."TrackId" FROM "Track" INNER JOIN "PlaylistTrack" ON
"PlaylistTrack"."TrackId" = "Track"."TrackId" INNER JOIN "Playlist" ON
"Playlist"."PlaylistId" = "PlaylistTrack"."PlaylistId"
WHERE (("Playlist"."name" LIKE '%classic%' OR "Playlist"."name" LIKE '%90%')))
FYI:
Track.where{trackId.not_in oldie}.count # => 1971
Track.count # => 3503
# join table:
PlaylistTrack.count # => 8715
Conclusion: I don't see the overhead caused by NOT IN. 35.0 vs 37.5 isn't noticeable difference. Few times 35.0 became 37.5 and vice verse.
One option is to do an OUTER JOIN and put the featured argument there. Then you just select all posts where no term was joined. I don't know any way of doing it in a plain "Rails way" but with some extra SQL you could do it like this:
Post.joins("LEFT OUTER JOIN term_relationships ON posts.id = term_relationships.post_id
LEFT OUTER JOIN terms ON term_relationships.term_id = terms.id AND terms.term = ?", "featured").
where("terms.id IS NULL")
Related
I'm trying to convert the following Rails where clause to use Arel, mostly to take advantage of the or method that Arel provides.
Post model
class Post
belongs_to :user
end
User model
class User
has_many :posts
end
I'm looking for posts posted by Mark.
This is the Rails Query:
Post.joins(:user).where(users: { first_name: 'Mark' })
I need to convert this query with Arel.
Thanks in advance!
This should do it.
# Generate Arel tables for both
posts = Arel::Table.new(:posts)
users = Arel::Table.new(:users)
# Make a join and add a where clause
posts.join(:users).on(posts[:user_id].eq(users[:id])).where(users[:first_name].eq('Mark'))
If you only need Arel for the where part (not for the join), I think this would be a better solution (will wield Activerecord results):
Post.joins(:user).where(User.arel_table[:first_name].eq('Mark'))
I am new to Ruby on Rails. Now I am working on performance issues of a Rails application. I am using New Relic rpm to find out the bottlenecks of the code. While doing this I find something that I cannot figure out. The problem is that here in my Rails application I have used two models A, B and C where model B has two properties: primary key of A and primary key of C like following:
class B
include DataMapper::Resource
belongs_to :A, :key=>true
belongs_to :C, :key=>true
end
Model of A is as follows:
class A
include DataMapper::Resource
property :prop1
...
has n, :bs
has n, :cs, :through => :bs
end
While issuing the following statement a.find(:c.id=>10) then internally it is executing the following SQL query:
select a.prop1, a.prop2,... from a INNER JOIN b on a.id = b.a_id INNER JOIN c on b.c_id = c.id where (c.id=10) GROUP BY a.prop1, a.prop2,....[here in group by all the properties that has been mentioned in select appears, I don't know why]
And this statement is taking too much time during web transaction. Interesting thing is that, when I am executing the same auto generated query in mysql prompt of my terminal it's taking very less amount of time. I think it's because of mentioning so many fields in group by clause. I cannot understand how the query is being formed. If anyone kindly help me to figure this out and optimize this, I will be really grateful. Thank you.
I assume you have you model associations properly configured, something like this:
class A < ActiveRecord
has_many :B
has_many :C, through: :B
end
class B < ActiveRecord
belongs_to :A
belongs_to :C
end
class C < ActiveRecord
has_many :B
has_many :A, through: :B
end
then you could simply call:
a.c.find(10) #mind the plural forms though
You will get better performance this way.
I'm using Rails 3.2 with ActiveRecord and MySQL and I have models with one to many association:
class Author
has_many :books
end
class Book
belongs_to :author
attr_accessible :review
end
I want to find authors that have all the books without review. I tried:
Author.includes(:books).where('book.review IS NIL')
but is obviously didn't work, because it finds authors that have at least one book without review. What query should I use?
SQL is quite simple:
SELECT authors.name, count(books.review is not null)
FROM authors LEFT JOIN books ON (authors.id=books.author_id)
GROUP BY authors.name
HAVING count(books.review) == 0
Translating it to the AR query language may take me some time...
OK, so it seems to look like this:
Author.count('books.review', joins: :books, select: 'name',
group:'name', having: 'count_books_review=0')
As for me SQL looks much less weird then this ;-)
Basing on the WRz answer I prepared my own query:
Author.joins(:books).group('authors.id').having("count(books.reviews)=0")
It's better suited for me, because it returns an AR Relation (and WRz's query returns a Hash).
Try this
Author.joins(:books).where('books.review is null')
edit: This will fetch all the authors with at least one book with no review. I just realized your question is a bit different.
It would be something like this.
Authors.joins(:books).select('authors.*, count(books.id) as
total_books, count('books.review is null')
as books_without_review.group('authors.id').having(total_books ==
books_without_review)
P.S: This is not the exact syntax and it is untested
Try the following code.
class Author
has_many :books
end
class Book
belongs_to :author
attr_accessible :review
end
authors = Author.all.collect do |author|
if author.books.where(:review => nil).size == author.books.size
author
end
end
authors.compact!
After this code, authors will be an array containing all the authors having all the books unreviewed. Also note that I changed the author association in Book model to belongs_to instead of has_one. It is always a good practice to have has_many relation on one side and belongs_to association on the other side.
I have a model with a has_many relationship, and a scope to determine whether or not it has any children, such as:
scope :with_nomination, :include => [:nomination], :conditions => "nominations.service_id IS NOT NULL"
Using this, I can do something like Service.with_nomination and receive a list of all services with nomination children.
The problem is that when I do something like Service.select("id, firstName, lastName").with_nomination ActiveRecord in essense does a SELECT * FROM services which is very bad and does not utilize the indexes I so painstakingly set up.
How can I either rephrase my query or modify my scopes to work with the .select() command?
Turns out in the syntax I was using, a select is not possible, so it does a select * and any further selects are already overriden.
I re-wrote the scopes like so:
scope :no_nomination, joins("LEFT JOIN nominations ON nominations.service_id = services.id").where("nominations.service_id IS NULL")
# important distinction here, the left join allows you to find those records without children
scope :with_nomination, joins(:nomination).where("nominations.service_id IS NOT NULL")
Using this syntax allows me to do something like Service.select(:id,:user,:otherfield).with_nomination
8 years later...
This is ugly, but you could also convert the resulting ActiveRecord::Relation into to sql with to_sql and run the command manually with ActiveRecord::Base.connection.execute.
It might look like this:
query = Service.select("id, firstName, lastName").with_nomination.to_sql
records = ActiveRecord::Base.connection.execute(query)
records.first["firstName"] # => First Name
This doesn't eliminate the excess columns that the scope retrieves, and you have to access each field with string keys, but hey, at least you can still access them!
This query executes just fine:
p = PlayersToTeam.select("id").joins(:player).limit(10).order("players.FirstName")
This query causes my whole system to come to a screeching halt:
p = PlayersToTeam.select("id").includes(:player).limit(10).order("players.FirstName")
Here are the models:
class PlayersToTeam < ActiveRecord::Base
belongs_to :player
belongs_to :team
accepts_nested_attributes_for :player
end
class Player < ActiveRecord::Base
has_many :players_to_teams
has_many :teams, through: :players_to_teams
end
As far as I can tell, the includes does a LEFT JOIN and joins does an INNER JOIN. The query spit out (for joins) from Rails is:
SELECT players_to_teams.id FROM `players_to_teams` INNER JOIN `players` ON `players`.`id` = `players_to_teams`.`player_id` ORDER BY players.FirstName LIMIT 10
Which executes just fine on the command line.
SELECT players_to_teams.id FROM `players_to_teams` LEFT JOIN `players` ON `players`.`id` = `players_to_teams`.`player_id` ORDER BY players.FirstName LIMIT 10
also executes just fine, it just takes twice as long.
Is there an efficient way I can sort the players_to_teams records via players? I have an index on FirstName for players.
EDIT
Turns out the query required heavy optimization to run even half decently. Splitting the query was the best solution short of restructuring the Data or customizing the query
You also might consider to split it into 2(3) queries. First - to get ids by sorting with joins:
players_to_teams = PlayersToTeam.select("id").joins(:player).limit(10).order("players.FirstName")
Second (which is inside contains 2 queries) - to get PlayersToTeams with players pre-loaded.
players_to_teams = PlayersToTeam.include(:player).where(:id => players_to_teams.map(&:id))
So after that you will have fully initialized players_to_teams with players loaded and initialized.
One thing to note is that include will add a second db access to do the preloading. You should check what that one looks like (it should contain a big IN statement on the player_ids from players_to_teams).
As for how to avoid using include, if you just need the name from players, you can do it like this:
PlayersToTeam.select("players_to_teams.id, players.FirstName AS player_name").joins(:player).limit(10).order("players.FirstName")