Is there a way in Active Record to construct a single query that will do a conditional join for multiple primary keys?
Say I have the following models:
Class Athlete < ActiveRecord::Base
has_many :workouts
end
Class Workout < ActiveRecord::Base
belongs_to :athlete
named_scope :run, :conditions => {:type => "run"}
named_scope :best, :order => "time", :limit => 1
end
With that, I could generate a query to get the best run time for an athlete:
Athlete.find(1).workouts.run.best
How can I get the best run time for each athlete in a group, using a single query?
The following does not work, because it applies the named scopes just once to the whole array, returning the single best time for all athletes:
Athlete.find([1,2,3]).workouts.run.best
The following works. However, it is not scalable for larger numbers of Athletes, since it generates a separate query for each Athlete:
[1,2,3].collect {|id| Athlete.find(id).workouts.run.best}
Is there a way to generate a single query using the Active Record query interface and associations?
If not, can anyone suggest a SQL query pattern that I can use for find_by_SQL? I must confess I am not very strong at SQL, but if someone will point me in the right direction I can probably figure it out.
To get the Workout objects with the best time:
athlete_ids = [1,2,3]
# Sanitize the SQL as we need to substitute the bind variable
# this query will give duplicates
join_sql = Workout.send(:santize_sql, [
"JOIN (
SELECT a.athlete_id, max(a.time) time
FROM workouts a
WHERE a.athlete_id IN (?)
GROUP BY a.athlete_id
) b ON b.athlete_id = workouts.athlete_id AND b.time = workouts.time",
athlete_ids])
Workout.all(:joins => join_sql, :conditions => {:athlete_id => })
If you require just the best workout time per user then:
Athlete.max("workouts.time", :include => :workouts, :group => "athletes.id",
:conditions => {:athlete_id => [1,2,3]}))
This will return a OrderedHash
{1 => 300, 2 => 60, 3 => 120}
Edit 1
The solution below avoids returning multiple workouts with same best time. This solution is very efficient if athlete_id and time columns are indexed.
Workout.all(:joins => "LEFT OUTER JOIN workouts a
ON workouts.athlete_id = a.athlete_id AND
(workouts.time < b.time OR workouts.id < b.id)",
:conditions => ["workouts.athlete_id = ? AND b.id IS NULL", athlete_ids]
)
Read this article to understand how this query works. Last check (workouts.id < b.id) in the JOIN ensures only one row is returned when there are more than one matches for the best time. When there are more than one match to the best time for an athlete, the workout with the highest id is returned(i.e. the last workout).
Certainly following will not work
Athlete.find([1,2,3]).workouts.run.best
Because Athlete.find([1,2,3]) returns an array and you can't call Array.workouts
You can try something like this:
Workout.find(:first, :joins => [:athlete], :conditions => "athletes.id IN (1,2,3)", :order => 'workouts.time DESC')
You can edit the conditions according to your need.
Related
Good afternoon.
I have a model called 'Cliente' and another called 'Acct'. The ratio is 1 'Cliente' for many 'Acct'. When I use a has_one relationship, it fetches all the millions of 'Acct' to pick only one of these results.
Statement of the model in relation 'Cliente':
'accts' => [
self::HAS_MANY,
'Acct',
'cliente_id',
],
'lastAcct' => [
self::HAS_ONE,
'Acct',
'cliente_id',
'order' => 'acct.id DESC',
],
In Yii (Yii1 as well as Yii2), creating a "Has One" relationship does not automatically apply a LIMIT 1 to the query. You can read more about the reasoning behind it here: https://github.com/yiisoft/yii/pull/2113
You should manually add the limit clause, like so:
'lastAcct' => [
self::HAS_ONE,
'Acct',
'cliente_id',
'order' => 'acct.id DESC',
'limit' => '1'
],
Thanks for your answer, sir.
This 'limit' to enter the main query and not only in relation, for example, if you search thousands of 'clients' with the last 'acct' in each will not work, will get only one 'cliente'.
Complementing ...
To solve this I use a subquery, for example:
LEFT OUTER JOIN `radacct` `acct` ON ((acct.username = t.login) AND (acct.radacctid = (SELECT radacctid FROM `radacct` `acct_subquery` WHERE acct_subquery.username = t.login GROUP BY acct_subquery.username ORDER BY acct_subquery.radacctid DESC LIMIT 1)))
But what worries me is that after this subquery be too cool with millions of results, it is a must do and ERP reports, weekly, monthly, yearly and since he started the company.
How do you use a conditional or in thinking sphinx?
The situation is:
I have a Message model with a sender_id and recipient_id attribute. I would like to compose this query:
Message.where("sender_id = ? OR recipient_id = ?", business_id, business_id)
Right now, I'm searching twice, one for all the messages that has recipient_id = business_id and another to return all messages that has sender_id = business_id. Then I just merge them.
I feel that there's a more efficient way to do this.
EDIT - Adding index file
ThinkingSphinx::Index.define :message, with: :active_record, delta: ThinkingSphinx::Deltas::DelayedDelta do
# fields
indexes body
# attributes
has job_id
has sender_id
has recipient_id
end
Sphinx doesn't allow for OR logic between attributes, only fields. However, a workaround would be to combine the two columns into a third attribute:
has [sender_id, recipient_id], :as => :business_ids, :multi => true
And then you can search on the combined values like so:
Message.search :with => {:business_ids => business_id}
So I have a model FeaturedListing that has a field date which is a mysql date field. There will be multiple FeaturedListings that have the same date value.
I want to find all dates that have N or more FeaturedListings on it. I think I'd have to group by date and then find which ones have N or more in there group and get the value that was grouped on. Could any one give me any pointers to accomplish that. A raw sql query may be required.
Edit: Thanks to the answers below it got me going on the right track and I finally have a solution I like. This has some extra conditions specific to my application but I think its pretty clear. This snippet finds all dates after today that have at least N featured listings on them.
$dates = $this->find('list', array(
'fields' => array('FeaturedListing.date'),
'conditions' => array('FeaturedListing.date >' => date('Y-m-d') ),
'group' => array('FeaturedListing.date HAVING COUNT(*) >= $N')
)
);
I then make a call to array_values() to remove the index from the returned list and flatten it to an array of date strings.
$dates = array_values($dates);
No need to go to raw SQL, you can achieve this easily in cake ($n is the variable that holds N):
$featuredListings = $this->FeaturedListing->find('all', array(
'fields' => array('FeaturedListing.date'),
'group' => array('FeaturedListing.date HAVING COUNT(*)' => $n),
));
In "raw" SQL you would use group by and having:
select `date`
from FeaturedListings fl
group by `date`
having count(*) >= N;
If you want the listings on these dates, you need to join this back to the original data. Here is one method:
select fl.*
from FeaturedListings fl join
(select `date`
from FeaturedListings fl
group by `date`
having count(*) >= N
) fld
on fl.`date` = fld.`date`
I am trying to do a basic report here
Show a list of users with details like names, age etc.
Also show other details like number of posts, last activity time etc.
All these other details are in a has_many relation from users
This is what I am doing right now:
User.find(
:all,
:select => "users.id, users.name, count(distinct(posts.id)) as posts_count",
:joins => "left outer join posts on posts.user_id = users.id",
:group => "users.id",
:limit => "100"
)
Have indexed the user_id column in posts table
My problem is it is taking a very long time to do this and sometimes hangs when I try to do more tables along with posts like activities, comments etc.
Is there a way join the count alone or some other way to achieve this?
Thanks in advance
I think one of these might work, but as always when doing an outer join, the query will be slower.
User.count(:include=>[:posts])
User.count(:select => "users.id, users.name, count(distinct(posts.id)) as posts_count",
:joins => "left outer join posts on posts.user_id = users.id",
:group => "users.id")
User.find(:all, :include => [:posts])
Also you can put in an initializer file or irbrc file this:
ActiveRecord::Base.logger = Logger.new(STDOUT)
ActiveRecord::Base.clear_active_connections!
And check the queries as you type them in console.
For post count
User.all.posts.size
As far as reporting each post count later you can do that with the User record you have already.
#users = User.find(:all, :include=> [:posts])
Later ...
<% #users.each do |user|
Posts: <%= user.posts.size %>
<% end %>
Users sorted by post
#sorted_users = #users.sort_by{|user| user.posts.size }
https://stackoverflow.com/a/5739222/1354978
I was wondering if it's possible to use :include in named_scope but to specify only specific columns to :include?
Currently I use the following:
class ProductOverwrite < ActiveRecord::Base
belongs_to :product
named_scope :with_name, :include => :product
def name
produt.name
end
end
But i'm wondering if I can select specific columns from the product table instead of selecting the entire set of columns which I obviously don't need.
This isn't something rails does out of the box.
You could 'piggy back' the attribute
named_scope :with_product_name, :joins => :product, :select => 'product_overwrites.*, products.name as piggy_backed_name'
def product_name
read_attribute(:piggy_backed_name) || product.name
end
If it is possible for a ProductOverwrite to have no product, then you'd need a left join rather than the default inner join.