ActiveRecord find identical set in many_to_many models - mysql

I have an anti-pattern in my Rails 3 code and I was wondering how to do this properly.
Let's say a customer orders french fries and a hamburger. I want to find out if such an order has been placed before. To keep it simple each item can only be ordered once per order (so no "two hamburgers please") and there are no duplicate orders either.
The models:
Order (attributes: id)
has_many :items_orders
has_many :items, :through => :items_orders
Item (attributes: id, name)
has_many :items_orders
has_many :orders,:through => :items_orders
ItemsOrder (attributes: id, item_id, order_id)
belongs_to :order
belongs_to :item
validates_uniqueness_of :item_id, :scope => :order_id
The way I do it now is to fetch all orders that include at least one of the line items. I then iterate over them to find the matching order. Needless to say that doesn't scale well, nor does it look pretty.
order = [1, 2]
1 and 2 correspond to the Item ids of fries and hamburgers.
candidates = Order.find(
:all,
:include => :items_orders,
:conditions => ["items_orders.item_id in (?)", order])
previous_order = nil
candidates.each do |candidate|
if candidate.items.collect{|i| i.id} == order
previous_order = candidate
break
end
end
I'm using MySQL and Postgress so I'm also open for a standard SQL solution that ignores most of ActiveRecord.

Assuming you only want to find identical orders, I'd be tempted to use a hash to achieve this. I'm not able to test the code I'm about to write, so please check it before you rely on it!
Something like this:
- Create a new attribute order_hash in your Order model using a migration.
- Add a before_save hook that updates this attribute using e.g. an MD5 hash of the order lines.
- Add a method for finding like orders which uses the hash to find other orders that match quickly.
The code would look a little like this:
class Order < ActiveRecord
def before_save
self.order_hash = calculate_hash
end
def find_similar
Order.where(:order_hash => self.calculate_hash)
end
def calculate_hash
d = Digest::MD5.new()
self.items.order(:id).each do |i|
d << i.id
end
d.hexdigest
end
end
This code would allow you to create and order, and then calling order.find_similar should return a list of orders with the same items attached. You could apply exactly the same approach even if you had item quantities etc. The only constraint is that you have to always be looking for orders that are the same, not just vaguely alike.
Apologies if the code is riddled with bugs - hopefully you can make sense of it!

Related

Rails 4 - how to generate statistics through "joins" and "includes" commands in Activerecord?

I kind of stuck at trying to generate statistics for my application. The relevant part of the application has the following structure:
class CarRegistration < ActiveRecord::Base
belongs_to :ride
belongs_to :car
...
end
class Car < ActiveRecord::Base
has_many :car_registration
...
end
class Ride < ActiveRecord::Base
belongs_to :passenger
belongs_to :driver
has_many :car_registration
...
end
class Driver < ActiveRecord::Base
has_many :cars
...
end
class Passenger < ActiveRecord::Base
has_many :cars
...
end
I am trying to get a list of rides, top drivers and and top passengers. I originally tried something like this:
#rides_finished = Ride.joins(:car_registration)
.select('rides.id')
.where("(car_registrations.ride_id = rides.id)
AND rides.status = 3
AND rides.driver_currency = ?
AND rides.passenger_currency = ?", currency, currency)
.distinct # against displaying one shipment multiple times
And then I tried:
#top_pasengers = #rides_finished.joins(:passenger)
.select('passengers.id, passengers.name, count(rides.passenger_id) AS count_all')
.where('rides.passenger_id IS NOT NULL')
.group('passengers.id')
.order('count_all DESC')
.limit(10)
But when I run these queries, I get
Mysql2::Error: Unknown column 'count_all' in 'order clause': ...
Any help how to get the needed numbers?
Thank you very much
Your question is a little confusing because your query uses Ride but there is no Ride in the model definitions listed. I've focussed purely on the example queries you listed.
I think it would be easier to start with a single query chain for 'top passengers':
Passenger
.select('passengers.*')
.select('count(1) as ride_count')
.joins(:rides)
.where(rides: { status: 3,
driver_currency: currency,
passenger_currency: currency })
.group('passengers.id')
.order('ride_count desc')
.limit(10)
That will get you an ActiveRecord::Relation of Passenger models that also respond to a ride_count call, e.g. you could use it like:
results.each do |p|
puts "#{p.name}: #{p.ride_count}'
end
If all that works, you should be able to adjust the query to get the top drivers.
To get the list of finished rides, I suggest a separate, simple query:
Ride.where(status: 3,
driver_currency: currency,
passenger_currency: currency)
Let me know if any of that produces an error.

Rails 4: ActiveRecord or MySQL query where no related models have attribute

Having a tough time with this one. I have a Job model, and a JobStatus model. A job has many statuses, each with different names (slugs in this case). I need an 'active' method I can call to find all jobs where none of the associated statuses has a slug of 'dropped-off'.
class Job < ActiveRecord::Base
belongs_to :agent
has_many :statuses, :class_name => "JobStatus"
validates :agent_id,
:pickup_lat,
:pickup_lng,
:dropoff_lat,
:dropoff_lng,
:description,
presence: true
class << self
def by_agent agent_id
where(agent_id: agent_id)
end
def active
#
# this should select all items where no related job status
# has the slug 'dropped-off'
#
end
end
end
Job Status:
class JobStatus < ActiveRecord::Base
belongs_to :job
validates :job_id,
:slug,
presence: true
end
The closest I've gotten so far is:
def active
joins(:statuses).where.not('job_statuses.slug = ?', 'dropped-off')
end
But it's still selecting the Job that has a dropped-off status because there are previous statuses that are not 'dropped-off'. If i knew the raw sql, I could probably work it into activerecord speak but I can't quite wrap my head around it.
Also not married to using activerecord, if the solution is raw SQL that's fine too.
Job.where.not(id: JobStatus.where(slug: 'dropped-off').select(:job_id))
will generate a nested subquery for you.
Not the cleanest method, but you could use two queries.
# Getting the ID of all the Jobs which have 'dropped-off' JobStatuses
dropped_off_ids = JobStatus.where(slug: 'dropped-off').pluck(:job_id)
# Using the previous array to filter the Jobs
Job.where.not(id: dropped_off_ids)
Try this:
def active
Job.joins(:statuses).where.not('job_statuses.slug' => 'dropped-off')
end
or this:
def active
Job.joins(:statuses).where('job_statuses.slug != ?', 'dropped-off')
end
I think you may want to reevaluate your data model somewhat. If the problem is that you're turning up old statuses when asking about Job, you likely need to have column identifying the current status for any job, i.e. job.statuses.where(current_status: true)
Then you can very easily grab only the rows which represent the current status for all jobs and are not "dropped-off".
Alternatively, if I'm misunderstanding your use case and you're just looking for any job that has ever had that status, you can just go backwards and search for the status slugs first, i.e.
JobStatus.where.not(slug: "dropped-off").map(&:job)

Query processing time is more..performance issue in retrieving data

I have bit complicated model design with many associations among themselves.
Here is model design
User Model
class User < ActiveRecord::Base
has_many :records
validates :email, presence: true
end
Record Model
class Record < ActiveRecord::Base
belongs_to :user
has_many :record_type_student
has_many :record_type_employee
has_many :record_type_other
end
RecordTypeStudent Model
class RecordTypeStudent < ActiveRecord::Base
belongs_to :record
belongs_to :user
belongs_to :source
end
Similar Model for other two RecordTypeOther and RecordTypeEmployee
I have Added index as well in RecordTypeStudent, Other and Employee from Record for fast retrieval.
Currently retreiving 1000 records including all three takes 2.4 seconds which I think is alot.
Here is how I am querying my records
first Query
#records = Record.where(:user_id => 1)
#r = []
#records.each do |m|
if !r.record_type_students.empty?
#r += r.record_type_students
end
if !r.record_type_other.empty?
#r += r.record_type_others
end
if !r.record_type_employees.empty?
#r += r.record_type_employees
end
end
The processing time is very low and it is only 1000 records so I am not sure is bad queries that I am doing or something else.
Application is using MySQl for data base
Completed 200 OK in 2338ms (Views: 0.5ms | ActiveRecord: 445.5ms)
It seems to me like you are creating a lot of unnecessary queries. Instead of pulling out the records and iterating over them (which should have been done with .find_each rather than .each), you can query the individual record types with the right records which will result in an IN clause and will be done on the database side. If I am understanding your schema correctly, you can get the same data as follows (AR 4.1.1):
record = Record.where(user_id: 1)
r = []
r += RecordTypeStudent.where(record: record).to_a
r += RecordTypeEmployee.where(record: record).to_a
r += RecordTypeOther.where(record: record).to_a
This will result in 3 queries total. You can make the code cleaner as follows:
r = [RecordTypeStudent, RecordTypeEmployee, RecordTypeOther].flat_map do |type|
type.where(record: Record.where(user_id: 1)).to_a
end
If you want to further drop the number of queries, you could get the data via UNION between those tables but I don't think that's necessary.
I guess it's just here in the question's text but those has_many in the Record model have to be specified with plural arguments, otherwise it won't work.

Rails activerecord order by field in related table

I have a typical forum style app. There is a Topics model which has_many Posts.
What I want to do using Rails 2.3.x is query the topics table and sort by the most recent post in that topic.
#topics = Topic.paginate :page => params[:page], :per_page => 25,
:include => :posts, :order => 'HELP'
I'm sure this is a simple one but no joy with Google. Thanks.
Sorting on a joined column is probably a bad idea and will take an enormous amount of time to run in many situations. What would be better is to twiddle a special date field on the Topic model when a new Post is created:
class Post < ActiveRecord::Base
after_create :update_topic_activity_at
protected
def update_topic_activity_at
Topic.update_all({ :activity_at => Time.now }, { :id => self.topic_id})
end
end
Then you can easily sort on the activity_at column as required.
When adding this column you can always populate the initial activity_at with the highest posting time if you have existing data to migrate.

Sorting a Rails database table by a column in an associated model

I'm trying to implement Ryan Bates' sortable table columns code (Railscast #228) but I'd like to be able to sort on an associated column. In particular, I have the following models and associations:
class Project < ActiveRecord::Base
belongs_to :program_manager, :class_name => "User"
class User < ActiveRecord::Base
has_many :program_manager_projects, :class_name => "Project", :foreign_key => "program_manager_id"
The association between the Project model and the User model is mediated by the 'program_manager_id' foreign key, which the user sets in the new/edit views using a collection-select dropdown. Here's part of the annotation at the top of project.rb:
# Table name: projects
# program_manager_id :integer
I want to be able to sort my list of projects in the index view by the program manager's name, i.e., by project.program_manager.name.
Ideally, I'd be able to point :order to this name somehow, perhaps with something like this in the index method of my ProjectsController:
#projects = Project.find(:all, :order => project.program_manager.name)
But that obviously won't work (not to mention Ryan's routine implements this with a specific reference to table names from the model to be sorted.)
I've come across some intimidating approaches that use named_scope, such as:
named_scope :most_active, :select => "questions.*", :joins => "left join comments as comments_for_count on comments_for_count.question.id = questions.id", :group => "questions.id", :order => "count(questions.id) desc"
But given my lack of MySQL expertise, this is fairly impenetrable to me.
Can anyone help me either generalize the named_scope example above for my specific case, or point me to a more straightforward strategy?
Thanks very much,
Dean
Let's dissect that named scope you referenced above. Imagine a model Question which has many Comments.
named_scope :most_active, :select => "questions.*", :joins => "left join comments as comments_for_count on comments_for_count.question.id = questions.id", :group => "questions.id", :order => "count(questions.id) desc"
:most_active
the name of your scope. You would reference thusly: Question.find(:all).most_active
:select => "questions.*"
by default scopes selects all columns from your table anyway, so this limits the results to only the questions table, and not the comments table. This is optional.
:joins => "left join comments as comments_for_count on comments_for_count.question.id = questions.id"
this is saying for every question, I also want to get all comments associated with them. The comments table has a column 'question_id' which is what we'll be using to match them up to the appropriate question record. This is important. It allows us access to fields that are not on our model!
:group => "questions.id"
This is required for the count() function in the order clause to tell us that we want the count of comments based on question. We don't need the count function in our order clause, so we also don't need this group statement
:order => "count(questions.id) desc"
Return the results in order of number of comments, highest to lowest.
So for our example, discarding what we don't need, and applying to your needs, we end up with:
:named_scope :by_program_manager_name, :joins => "left join users on projects.program_manager_id = users.id", :order => "users.name"
This named_scope would be called thusly:
Project.find(:all).by_program_manager_name
Note this is basically equivalent to:
Project.find(:all, :joins => "left join users on projects.program_manager_id = users.id", :order => "users.name")
But, as cam referenced above, you should really know the underlying SQL. Your abilities will be severely hampered without this understanding