I have Order model, that has_many :order_operations. OrderOperation create all the time, when order state is changed. I want to show all OrderOperations.created_at for my orders without creating new queries. I used MySQL.
class Order < ActiveRecord::Base
has_many :order_operations
def change_state new_state
order_operations.create to_state: new_state
end
def date_for_state state_name
order_operations.where(state: state_name).pluck(:created_at).last
end
end
I know about includes and joins methods, but on calling date_for_state always run new query. Even I remove where and pluck query will perform too.
I have only one idea to create service object for this.
When you do a join/include, it caches the results of doing a particular query: specifically, a query to get all of the order_operations associated with the order.
If you had loaded #order, eager-loading the associated order_operations, and did #order.order_operations, then Rails has cached the associated order_operations as part of the include, and doesn't need to load them again.
However, if you do #order.order_operations.where(state: state_name).pluck(:created_at).last, this is a different query than the one used in the include, so rails says "he's asking for some different stuff to the stuff i cached, so i can't use the cached stuff, i need to make another query". You might say "aha, but this will always just be a subset of the stuff you cached, so can't you just work out which of the cached records this applies to?", but Rails isn't that smart.
If you were to do
#order.order_operations.select{|oo| oo.state == state_name}.map(:created_at).last
then you're just doing some array operations and .order_operations will use the cached records, as it's the same query as the one you cached with includes, ie a straight join to all associated records. But, if you call it on an instance var #order that doesn't happen to have already eager-loaded the associated records then it will be much less efficient because it will do a much bigger query than the one you had originally.
In other words: if you want to get an efficiency gain by using includes then the parameters to your includes call needs to exactly match the association calls you will make subsequently on the object.
Related
Is it possible to eager load a specific entry from the user table, as well as all associated tables attached via belong_to.
For example, I have users table with accounts and patients tables, that both belong_to :user. If I wanted to grab a specific User table entry and eager_load the associated tables, akin to something like this:
user = User.find_by_email("testemail#here.com").eager_load(:accounts, :patients)
How can I do it ?
You were close. Try putting the associations in an array within the eager_load call as shown below:
user = User.includes([:accounts, :patients]).find_by(email: "testemail#here.com")
Now that you have included the associated tables and then found your user you can call any attributes from those other tables on the user without firing another query from the database. For example once that is run you can do:
all_user_accounts = user.accounts
This will not fire a query in the database but instead be loaded from memory.
If you use #eager_load you are doing one query whereas includes will do it in one or two depending on what it deems necessary. So what is #includes for? It decides for you which way it is going to be. You let Rails handle that decision.
Check this guide for great information on includes/eager_load/preload in rails:
http://blog.arkency.com/2013/12/rails4-preloading/
I have a Ruby 2.2.2 app that uses ActiveRecord as the ORM (similar to Rails). I have two tables, "users" and "accounts", where "accounts" have belongs_to :user. For a given user, I simply want to eager load the contents of the "accounts" table into a Ruby object (named user_accounts) so I can perform operations like:
user_accounts.find_by_product_name("nameofproduct")
...without the find_by_product_name method performing a SQL query. I simply want to preload all entries from the "accounts" table (that belong to a given user) into a Ruby object so I can minimize the number of SQL queries performed.
No matter how much documentation I read, I cannot figure out how to do this properly. Is this possible? If so, how?
If you don't want the ORM to re-query the database, then I think you are better of using the select method added by the Enumerable mixin. Because if you try to use the find_by_* methods, I think it will always send another query.
Here is an example of how it could be achieved.
# The accounts association will be loaded and cached the first time
#user.accounts.select { |account| account.name == "nameofproduct" }
# The next time, it is already loaded and will select from the previously loaded records
#user.accounts.select { |account| account.name == "nameofanotherproduct" }
I would store that accounts in a hash instead of an array, because lookups in a hash are much faster (in O(1)) than in an array which only allows O(n).
group_by helps you building that hash.
# assuming that #user.accounts returns all accounts
user_accounts = #user.accounts.group_by(&:name)
# querying the hash. use `first` to receive the first matching
# account (even if there is only one)
user_accounts['nameofproduct'].first
So I have the following model structure in my Django App:-
class SuperModel(models.Model):
f1 = models.CharField()
f2 = models.CharField()
class Model(SuperModel):
f3 = models.CharField()
class OverrideModel(models.Model):
fpk = models.OneToOneField(Model, primary_key=True)
f1 = models.CharField()
f2 = models.CharField()
Basically, in my application, the fields f1 and f2 in the Model table contain user information that I have entered. The user has the ability to override this information and any changes he/she makes in the data is stored in the OverrideModel table (because I do not want to lose the information that I had entered first). Think of it as me creating user profiles earlier while now I want the user to be able to edit his/her own profile without losing the information that I had entered about them.
Now, since the rest of my application (views/templates etal) work with the field names in the Model class, what I want is to create a view of the data that fetches the field f1 from the override table if it exists, otherwise it should pickup f1 from the table it used to earlier without resorting to a raw queryset.
I will describe everything I have considered so far so that some of the other constraints I am working with become clear:-
Model.objects.annotate(f1=Case(When(overridemodel__f1__isnull=True, then=F('f1')), default=F('overridemodel__f1'))).
This throws the error that the annotate alias conflicts with a field already in the table.
Model.objects.defer('f1').extra(select={'f1': 'CASE WHEN ... END'}, tables=..., where=...).
This approach cannot be applied because I could not figure out a way to apply an outer join using extra. The override model may not have a row corresponding to each model row. Specifying the override table in the tables clause performs a cross product operation which combined with where can be used to perform an inner join, not an outer join (although I'd be happy to be proved wrong).
EDIT: I have realized that select_related might be able to solve the above problem but if I filter the queryset generated by Model.objects.select_related('overridemodel').defer('f1').extra(select={'f1': 'CASE WHEN ... END'}, tables=..., where=...) on the field f1, say qs.filter(f1='Random stuff') the where clause for the filter query uses the Model.f1 field rather than the f1 field generated in extra. So this approach is also futile.
Using Model.objects.raw() to get a raw queryset.
This is a non-starter because the Django ORM becomes useless after using raw and I need to be able to filter / sort the model objects as part of the application.
Defining methods/properties on the Model class.
Again, I will not be able to use the same field names here which involves hunting through code for all usages and making changes.
Creating a view in the database that gives me what I want and creating an unmanaged model that reads the data from that view.
This is probably the best solution for my problem but having never used an unmanaged model before, I'm not sure how to go about it or what pitfalls I might encounter. One problem that I can think of off the top of my head is that my view always has to be kept in sync with the models but that seems a small price to pay compared to hunting through the codebase and making changes and then testing to see if anything broke.
So, there you have it. As always, any help / pointers will be greatly appreciated. I have tried to provide as minimal an example as possible; so if any more information is required I'll be happy to provide it.
Also, I am using Django 1.8 with MySQL.
I realized that there is no easy canonical way to solve my problem. Even with using option 5 (creating a view that is ORM manipulated using an unmanaged Model), I would lose the related query names on the original model that are being used in my filtering / sorting.
So, for anyone else with a similar problem I would recommend the approach I finally went with which is not keeping an OverrideModel but an OverriddenModel which keeps the values that are overridden whenever the user makes changes and updating the original Model with the override values so that the model always contains the values on which filtering / querying is going to occur
as my first Rails app, I am building a homework management app which has these tables:
users (from Devise authentication)
schools
courses
assignments
Unlike most examples of course/grading apps I've found, this one is never concerned with all the grades for all students for a particular course, but has only a 1:many relationship between student and courses. So the examples don't really help me.
In order to calculate a user's current grade in any given course (which requires access to data in both course model and assignment model), I am following a suggestion from here on Stack Overflow and creating a PORO in the app/models directory called GradeCalculator.
But this is my first experience with building a PORO into a Rails app, and most of the documentation I'm finding online is for more sophisticated users. I'm assuming it doesn't need a controller (please correct me if I'm wrong), and I see that building it is as simple as:
app/models/gradecalculator.rb
Class GradeCalculator
def calculate_current_course_grade(course_id)
#graded_course_assignments = Assignment.where(user_id: current_user.id, course_id: course_id, graded: true)
#grab weights for each type of assignment in #graded_course_assignments from courses table
#do some calculations
#return the array
end
def calculate_user_GPA(current_user.id)
#more of the same
end
end
My questions are:
Can a PORO access the database (to get data from the courses and assignments tables). Or do I need to pass it all the relevant data from my other classes (like assignments) as params when calling it?
1a. If a simple class can access the database, does the syntax differ from that in the models? Would the above code be able to access Assignment.where?
1b. How would I call this PORO? For example, how would I call it from my views/assignments/index.html.erb?
Can it access Devise's current_user helper?
Tangentially, I just realized that I could store assignment weights in the assigments table. I was thinking chronologically (user inputs number of homework assignments, quizes, etc. at the time of inputting a new course, which determines the weight for each type of assignment), but I could programmatically have each new assignment populate its own weight field by referencing the number of like assignments from its parent course. This would mean, logically, I could do the grade calculation right in the Assignment model. But if extra credit or other changes were added to the course, all the assignments might then have to recalculate their weights. Which way is more correct?
Writing these questions makes me suspect that I am just to pass my PORO the relevant data from my view, let it do calculations on that data and return a grade. But I will post anyway, just in case I'm wrong.
The reason for breaking business logic out into POROs like this is usually to make your code easier to reason about and easier (and faster) to test. To that end, you do not want GradeCalculator to know or care how Assignment works. You should just pass GradeCalculator all of the data it needs (or a Relation, which quacks like an Enumerable). Having GradeCalculator call Assignment.where means that your tests will depend on ActiveRecord, and the database, which means they'll be slow. If GradeCalculator just expects an array, in your tests you'll just have to mock an array of objects that respond to whatever attribute methods GradeCalculator needs to know about, and you'll be able to run them without loading Rails at all. (It's common to have a separate spec_helper.rb and rails_helper.rb so that specs that don't need Rails can run without loading Rails, which makes them so much faster.)
Per your second question my advice is similar: Decouple your POROs as much as possible from Rails and from Devise.
On one of my PHP/MySQL sites, every user can block every other user on the site. These blocks are stored in a Blocked table with each row representing who did the blocking and who is the target of the block. The columns are indexed for faster retrieval of a user's entire "block list".
For each user, we must exclude from any search results any user that appears in their block list.
In order to do that, is it better to:
1) Generate the "block list" whenever the user logs in by querying the Blocked table once at login and saving it to the $_SESSION (and re-querying any time they make a change to their "block list" and re-saving it to the $_SESSION), and then querying as such:
NOT IN ($commaSeparatedListFromSession)
or
2) Exclude the blocked users in "real-time" directly in the query by using a sub-query for each user's search query as such:
NOT IN (SELECT userid FROM Blocked WHERE Blocked.from = $currentUserID) ?
If the website is PHP and the blocklist is less than say 100 total per user I would store it in a table, load it to $_SESSION when changed/loggging in. You could just as easily load it from SQL on each page load into a local variable however.
What I would store in $_SESSION is a flag 'has_blocklist_contents' that would decide whether or not you should load or check the blocklist on page load.
Instead of then using a NOT IN with all of your queries the list I think it might be smarter to filter them out using PHP.
I have two reasons for wanting to implement this way:
Your database can re-use the SQL for all users on the system resulting in a performance boost for retrieving comments and such.
Your block list will most of the time be empty, so you're not adding any processing time for the majority of users.
I think there is 3rd solution to it. In my opinion this would be the better way to go.
If you can write this
NOT IN (SELECT userid FROM Blocked WHERE Blocked.from = $currentUserID)
Then you can surely write this.
....
SomeTable st
LEFT JOIN
Blocked b
ON( st.userid = b.userid AND Blocked.from = $currentUserID)
WHERE b.primaryKey IS NULL;
I hope you understand what I mean by the above query.
This way you get the best of both worlds i.e. You don't have to run 2 queries, and you don't have to save data in $_SESSION
Don't use the $_SESSION as a substitute for a proper caching system. The more junk you pile into $_SESSION, the more you'll have to load for each and every request.
Using a sub-select for exclusions can be brutally slow if you're not careful to keep your database tuned. Make sure your indexes are covering all your WHERE conditions.