I have the following query
#initial_matches = Listing.find_by_sql(["SELECT * FROM listings WHERE industry = ?", current_user.industry])
Is there a way I can run another SQL query on the selection from the above query using a each do? I want to run geokit calculations to eliminate certain listings that are outside of a specified distance...
Your question is slightly confusing. Do you want to use each..do (ruby) to do the filtering. Or do you want to use a sql query. Here is how you can let the ruby process do the filtering
refined list = #initial_matches.map { |listing|
listing.out_of_bounds? ? nil : listing
}.comact
If you wanted to use sql you could simply add additional sql (maybe a sub-select) it into your Listing.find_by_sql call.
If you want to do as you say in your comment.
WHERE location1.distance_from(location2, :units=>:miles)
You are mixing ruby (location1.distance_from(location2, :units=>:miles)) and sql (WHERE X > 50). This is difficult, but not impossible.
However, if you have to do the distance calculation in ruby already, why not do the filtering there as well. So in the spirit of my first example.
listing2 = some_location_to_filter_by
#refined_list = #initial_matches.map { |listing|
listing.distance_from(listing2) > 50 ? nil : listing
}.compact
This will iterate over all listings, keeping only those that are further than 50 from some predetermined listing.
EDIT: If this logic is done in the controller you need to assign to #refined_list instead of refined_list since only controller instance variables (as opposed to local ones) are accessible to the view.
In short, no. This is because after the initial query, you are not left with a relational table or view, you are left with an array of activerecord objects. So any processing to be done after the initial query has to be in the format of ruby and activerecord, not sql.
Related
What is the best approach for searching?What will be difference if i filter the all data in controller and get result, and use where query in model and get result ?Please suggest your opinion.
It depends on the complexity of your query..
You can try to mesure the time of processing by putting flags in your code between each steps in order to measure the time of processing.
Then you make one test of speed processing like:
print time_flag 1
var results_sql_processing = *complex query*
print time flag 2
var raw_results_script_processing = *dump query*
print time flag 3
var results_script_processing = *processing the data*
print time flag 4
and make sure results_script_processing == results_sql_processing.
You can also set different datasets size (limit 100, 500, 1000) and see how the difference evolves between both solutions
Also, i would recommend to take a look at the query builders you might find in many frameworks (I use laravel's query builder).
They are usually a very good compromise when the query isn't too complex (no data aggregation, complex concat,....), you can still use joins, union and many filters on them.
But in case you want to get a super complex pivot table for instance, just build a strong sql query and then fire it in your code!
I have some reporting methods throughout my app and in some cases I want to return the count (for a dashboard), and in others, return the full result set for viewing the details of a report.
What I'm wondering, is there a way to dynamically choose to show the count (instead of what I'm doing here):
def get_query_results(reporting_parameters, count_only = true)
#put together reporting details...
if count_only
MyModel.where(query).count
else
MyModel.where(query)
end
end
I considered setting a local variable to the result of my query parameter, and then call count, but that queries the database again (and even if it didn't it could increase memory usage).
Is there a way to do an effective way to do this in one query? This is one of several queries I have like this in my app, otherwise I wouldn't care. Also, I'd use a ternary, but the actual query conditions in my app are much longer than my example here and it makes it unreadable.
Suppose you are doing this:
#collection = get_query_results(...)
Then you can do this afterwards instead of inside of the action:
#collection.count
And if you like to call another method:
def total_number(collection)
collection.count
end
#collection = get_query_results(...)
no_of_records = total_number(#collection)
TL;DR? See Edit 2
I've got a little Rails application that has a few different sort of games people can play: it's based around sports, so they can pick the winners of each game every week (model PickEm, attribute correct boolean with nil for unfinished games), and predict the outcome of a specific team's game (model Guess, attribute score with integer, nil for unfinished games). Every User has_many PickEms and Guesses. And I'm trying to display standings (correct/total - total being all non-nil, score/total possible).
What I'm finding is that I can gather the users and their associated records, but in trying to display standings I'm discovering that every single User is triggering another query - slow and not sustainable as the user base increases. That's because #user.pick_em_score is pick_ems.where(correct: true).size and #user.guess_Score is guesses.where.not(score: nil).sum(:score). So I call user.pick_em_score and it runs that query. I feel like there should be a way to get every User, as well as these specific counts, at once, rather than buffering a whole bunch of needless extra stuff.
What I need:
User record
User.pick_em_score (calculated by counting correct records)
User.pick_ems count where NOT NULL
User.guesses_score (calculated by guesses.sum(:score))
User.guesses count where NOT NULL
Most of the stuff I find on Rails's ActiveRecord helpers, especially related to calculations, is for retrieving only the calculation. It looks like I'll probably need to delve directly into select() etc. But I can't get it working. Can someone point me in the right direction?
Edit
For clarification: I'm aware that I can write this information to the User model, but this is overly restrictive: next season, I'll need to add a new column to the User for that year's results, etc. In addition, this is a third degree of callback updating related models – the Match model already updates related PickEms and Guesses on save. I'm looking for the simplest ActiveRecord query or queries to be able to work with this information, as indicated by the title. Ideally one query that returns the above information, but if it needs to a few, that's OK.
I used to work directly in MySQL with PHP, but those skills have rusted (in raw MySQL, I imagine, I'd have several sub-select statements to help pull these counts) and I'd also like to be able to use Rails's ActiveRecord helpers and such, and avoid constructing raw SQL as much as possible.
Second Edit:
I seem to have it down to one call that starts to work, but I'm writing a lot of SQL. It's also brittle, IMO, and trying to run with it has failed. It also looks like I'm just pushing the million singular SELECT queries from Rails right into SQL, but that may still be a step up.
User.unscoped.select('users.*',
'(SELECT COUNT(*) FROM pick_ems WHERE pick_ems.user_id = users.id AND pick_ems.correct) AS correct_pick_ems',
'(SELECT COUNT(*) FROM pick_ems WHERE pick_ems.user_id = users.id AND pick_ems.correct IS NOT NULL) AS total_pick_ems',
'(SELECT SUM(guesses.score) FROM guesses WHERE guesses.user_id = users.id AND guesses.score IS NOT NULL) AS guesses_score',
'(SELECT COUNT(*) FROM guesses WHERE guesses.user_id = users.id AND guesses.score IS NOT NULL) AS guesses_count' )
The issue seems to be: is there a way to use Rails, and not raw SQL, to link up users.id that we see there with these subqueries? Or just … a better way to construct this, in general?
In addition, I'm running another set of SELECTs for the WHERE, which would hinge on total_pick_ems and guesses_count being > 0 but since I can't use those aliased columns, I have to call the SELECT one more time.
Welcome to AR. Its really only good for simple CRUD like queries. Once you actually want to query your data in anger it just doesn't have the capababilities to do the queries you want without resorting to wholesale SQL strings and often abandoning the ability to chain as a result.
Its precisely why I moved to Sequel as it does have the features to compose queries using a much fuller SQL feature set, including join conditions, window functions, recursive common table expressions, and advanced eager loading. The author is incredibly responsive and documentation is excellent compared to AR and Arel.
I don't expect you will like this answer but a time will come when you will start to look outside the opinionated components that come with rails which I have to say are hardly best of breed. Sequel also sped my application up many times over what I was able to get with AR as well, it not just developer happiness, it means less servers to run. Yes it will be a learning curve but IMO its better to learn tools that have your back covered.
Joins might work. Smthing like below
User.unscoped.joins(:guesses).joins(:pick_ems).
where("guesses.score IS NOT NULL").
select("users.*,
sum(guesses.score) as guesses_score,
count(guesses.id) as guesses_count,
count(case when pick_ems.correct = True then 1 else null end)
as correct_pick_ems,
count(case when pick_ems.correct != null then 1 else null end)
as total_pick_ems,
").
group("users.id")
If you need this information for a limited number of users at a time then above query or eager loading (User.includes(:guesses, :pick_ems)) with class methods like
def correct_pick_ems
pick_ems.count(&:correct)
end
would work.
However If you need this information for all the users most of the time, cached counters within the users table would be more optimal.
What you need is some sort of custom (smart) counter_cache to count only at certain conditions (e.g correct is true)
You can achive this using conditional after_save & after_destroy triggers to build your own custom counter_cache that looks like this:
class PickEm
belongs_to :user
after_save :increment_finished_counter_cache, if: Proc.new { |pick_em| pick_em.correct }
after_destroy :decrement_finished_counter_cache, if: Proc.new { |pick_em| pick_em.correct }
private
def increment_finished_counter_cache
self.user.update_column(:finished_games_counter, self.user.finished_games_counter + 1) #update_column should not trigger any validations or callbacks
end
def decrement_finished_counter_cache
self.user.update_column(:finished_games_counter, self.user.finished_games_counter - 1) #update_column should not trigger any validations or callbacks
end
end
Notes:
Code not tested (only to show the idea)
Some guys said it's better to avoid naming custom counters as rails name them (foo_counter_cache)
You should benchmark it, but my hunch is that adding all of that data into a single SELECT isn't going to be much faster than breaking it up into separate SELECTs (I've actually had cases where the latter was faster). By breaking it up, you can also stick to more ActiveRecord and less raw SQL, e.g.:
user_ids_to_pick_em_score = User.joins(:pick_ems).where(pick_ems: {correct: true}).group(:user_id).count
user_ids_to_pick_ems_count = User.joins(:pick_ems).where.not(pick_ems: {correct: nil}).group(:user_id).count
user_ids_to_guesses_score = Hash[User.select("users.id, SUM(guesses.score) AS total_score").joins(:guesses).group(:user_id).map{|u| [u.id, u.total_score]}]
user_ids_to_guesses_count = User.joins(:guesses).where.not(guesses: {score: nil}).group(:user_id).count
Edit: To display them, you could do like so:
<%- User.select(:id, :name).find_each do |u| -%>
Name: <%= u.name %>
Picks Correct: <%= user_ids_to_pick_em_score[u.id] %>/<%= user_ids_to_pick_ems_count[u.id] %>
Total Score: <%= user_ids_to_guesses_score[u.id] %>/<%= user_ids_to_guesses_count[u.id] %>
<%- end -%>
I'm working on a app which ties to a legacy database. The primary model is based on a stupidly large 100+ column table. I don't know too much about the inner-workings of ActiveRecord but it seems to me that any request on this model is slowing down because it's creating objects with 100+ attributes. Let's call this SlowModel.
Rendering pages with this model sometimes take 17 seconds on my dev computer. Straight up mysql queries only take ~ 0.5 - 1 second.
I've managed to speed up one portion of the app by using a MySQL view that selects a subset of fields (20 or so). We'll call this QuickModel. Using views is OK but isn't the most portable solution.
I will likely continue to try and add this QuickModel into other parts of the site but I was wondering if anyone had other ideas in speeding up the original object. For instance, is there a way to specify in the model what columns activerecord should just ignore and avoid building? Maybe there are specific column types (:text??) that cause bloat in ActiveRecord objects.
Assume that columns have proper indices.
You can specify which columns are returned in the model lookup using the :select option of the ActiveRecord lookup:
SlowModel.all(:select => 'id, col1, col2, col3')
...will load instances of SlowModel with only the specified columns populated.
How about having a completely new QuickModel that sits to its own table... and a QuickModel has_one SlowModel?
You can use SQL to move the most-necessary data into the QuickModel table and only refer to the SlowModel using my_quick_model.slow_model when necessary.
Alternatively, you can add a "select" to the default scope (you can google "rails default scope" for more). By default it'll only fetch the reduced set - but you can ask for all attributes by passing :select => "*" if necessary.
Along the lines of what Winfield is saying, you may want to take a look at using an attribute tracker like SlimScrooge. The tracker attempts to fetch only the data that you're using, which reduces overhead. It attempts to automatically do what Winfield is suggesting.
Example from the Readme:
# 1st request, sql is unchanged but columns accesses are recorded
Brochure Load SlimScrooged 1st time (27.1ms) SELECT * FROM `brochures` WHERE (expires_at IS NULL)
# 2nd request, only fetch columns that were used the first time
Brochure Load SlimScrooged (4.5ms) SELECT `brochures`.expires_at,`brochures`.operator_id,`brochures`.id FROM `brochures` WHERE (expires_at IS NULL)
# 2nd request, later in code we need another column which causes a reload of all remaining columns
Brochure Reload SlimScrooged (0.6ms) `brochures`.name,`brochures`.comment,`brochures`.image_height,`brochures`.id, `brochures`.tel,`brochures`.long_comment,`brochures`.image_name,`brochures`.image_width FROM `brochures` WHERE `brochures`.id IN ('5646','5476','4562','3456','4567','7355')
# 3rd request
Brochure Load SlimScrooged (4.5ms) SELECT `brochures`.expires_at,`brochures`.operator_id,`brochures`.name, `brochures`.id FROM `brochures` WHERE (expires_at IS NULL)
I am getting an IQueryable from my database and then I am getting another IQueryable from that first one -that is, I am filtering the first one.
My question is -does this affect performance? How many times will the code call the database? Thank you.
Code:
DataContext _dc = new DataContext();
IQueryable offers =
(from o in _dc.Offers
select o);
IQueryable filtered =
(from o in offers
select new { ... } );
return View(filtered);
The code you have given will never call the database since you're never using the results of the query in any code.
IQueryable collections aren't filled until you iterate through them...and you're not iterating through anything in that code sample (ah, the beauty of lazy initialization).
That also means that each of those statements will be executed as its own query against the database which results in no performance cost over doing two completely independent queries.
SO is not a replacement for developer tools. There are many good free tools able to tell you exactly what this code translates into and how it works. Use Reflector on this method and look at what code is generated and reason for yourself what is going on from there.