Indexing calculated field for search conditions in Thinking Sphinx - mysql

I have a products model set up that I am trying to search with Thinking Sphinx. The model has an attribute called status which can be Active, Not active or Active during specified dates.
I would like to be able to restrict my search results to products that are active. I.e. has status of active or has status of active during dates and the current time is between those dates.
I'm a beginner to Rails so I'm looking for suggestions as to how I could implement this. I thought about putting a boolean method in my model that calculates this logic, but I don't think that this is indexable by Sphinx.
I am using MySQL as the database server.
Does anyone have any bright ideas?

You're right, ruby methods on your model are not accesible to sphinx. However you can re-create the method as a sphinx attribute. These can easily be made using SQL fragments like so:
class Person < ActiveRecord::base
...
define_index do
has "status = 'active' and now() > start_date and now() < end_date", :as => :active, :type => :boolean
end
...
end
Using a string to specify a field or attribute like this is the same as specifying a custom column when building an SQL query.

try the #indexes method (haven't tried myself, just noticed in googling around)
http://www.slideshare.net/sazwqa/using-thinking-sphinx-with-rails-presentation
slide 11
http://rdoc.info/rdoc/freelancing-god/thinking-sphinx/blob/04320b610b3a665ca1885cc2e6f29354d029e49a/ThinkingSphinx/Index/Builder.html#indexes-instance_method
Also:
http://www.mail-archive.com/rubyonrails-deployment#googlegroups.com/msg02046.html

Related

Copying Parent model attribute to all of its children with same attribute in mysql or Rails as single Query

I know we need to use the following sudo code in case of Rails
Parent.all.each do |parent|
parent.childrens.update_all(:price => parent.price)
end
But I have like 5 Million Parent records and I know this would take a lot of time.
Is there a easy way to do the above through Rails or MySQL the fastest way (in a single query)
Parent.includes(:childrens).find_in_batches.find_in_batches do |group|
sleep(50)
group.each { |parent| parent.childrens.update_all(price: parent.price) }
end
This is the best you can come up with rails atleast..it will avoid n+1 also, since the records are huge, find_in_batches will help you, otherwise there is a possibility that your db/dyno gets locked..
I think you can use ActiveRecord callback functionality to achieve this.
Example code would look like this:
class Parent < ActiveRecord::Base
after_update :denormalize
has_many :children
private
def denormalize
children.update_all(price: price)
end
end
This will ensure that, whenever a parent object is modified the child will also be updated.

Find users created during a specific hour range (any day)

For fun, I'd like to see the set of users (in a Rails database) who were created during a specific hour range (2AM - 5AM to be specific), but on any day. They all have the typical created_at field. I think I know how to extract the hour from this field for one user, and then see if it falls in a range--but how I do do this for all of them? Should I just loop through them? (Even as I write it, that sounds naive).
The first part of Sontyas answer is the easy solution in rails.
I would however move that part to it's own place inside your class to separate your code from the framework a bit more.
# app/models/user.rb
class User < ActiveRecord::Base
# ...
def self.get_users_created_between(start_time, end_time)
User.where("TIME(created_at) BETWEEN TIME(?) AND TIME(?)", start_time, end_time)
end
# ...
end
And use it like this:
irb> User.get_users_created_between(Time.parse("2pm"), Time.parse("5pm"))
This provides you with a couple of benefits:
You can reuse it all over your code without ever having to worry about the syntax of the where or time ranges.
If for some weird reason rails decides to change the interface for this, you only need to edit one method and not code in a thousand places all over your project.
You can easily move this piece of code out of the user.rb if you feel that user.rb gets to big. Maybe to some dedicated finder or query class. Or to something like a repository pattern.
PS: Time functions may vary between different DBMS like MySQL, Postgresql, MSSQL etc. I don't know, if there is generic way to do this. This answer is for MySQL.
Try this,
User.where(created_at: Time.parse("2pm")..Time.parse("5pm"))
Or something like this
User.select { |user| user.created_at.hour.between?(2, 5) }
To return users that where created between two hours on any given day, use this:
User.where('HOUR(created_at) BETWEEN ? AND ?', 2, 5)
Please note that HOUR(created_at) only works for MySQL. The syntax in Postgresql is extract(hour from timestamp created_at) and strftime('%H' created_at) in SQLite.

Query ActiveRecord for records and relation calculations at once

TL;DR? See Edit 2
I've got a little Rails application that has a few different sort of games people can play: it's based around sports, so they can pick the winners of each game every week (model PickEm, attribute correct boolean with nil for unfinished games), and predict the outcome of a specific team's game (model Guess, attribute score with integer, nil for unfinished games). Every User has_many PickEms and Guesses. And I'm trying to display standings (correct/total - total being all non-nil, score/total possible).
What I'm finding is that I can gather the users and their associated records, but in trying to display standings I'm discovering that every single User is triggering another query - slow and not sustainable as the user base increases. That's because #user.pick_em_score is pick_ems.where(correct: true).size and #user.guess_Score is guesses.where.not(score: nil).sum(:score). So I call user.pick_em_score and it runs that query. I feel like there should be a way to get every User, as well as these specific counts, at once, rather than buffering a whole bunch of needless extra stuff.
What I need:
User record
User.pick_em_score (calculated by counting correct records)
User.pick_ems count where NOT NULL
User.guesses_score (calculated by guesses.sum(:score))
User.guesses count where NOT NULL
Most of the stuff I find on Rails's ActiveRecord helpers, especially related to calculations, is for retrieving only the calculation. It looks like I'll probably need to delve directly into select() etc. But I can't get it working. Can someone point me in the right direction?
Edit
For clarification: I'm aware that I can write this information to the User model, but this is overly restrictive: next season, I'll need to add a new column to the User for that year's results, etc. In addition, this is a third degree of callback updating related models – the Match model already updates related PickEms and Guesses on save. I'm looking for the simplest ActiveRecord query or queries to be able to work with this information, as indicated by the title. Ideally one query that returns the above information, but if it needs to a few, that's OK.
I used to work directly in MySQL with PHP, but those skills have rusted (in raw MySQL, I imagine, I'd have several sub-select statements to help pull these counts) and I'd also like to be able to use Rails's ActiveRecord helpers and such, and avoid constructing raw SQL as much as possible.
Second Edit:
I seem to have it down to one call that starts to work, but I'm writing a lot of SQL. It's also brittle, IMO, and trying to run with it has failed. It also looks like I'm just pushing the million singular SELECT queries from Rails right into SQL, but that may still be a step up.
User.unscoped.select('users.*',
'(SELECT COUNT(*) FROM pick_ems WHERE pick_ems.user_id = users.id AND pick_ems.correct) AS correct_pick_ems',
'(SELECT COUNT(*) FROM pick_ems WHERE pick_ems.user_id = users.id AND pick_ems.correct IS NOT NULL) AS total_pick_ems',
'(SELECT SUM(guesses.score) FROM guesses WHERE guesses.user_id = users.id AND guesses.score IS NOT NULL) AS guesses_score',
'(SELECT COUNT(*) FROM guesses WHERE guesses.user_id = users.id AND guesses.score IS NOT NULL) AS guesses_count' )
The issue seems to be: is there a way to use Rails, and not raw SQL, to link up users.id that we see there with these subqueries? Or just … a better way to construct this, in general?
In addition, I'm running another set of SELECTs for the WHERE, which would hinge on total_pick_ems and guesses_count being > 0 but since I can't use those aliased columns, I have to call the SELECT one more time.
Welcome to AR. Its really only good for simple CRUD like queries. Once you actually want to query your data in anger it just doesn't have the capababilities to do the queries you want without resorting to wholesale SQL strings and often abandoning the ability to chain as a result.
Its precisely why I moved to Sequel as it does have the features to compose queries using a much fuller SQL feature set, including join conditions, window functions, recursive common table expressions, and advanced eager loading. The author is incredibly responsive and documentation is excellent compared to AR and Arel.
I don't expect you will like this answer but a time will come when you will start to look outside the opinionated components that come with rails which I have to say are hardly best of breed. Sequel also sped my application up many times over what I was able to get with AR as well, it not just developer happiness, it means less servers to run. Yes it will be a learning curve but IMO its better to learn tools that have your back covered.
Joins might work. Smthing like below
User.unscoped.joins(:guesses).joins(:pick_ems).
where("guesses.score IS NOT NULL").
select("users.*,
sum(guesses.score) as guesses_score,
count(guesses.id) as guesses_count,
count(case when pick_ems.correct = True then 1 else null end)
as correct_pick_ems,
count(case when pick_ems.correct != null then 1 else null end)
as total_pick_ems,
").
group("users.id")
If you need this information for a limited number of users at a time then above query or eager loading (User.includes(:guesses, :pick_ems)) with class methods like
def correct_pick_ems
pick_ems.count(&:correct)
end
would work.
However If you need this information for all the users most of the time, cached counters within the users table would be more optimal.
What you need is some sort of custom (smart) counter_cache to count only at certain conditions (e.g correct is true)
You can achive this using conditional after_save & after_destroy triggers to build your own custom counter_cache that looks like this:
class PickEm
belongs_to :user
after_save :increment_finished_counter_cache, if: Proc.new { |pick_em| pick_em.correct }
after_destroy :decrement_finished_counter_cache, if: Proc.new { |pick_em| pick_em.correct }
private
def increment_finished_counter_cache
self.user.update_column(:finished_games_counter, self.user.finished_games_counter + 1) #update_column should not trigger any validations or callbacks
end
def decrement_finished_counter_cache
self.user.update_column(:finished_games_counter, self.user.finished_games_counter - 1) #update_column should not trigger any validations or callbacks
end
end
Notes:
Code not tested (only to show the idea)
Some guys said it's better to avoid naming custom counters as rails name them (foo_counter_cache)
You should benchmark it, but my hunch is that adding all of that data into a single SELECT isn't going to be much faster than breaking it up into separate SELECTs (I've actually had cases where the latter was faster). By breaking it up, you can also stick to more ActiveRecord and less raw SQL, e.g.:
user_ids_to_pick_em_score = User.joins(:pick_ems).where(pick_ems: {correct: true}).group(:user_id).count
user_ids_to_pick_ems_count = User.joins(:pick_ems).where.not(pick_ems: {correct: nil}).group(:user_id).count
user_ids_to_guesses_score = Hash[User.select("users.id, SUM(guesses.score) AS total_score").joins(:guesses).group(:user_id).map{|u| [u.id, u.total_score]}]
user_ids_to_guesses_count = User.joins(:guesses).where.not(guesses: {score: nil}).group(:user_id).count
Edit: To display them, you could do like so:
<%- User.select(:id, :name).find_each do |u| -%>
Name: <%= u.name %>
Picks Correct: <%= user_ids_to_pick_em_score[u.id] %>/<%= user_ids_to_pick_ems_count[u.id] %>
Total Score: <%= user_ids_to_guesses_score[u.id] %>/<%= user_ids_to_guesses_count[u.id] %>
<%- end -%>

Access SQL computed columns through ActiveRecord

I have a Person model, which includes a property representing the data of birth (birth_date).
I also have a method called age(), which works out the current age of the person.
I now have need to run queries based on the person's age, so I have replicated the logic of age() as a computed column in MySQL.
I cannot workout how I would make this additional column part of the default select statement of the model.
I would like to be able to access the age as if it were a native property of the Person model, to perform queries against it and access the value in my views.
Is this possible, or am barking up the wrong tree?
I thought I might be able to define additional fields through default_scope or scope, but these methods seem to only recognise existing fields. I also tried default_scope in tandem with attr_assessor.
Possible workarounds I've considered but would prefer not to do:
Create an actual property called age and populate through the use of callbacks. The date is always changing, so this obviously would be be reliable.
Replicate the logic in ActiveRecord as age() and in a scope as a where cause. This would achieve what I need, but doesn't feel very DRY.
I am already caching the results of the age() method. it is the ability to use the field in where clauses that I am most interested in.
There must be a way to define dynamic fields through SQL that I can access through the model by default.
Any help would be appreciated.
Rich
UPDATE
An example of my failed attempt to utilise scopes:
default_scope :select => "*, 2 as age"
attr_accessor :age
age is blank, I assume because scopes only deal with limiting, not extending.
kim3er your solution to your problem is simple. Follow these steps:
Loose the attr_accessor :age from your model. You simply don't need it.
Leave the default scope at the Person model: default_scope :select => "*, 2 as age"
Lastly open up a console and try
p = Person.first
p.age
=> 2
When you define a select using as, Rails will automagically add those methods on the instances for you! So the answer to your question:
There must be a way to define dynamic fields through SQL that I can access through the model by default.
is:
Rails
I'm not an expert by any stretch, but it seems you want:
class Person < ActiveRecord::Base
scope :exact_age, lambda { |a| where('age = ?', a) }
scope :age_gt, lambda { |a| where('age > ?', a) }
but that said, I've just started looking at Arel, seems pretty cool

Rails MySQL auto generated column

I am working on RoR 3.x with MySQL as backend.
Is there any way to modify the existing id (autogenerated with migration) in a way that can generate particular user defined pattern.
For e.g : "Products Table" should have values in "id" field like "P01", "P02" and so on, where P can be specified by the user, and 01,02 are autogenerated.
Thanks in advance!
The 'regular' IDs (1, 2, 3, ..., n) in this case aren't generated by rails but by MySQL (using AUTO_INCREMENT). So, if you want to go with auto-generated, auto-incrementing IDs, I would suggest not messing with this. What you could do, and what I would suggest, is creating an additional column and then populating that using a callback on your model.
Example:
class Product < ActiveRecord::Base
attr_accessor :user_supplied_prefix
after_create :generate_user_supplied_id
private
def generate_user_supplied_id
update_attribute(:user_supplied_id, "#{self.user_supplied_prefix}#{self.id}")
end
end
The downside of this approach is that Product.find(user_supplied_id) won't work. Fortunately, Product.find_by_user_supplied_id(user_supplied_id) will.