Wildcard search with Thinking Sphinx issue with indexes - thinking-sphinx

define_index do
indexes :first_name, :prefixes => true
indexes :last_name, :prefixes => true
indexes :email, :prefixes => true
set_property :enable_star => 1
set_property :min_perfix_len => 1
end
In this case if i what to search for only email then it will search from all the indexes that are specified.
EG:
email ="*me*"
Contact.search email
Displayed from first_name,last_name and email.
But it should display from only email
What would be solution for searching only one index from the specified indexes.

Just a quick correction - you want to search on a specific field, not a specific index.
And Thinking Sphinx can do this by using the :conditions option - so give the following a try:
Contact.search :conditions => {:email => '*me*'}
Thinking Sphinx can also automatically add wildcards to both ends of each word you give it as well:
Contact.search :conditions => {:email => 'me'}, :star => true

Related

ThinkingSphinx::OutOfBoundsError configuration confusion

Hitting an OutOfBoundsError as a consequence of misunderstanding the proper configuration syntax (which may also be a by-product of legacy syntax).
The manual suggests a Class search taking on WillPaginate styled parameters. Having many fields to draw from, the model is defined as
class AziendaSearch < BaseSearch
set_per_page 10000
accept :terms
end
the set_per_page was put at a high level because if I set it at the target of 100, the will_paginate links do not show up.
the controller may be excessively convoluted to include the ordering parameter, and thus result in a two-step process:
#azienda_search = AziendaSearch.new params
#results = #azienda_search.search
#aziendas = Azienda.order('province_id ASC').where('id IN (?)', #results).paginate :page => params[:page], :per_page => 100
the view paginates on the basis of #aziendas:
<%= will_paginate #aziendas, :previous_label => "precedente ", :next_label => " successiva" %>
My suspicion is that the search model is not properly set, but the syntax is not obvious to me given the manual's indications. page params[:page] certainly does not work...
Update
BaseSearch is a Sphinx method and was in fact inherited from an older version of this applications (rails2.x...). So this may be hanging around creating all sort of syntaxic confusion.
In fact, following the manual, I am now fully uncertain as to how to best makes these statements. Should a seperate class be defined for AziendaSearch ? If not, where should the Azienda.search bloc be invoked... in the controller as such?
#azienda_search = Azienda.search(
:max_matches => 100_000,
:page => params[:page],
:per_page => 100,
:order => "province_id ASC"
)
#results = #azienda_search.search
I'm not sure what BaseSearch is doing with set_per_page (that's certainly not a Thinking Sphinx method), but it's worth noting that Sphinx defaults to a maximum of 1000 records. It is possible to configure Sphinx to return more, though - you need to set max_matches in your config/thinking_sphinx.yml to your preferred limit (per environment):
production:
max_matches: 100000
And also set the limit on the relevant search requests:
Azienda.search(
:max_matches => 100_000,
:page => params[:page],
:per_page => 100
)
As for the doubled queries… if you add province_id as an attribute in your index definition, you'll be able to order search queries by it.
# in your Azienda index definition:
has province_id
# And then when searching:
Azienda.search(
params[:azienda_search][:terms],
:max_matches => 100_000,
:page => params[:page],
:per_page => 100,
:order => "province_id ASC"
)

missing quotes in thinking sphinx scoped search

I'm wanting a sphinx_scope that will only search for records that are current. Each database record has a field, status, whose value is either CURRENT or ARCHIVED.
I have achieved this, but I have to use an odd construct to get there; there is probably a much better way to do it.
Here's what I have:
indices/letter_index.rb
ThinkingSphinx::Index.define :letter, :with => :real_time do
# fields
indexes title, :sortable => true
indexes content
# attributes
has status, :type => :string
has issue_date, :type => :timestamp
has created_at, :type => :timestamp
has updated_at, :type => :timestamp
end
models/letter.rb
class Letter < ActiveRecord::Base
include ThinkingSphinx::Scopes
after_save ThinkingSphinx::RealTime.callback_for(:letter)
.. snip ..
sphinx_scope(:archived) {
{:with => {:status => "'ARCHIVED'"}}
}
The problem that I ran into was that if I used :with => {:status => 'ARCHIVED'}, my query came out as
SELECT * FROM `letter_core` WHERE MATCH('search term') AND `status` = ARCHIVED AND `sphinx_deleted` = 0 LIMIT 0, 20
ThinkingSphinx::SyntaxError: sphinxql: syntax error, unexpected IDENT, expecting CONST_INT (or 4 other tokens) near 'ARCHIVED AND `sphinx_deleted` = 0 LIMIT 0, 20; SHOW META'
but, if I construct it as :with => {:status => "'ARCHIVED'"}, it then adds the single quotes and the query succeeds. :)
Is this the proper way to write the scope, or is there a better way?
Bonus question: where do I find the docs for what is allowed in the scopes, such as :order, :with, :conditions, etc.
Firstly: the need for quotes was a bug - Sphinx has only relatively recently allowed for filtering on string attributes, hence why this wasn't something in place a good while ago. I've patched Riddle (the gem that is a pure Ruby wrapper around Sphinx functionality and used by Thinking Sphinx), you can give the latest a spin with this in your Gemfile:
gem 'riddle', '~> 2.0',
:git => 'git://github.com/pat/riddle.git',
:branch => 'develop',
:ref => '8aec79fdf4'
As for what can go in a Sphinx scope: anything that can go in a normal search call.

active record query conflict between :joins and :includes

Can anyone suggest an easy fix for the query below?
Workout.all(:joins => [:person, :schedule],
:conditions => ["schedule.id = ? AND people.gym_id = ?", schedule.id, gym.id ],
:order => "time DESC",
:include => [:person]),
Error message says Mysql::Error: Not unique table/alias: 'people': and then a very long sql query
When I remove :person from the :include options, the code works fine. However, then I get the "n+1 queries" problem: when I display the workouts, it generates a separate query for each person.
I am assuming that the 'people' tables from the join and include options conflict in the generated SQL. Is there a way to alias tables with active record using the find options? Or do I need to use a sql fragment for the :joins option so that I can alias the tables with AS?
I have not upgraded this app to 3.0 yet, so I can't use the "Class.joins().where()" syntax.
You can remove :person from the joins and keep it in the include, you can add fields from the people table in the :select option e.g
Workout.all(:joins => [:schedule],
:conditions => ["schedule.id = ? AND people.gym_id = ?", schedule.id, gym.id ],
:order => "time DESC",
:select => "workouts.*,people.*"
:include => [:person])

Sorting a Rails database table by a column in an associated model

I'm trying to implement Ryan Bates' sortable table columns code (Railscast #228) but I'd like to be able to sort on an associated column. In particular, I have the following models and associations:
class Project < ActiveRecord::Base
belongs_to :program_manager, :class_name => "User"
class User < ActiveRecord::Base
has_many :program_manager_projects, :class_name => "Project", :foreign_key => "program_manager_id"
The association between the Project model and the User model is mediated by the 'program_manager_id' foreign key, which the user sets in the new/edit views using a collection-select dropdown. Here's part of the annotation at the top of project.rb:
# Table name: projects
# program_manager_id :integer
I want to be able to sort my list of projects in the index view by the program manager's name, i.e., by project.program_manager.name.
Ideally, I'd be able to point :order to this name somehow, perhaps with something like this in the index method of my ProjectsController:
#projects = Project.find(:all, :order => project.program_manager.name)
But that obviously won't work (not to mention Ryan's routine implements this with a specific reference to table names from the model to be sorted.)
I've come across some intimidating approaches that use named_scope, such as:
named_scope :most_active, :select => "questions.*", :joins => "left join comments as comments_for_count on comments_for_count.question.id = questions.id", :group => "questions.id", :order => "count(questions.id) desc"
But given my lack of MySQL expertise, this is fairly impenetrable to me.
Can anyone help me either generalize the named_scope example above for my specific case, or point me to a more straightforward strategy?
Thanks very much,
Dean
Let's dissect that named scope you referenced above. Imagine a model Question which has many Comments.
named_scope :most_active, :select => "questions.*", :joins => "left join comments as comments_for_count on comments_for_count.question.id = questions.id", :group => "questions.id", :order => "count(questions.id) desc"
:most_active
the name of your scope. You would reference thusly: Question.find(:all).most_active
:select => "questions.*"
by default scopes selects all columns from your table anyway, so this limits the results to only the questions table, and not the comments table. This is optional.
:joins => "left join comments as comments_for_count on comments_for_count.question.id = questions.id"
this is saying for every question, I also want to get all comments associated with them. The comments table has a column 'question_id' which is what we'll be using to match them up to the appropriate question record. This is important. It allows us access to fields that are not on our model!
:group => "questions.id"
This is required for the count() function in the order clause to tell us that we want the count of comments based on question. We don't need the count function in our order clause, so we also don't need this group statement
:order => "count(questions.id) desc"
Return the results in order of number of comments, highest to lowest.
So for our example, discarding what we don't need, and applying to your needs, we end up with:
:named_scope :by_program_manager_name, :joins => "left join users on projects.program_manager_id = users.id", :order => "users.name"
This named_scope would be called thusly:
Project.find(:all).by_program_manager_name
Note this is basically equivalent to:
Project.find(:all, :joins => "left join users on projects.program_manager_id = users.id", :order => "users.name")
But, as cam referenced above, you should really know the underlying SQL. Your abilities will be severely hampered without this understanding

Is it possible to use Sphinx search with dynamic conditions?

In my web app I need to perform 3 types of searching on items table with the following conditions:
items.is_public = 1 (use title field for indexing) - a lot of results can be retrieved(cardinality is much higher than in other cases)
items.category_id = {X} (use title + private_notes fields for indexing) - usually less than 100 results
items.user_id = {X} (use title + private_notes fields for indexing) - usually less than 100 results
I can't find a way to make Sphinx work in all these cases, but it works well in 1st case.
Should I use Sphinx just for the 1st case and use plain old "slow" FULLTEXT searching in MySQL(at least because of lower cardinality in 2-3 cases)?
Or is it just me and Sphinx can do pretty much everything?
Without full knowledge of your models I might be missing something, but how's this:
class item < ActiveRecord::Base
define_index do
indexes :title
indexes :private_notes
has :is_public, :type => :boolean
has :category_id
has :user_id
end
end
1)
Item.search(:conditions => {:title => "blah"}, :with => {:is_public => true})
2)
Item.search("blah", :with => {:category_id => 1})
3)
Item.search("blah", :with => {:user_id => 196})