ThinkingSphinx::OutOfBoundsError configuration confusion - thinking-sphinx

Hitting an OutOfBoundsError as a consequence of misunderstanding the proper configuration syntax (which may also be a by-product of legacy syntax).
The manual suggests a Class search taking on WillPaginate styled parameters. Having many fields to draw from, the model is defined as
class AziendaSearch < BaseSearch
set_per_page 10000
accept :terms
end
the set_per_page was put at a high level because if I set it at the target of 100, the will_paginate links do not show up.
the controller may be excessively convoluted to include the ordering parameter, and thus result in a two-step process:
#azienda_search = AziendaSearch.new params
#results = #azienda_search.search
#aziendas = Azienda.order('province_id ASC').where('id IN (?)', #results).paginate :page => params[:page], :per_page => 100
the view paginates on the basis of #aziendas:
<%= will_paginate #aziendas, :previous_label => "precedente ", :next_label => " successiva" %>
My suspicion is that the search model is not properly set, but the syntax is not obvious to me given the manual's indications. page params[:page] certainly does not work...
Update
BaseSearch is a Sphinx method and was in fact inherited from an older version of this applications (rails2.x...). So this may be hanging around creating all sort of syntaxic confusion.
In fact, following the manual, I am now fully uncertain as to how to best makes these statements. Should a seperate class be defined for AziendaSearch ? If not, where should the Azienda.search bloc be invoked... in the controller as such?
#azienda_search = Azienda.search(
:max_matches => 100_000,
:page => params[:page],
:per_page => 100,
:order => "province_id ASC"
)
#results = #azienda_search.search

I'm not sure what BaseSearch is doing with set_per_page (that's certainly not a Thinking Sphinx method), but it's worth noting that Sphinx defaults to a maximum of 1000 records. It is possible to configure Sphinx to return more, though - you need to set max_matches in your config/thinking_sphinx.yml to your preferred limit (per environment):
production:
max_matches: 100000
And also set the limit on the relevant search requests:
Azienda.search(
:max_matches => 100_000,
:page => params[:page],
:per_page => 100
)
As for the doubled queries… if you add province_id as an attribute in your index definition, you'll be able to order search queries by it.
# in your Azienda index definition:
has province_id
# And then when searching:
Azienda.search(
params[:azienda_search][:terms],
:max_matches => 100_000,
:page => params[:page],
:per_page => 100,
:order => "province_id ASC"
)

Related

missing quotes in thinking sphinx scoped search

I'm wanting a sphinx_scope that will only search for records that are current. Each database record has a field, status, whose value is either CURRENT or ARCHIVED.
I have achieved this, but I have to use an odd construct to get there; there is probably a much better way to do it.
Here's what I have:
indices/letter_index.rb
ThinkingSphinx::Index.define :letter, :with => :real_time do
# fields
indexes title, :sortable => true
indexes content
# attributes
has status, :type => :string
has issue_date, :type => :timestamp
has created_at, :type => :timestamp
has updated_at, :type => :timestamp
end
models/letter.rb
class Letter < ActiveRecord::Base
include ThinkingSphinx::Scopes
after_save ThinkingSphinx::RealTime.callback_for(:letter)
.. snip ..
sphinx_scope(:archived) {
{:with => {:status => "'ARCHIVED'"}}
}
The problem that I ran into was that if I used :with => {:status => 'ARCHIVED'}, my query came out as
SELECT * FROM `letter_core` WHERE MATCH('search term') AND `status` = ARCHIVED AND `sphinx_deleted` = 0 LIMIT 0, 20
ThinkingSphinx::SyntaxError: sphinxql: syntax error, unexpected IDENT, expecting CONST_INT (or 4 other tokens) near 'ARCHIVED AND `sphinx_deleted` = 0 LIMIT 0, 20; SHOW META'
but, if I construct it as :with => {:status => "'ARCHIVED'"}, it then adds the single quotes and the query succeeds. :)
Is this the proper way to write the scope, or is there a better way?
Bonus question: where do I find the docs for what is allowed in the scopes, such as :order, :with, :conditions, etc.
Firstly: the need for quotes was a bug - Sphinx has only relatively recently allowed for filtering on string attributes, hence why this wasn't something in place a good while ago. I've patched Riddle (the gem that is a pure Ruby wrapper around Sphinx functionality and used by Thinking Sphinx), you can give the latest a spin with this in your Gemfile:
gem 'riddle', '~> 2.0',
:git => 'git://github.com/pat/riddle.git',
:branch => 'develop',
:ref => '8aec79fdf4'
As for what can go in a Sphinx scope: anything that can go in a normal search call.

Wildcard search with Thinking Sphinx issue with indexes

define_index do
indexes :first_name, :prefixes => true
indexes :last_name, :prefixes => true
indexes :email, :prefixes => true
set_property :enable_star => 1
set_property :min_perfix_len => 1
end
In this case if i what to search for only email then it will search from all the indexes that are specified.
EG:
email ="*me*"
Contact.search email
Displayed from first_name,last_name and email.
But it should display from only email
What would be solution for searching only one index from the specified indexes.
Just a quick correction - you want to search on a specific field, not a specific index.
And Thinking Sphinx can do this by using the :conditions option - so give the following a try:
Contact.search :conditions => {:email => '*me*'}
Thinking Sphinx can also automatically add wildcards to both ends of each word you give it as well:
Contact.search :conditions => {:email => 'me'}, :star => true

Rails activerecord order by field in related table

I have a typical forum style app. There is a Topics model which has_many Posts.
What I want to do using Rails 2.3.x is query the topics table and sort by the most recent post in that topic.
#topics = Topic.paginate :page => params[:page], :per_page => 25,
:include => :posts, :order => 'HELP'
I'm sure this is a simple one but no joy with Google. Thanks.
Sorting on a joined column is probably a bad idea and will take an enormous amount of time to run in many situations. What would be better is to twiddle a special date field on the Topic model when a new Post is created:
class Post < ActiveRecord::Base
after_create :update_topic_activity_at
protected
def update_topic_activity_at
Topic.update_all({ :activity_at => Time.now }, { :id => self.topic_id})
end
end
Then you can easily sort on the activity_at column as required.
When adding this column you can always populate the initial activity_at with the highest posting time if you have existing data to migrate.

How can I get FOUND_ROW()s from an active record object in rails?

When querying the database with:
#robots = Robot.all(:condition => [:a => 'b'], :limit => 50, :offset => 0)
What is the best way to get the total number of rows without the :limit?
In raw MySQL you could do something like this:
SELECT SQL_CALC_FOUND_ROWS * FROM robots WHERE a=b LIMIT 0, 50
SELECT FOUND_ROWS();
Is there an active record way of doing this?
This works for me:
ps = Post.all(:limit => 10, :select => "SQL_CALC_FOUND_ROWS *")
Post.connection.execute("select found_rows()").fetch_hash
=> {"found_rows()"=>"2447"}
This will probably not work for joins or anything complex, but it works for the simple case.
Robot.count actually is the solution you want.
Reading one of the comments above, it looks like you may have a misunderstanding of how .count works. It returns a count of all the rows in the table only if there's no parameters.
but if you pass in the same conditions that you pass to all/find eg:
Robot.count(:conditions => {:a => 'b'})
.count() will return the number of rows that match the given conditions.
Just to be obvious - you can even save the condition-hash as a variable to pass into both - to reduce duplication, so:
conds = {:a => 'b'}
#robots = Robot.all(:conditions => conds, :limit => 50)
#num_robots = Robot.count(:conditions => conds)
That being said - you can't do an after-the-fact count on the result-set (like in your example). ie you can't just run your query then ask it how many rows would have been found. You do actually have to call .count on purpose.
search = Robot.all(:condition => ["a=b"], :offset => 0)
#robots = search[0..49]
#count = search.count
That should get what you want, gets all the Robots for counting and then sets #robots to the first 50. Might be a bit expensive on the resource front if the Robots table is huge.
You can of course do:
#count=Robot.all(:condition => ["a=b"], :offset => 0).count
#robots=Robot.all(:condition => ["a=b"], :limit => 50, :offset => 0)
but that will hit the database twice on each request (although rails does have query caching).
Both solutions only use active record so are database independent.
What do you need the total returned by the query for? if its pagination look into Will_paginate (Railscast) which can be extended with AJAX etc...
Try find_by_sql may that help.
Is #robots.size what you're looking for? Or Robot.count?
Otherwise, please clarify.
I think hakunin is right.
You can get no of row return by query by just chekcing the size of resulting array of query.
#robots = Robot.find_by_sql("Your sql")
or
#robots = Robot.find(:all , :conditions=>["your condiitons"] )
#robots.size or #robots.count

Is it possible to use Sphinx search with dynamic conditions?

In my web app I need to perform 3 types of searching on items table with the following conditions:
items.is_public = 1 (use title field for indexing) - a lot of results can be retrieved(cardinality is much higher than in other cases)
items.category_id = {X} (use title + private_notes fields for indexing) - usually less than 100 results
items.user_id = {X} (use title + private_notes fields for indexing) - usually less than 100 results
I can't find a way to make Sphinx work in all these cases, but it works well in 1st case.
Should I use Sphinx just for the 1st case and use plain old "slow" FULLTEXT searching in MySQL(at least because of lower cardinality in 2-3 cases)?
Or is it just me and Sphinx can do pretty much everything?
Without full knowledge of your models I might be missing something, but how's this:
class item < ActiveRecord::Base
define_index do
indexes :title
indexes :private_notes
has :is_public, :type => :boolean
has :category_id
has :user_id
end
end
1)
Item.search(:conditions => {:title => "blah"}, :with => {:is_public => true})
2)
Item.search("blah", :with => {:category_id => 1})
3)
Item.search("blah", :with => {:user_id => 196})