ActiveRecord model column not updating (even though save! succeeds) - mysql

I've got a really, really odd problem manifesting on a big Rails e-commerce app and thought I'd see if anyone has good insight. I have an"Order" model with many associations. If I create a new instance, and then set one particular column value and "save!" the "save!" is succeeding without errors, but the change isn't actually persisted to the DB. I'll run through the scenario below:
#order = Order.create!(<some attributes>)
=> true
#order.shipping_method_id
=> 1
#order.shipping_method_id = 203
=> 203
#order.save!
=> true
#order.shipping_method_id
=> 1
To try and debug this I actually prepended a before_save filter and I can see that when this first filter is called after setting the value, it is correct ("203") BUT the very next before_save after the 6-or-so built-in "autosave_foo_bar_quux" filters (for nested associations) it is back to "1".
Oddly, if I just reload the order (#order.reload), change the column value and save! the update does succeed.
In both cases, doing #order.changed shows that ActiveModel recognizes the column value change for shipping_method_id. In the first, though, the SQL logging shows that the order row is not updated.
I feel like I'm going insane. Any ideas? Also, let me know if there's anything else I can post here for more context.

Related

Thinking sphinx ranking and statistics

I'm trying to set up an ability to get some numbers from my Sphinx indexes, but not sure how to get the info I want.
I have a mysql db with articles, sphinx index set up for that db and full text search, all working. What I want is to get some numbers:
How many times search text (keyword, or key phrase) appears over all articles for all time (more likely limited to "articles from time interval from X and to Y")
Same as previous but for how many times 2 keywords or keyphrases (so "x AND y") appear in same articles
I was doing something similar to first manually using bat file I made
indexer ind_core -c c:\%SOME_PATH%\development.sphinx.conf --buildstops stats.txt 10000 --buildfreqs
Which generated me a txt with all repeating keywords and how often they appear at early development stages, which helped to form a list of keywords I'm interested in. Now I'm trying to do the same but just for a finite list of predetermined keywords and integrated into my rails project to be able to build charts in future.
I tried running some queries like
#testing = Article.search 'Keyword1 AND Keyword2', :ranker => :wordcount
but I'm not sure how it works and how to process the result, as well as if that's what I'm looking for.
Another approach I tried was manual mysql queries such as
SELECT id,title,WEIGHT() AS w FROM ind_core WHERE MATCH('#title keyword1 | keyword2') OPTION ranker=expr('sum(hit_count)');
but I'm not sure how to process results from here either (as well as how to actually implement it into my existing rails project), and it's limited to 20 lines per query (which I think I can change somewhere in settings?). But at least looking at mysql results what I'm interested in is hit_count over all articles (or all articles from set timeframe).
Any ideas on how to do this?
UPDATE:
Current way I found was to add
#testing = Article.search params[:search], :without => {:is_active => false}, :ranker => :bm25
to controller with some conditions (so it doesn't bug out from nil search). :is_active is my soft delete flag, don't want to search deleted entries, so don't mind it. And in view I simply displayed
<%= #testing.total_entries %>
Which if I understand it correct shows me number of matches sphinx found (so pretty much what I was looking for).
So, to figure out the number of hits per document, you're pretty much on the right track, it's just a matter of getting it into Ruby/Thinking Sphinx.
To get the raw Sphinx results (if you don't need the ActiveRecord objects):
search = Article.search "foo",
:ranker => "expr('SUM(hit_count)')",
:select => "*, weight()",
:middleware => ThinkingSphinx::Middlewares::RAW_ONLY
… this will return an array of hashes, and you can use the weight() string key for the hit count, and the sphinx_internal_id string key for the model's primary key (id is Sphinx's own primary key, which isn't so useful).
Or, if you want to use the ActiveRecord objects, Thinking Sphinx has the ability to wrap each search result in a helper object which passes appropriate methods through to the underlying model instances, but lets weight respond with the values from Sphinx:
search = Article.search "foo",
:ranker => "expr('SUM(hit_count)')",
:select => "*, weight()"; ""
search.context[:panes] << ThinkingSphinx::Panes::WeightPane
search.each do |article|
puts article.weight
end
Keep in mind that panes must be added before the search is evaluated, so if you're testing this in a Rails console, you'll want to avoid letting the console inspect the search variable (which I usually do by adding ; "" at the end of the initial search call.
In both of these cases, as you've noted, the search results are paginated - you can use the :page option to determine which page of results you want, and :per_page to determine the number of records returned in each request. There is a standard limit of 1000 results overall, but that can be altered using the max_matches setting.
Now, if you want the number of times the keywords appear across all Sphinx records, then the best way to do that while also taking advantage of Thinking Sphinx's search options, is to get the raw results of an aggregate SUM - similar to the first option above.
search = Article.search "foo",
:ranker => "expr('SUM(hit_count)')",
:select => "SUM(weight()) AS count",
:middleware => ThinkingSphinx::Middlewares::RAW_ONLY
search.first["count"]

Help me optimize an ActiveRecord object with too many attributes

I'm working on a app which ties to a legacy database. The primary model is based on a stupidly large 100+ column table. I don't know too much about the inner-workings of ActiveRecord but it seems to me that any request on this model is slowing down because it's creating objects with 100+ attributes. Let's call this SlowModel.
Rendering pages with this model sometimes take 17 seconds on my dev computer. Straight up mysql queries only take ~ 0.5 - 1 second.
I've managed to speed up one portion of the app by using a MySQL view that selects a subset of fields (20 or so). We'll call this QuickModel. Using views is OK but isn't the most portable solution.
I will likely continue to try and add this QuickModel into other parts of the site but I was wondering if anyone had other ideas in speeding up the original object. For instance, is there a way to specify in the model what columns activerecord should just ignore and avoid building? Maybe there are specific column types (:text??) that cause bloat in ActiveRecord objects.
Assume that columns have proper indices.
You can specify which columns are returned in the model lookup using the :select option of the ActiveRecord lookup:
SlowModel.all(:select => 'id, col1, col2, col3')
...will load instances of SlowModel with only the specified columns populated.
How about having a completely new QuickModel that sits to its own table... and a QuickModel has_one SlowModel?
You can use SQL to move the most-necessary data into the QuickModel table and only refer to the SlowModel using my_quick_model.slow_model when necessary.
Alternatively, you can add a "select" to the default scope (you can google "rails default scope" for more). By default it'll only fetch the reduced set - but you can ask for all attributes by passing :select => "*" if necessary.
Along the lines of what Winfield is saying, you may want to take a look at using an attribute tracker like SlimScrooge. The tracker attempts to fetch only the data that you're using, which reduces overhead. It attempts to automatically do what Winfield is suggesting.
Example from the Readme:
# 1st request, sql is unchanged but columns accesses are recorded
Brochure Load SlimScrooged 1st time (27.1ms) SELECT * FROM `brochures` WHERE (expires_at IS NULL)
# 2nd request, only fetch columns that were used the first time
Brochure Load SlimScrooged (4.5ms) SELECT `brochures`.expires_at,`brochures`.operator_id,`brochures`.id FROM `brochures` WHERE (expires_at IS NULL)
# 2nd request, later in code we need another column which causes a reload of all remaining columns
Brochure Reload SlimScrooged (0.6ms) `brochures`.name,`brochures`.comment,`brochures`.image_height,`brochures`.id, `brochures`.tel,`brochures`.long_comment,`brochures`.image_name,`brochures`.image_width FROM `brochures` WHERE `brochures`.id IN ('5646','5476','4562','3456','4567','7355')
# 3rd request
Brochure Load SlimScrooged (4.5ms) SELECT `brochures`.expires_at,`brochures`.operator_id,`brochures`.name, `brochures`.id FROM `brochures` WHERE (expires_at IS NULL)

Exposing table name and field names in request URL

I was tasked to create this Joomla component (yep, joomla; but its unrelated) and a professor told me that I should make my code as dynamic as possible (a code that needs less maintenance) and avoid hard coding. The approach we thought initially is take url parameters, turn them into objects, and pass them to query.
Let's say we want to read hotel with id # 1 in the table "hotels". lets say the table has the fields "hotel_id", "hotel_name" and some other fields.
Now, the approach we took in making the sql query string is to parse the url request that looked like this:
index.php?task=view&table=hotels&hotel_id=1&param1=something&param2=somethingelse
and turned it into a PHP object like this (shown in JSON equivalent, easier to understand):
obj = {
'table':'hotel',
'conditions':{
'hotel_id':'1',
'param1':'something',
'param2':'somethingelse'
}
and the SQL query will be something like this where conditions are looped and appended into the string where field and value of the WHERE clause are the key and value of the object (still in JSON form for ease):
SELECT * FROM obj.table WHERE hotel_id=1 AND param1=something and so on...
The problem that bugged me was the exposing of the table name and field names in the request url. I know it poses a security risk exposing items that should only be seen to the server side. The current solution I'm thinking is giving aliases to each and every table and field for the client side - but that would be hard coding, which is against his policy. and besides, if I did that, and had a thousand tables to alias, it would not be practical.
What is the proper method to do this without:
hard coding stuff
keep the code as dynamic and adaptable
EDIT:
Regarding the arbitrary queries (I forgot to include this), what currently stops them in the back end is a function, that takes a reference from a hard-coded object (more like a config file shown here), and parses the url by picking out parameters or matching them.
The config looks like:
// 'hotels' here is the table name. instead of parsing the url for a table name
// php will just find the table from this config. if no match, return error.
// reduces risk of arbitrary tables.
'hotels' => array(
// fields and their types, used to identify what filter to use
'structure' => array(
'hotel_id'=>'int',
'name'=>'string',
'description'=>'string',
'featured'=>'boolean',
'published'=>'boolean'
),
//these are the list of 'tasks' and accepted parameters, based on the ones above
//these are the actual parameter names which i said were the same as field names
//the ones in 'values' are usually values for inserting and updating
//the ones in 'conditions' are the ones used in the WHERE part of the query
'operations' =>array(
'add' => array(
'values' => array('name','description','featured'),
'conditions' => array()
),
'view' => array(
'values' => array(),
'conditions' => array('hotel_id')
),
'edit' => array(
'values' => array('name','description','featured'),
'conditions' => array('hotel_id')
),
'remove' => array(
'values' => array(),
'conditions' => array('hotel_id')
)
)
)
and so, from that config list:
if a parameters sent for a task is not complete, server returns an error.
if a parameter from the url is doubled, only the first parameter read is taken.
any other parameters not in the config are discarded
if that task is not allowed, it wont be listed for that table
if a task is not there, server returns an error
if a table is not there, server returns an error
I actually patterned this after seeing a component in joomla that uses this strategy. It reduces the model and controller to 4 dynamic functions which would be CRUD, leaving only the config file to be the only file editable later on (this was what I meant about dynamic code, I only add tables and tasks if further tables are needed) but I fear it may impose a security risk which I may have not known yet.
Any ideas for an alternative?
I have no problem with using the same (or very similar) names in the URL and the database — sure, you might be "exposing" implementation details, but if you're choosing radically different names in the URL and the DB, you're probably choosing bad names. I'm also a fan of consistent naming — communication with coders/testers/customers becomes much more difficult if everyone calls everything something slightly different.
What bugs me is that you're letting the user run arbitrary queries on your database. http://.../index.php?table=users&user_id=1, say? Or http://.../index.php?table=users&password=password (not that you should be storing passwords in plaintext)? Or http://.../index.php?table=users&age=11?
If the user connected to the DB has the same permissions as the user sitting in front of the web browser, it might make sense. Generally, that's not going to be the case, so you'll need some layer that knows what the user is and isn't allowed to see, and that layer is a lot easier to write correctly by whitelisting.
(If you've stuck enough logic into stored procedures, then it might work, but then your stored procedures will hard-code column names...)
When composing a SQL query with data from the input, it presents a security risk. But keep in mind that columns values are inserted to the fields by taking input from the user, analyzing it and composing a SQL query with it (except for prepared statements). So when done properly, you have nothing to worry about - simply restrict the user to those column & tables. Open source software's code/database is visible to all - and it doesn't harm the system so much as one would think.
Your aliasses could be a rot13() on the meta/name of your objects.
Although, if you escape the input accordingly when working with those names, I don't see any problem in exposing their names.

Access SQL computed columns through ActiveRecord

I have a Person model, which includes a property representing the data of birth (birth_date).
I also have a method called age(), which works out the current age of the person.
I now have need to run queries based on the person's age, so I have replicated the logic of age() as a computed column in MySQL.
I cannot workout how I would make this additional column part of the default select statement of the model.
I would like to be able to access the age as if it were a native property of the Person model, to perform queries against it and access the value in my views.
Is this possible, or am barking up the wrong tree?
I thought I might be able to define additional fields through default_scope or scope, but these methods seem to only recognise existing fields. I also tried default_scope in tandem with attr_assessor.
Possible workarounds I've considered but would prefer not to do:
Create an actual property called age and populate through the use of callbacks. The date is always changing, so this obviously would be be reliable.
Replicate the logic in ActiveRecord as age() and in a scope as a where cause. This would achieve what I need, but doesn't feel very DRY.
I am already caching the results of the age() method. it is the ability to use the field in where clauses that I am most interested in.
There must be a way to define dynamic fields through SQL that I can access through the model by default.
Any help would be appreciated.
Rich
UPDATE
An example of my failed attempt to utilise scopes:
default_scope :select => "*, 2 as age"
attr_accessor :age
age is blank, I assume because scopes only deal with limiting, not extending.
kim3er your solution to your problem is simple. Follow these steps:
Loose the attr_accessor :age from your model. You simply don't need it.
Leave the default scope at the Person model: default_scope :select => "*, 2 as age"
Lastly open up a console and try
p = Person.first
p.age
=> 2
When you define a select using as, Rails will automagically add those methods on the instances for you! So the answer to your question:
There must be a way to define dynamic fields through SQL that I can access through the model by default.
is:
Rails
I'm not an expert by any stretch, but it seems you want:
class Person < ActiveRecord::Base
scope :exact_age, lambda { |a| where('age = ?', a) }
scope :age_gt, lambda { |a| where('age > ?', a) }
but that said, I've just started looking at Arel, seems pretty cool

ActiveRecord caching and update_attributes

If a model changes an attribute locally, then changes it back, ActiveRecord doesn't send the change to the DB. This is great for performance, but if something else changes the database, and I want to revert it to the original value, the change doesn't take:
model = Model.find(1)
model.update_attribute(:a, 1) # start it off at 1
# some code here that changes model.a to 2
model.a = 2 # I know it changed, reflecting my local model of the change
model.update_attribute(:a, 1) # try change it back, DOESN'T WORK
The last line doesn't work because AR thinks in the DB it's still 1, even though something else changed it to 2. How can I force an AR update, or update the cache directly if I know the new value?
Side note: the code that changes it is an update_all query that locks the record, but it has side effects that mess up the cache. Multiple machines read this table. If there's a better way to do this I'd love to know.
Model.update_all(["locked_by = ?", lock_name], ["id = ? AND locked_by IS NULL", id])
Use the reload method for this.
model.reload(:select => "a")
OR
You can try the will_change! method(Its not clear how your change happens. But you can try this method).
model.update_attribute(:a, 1) # start it off at 1
model.a_will_change! #fore warn the model about the change
model.a = 2 #perform the change
model.update_attribute(:a, 1)
The answer by Harish Shetty is correct, you need to call reload on the reference, however I found a better way to do that automatically.
In your model you want to reload attribute to, create a after_update callback and call reload directly there, like so:
after_update :reload_attr
def reload_attr
reload select: "attr"
end