Selecting random in rails sqlite vs mysql - mysql

Hey guys, I'm trying to select random data from the database in Ruby on Rails. Unfortunately, sqlite and mysql use different names for the "random" function. Mysql uses rand(), sqlite use random(). I've been pretty happy using sqlite in my development environments so far, and I don't want to give it up for just this.
So I have a solution for it, but I'm not very happy with it. First, is there a cleaner abstraction in RoR for getting the random function? And if not, is this the best way to get the "adapter"?
# FIXME: There has to be a better way...
adapter = Rails.configuration.database_configuration[Rails.configuration.environment]["adapter"]
if adapter == "sqlite3"
# sqllite calls it rand
random = "random"
else
# mysql calls it rand
random = "rand"
end
query.push("SELECT *, (" + random + "() * (0.1 * value)) AS weighted_random_value...")

You can effectively alias MySQL's rand() to the standard random() by creating a function:
CREATE FUNCTION random() RETURNS FLOAT NO SQL SQL SECURITY INVOKER RETURN rand();

I wrote a small plugin that handles this problem:
http://github.com/norman/active_record_random

I ran into this problem when developing locally using SQLite. Unfortunately, this is not the only difference between the databases you're going to run into (booleans are also handled differently for instance).
Is it a requirement that you support both SQLite and MySQL? If not I recommend switching to a single database: the one you're deploying on in production.
This takes a bit more time to set up but IMHO in the long run it will save you time, and you will have confidence that your app works well on the database that you'll actually be deploying it with.

Related

Python3, MySQL, and SqlAlchemy -- does SqlAlchemy always require a DBAPI?

I am in the process of migrating databases from sqlite to mysql. Now that I've migrated the data to mysql, I'm not able to use my sqlalchemy code (in Python3) to access it in the new mysql db. I was under the impression that sqlalchemy syntax was database agnostic (i.e. the same syntax would work for accessing sqlite and mysql), but this appears not to be the case. So my question is: Is it absolutely required to use a DBAPI in addition to Sqlalchemy to read the data? Do I have to edit all of my sqlalchemy code to now read mysql?
The documentation says: The MySQL dialect uses mysql-python as the default DBAPI. There are many MySQL DBAPIs available, including MySQL-connector-python and OurSQL, which I think means that I DO need a DBAPI.
My old code with sqlite successfully worked like this with sqlite:
engine = create_engine('sqlite:///pmids_info.db')
def connection():
conn = engine.connect()
return conn
def load_tables():
metadata = MetaData(bind=engine) #init metadata. will be empty
metadata.reflect(engine) #retrieve db info for metadata (tables, columns, types)
inputPapers = Table('inputPapers', metadata)
return inputPapers
inputPapers = load_tables()
def db_inputPapers_retrieval(user_input):
result = engine.execute("select title, author, journal, pubdate, url from inputPapers where pmid = :0", [user_input])
for row in result:
title = row['title']
author = row['author']
journal = row['journal']
pubdate = row['pubdate']
url = row['url']
apa = str(author+' ('+pubdate+'). '+title+'. '+journal+'. Retrieved from '+url)
return apa
This worked fine and dandy. So then I tried to update it to work with the mysql db like this:
engine = create_engine('mysql://snarkshark#localhost/pmids_info')
At first when I tried to run my sample code like this, it complained because I didn't have MySqlDB. Some googling around informed me that MySqlDB does NOT work for Python 3. So then I tried pip installing pymysql and changing my engine statement to
engine = create_engine('mysql+pymysql://snarkshark#localhost/pmids_info')
which also ends up giving me various syntax errors when I try to adjust things.
So what I want to know, is if there is any way I can get my current syntax to work with mysql? Since the syntax is from sqlalchemy, I thought it would work perfectly for the exact same data in mysql that was previously in sqlite. Will I have to go through and update ALL of my db functions to use the syntax of the DBAPI?
This will sound like a dumb answer, but you'll need to change all the places where you're using database-specific behavior. SQLAlchemy does not guarantee that anything you do with it is portable across all backends. It leaks some abstractions on purpose to allow you to do things that are only available on certain backends. What you're doing is like using Python because it's cross-platform, then doing a bunch of os.fork()s everywhere, and then being surprised that it doesn't work on Windows.
For your specific case, at a minimum, you need to wrap all your raw SQL in text() so that you're not affected by the supported paramstyle of the DBAPI. However, there are still subtle differences between different dialects of SQL, so you'll need to use the SQLAlchemy SQL expression language instead of raw SQL if you want portability. After all that, you'll still need to be careful not to use backend-specific features in the SQL expression language.

Drupal : How can I know if the db is mysql or postgres

I have a complicated query and since I need that my module work on both mysql and postgres, I need to write two version of it.
Unfortunately, I don't know how I can check if the db I use is mysql or postgres, to know which query use. Do you know if a function can return this value?
As #kordirko says, one option is to query the server version: SELECT version(); will work on both MySQL and PostgreSQL, though not most other database engines.
Parsing version strings is always a bit fragile though, and MySQL returns just a version number like 5.5.32 wheras PostgreSQL returns something like PostgreSQL 9.4devel on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.7.2 20121109 (Red Hat 4.7.2-8), 64-bit. What do you do if you're connecting to a PostgreSQL-compatible database like EnterpriseDB Postgres Plus, or a MySQL-compatible database?
It's much safer to use the Drupal function for the purpose, DatabaseConnection::databaseType. This avoids a query round-trip to the DB, will work on databases that won't understand/accept SELECT version(), and will avoid the need to parse version strings.
You'll find this bug report useful; it suggests that the correct usage is Database::getConnection()->databaseType().
(I've never even used Drupal, I just searched for this).
As long as the abstract DatabaseConnection class extends PDO class, you can invoking pdo methods in order to know the current database driver.
For instance:
$conn = Database::getConnection();
print $conn->getAttribute($conn::ATTR_DRIVER_NAME); #returns mysql, pgsql...
There is a second way to do it using DatabaseConnection::driver():
print $conn->driver();
or DatabaseConnection::databaseType();
print $conn->databaseType();
Note that DatabaseConnection::driver() and DatabaseConnection::databaseType() are similar functions but not equals!
The return value from DatabaseConnection::driver() method depends on the implementation and other factors.
in the Drupal Database API page:
database.inc abstract public DatabaseConnection::driver()
This is not necessarily the same as the type of the database itself. For instance, there could be two MySQL drivers, mysql and mysql_mock. This function would return different values for each, but both would return "mysql" for databaseType().
In the most cases you just gonna want to use only
$conn->getAttribute($conn::ATTR_DRIVER_NAME)
or $conn->databaseType()
If you want get more specific properties, you should take advantage the PHP ReflectionClass features:
$conn = Database::getConnection();
$ref = new ReflectionClass($conn);
#ref->getProperties, ref->getConstants $ref->isAbstract...
Reference:
PDO::getAttribute
PDO::ATTR_DRIVER_NAME
Drupal Database API
Drupal Base Database API class

How does Rails build a MySQL statement?

I have the following code that run on heroku inside a controller that intermittently fails. It's a no-brainer that it should work to me, but I must be missing something.
#artist = Artist.find(params[:artist_id])
The parameters hash looks like this:
{"utf8"=>"������",
"authenticity_token"=>"XXXXXXXXXXXXXXX",
"password"=>"[FILTERED]",
"commit"=>"Download",
"action"=>"show",
"controller"=>"albums",
"artist_id"=>"62",
"id"=>"157"}
The error I get looks like this:
ActiveRecord::StatementInvalid: Mysql::Error: : SELECT `artists`.* FROM `artists` WHERE `artists`.`id` = ? LIMIT 1
notice the WHEREartists.id= ? part of the statement? It's trying to find an ID of QUESTION MARK. Meaning Rails is not passing in the params[:artist_id] which is obviously in the params hash. I'm at complete loss.
I get the same error on different pages trying to select the record in a similar fashion.
My environment: Cedar Stack on Heroku (this only happens on Heroku), Ruby 1.9.3, Rails 3.2.8, files being hosted on Amazon S3 (though I doubt it matters), using the mysql gem (not mysql2, which doesn't work at all), ClearDB MySQL database.
Here's the full trace.
Any help would be tremendously appreciated.
try sql?
If it's just this one statement, and it's causing production problems, can you omit the query generator just for now? In other words, for very short term, just write the SQL yourself. This will buy you a bit of time.
# All on one line:
Artist.find_by_sql
"SELECT `artists`.* FROM `artists`
WHERE `artists`.`id` = #{params[:artist_id].to_i} LIMIT 1"
ARel/MySQL explain?
Rails can help explain what MySQL is trying to do:
Artist.find(params[:artist_id]).explain
http://weblog.rubyonrails.org/2011/12/6/what-s-new-in-edge-rails-explain/
Perhaps you can discover some kind of difference between the queries that are succeeding vs. failing, such as how the explain uses indexes or optimizations.
mysql2 gem?
Can you try changing from the mysql gem to the mysql2 gem? What failure do you get when you switch to the mysql2 gem?
volatility?
Perhaps there's something else changing the params hash on the fly, so you see it when you print it, but it's changed by the time the query runs?
Try assigning the variable as soon as you receive the params:
artist_id = params[:artist_id]
... whatever code here...
#artist = Artist.find(artist_id)
not the params hash?
You wrote "Meaning Rails is not passing in the params[:artist_id] which is obviously in the params hash." I don't think that's the problem-- I expect that you're seeing this because Rails is using the "?" as a placeholder for a prepared statement.
To find out, run the commands suggested by #Mori and compare them; they should be the same.
Article.find(42).to_sql
Article.find(params[:artist_id]).to_sql
prepared statements?
Could be a prepared statement cache problem, when the query is actually executed.
Here's the code that is failing-- and there's a big fat warning.
begin
stmt.execute(*binds.map { |col, val| type_cast(val, col) })
rescue Mysql::Error => e
# Older versions of MySQL leave the prepared statement in a bad
# place when an error occurs. To support older mysql versions, we
# need to close the statement and delete the statement from the
# cache.
stmt.close
#statements.delete sql
raise e
end
Try configuring your database to turn off prepared statements, to see if that makes a difference.
In your ./config/database.yml file:
production:
adapter: mysql
prepared_statements: false
...
bugs with prepared statements?
There may be a problem with Rails ignoring this setting. If you want to know a lot more about it, see this discussion and bug fix by Jeremey Cole and Aaron: https://github.com/rails/rails/pull/7042
Heroku may ignore the setting. Here's a way you can try overriding Heroku by patching the prepared_statements setup: https://github.com/rails/rails/issues/5297
remove the query cache?
Try removing the ActiveRecord QueryCache to see if that makes a difference:
config.middleware.delete ActiveRecord::QueryCache
http://edgeguides.rubyonrails.org/configuring.html#configuring-middle
try postgres?
If you can try Postgres, that could clear it up too. That may not be a long term solution for you, but it would isolate the problem to MySQL.
The MySQL statement is obviously wrong, but the Ruby code you mentioned would not produce it. Something is wrong here, either you use a different Ruby code (maybe one from a before_filter) or pass a different parameter (like params[:artist_id] = "?"). Looks like you use nested resources, something like Artist has_many :albums. Maybe the #artist variable is not initialized correctly in the previous action, so that params[:artist_id] has not the right value?

Why are stored procedures still not supported in Rails (3+)?

I am familiar with the long standing love-hate relationship between Ruby on Rails, DB(MS)-drivers and Stored Procedures and I have been developing Rails applications since version 2.3.2.
However, every once in a while a situation arises where a SP is simply a better choice than combining data on the (much slower) application level. Specifically, running reports which combines data from multiple tables is usually better suited for a SP.
Why are stored procedures still so poorly integrated into Rails or the MySQL gem. I am currently working on a project with Rails 3.0.10 and MySQL2 gem 0.2.13 but as far as I can see, even the latest Edge Rails and MySQL gem 0.3+ still throw tantrums when you use SPs.
The problem which has been, and still is, is that the database connection is lost after a SP is called.
>> ActiveRecord::Base.connection.execute("CALL stored_proc")
=> #<Mysql::Result:0x103429c90>
>> ActiveRecord::Base.connection.execute("CALL stored_proc")
ActiveRecord::StatementInvalid: Mysql::Error: Commands out of sync;
[...]
>> ActiveRecord::Base.connection.active?
=> false
>> ActiveRecord::Base.connection.reconnect!
=> nil
>> ActiveRecord::Base.connection.execute("CALL proc01")
=> #<Mysql::Result:0x1034102e0>
>> ActiveRecord::Base.connection.active?
=> false
Is this a really difficult problem to tackle, technically, or is this a design choice by Rails?
Stored procedures are supported in rails. The out of of sync error you are getting is because the MULTI_STATEMENTS flag for MySQL is not enabled by default in Rails. This flag allows for procedures to return more than 1 result set.
See here for a code sample on how to enable it: https://gist.github.com/wok/1367987
Stored procedures work out of the box with MS SQL Server.
I have been using stored procedures in almost all of my mySQL and SQL Server based rails projects without any issued.
This is for postgres to execute a stored procedure that returns instances of MyClass.
sql=<<-SQL
select * from my_cool_sp_with_3_parameters(?, ?, ?) as
foo(
column_1 <type1>,
column_2 <type2>
)
SQL
MyClass.find_by_sql([sql, param1, param2, param3]);
Replace the column list inside of foo() with the columns from your model and the stored procedure results. I'm sure this could be made generic by inspecting the columns of the class.
Those who are getting sync errors may have procedures that generate multiple results. You will need to do something like this to handle them:
raise 'You updated Rails. Check this duck punch is still valid' unless Rails.version == "3.2.15"
module ActiveRecord
module ConnectionAdapters
class Mysql2Adapter
def call_stored_procedure(sql)
results = []
results << select_all(sql)
while #connection.more_results?
results << #connection.next_result
end
results
end
end
end
end
Call like this:
ActiveRecord::Base.connection.call_stored_procedure("CALL your_procedure('foo')")

Renaming columns in a MySQL select statement with R package RJDBC

I am using the RJDBC package to connect to a MySQL (Maria DB) database in R on a Windows 7 machine and I am trying a statement like
select a as b
from table
but the column will always continue to be named "a" in the data frame.
This works normally with RODBC and RMySQL but doesn't work with RJDBC. Unfortunately, I have to use RJDBC as this is the only package that has no problem with the encoding of chinese, hebrew and so on letters (set names and so on don't seem to work with RODBC and RMySQL).
Has anybody experienced this problem?
I have run into the same frustrating issue. Sometimes the AS keyword would have its intended effect, but other times it wouldn't. I was unable to identify the conditions to make it work correctly.
Short Answer: (Thanks to Simon Urbanek (package maintainer for RJDBC), Yev, and Sebastien! See the Long Answer.) One thing that you may try is to open your JDBC connection using ?useOldAliasMetadataBehavior=true in your connection string. Example:
drv <- JDBC("com.mysql.jdbc.Driver", "C:/JDBC/mysql-connector-java-5.1.18-bin.jar", identifier.quote="`")
conn <- dbConnect(drv, "jdbc:mysql://server/schema?useOldAliasMetadataBehavior=true", "username", "password")
query <- "SELECT `a` AS `b` FROM table"
result <- dbGetQuery(conn, query)
dbDisconnect(conn)
This ended up working for me! See more details, including caveats, in the Long Answer.
Long Answer: I tried all sorts of stuff, including making views, changing queries, using JOIN statements, NOT using JOIN statements, using ORDER BY and GROUP BY statements, etc. I was never able to figure out why some of my queries were able to rename columns and others weren't.
I contacted the package maintainer (Simon Urbanek.) Here is what he said:
In the vast majority of cases this is an issue in the JBDC driver, because there is really not much RJDBC can do other than to call the driver.
He then recommended that I make sure I had the most recent JDBC driver for MySQL. I did have the most recent version. However, it got me thinking "maybe it IS a bug with the JDBC driver." So, I searched Google for: mysql jdbc driver bug alias.
The top result for this query was an entry at bugs.mysql.com. Yev, using MySQL 5.1.22, says that when he upgraded from driver version 5.0.4 to 5.1.5, his column aliases stopped working. Asked if it was a bug.
Sebastien replied, "No, it's not a bug! It's a documented change of behavior in all subsequent versions of the driver." and suggested using ?useOldAliasMetadataBehavior=true, citing documentation for the JDBC driver.
Caveat Lector: The documentation for the JDBC driver states that
useColumnNamesInFindColumn is preferred over useOldAliasMetadataBehavior unless you need the specific behavior that it provides with respect to ResultSetMetadata.
I haven't had the time to fully research what this means. In other words, I don't know what all of the ramifications are of using useOldAliasMetadataBehavior=true are. Use at your own risk. Does someone else have more information?
I don't know RJDBC, but in some cases when it is necessary to give permanent aliases to columns without renaming them, you can use VIEWs
CREATE OR REPLACE VIEW v_table AS
SELECT a AS b
FROM table
... and then ...
SELECT b FROM v_table
There is a separate function in the ResultSetMetaData interface for retrieving the column label vs the column name:
String getColumnLabel(int column) throws SQLException;
Gets the designated column's suggested title for use in printouts and
displays. The suggested title is usually specified by the SQL AS
clause. If a SQL AS is not specified, the value returned
fromgetColumnLabel will be the same as the value returned by the
getColumnName method.
Using getColumnLabel should resolve this issue (if not, check that your JDBC driver is following this spec).
e.g.
ResultSetMetaData rsmd = rs.getMetaData();
int columnCount = rsmd.getColumnCount();
while(rs.next()) {
for (int i = 1; i < columnCount + 1; i++) {
String label = rsmd.getColumnLabel(i);
System.out.println(rs.getString(label));
}
}
This is the work around we use for R and SAP HANA via RJDBC:
names(result)[1]<-"b"
It's not the nicest work around, but since Aaron's solution does work for us, we went with this "solution".