Python3, MySQL, and SqlAlchemy -- does SqlAlchemy always require a DBAPI? - mysql

I am in the process of migrating databases from sqlite to mysql. Now that I've migrated the data to mysql, I'm not able to use my sqlalchemy code (in Python3) to access it in the new mysql db. I was under the impression that sqlalchemy syntax was database agnostic (i.e. the same syntax would work for accessing sqlite and mysql), but this appears not to be the case. So my question is: Is it absolutely required to use a DBAPI in addition to Sqlalchemy to read the data? Do I have to edit all of my sqlalchemy code to now read mysql?
The documentation says: The MySQL dialect uses mysql-python as the default DBAPI. There are many MySQL DBAPIs available, including MySQL-connector-python and OurSQL, which I think means that I DO need a DBAPI.
My old code with sqlite successfully worked like this with sqlite:
engine = create_engine('sqlite:///pmids_info.db')
def connection():
conn = engine.connect()
return conn
def load_tables():
metadata = MetaData(bind=engine) #init metadata. will be empty
metadata.reflect(engine) #retrieve db info for metadata (tables, columns, types)
inputPapers = Table('inputPapers', metadata)
return inputPapers
inputPapers = load_tables()
def db_inputPapers_retrieval(user_input):
result = engine.execute("select title, author, journal, pubdate, url from inputPapers where pmid = :0", [user_input])
for row in result:
title = row['title']
author = row['author']
journal = row['journal']
pubdate = row['pubdate']
url = row['url']
apa = str(author+' ('+pubdate+'). '+title+'. '+journal+'. Retrieved from '+url)
return apa
This worked fine and dandy. So then I tried to update it to work with the mysql db like this:
engine = create_engine('mysql://snarkshark#localhost/pmids_info')
At first when I tried to run my sample code like this, it complained because I didn't have MySqlDB. Some googling around informed me that MySqlDB does NOT work for Python 3. So then I tried pip installing pymysql and changing my engine statement to
engine = create_engine('mysql+pymysql://snarkshark#localhost/pmids_info')
which also ends up giving me various syntax errors when I try to adjust things.
So what I want to know, is if there is any way I can get my current syntax to work with mysql? Since the syntax is from sqlalchemy, I thought it would work perfectly for the exact same data in mysql that was previously in sqlite. Will I have to go through and update ALL of my db functions to use the syntax of the DBAPI?

This will sound like a dumb answer, but you'll need to change all the places where you're using database-specific behavior. SQLAlchemy does not guarantee that anything you do with it is portable across all backends. It leaks some abstractions on purpose to allow you to do things that are only available on certain backends. What you're doing is like using Python because it's cross-platform, then doing a bunch of os.fork()s everywhere, and then being surprised that it doesn't work on Windows.
For your specific case, at a minimum, you need to wrap all your raw SQL in text() so that you're not affected by the supported paramstyle of the DBAPI. However, there are still subtle differences between different dialects of SQL, so you'll need to use the SQLAlchemy SQL expression language instead of raw SQL if you want portability. After all that, you'll still need to be careful not to use backend-specific features in the SQL expression language.

Related

SQLAlchemy - Unable to reflect SQL view

I'm getting the following error message when trying to reflect any of my SQL views:
sqlalchemy/dialects/mysql/reflection.py", line 306, in _describe_to_create
buffer.append(" ".join(line))
TypeError: sequence item 2: expected str instance, bytes found
I have tried using both the autoload_with and autoload=True options in my select query constructor to no avail.
I have the appropriate permissions on my view. My query is pretty simple:
company_country = Table('company_country', metadata, autoload_with=engine)
query = select(company_country.c.country)
return query
I've tried the inspect utility and it does not list my SQL view, nor does the reflecting all tables described below the views section on this page: https://docs.sqlalchemy.org/en/14/core/reflection.html#reflecting-views
I'm using version SQLAlchemy->1.4.32, Python 3.x and mySQL 8.0.28 on Mac if that's any help
I should add that I can query my SQL views using the text() constructor but it would be far more preferable to use select() if possible.
Any tips appreciated
I was using the mysql-connector client for interop with other code, but after switching to the mysqlclient, I was able to reflect the views.
https://docs.sqlalchemy.org/en/14/dialects/mysql.html#module-sqlalchemy.dialects.mysql.mysqldb

sqlalchemy.exc.ProgrammingError: Unexpected 'UNIQUE' error when using alembic with snowflake

I'm trying to use alembic with snowflake to version control the schema I use for both PostgreSQL and snowflake. I keep running into this Unexpected 'UNIQUE' error. I know this is being caused as it is trying to create an index, something that snowflake does not support. This is strange to me as I thought the purpose of the dialect system in SQLAlchemy was to manage differences between implementations and stop it trying to create this index when it's not supported.
I followed the guide on the snowflake website to add the dialect to alembic, calling the upgrade function like this:
engine = create_engine(SNOWFLAKE_DATABASE_URI)
with engine.begin() as con:
logger.info("Starting db upgrade.")
cfg = Config("migrations/alembic.ini")
cfg.attributes["connection"] = con
cfg.attributes["configure_logger"] = False
command.upgrade(cfg, "head")
Is the connector not working properly or am I not calling this in the correct way?

Entity Framework converts StartsWith to MySQL's Locate, MySQL's Locate doesn't use index

I'm using Entity Framework with MySQL, and my Linq Query:
db.Persons.Where(x => x.Surname.StartsWith("Zyw")).ToList();
..is producing the SQL:
SELECT PersonId, Forename, Surname
FROM Person
WHERE (LOCATE('Zyw', Surname)) = 1
...and it would seem that this doesn't make use of the index on Surname.
If LOCATE is replaced with the equivalent LIKE, the query speedily returns the required results. As it is it takes all afternoon.
Why is Entity Framework and its connecting drivers opting for this wierd LOCATE function / how can I make it use LIKE instead / why is MySQL making a poor index decision for the LOCATE function / how can I make it better?
Update:
I'm afraid I was guilty of over simplifying my code for this post, the Linq producing the error is in fact:
var target = "Zyw";
db.Persons.Where(x => x.Surname.StartsWith(target)).ToList();
If target term is hard coded, the SQL generated does indeed use LIKE, but with a variable term the SQL changes to use LOCATE.
This is all using the latest generally available MySQL for Windows as delivered by MySQL Installer 5.6.15.
Update:
A couple more notes to go with the bounty; am using:
Visual Studio 2010
EntityFramework 6.0.2
MySQL Installer 5.6.15,
which in turn gives:
MySql.Data 6.7.4
MySql.Data.Entities 6.7.4
The Entity Framework code is generated database first style.
I've also tried it with the latest connector from Nuget (MySql.Data 6.8.3) and the problem is still there.
It's likely your problem is caused by:
You are using an older connector with the bug.
You have a special case (using a variable to hold the .Contains search) described as a bug here
Does your case fall into any of those?
This looks like a regression of MySQL bug #64935 to me.
I can confirm that, using the same builds of EF6 and MySQL Connector, I'm getting the same SQL generated too:
context.stoppoints.Where(sp => sp.derivedName.StartsWith(stopName));
...logs as:
SELECT
`Extent1`.`primaryCode`,
...
`Extent1`.`stop_timezone`
FROM `stoppoints` AS `Extent1`
WHERE (LOCATE(#p__linq__0, `Extent1`.`derivedName`)) = 1
Entity Framework: 6.0.2
MySQL Connector.Net: 6.8.3
I have reported this as a MySQL bug regression.

Drupal : How can I know if the db is mysql or postgres

I have a complicated query and since I need that my module work on both mysql and postgres, I need to write two version of it.
Unfortunately, I don't know how I can check if the db I use is mysql or postgres, to know which query use. Do you know if a function can return this value?
As #kordirko says, one option is to query the server version: SELECT version(); will work on both MySQL and PostgreSQL, though not most other database engines.
Parsing version strings is always a bit fragile though, and MySQL returns just a version number like 5.5.32 wheras PostgreSQL returns something like PostgreSQL 9.4devel on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.7.2 20121109 (Red Hat 4.7.2-8), 64-bit. What do you do if you're connecting to a PostgreSQL-compatible database like EnterpriseDB Postgres Plus, or a MySQL-compatible database?
It's much safer to use the Drupal function for the purpose, DatabaseConnection::databaseType. This avoids a query round-trip to the DB, will work on databases that won't understand/accept SELECT version(), and will avoid the need to parse version strings.
You'll find this bug report useful; it suggests that the correct usage is Database::getConnection()->databaseType().
(I've never even used Drupal, I just searched for this).
As long as the abstract DatabaseConnection class extends PDO class, you can invoking pdo methods in order to know the current database driver.
For instance:
$conn = Database::getConnection();
print $conn->getAttribute($conn::ATTR_DRIVER_NAME); #returns mysql, pgsql...
There is a second way to do it using DatabaseConnection::driver():
print $conn->driver();
or DatabaseConnection::databaseType();
print $conn->databaseType();
Note that DatabaseConnection::driver() and DatabaseConnection::databaseType() are similar functions but not equals!
The return value from DatabaseConnection::driver() method depends on the implementation and other factors.
in the Drupal Database API page:
database.inc abstract public DatabaseConnection::driver()
This is not necessarily the same as the type of the database itself. For instance, there could be two MySQL drivers, mysql and mysql_mock. This function would return different values for each, but both would return "mysql" for databaseType().
In the most cases you just gonna want to use only
$conn->getAttribute($conn::ATTR_DRIVER_NAME)
or $conn->databaseType()
If you want get more specific properties, you should take advantage the PHP ReflectionClass features:
$conn = Database::getConnection();
$ref = new ReflectionClass($conn);
#ref->getProperties, ref->getConstants $ref->isAbstract...
Reference:
PDO::getAttribute
PDO::ATTR_DRIVER_NAME
Drupal Database API
Drupal Base Database API class

Renaming columns in a MySQL select statement with R package RJDBC

I am using the RJDBC package to connect to a MySQL (Maria DB) database in R on a Windows 7 machine and I am trying a statement like
select a as b
from table
but the column will always continue to be named "a" in the data frame.
This works normally with RODBC and RMySQL but doesn't work with RJDBC. Unfortunately, I have to use RJDBC as this is the only package that has no problem with the encoding of chinese, hebrew and so on letters (set names and so on don't seem to work with RODBC and RMySQL).
Has anybody experienced this problem?
I have run into the same frustrating issue. Sometimes the AS keyword would have its intended effect, but other times it wouldn't. I was unable to identify the conditions to make it work correctly.
Short Answer: (Thanks to Simon Urbanek (package maintainer for RJDBC), Yev, and Sebastien! See the Long Answer.) One thing that you may try is to open your JDBC connection using ?useOldAliasMetadataBehavior=true in your connection string. Example:
drv <- JDBC("com.mysql.jdbc.Driver", "C:/JDBC/mysql-connector-java-5.1.18-bin.jar", identifier.quote="`")
conn <- dbConnect(drv, "jdbc:mysql://server/schema?useOldAliasMetadataBehavior=true", "username", "password")
query <- "SELECT `a` AS `b` FROM table"
result <- dbGetQuery(conn, query)
dbDisconnect(conn)
This ended up working for me! See more details, including caveats, in the Long Answer.
Long Answer: I tried all sorts of stuff, including making views, changing queries, using JOIN statements, NOT using JOIN statements, using ORDER BY and GROUP BY statements, etc. I was never able to figure out why some of my queries were able to rename columns and others weren't.
I contacted the package maintainer (Simon Urbanek.) Here is what he said:
In the vast majority of cases this is an issue in the JBDC driver, because there is really not much RJDBC can do other than to call the driver.
He then recommended that I make sure I had the most recent JDBC driver for MySQL. I did have the most recent version. However, it got me thinking "maybe it IS a bug with the JDBC driver." So, I searched Google for: mysql jdbc driver bug alias.
The top result for this query was an entry at bugs.mysql.com. Yev, using MySQL 5.1.22, says that when he upgraded from driver version 5.0.4 to 5.1.5, his column aliases stopped working. Asked if it was a bug.
Sebastien replied, "No, it's not a bug! It's a documented change of behavior in all subsequent versions of the driver." and suggested using ?useOldAliasMetadataBehavior=true, citing documentation for the JDBC driver.
Caveat Lector: The documentation for the JDBC driver states that
useColumnNamesInFindColumn is preferred over useOldAliasMetadataBehavior unless you need the specific behavior that it provides with respect to ResultSetMetadata.
I haven't had the time to fully research what this means. In other words, I don't know what all of the ramifications are of using useOldAliasMetadataBehavior=true are. Use at your own risk. Does someone else have more information?
I don't know RJDBC, but in some cases when it is necessary to give permanent aliases to columns without renaming them, you can use VIEWs
CREATE OR REPLACE VIEW v_table AS
SELECT a AS b
FROM table
... and then ...
SELECT b FROM v_table
There is a separate function in the ResultSetMetaData interface for retrieving the column label vs the column name:
String getColumnLabel(int column) throws SQLException;
Gets the designated column's suggested title for use in printouts and
displays. The suggested title is usually specified by the SQL AS
clause. If a SQL AS is not specified, the value returned
fromgetColumnLabel will be the same as the value returned by the
getColumnName method.
Using getColumnLabel should resolve this issue (if not, check that your JDBC driver is following this spec).
e.g.
ResultSetMetaData rsmd = rs.getMetaData();
int columnCount = rsmd.getColumnCount();
while(rs.next()) {
for (int i = 1; i < columnCount + 1; i++) {
String label = rsmd.getColumnLabel(i);
System.out.println(rs.getString(label));
}
}
This is the work around we use for R and SAP HANA via RJDBC:
names(result)[1]<-"b"
It's not the nicest work around, but since Aaron's solution does work for us, we went with this "solution".