SQL check if list parameter is null in different RDBMS - mysql

I am creating an app which performs raw queries across different databases and I am struggling with list parameters (IN).
I use SQLAlchemy for performing these queries.
I want to perform a query that accepts list parameter and that parameter might be NULL, which means I don't have to filter by field.
from sqlalchemy import create_engine, text
SQL = """SELECT group, count(1) cnt
FROM some_table
WHERE group IN :groups OR :groups IS NULL
GROUP BY group
"""
params = {'groups': ('group1', 'group2')}
engine = create_engine(connection_string)
query = text(SQL).bindparams(**params)
cursor = engine.execute(query)
Currently I'm testing it on PostgreSQL, MySQL and SQLite, but in production mode it is also supposed to work with SQL Server and Oracle.
The code above works only on PostgreSQL, however if I change params with None
params = {'groups': None}
The code wouldn't work on any databases.
Is there workaround for this problem?
I understand that solution might be specific for each RDBMS.

Related

Django mysql count distinct gives different result to postgres

I'm trying to count distinct string values for a fitered set of results in a django query against a mysql database versus the same data in a postgres database. However, I'm getting really confusing results.
In the code below, NewOrder represents queries against the same data in a postgres database, and OldOrder is the same data in a MYSQL instance.
( In the old database, completed orders had status=1, in the new DB complete status = 'Complete'. In both the 'email' field is the same )
OldOrder.objects.filter(status=1).count()
6751
NewOrder.objects.filter(status='Complete').count()
6751
OldOrder.objects.filter(status=1).values('email').distinct().count()
3747
NewOrder.objects.filter(status='Complete').values('email').distinct().count()
3825
print NewOrder.objects.filter(status='Complete').values('email').distinct().query
SELECT DISTINCT "order_order"."email" FROM "order_order" WHERE "order_order"."status" = Complete
print OldSale.objects.filter(status=1).values('email').distinct().query
SELECT DISTINCT "order_order"."email" FROM "order_order" WHERE "order_order"."status" = 1
And here is where it gets really bizarre
new_orders = NewOrder.objects.filter(status='Complete').values_list('email', flat=True)
len(set(new_orders))
3825
old_orders = OldOrder.objects.filter(status=1).values_list('email',flat=True)
len(set(old_orders))
3825
Can anyone explain this discrepancy? And possibly point me as to why results would be different between postgres and mysql? My only guess is a character encoding issue, but I'd expect the results of the python set() to also be different?
Sounds like you're probably using a case-insensitive collation in MySQL. There's no equivalent in PostgreSQL; the closest is the citext data type, but usually you just compare lower(...) of strings, or use ILIKE for pattern matching.
I don't know how to say it in Django, but I'd see if the count of the set of distinct lowercased email addresses is the same as the old DB.
According to the Django docs something like this might work:
NewOrder.objects.filter(status='Complete').values(Lower('email')).distinct()

Retrieving ultimate sql query sentence (with the values in place of any '?')

Since it may be efficient to paste a flawed sql query directly into a database administration tool such as phpmyadmin in order to work on it until it returns the expected result,
Is there any way to retrieve the ultimate sql sentence Sqlalchemy Core supposedly passes to the MySql database, in a ready-to-execute shape ?
This typically means that you want the bound parameters to be rendered inline. There is limited support for this automatically (as of SQLA 0.9 this will work):
from sqlalchemy.sql import table, column, select
t = table('x', column('a'), column('b'))
stmt = select([t.c.a, t.c.b]).where(t.c.a > 5).where(t.c.b == 10)
print(stmt.compile(compile_kwargs={"literal_binds": True}))
also you'd probably want the query to be MySQL specific, so if you already have an engine lying around you can pass that in too:
from sqlalchemy import create_engine
engine = create_engine("mysql://")
print(stmt.compile(engine, compile_kwargs={"literal_binds": True}))
and it prints:
SELECT x.a, x.b
FROM x
WHERE x.a > 5 AND x.b = 10
now, if you have more elaborate values in the parameters, like dates, SQLAlchemy might throw an error, it only has "literal binds" renderers for a very limited number of types. An approach that bypasses that system instead and gives you a pretty direct shot at turning those parameters into strings is then do to a "search and replace" on the statement object, replacing the bound parameters with literal strings:
from sqlalchemy.sql import visitors, literal_column
from sqlalchemy.sql.expression import BindParameter
def _replace(arg):
if isinstance(arg, BindParameter):
return literal_column(
repr(arg.effective_value) # <- do any fancier conversion here
)
stmt = visitors.replacement_traverse(stmt, {}, _replace)
once you do that you can just print it:
print(stmt)
or the MySQL version:
print(stmt.compile(engine))

SQLAlchemy query to return only n results?

I have been googling and reading through the SQLAlchemy documentation but haven't found what I am looking for.
I am looking for a function in SQLAlchemy that limits the number of results returned by a query to a certain number, for example: 5? Something like first() or all().
for sqlalchemy >= 1.0.13
Use the limit method.
query(Model).filter(something).limit(5).all()
Alternative syntax
query.(Model).filter(something)[:5].all()
If you need it for pagination you can do like this:
query = db.session.query(Table1, Table2, ...).filter(...)
if page_size is not None:
query = query.limit(page_size)
if page is not None:
query = query.offset(page*page_size)
query = query.all()
Or if you query one table and have a model for it you can:
query = (Model.query(...).filter(...))
.paginate(page=start, per_page=size))
Since v1.4, SQLAlchemy core's select function provides a fetch method for RDBMS that support FETCH clauses*. FETCH was defined in the SQL 2008 standard to provide a consistent way to request a partial result, as LIMIT/OFFSET is not standard.
Example:
# As with limit queries, it's usually sensible to order
# the results to ensure results are consistent.
q = select(tbl).order_by(tbl.c.id).fetch(10)
# Offset is supported, but it is inefficient for large resultsets.
q_with_offset = select(tbl).order_by(tbl.c.id).offset(10).fetch(10)
# A suitable where clause may be more efficient
q = (select(tbl)
.where(tbl.c.id > max_id_from_previous_query)
.order_by(tbl.c.id)
.fetch(10)
)
The syntax is supported in the ORM layer since v1.4.38. It is only supported for 2.0-style select on models; the legacy session.query syntax does not support it.
q = select(Model).order_by(Model.id).fetch(10)
* Currently Oracle, PostgreSQL and MSSQL.
In my case it works like
def get_members():
m = Member.query[:30]
return m

MySQL Not Operator

I'm in the process of migrating a locally hosted MySQL database over to a cloud based MySQL database using Xeround. I'm running a test script that uses a left join to form a table and then runs two select statements
--one where the VAL and KVAL fields are equal and one that returns the complement of this set (where the VAL and KVAL sets are not equal).
I'm having no problems getting the records where VAL and KVAL match using (VAL=KVAL) as a where statement. I'm able to get the records where VAL=KVAL in both setups. I'm able to get the complement in my local setup using the where statement: VAL!=KVAL OR (KVAL IS NULL).
However, when I run this same Select/Where statement in my Xeround setup it returns a NULL set. The Xeround database uses PHP MyAdmin if that is helpful. I've also played around with <>, placing an exclamation mark or not statement outside of the original where statement. This should be fairly straight forward. Can you help me out?
The complement condition of
WHERE ( val = kval )
is:
WHERE ( val <> kval OR val IS NULL OR kval IS NULL )

How best to retrieve result of SELECT COUNT(*) from SQL query in Java/JDBC - Long? BigInteger?

I'm using Hibernate but doing a simple SQLQuery, so I think this boils down to a basic JDBC question. My production app runs on MySQL but my test cases use an in memory HSQLDB. I find that a SELECT COUNT operation returns BigInteger from MySQL but Long from HSQLDB.
MySQL 5.5.22
HSQLDB 2.2.5
The code I've come up with is:
SQLQuery tq = session.createSQLQuery(
"SELECT COUNT(*) AS count FROM calendar_month WHERE date = :date");
tq.setDate("date", eachDate);
Object countobj = tq.list().get(0);
int count = (countobj instanceof BigInteger) ?
((BigInteger)countobj).intValue() : ((Long)countobj).intValue();
This problem of the return type negates answers to other SO questions such as getting count(*) using createSQLQuery in hibernate? where the advice is to use setResultTransformer to map the return value into a bean. The bean must have a type of either BigInteger or Long, and fails if the type is not correct.
I'm reluctant to use a cast operator on the 'COUNT(*) AS count' portion of my SQL for fear of database interoperability. I realise I'm already using createSQLQuery so I'm already stepping outside the bounds of Hibernates attempts at database neutrality, but having had trouble before with the differences between MySQL and HSQLDB in terms of database constraints
Any advice?
I don't known a clear solution for this problem, but I will suggest you to use H2 database for your tests.
H2 database has a feature that you can connect using a compatibility mode to several different databases.
For example to use MySQL mode you connect to the database using this jdbc:h2:~/test;MODE=MySQL URL.
You can downcast to Number and then call the intValue() method. E.g.
SQLQuery tq = session.createSQLQuery("SELECT COUNT(*) AS count FROM calendar_month WHERE date = :date");
tq.setDate("date", eachDate);
Object countobj = tq.list().get(0);
int count = ((Number) countobj).intValue();
Two ideas:
You can get result value as String and then parse it to Long or BigInteger
Do not use COUNT(*) AS count FROM ..., better use something like COUNT(*) AS cnt ... but in your example code you do not use name of result column but it index, so you can use simply COUNT(*) FROM ...