Get table columns in SQL Alchemy (1.0.4) - sqlalchemy

I've realized that in the newest version of SQLAlchemy (v1.0.4) I'm getting errors when using the table.c.keys() for selecting columns.
from sqlalchemy import MetaData
from sqlalchemy import (Column, Integer, Table, String, PrimaryKeyConstraint)
metadata = MetaData()
table = Table('test', metadata,
Column('id', Integer,nullable=False),
Column('name', String(20)),
PrimaryKeyConstraint('id')
)
stmt = select(table.c.keys()).select_from(table).where(table.c.id == 1)
In previous versions it used to work fine, but now this is throwing the following errors:
sqlalchemy/sql/elements.py:3851: SAWarning: Textual column expression 'id' should be explicitly declared with text('id'), or use column('id') for more specificity.
sqlalchemy/sql/elements.py:3851: SAWarning: Textual column expression 'name' should be explicitly declared with text('name'), or use column('name') for more specificity.
Is there a function for retrieving all these table columns rather than using a list comprehension like the following? [text(x) for x in table.c.keys()]

No, but you can always roll your own.
def all_columns(model_or_table, wrap=text):
table = getattr(model_or_table, '__table__', model_or_table)
return [wrap(col) for col in table.c.keys()]
then you would use it like
stmt = select(all_columns(table)).where(table.c.id == 1)
or
stmt = select(all_columns(Model)).where(Model.id == 1)
Note that in most cases you don't need select_from, i.e when you don't actually join to some other table.

Related

How to get the vendor type for a SQLAlchemy generic type without creating a table?

Using the code shown below I can obtain the vendor type that corresponds to the SQLAlchemy generic type. In this case it is "VARCHAR(10)". How can I get the vendor type without creating a table?
engine = create_engine(DB_URL)
metadata_obj = MetaData()
table = Table('Table', metadata_obj,
Column('Column', types.String(10))
)
metadata_obj.create_all(bind=engine)
metadata_obj = MetaData()
metadata_obj.reflect(bind=engine)
print(metadata_obj.tables['Table'].columns[0].type)
You can't obtain the type directly, but you could use a mock_engine to generate the DDL as a string which can be parsed. A mock_engine must be coupled with a callable that will process the SQL expression object that it generates.
This snippet is based on the example code from the SQLAlchemy docs.
import sqlalchemy as sa
tbl = sa.Table('drop_me', sa.MetaData(), sa.Column('col', sa.String(10)))
def dump(sql, *multiparams, **params):
print(sql.compile(dialect=engine.dialect))
mock_engine = sa.create_mock_engine('postgresql://', executor=dump)
tbl.create(mock_engine)
Outputs
CREATE TABLE "Table" (
"Column" VARCHAR(10)
)
sqlalchemy.schema.CreateTable, could also be used, but binding it to an engine is deprecated, to be removed in SQLAlchemy 2.0.
from sqlalchemy.schema import CreateTable
print(CreateTable(tbl, bind=some_engine)

Django / PostgresQL jsonb (JSONField) - convert select and update into one query

Versions: Django 1.10 and Postgres 9.6
I'm trying to modify a nested JSONField's key in place without a roundtrip to Python. Reason is to avoid race conditions and multiple queries overwriting the same field with different update.
I tried to chain the methods in the hope that Django would make a single query but it's being logged as two:
Original field value (demo only, real data is more complex):
from exampleapp.models import AdhocTask
record = AdhocTask.objects.get(id=1)
print(record.log)
> {'demo_key': 'original'}
Query:
from django.db.models import F
from django.db.models.expressions import RawSQL
(AdhocTask.objects.filter(id=25)
.annotate(temp=RawSQL(
# `jsonb_set` gets current json value of `log` field,
# take a the nominated key ("demo key" in this example)
# and replaces the value with the json provided ("new value")
# Raw sql is wrapped in triple quotes to avoid escaping each quote
"""jsonb_set(log, '{"demo_key"}','"new value"', false)""",[]))
# Finally, get the temp field and overwrite the original JSONField
.update(log=F('temp’))
)
Query history (shows this as two separate queries):
from django.db import connection
print(connection.queries)
> {'sql': 'SELECT "exampleapp_adhoctask"."id", "exampleapp_adhoctask"."description", "exampleapp_adhoctask"."log" FROM "exampleapp_adhoctask" WHERE "exampleapp_adhoctask"."id" = 1', 'time': '0.001'},
> {'sql': 'UPDATE "exampleapp_adhoctask" SET "log" = (jsonb_set(log, \'{"demo_key"}\',\'"new value"\', false)) WHERE "exampleapp_adhoctask"."id" = 1', 'time': '0.001'}]
It would be much nicer without RawSQL.
Here's how to do it:
from django.db.models.expressions import Func
class ReplaceValue(Func):
function = 'jsonb_set'
template = "%(function)s(%(expressions)s, '{\"%(keyname)s\"}','\"%(new_value)s\"', %(create_missing)s)"
arity = 1
def __init__(
self, expression: str, keyname: str, new_value: str,
create_missing: bool=False, **extra,
):
super().__init__(
expression,
keyname=keyname,
new_value=new_value,
create_missing='true' if create_missing else 'false',
**extra,
)
AdhocTask.objects.filter(id=25) \
.update(log=ReplaceValue(
'log',
keyname='demo_key',
new_value='another value',
create_missing=False,
)
ReplaceValue.template is the same as your raw SQL statement, just parametrized.
(jsonb_set(log, \'{"demo_key"}\',\'"another value"\', false)) from your query is now jsonb_set("exampleapp.adhoctask"."log", \'{"demo_key"}\',\'"another value"\', false). The parentheses are gone (you can get them back by adding it to the template) and log is referenced in a different way.
Anyone interested in more details regarding jsonb_set should have a look at table 9-45 in postgres' documentation: https://www.postgresql.org/docs/9.6/static/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE
Rubber duck debugging at its best - in writing the question, I've realised the solution. Leaving the answer here in hope of helping someone in future:
Looking at the queries, I realised that the RawSQL was actually being deferred until query two, so all I was doing was storing the RawSQL as a subquery for later execution.
Solution:
Skip the annotate step altogether and use the RawSQL expression straight into the .update() call. Allows you to dynamically update PostgresQL jsonb sub-keys on the database server without overwriting the whole field:
(AdhocTask.objects.filter(id=25)
.update(log=RawSQL(
"""jsonb_set(log, '{"demo_key"}','"another value"', false)""",[])
)
)
> 1 # Success
print(connection.queries)
> {'sql': 'UPDATE "exampleapp_adhoctask" SET "log" = (jsonb_set(log, \'{"demo_key"}\',\'"another value"\', false)) WHERE "exampleapp_adhoctask"."id" = 1', 'time': '0.001'}]
print(AdhocTask.objects.get(id=1).log)
> {'demo_key': 'another value'}

Define a relationship() that is only true sometimes

I'm working with a database schema that has a relationship that isn't always true, and I'm not sure how to describe it with sqlalchemy's ORM.
All the primary keys in this database are stored as a blob type, and are 16 byte binary strings.
I have a table called attribute, and this table has a column called data_type. There are a number of built in data_types, that are not defined explicitly in the database. So, maybe a data_type of 00 means it is a string, and 01 means it is a float, etc (those are hex values). The highest value for the built in data types is 12 (18 in decimal).
However, for some rows in attribute, the value of the attribute stored in the row must exist in a pre-defined list of values. In this case, data_type referrs to lookup.lookup_id. The actual data type for the attribute can then be retrieved from lookup.data_type.
I'd like to be able to call just Attribue.data_type and get back 'string' or 'number'. Obviously I'd need to define the {0x00: 'string', 0x01: 'number'} mapping somewhere, but how can I tell sqlalchemy that I want lookup.data_type if the value of attribute.data_type is greater than 18?
There are a couple of ways to do this.
The simplest, by far, is to just put your predefined data types into the table lookup. You say that you "need to define the... mapping somewhere", and a table is as good a place as any.
Assuming that you can't do that, the next simplest thing is to create a python property on class Attribute. The only problem will be that you can't query against it. You'll want to reassign the column data_type so that it maps to _data_type:
data_type_dict = {0x00: 'string',
0x01: 'number,
...}
class Attribute(Base):
__tablename__ = 'attribute'
_data_type = Column('data_type')
...
#property
def data_type(self):
dt = data_type_dict.get(self._data_type, None)
if dt is None:
s = Session.object_session(self)
lookup = s.query(Lookup).filter_by(id=self._data_type).one()
dt = lookup.data_type
return dt
If you want this to be queryable, that is, if you want to be able to do session.query(Attribute).filter_by(data_type='string'), you need to map data_type to something the database can handle, i.e., an SQL statement. You could do this in raw SQL as a CASE expression:
from sqlalchemy.sql.expression import select, case
class Attribute(Base):
...
data_type = column_property(select([attribute, lookup])\
.where(attribute.data_type==lookup.lookup_id)\
.where(case([(attribute.data_type==0x00, 'string'),
(attribute.data_type==0x01, 'number'),
...],
else_=lookup.data_type))
I'm not 100% certain that last part will work; you may need to explicitly join the tables attribute and lookup to specify that it's an outer join, though I think SQLAlchemy does that by default. The downside of this approach is that you are always going to try to join with the table lookup, though to query using SQL, you sort of have to do that.
The final option is to use a polymorphism, and map the two cases (data_type greater/less than 18) to two different subclasses:
class Attribute(Base):
__tablename__ = 'attribute'
_data_type = Column('data_type')
_lookup = column_property(attribute.data_type > 18)
__mapper_args__ = {'polymorphic_on': _lookup}
class FixedAttribute(Attribute):
__mapper_args__ = {'polymorphic_identity': 0}
data_type = column_property(select([attribute.data_type])\
.where(case([(attribute.data_type==0x00, 'string'),
(attribute.data_type==0x01, 'number'),
...])))
class LookupAttribute(Attribute):
__mapper_args__ = {'polymorphic_identity': 1}
data_type = column_property(select([lookup.data_type],
whereclause=attribute.data_type==lookup.lookup_id))
You might have to replace the 'polymorphic_on': _lookup with an explicit attribute.data_type > 18, depending on when that ColumnProperty gets bound.
As you can see, these are all really messy. Do #1 if it's at all possible.

Unique Sequencial Number to column

I need create sequence but in generic case not using Sequence class.
USN = Column(Integer, nullable = False, default=nextusn, server_onupdate=nextusn)
, this funcion nextusn is need generate func.max(table.USN) value of rows in model.
I try using this
class nextusn(expression.FunctionElement):
type = Numeric()
name = 'nextusn'
#compiles(nextusn)
def default_nextusn(element, compiler, **kw):
return select(func.max(element.table.c.USN)).first()[0] + 1
but the in this context element not know element.table. Exist way to resolve this?
this is a little tricky, for these reasons:
your SELECT MAX() will return NULL if the table is empty; you should use COALESCE to produce a default "seed" value. See below.
the whole approach of inserting the rows with SELECT MAX is entirely not safe for concurrent use - so you need to make sure only one INSERT statement at a time invokes on the table or you may get constraint violations (you should definitely have a constraint of some kind on this column).
from the SQLAlchemy perspective, you need your custom element to be aware of the actual Column element. We can achieve this either by assigning the "nextusn()" function to the Column after the fact, or below I'll show a more sophisticated approach using events.
I don't understand what you're going for with "server_onupdate=nextusn". "server_onupdate" in SQLAlchemy doesn't actually run any SQL for you, this is a placeholder if for example you created a trigger; but also the "SELECT MAX(id) FROM table" thing is an INSERT pattern, I'm not sure that you mean for anything to be happening here on an UPDATE.
The #compiles extension needs to return a string, running the select() there through compiler.process(). See below.
example:
from sqlalchemy import Column, Integer, create_engine, select, func, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.sql.expression import ColumnElement
from sqlalchemy.schema import ColumnDefault
from sqlalchemy.ext.compiler import compiles
from sqlalchemy import event
class nextusn_default(ColumnDefault):
"Container for a nextusn() element."
def __init__(self):
super(nextusn_default, self).__init__(None)
#event.listens_for(nextusn_default, "after_parent_attach")
def set_nextusn_parent(default_element, parent_column):
"""Listen for when nextusn_default() is associated with a Column,
assign a nextusn().
"""
assert isinstance(parent_column, Column)
default_element.arg = nextusn(parent_column)
class nextusn(ColumnElement):
"""Represent "SELECT MAX(col) + 1 FROM TABLE".
"""
def __init__(self, column):
self.column = column
#compiles(nextusn)
def compile_nextusn(element, compiler, **kw):
return compiler.process(
select([
func.coalesce(func.max(element.column), 0) + 1
]).as_scalar()
)
Base = declarative_base()
class A(Base):
__tablename__ = 'a'
id = Column(Integer, default=nextusn_default(), primary_key=True)
data = Column(String)
e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)
# will normally pre-execute the default so that we know the PK value
# result.inserted_primary_key will be available
e.execute(A.__table__.insert(), data='single row')
# will run the default expression inline within the INSERT
e.execute(A.__table__.insert(), [{"data": "multirow1"}, {"data": "multirow2"}])
# will also run the default expression inline within the INSERT,
# result.inserted_primary_key will not be available
e.execute(A.__table__.insert(inline=True), data='single inline row')

MySQL Dynamic Query Statement in Python with Dictionary

Very similar to this question MySQL Dynamic Query Statement in Python
However what I am looking to do instead of two lists is to use a dictionary
Let's say i have this dictionary
instance_insert = {
# sql column variable value
'instance_id' : 'instnace.id',
'customer_id' : 'customer.id',
'os' : 'instance.platform',
}
And I want to populate a mysql database with an insert statement using sql column as the sql column name and the variable name as the variable that will hold the value that is to be inserted into the mysql table.
Kind of lost because I don't understand exactly what this statement does, but was pulled from the question that I posted where he was using two lists to do what he wanted.
sql = "INSERT INTO instance_info_test VALUES (%s);" % ', '.join('?' for _ in instance_insert)
cur.execute (sql, instance_insert)
Also I would like it to be dynamic in the sense that I can add/remove columns to the dictionary
Before you post, you might want to try searching for something more specific to your question. For instance, when I Googled "python mysqldb insert dictionary", I found a good answer on the first page, at http://mail.python.org/pipermail/tutor/2010-December/080701.html. Relevant part:
Here's what I came up with when I tried to make a generalized version
of the above:
def add_row(cursor, tablename, rowdict):
# XXX tablename not sanitized
# XXX test for allowed keys is case-sensitive
# filter out keys that are not column names
cursor.execute("describe %s" % tablename)
allowed_keys = set(row[0] for row in cursor.fetchall())
keys = allowed_keys.intersection(rowdict)
if len(rowdict) > len(keys):
unknown_keys = set(rowdict) - allowed_keys
print >> sys.stderr, "skipping keys:", ", ".join(unknown_keys)
columns = ", ".join(keys)
values_template = ", ".join(["%s"] * len(keys))
sql = "insert into %s (%s) values (%s)" % (
tablename, columns, values_template)
values = tuple(rowdict[key] for key in keys)
cursor.execute(sql, values)
filename = ...
tablename = ...
db = MySQLdb.connect(...)
cursor = db.cursor()
with open(filename) as instream:
row = json.load(instream)
add_row(cursor, tablename, row)
Peter
If you know your inputs will always be valid (table name is valid, columns are present in the table), and you're not importing from a JSON file as the example is, you can simplify this function. But it'll accomplish what you want to accomplish. While it may initially seem like DictCursor would be helpful, it looks like DictCursor is useful for returning a dictionary of values, but it can't execute from a dict.