Preface:
My task is storing files on a disk, the part of a file name is a timestamp. Path to these files is storing in DB. Multiple files may have a single owner entity (one message can contain multiple attachments).
To make things easier I want to have the same timestamp for file paths in DB (it's set to default now()) and files on the disk.
Question:
So after insertion, I need to get back default inserted values (in most cases primary_key_id and created_datetime).
I tried:
db_session # Just for clarity
<sqlalchemy.orm.session.AsyncSession object at 0x7f836691db20>
str(statement) # Just for clarity. Don't know how to get back the original python (not an SQL) statement
'INSERT INTO users (phone_number, login, full_name, hashed_password, role) VALUES (:phone_number, :login, :full_name, :hashed_password, :role)'
query_result = await db_session.execute(statement=statement)
query_result.returned_defaults_rows # Primary_key, but no datetime
[(243,)]
query_result.returned_defaults # Primary_key, but no datetime
(243,)
query_result.fetchall()
[]
My tables:
Base = declarative_base() # Main class of ORM; Put in config by suggestion https://t.me/ru_python/1450665
claims = Table( # TODO set constraints for status
"claims",
Base.metadata,
Column("id", Integer, primary_key=True),
My queries
async def create_item(statement: Insert, db_session: AsyncSession, detail: str = '') -> dict:
try: # return default created values
statement = statement.returning(statement.table.c.id, statement.table.c.created_datetime)
return (await db_session.execute(statement=statement)).fetchone()._asdict()
except sqlalchemy.exc.IntegrityError as error:
# if psycopg2_errors.lookup(code=error.orig.pgcode) in (psycopg2_errors.UniqueViolation, psycopg2_errors.lookup):
detail = error.orig.args[0].split('Key ')[-1].replace('(', '').replace(')', '').replace('"', '')
raise HTTPException(status_code=422, detail=detail)
P.S. Sqlalchemy v 1.4
I was able to do this with session.bulk_save_objects(objects, return_defaults=True)
Docs on this method are here
Related
I'm considering porting my app to the SQLAlchemy as it's much more extensive than my own ORM implementation, but all the examples I could find show how to set the schema name at class declaration rather than dynamically at runtime.
I need to map my objects to Postgres tables from multiple schemas. Moreover, the application creates new schemas in runtime and I need to map new instances of the class to rows of the table from that new schema.
Currently, I use my own ORM module where I just provide the schema name as an argument when creating new instances of a class (I call class' method with the schema name as an argument and it returns an object(s) that holds the schema name). The class describes a table that can exist in many schemas. The class declaration doesn't contain information about schema, but instances of that class do contain it and include it when generating SQL statements.
This way, the application can work with many schemas simultaneously and even create foreign keys in tables from "other" schemas to the "main" table in the public schema. In such a way it is also possible to delete data in other schemas cascaded when deleting the row in the public schema.
The SQLAlchemy gives this example to set the schema for the table (documentation):
metadata_obj = MetaData(schema="remote_banks")
financial_info = Table(
"financial_info",
metadata_obj,
Column("id", Integer, primary_key=True),
Column("value", String(100), nullable=False),
)
But on ORM level, when I declare the class, I should pass an already constructed table (example from documentation):
metadata = MetaData()
group_users = Table(
"group_users",
metadata,
Column("user_id", String(40), nullable=False),
Column("group_id", String(40), nullable=False),
UniqueConstraint("user_id", "group_id"),
)
class Base(DeclarativeBase):
pass
class GroupUsers(Base):
__table__ = group_users
__mapper_args__ = {"primary_key": [group_users.c.user_id, group_users.c.group_id]}
So, the question is: is it possible to map class instances to tables/rows from dynamically created database schemas (in runtime) in SQLAlchemy? The way of altering the connection to set the current schema is not acceptable to me. I want to work with all schemas simultaneously.
I'm free to use the newest SQLAlchemy 2.0 (currently in BETA release).
You can set the schema per table so I think you have to make a table and class per schema. Here is a made up example. I have no idea what the ramifications are of changing the mapper registry during runtime. Especially as I have done below, mid-transaction and what would happen with threadsafety. You could probably use a master schema list table in public and lock it or lock the same row across connections to syncronize the schema list and provide threadsafety when adding a schema. I'm suprised it works. Kind of cool.
import sys
from sqlalchemy import (
create_engine,
Integer,
MetaData,
Float,
event,
)
from sqlalchemy.schema import (
Column,
CreateSchema,
Table,
)
from sqlalchemy.orm import Session
from sqlalchemy.orm import registry
username, password, db = sys.argv[1:4]
engine = create_engine(f"postgresql+psycopg2://{username}:{password}#/{db}", echo=True)
metadata = MetaData()
mapper_registry = registry()
def map_class_to_some_table(cls, table, entity_name, **mapper_kwargs):
newcls = type(entity_name, (cls,), {})
mapper_registry.map_imperatively(newcls, table, **mapper_kwargs)
return newcls
class Measurement(object):
pass
units = []
cls_for_unit = {}
tbl_for_unit = {}
def add_unit(unit, create_bind=None):
units.append(unit)
schema_name = f"unit_{unit}"
if create_bind:
create_bind.execute(CreateSchema(schema_name))
else:
event.listen(metadata, "before_create", CreateSchema(schema_name))
cols = [
Column("id", Integer, primary_key=True),
Column("value", Float, nullable=False),
]
# One table per schema.
tbl_for_unit[unit] = Table("measurement", metadata, *cols, schema=schema_name)
if create_bind:
tbl_for_unit[unit].create(create_bind)
# One class per schema.
cls_for_unit[unit] = map_class_to_some_table(
Measurement, tbl_for_unit[unit], Measurement.__name__ + f"_{unit}"
)
for unit in ["mm", "m"]:
add_unit(unit)
metadata.create_all(engine)
with Session(engine) as session, session.begin():
# Create a value for each unit (schema).
session.add_all([cls(value=i) for i, cls in enumerate(cls_for_unit.values())])
with Session(engine) as session, session.begin():
# Read back a value for each unit (schema).
print(
[
(unit, cls.__name__, cls, session.query(cls).first().value)
for (unit, cls) in cls_for_unit.items()
]
)
with Session(engine) as session, session.begin():
# Add another unit, add a value, flush and then read back.
add_unit("km", create_bind=session.bind)
session.add(cls_for_unit["km"](value=100.0))
session.flush()
print(session.query(cls_for_unit["km"]).first().value)
Output of last add_unit()
2022-12-16 08:16:13,446 INFO sqlalchemy.engine.Engine CREATE SCHEMA unit_km
2022-12-16 08:16:13,446 INFO sqlalchemy.engine.Engine [no key 0.00015s] {}
2022-12-16 08:16:13,447 INFO sqlalchemy.engine.Engine COMMIT
2022-12-16 08:16:13,469 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-12-16 08:16:13,469 INFO sqlalchemy.engine.Engine
CREATE TABLE unit_km.measurement (
id SERIAL NOT NULL,
value FLOAT NOT NULL,
PRIMARY KEY (id)
)
Ian Wilson posted a great answer to my question which I'm going to use.
About the same time I got an idea of how it can work and would like to post it here just as a very simple example. I think this is the same mechanism behind it as posted by Ian.
This example only "reads" an object from the schema that can be referenced at runtime.
from sqlalchemy import create_engine, Column, Integer, String, MetaData
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import sessionmaker
import psycopg
engine = create_engine(f"postgresql+psycopg://user:password#localhost:5432/My_DB", echo=True)
Session = sessionmaker(bind=engine)
session = Session()
class Base(DeclarativeBase):
pass
class A(object):
__tablename__ = "my_table"
id = Column("id", Integer, primary_key=True)
name = Column("name", String)
def __repr__(self):
return f"A: {self.id}, {self.name}"
metadata_obj = MetaData(schema="my_schema") # here we create new mapping
A1 = type("A1", (A, Base), {"metadata": metadata_obj}) # here we make a new subclass with desired mapping
data = session.query(A1).all()
print(data)
This info helped me to come to this solution:
https://github.com/sqlalchemy/sqlalchemy/wiki/EntityName
"... SQLAlchemy mapping makes modifications to the mapped class, so it's not really feasible to have many mappers against the exact same class ..."
This means a separate class must be created in runtime for each schema
I am trying to do a migration to update the value of the column has_bubble_in_countries based on the has_bubble_v1 s column value.
I created before the upgrade() the table:
subscription_old_table = sa.Table(
'Subscription',
sa.MetaData(),
sa.Column('id', sa.Unicode(255), primary_key=True, unique=True, nullable=False),
sa.Column('has_bubble_v1', sa.Boolean, nullable=False, default=False),
sa.Column('has_bubble_in_countries', MutableList.as_mutable(ARRAY(sa.Enum(Country))), nullable=False, default=[], server_default='{}')
)
And then the upgrade() method looks like:
def upgrade():
connection = op.get_bind()
for subscription in connection.execute(subscription_old_table.select()):
if subscription.has_bubble_v1:
connection.execute(
subscription_old_table.update().where(
subscription_old_table.c.id == subscription.id
).values(
has_bubble_in_countries=subscription.has_bubble_in_countries.append(Country.NL),
)
)
# Then drop the column after the data has been migrated
op.drop_column('Subscription', 'has_bubble_v1')
All the rows in the database of has_bubble_in_countries column have this value {} when I check the database using pgadmin's interface.
When the upgrade() function gets to the update method it throws this error:
sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) null value in column "has_bubble_in_countries" of relation "Subscription" violates not-null constraint
DETAIL: Failing row contains (keydsakwlkad, null, 2027-08-14 00:00:00+00,groot abonnement, big, {nl}, null, null, 2022-08-08 08:45:52.875931+00, 3482992, {}, f, null, null, null, t, 2011-05-23 08:55:20.538451+00, 2022-08-08 09:10:15.577283+00, ***null***).
[SQL: UPDATE "Subscription" SET has_bubble_in_countries=%(has_bubble_in_countries)s::country[] WHERE "Subscription".id = %(id_1)s]
[parameters: {'has_bubble_in_countries': None, 'id_1': '1pwohLmjftAZdIaJ'}]
The bolded value from the error is the value that is retrieved for the has_bubble_in_countries column even if it has a server_default='{}' and nullable=False.
Is there any possibility to add a configuration to alembic to recognize the server default s value when it is retrieved from the database? Or how can this be fixed?
I think the problem is actually that are you passing in the result of .append() which is None. Unlike other languages where it is common to return the altered list, append changes the list in place. I'm not sure that is a great idea for a core query result here but it seems to work. Also as far as I know, if you pass in NULL it doesn't trigger the default. The default is used when you pass in no value at all either when inserting or updating.
with Session(engine) as session, session.begin():
for subscription in session.execute(subscription_old_table.select()):
if subscription.has_bubble_v1:
# Append here.
subscription.has_bubble_in_countries.append(Country.NL)
# Then set values:
session.execute(
subscription_old_table.update().where(
subscription_old_table.c.id == subscription.id
).values(has_bubble_in_countries=subscription.has_bubble_in_countries,
)
)
Maybe cloning the list and then adding the element like this would be safer and clearer:
has_bubble_in_countries=subscription.has_bubble_in_countries[:] + [Country.NL]
I am using sqlalchemy orm and have multiple users being able to query my api. I keep all user engines stored separately and accessible only via JWT verification. I use their engine when booting up the api to form a dictionary of the following format:
{
"user1": {
"table1": {
"column1": table1.c.column1
}
}
}
and repeat this for every user in my database who I want to access the api. Code is as follows:
def build_translation_per_db(connections):
metadata=MetaData()
database_engines = {}
translation_per_db = {}
for connection in connections:
db_tag = connection['db_tag']
client = aws_connect_function()
secret = function_get_secret()
if type(secret) == list:
for db in secret['aliases']:
db_engine_object, db_connection_object=function_return_engine(secret=db, db_tag=db_tag)
database_engines[db] = {
'engine': db_engine_object
}
else:
db_engine_object, db_connection_object=function_return_engine(secret=secret, db_tag=db_tag)
database_engines[connection['name']] = {
'engine': db_engine_object
}
# at this point we should have all engines and connections
for database in database_engines.keys():
# name of the database
table_translation, column_translation_from_table = build_table_translation(database_engines[database]['engine'], metadata)
translation_per_db[database] = {
'table_translation': table_translation,
'column_translation_from_table': column_translation_from_table,
'engine': database_engines[database]['engine']
}
return translation_per_db
def build_table_translation(db_engine_object, metadata):
table_translation = {}
tables = db_engine_object.table_names()
# print(tables)
for table in tables:
table_translation[table] = Table(table, metadata, autoload=True, autoload_with=db_engine_object)
column_translation_from_table = {}
for table in table_translation.keys():
column_translation_from_table[table] = {}
for col in table_translation[table].c:
column_translation_from_table[table][col.name] = col
where the metadata is built before all engines have been acquired. This was resulting in an error where all engine Tables were following the first engine's schema (ie a column present in user2 table2 would not be snagged if user1 table2 did not have that column).
This problem was solved by building the metadata directly in the function build_table_translation rather than passing it in. While this it is good that the bug is resolved, I don't understand why the bug was present in the first place - clearly I missed something from sqlachemy's docs on MetaData. Would appreciate an explanation!
From a comment to the question:
which user will the metadata act on as it is placed in my above code?
Since build_translation_per_db() does metadata = MetaData() and then passes that object to each invocation of build_table_translation(), all tables will share the same MetaData instance and that instance will contain table information for all users/engines:
from pprint import pprint
from sqlalchemy import Column, Integer, MetaData, Table
def build_translation_per_db(connections):
# for demonstration purposes, connections is just a list of strings
metadata = MetaData()
return [build_table_translation(conn, metadata) for conn in connections]
def build_table_translation(db_engine_object, metadata):
# for demonstration purposes, db_engine_object is just a string
return Table(
f"{db_engine_object}_table",
metadata,
Column("id", Integer, primary_key=True, autoincrement=False),
)
conns = ["engine_1", "engine_2"]
table_1, table_2 = build_translation_per_db(conns)
# Do the tables share the same metadata object?
print(table_1.metadata == table_2.metadata) # True
# What does it contain?
pprint(table_1.metadata.tables)
"""
{'engine_1_table': Table('engine_1_table', MetaData(), Column('id', Integer(), table=<engine_1_table>, primary_key=True, nullable=False), schema=None),
'engine_2_table': Table('engine_2_table', MetaData(), Column('id', Integer(), table=<engine_2_table>, primary_key=True, nullable=False), schema=None)}
"""
If different users can have tables with the same name but different columns then those tables may represent the first user processed, or maybe the last user processed, or perhaps some crazy mish-mash of attributes, but in any case it's not something you want.
This source details how to use association proxies to create views and objects with values of an ORM object.
However, when I append an value that matches an existing object in the database (and said value is either unique or a primary key), it creates a conflicting object so I cannot commit.
So in my case is this only useful as a view, and I'll need to use ORM queries to retrieve the object to be appended.
Is this my only option or can I use merge (I may only be able to do this if it's a primary key and not a unique constraint), OR set up the constructor such that it will use an existing object in the database if it exists instead of creating a new object?
For example from the docs:
user.keywords.append('cheese inspector')
# Is translated by the association proxy into the operation:
user.kw.append(Keyword('cheese inspector'))
But I'd like to to be translated to something more like: (of course the query could fail).
keyword = session.query(Keyword).filter(Keyword.keyword == 'cheese inspector').one()
user.kw.append(keyword)
OR ideally
user.kw.append(Keyword('cheese inspector'))
session.merge() # retrieves identical object from the database, or keeps new one
session.commit() # success!
I suppose this may not even be a good idea, but it could be in certain use cases :)
The example shown on the documentation page you link to is a composition type of relationship (in OOP terms) and as such represents the owns type of relationship rather then uses in terms of verbs. Therefore each owner would have its own copy of the same (in terms of value) keyword.
In fact, you can use exactly the suggestion from the documentation you link to in your question to create a custom creator method and hack it to reuse existing object for given key instead of just creating a new one. In this case the sample code of the User class and creator function will look like below:
def _keyword_find_or_create(kw):
keyword = Keyword.query.filter_by(keyword=kw).first()
if not(keyword):
keyword = Keyword(keyword=kw)
# if aufoflush=False used in the session, then uncomment below
#session.add(keyword)
#session.flush()
return keyword
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String(64))
kw = relationship("Keyword", secondary=lambda: userkeywords_table)
keywords = association_proxy('kw', 'keyword',
creator=_keyword_find_or_create, # #note: this is the
)
I recently ran into the same problem. Mike Bayer, creator of SQLAlchemy, refered me to the “unique object” recipe but also showed me a variant that uses an event listener. The latter approach modifies the association proxy so that UserKeyword.keyword temporarily points to a plain string and only creates a new Keyword object if the keyword doesn't already exist.
from sqlalchemy import event
# Same User and Keyword classes from documentation
class UserKeyword(Base):
__tablename__ = 'user_keywords'
# Columns
user_id = Column(Integer, ForeignKey(User.id), primary_key=True)
keyword_id = Column(Integer, ForeignKey(Keyword.id), primary_key=True)
special_key = Column(String(50))
# Bidirectional attribute/collection of 'user'/'user_keywords'
user = relationship(
User,
backref=backref(
'user_keywords',
cascade='all, delete-orphan'
)
)
# Reference to the 'Keyword' object
keyword = relationship(Keyword)
def __init__(self, keyword=None, user=None, special_key=None):
self._keyword_keyword = keyword_keyword # temporary, will turn into a
# Keyword when we attach to a
# Session
self.special_key = special_key
#property
def keyword_keyword(self):
if self.keyword is not None:
return self.keyword.keyword
else:
return self._keyword_keyword
#event.listens_for(Session, "after_attach")
def after_attach(session, instance):
# when UserKeyword objects are attached to a Session, figure out what
# Keyword in the database it should point to, or create a new one
if isinstance(instance, UserKeyword):
with session.no_autoflush:
keyword = session.query(Keyword).\
filter_by(keyword=instance._keyword_keyword).\
first()
if keyword is None:
keyword = Keyword(keyword=instance._keyword_keyword)
instance.keyword = keyword
Good day everyone,
I have a file of strings corresponding to the fields of my SQLAlchemy object. Some fields are floats, some are ints, and some are strings.
I'd like to be able to coerce my string into the proper type by interrogating the column definition. Is this possible?
For instance:
class MyClass(Base):
...
my_field = Column(Float)
It feels like one should be able to say something like MyClass.my_field.column.type and either ask the type to coerce the string directly or write some conditions and int(x), float(x) as needed.
I wondered whether this would happen automatically if all the values were strings, but I received Oracle errors because the type was incorrect.
Currently I naively coerce -- if it's float()able, that's my value, else it's a string, and I trust that integral floats will become integers upon inserting because they are represented exactly. But the runtime value is wrong (e.g. 1.0 vs 1) and it just seems sloppy.
Thanks for your input!
SQLAlchemy 0.7.4
You can iterate over columns of the mapped Table:
for col in MyClass.__table__.columns:
print col, repr(col.type)
... so you can check the type of each field by its name like this:
def get_col_type(cls_, fld_):
for col in cls_.__table__.columns:
if col.name == fld_:
return col.type # this contains the instance of SA type
assert Float == type(get_col_type(MyClass, 'my_field'))
I would cache the results though if your file is large in order to save the for-loop on every row imported from the file.
Type coercion for sqlalchemy prior to committing to some database.
How can I verify Column data types in the SQLAlchemy ORM?
from sqlalchemy import (
Column,
Integer,
String,
DateTime,
)
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import event
import datetime
Base = declarative_base()
type_coercion = {
Integer: int,
String: str,
DateTime: datetime.datetime,
}
# this event is called whenever an attribute
# on a class is instrumented
#event.listens_for(Base, 'attribute_instrument')
def configure_listener(class_, key, inst):
if not hasattr(inst.property, 'columns'):
return
# this event is called whenever a "set"
# occurs on that instrumented attribute
#event.listens_for(inst, "set", retval=True)
def set_(instance, value, oldvalue, initiator):
desired_type = type_coercion.get(inst.property.columns[0].type.__class__)
coerced_value = desired_type(value)
return coerced_value
class MyObject(Base):
__tablename__ = 'mytable'
id = Column(Integer, primary_key=True)
svalue = Column(String)
ivalue = Column(Integer)
dvalue = Column(DateTime)
x = MyObject(svalue=50)
assert isinstance(x.svalue, str)
I'm not sure if I'm reading this question correctly, but I would do something like:
class MyClass(Base):
some_float = Column(Float)
some_string = Column(String)
some_int = Column(Int)
...
def __init__(self, some_float, some_string, some_int, ...):
if isinstance(some_float, float):
self.some_float = somefloat
else:
try:
self.some_float = float(somefloat)
except:
# do something intelligent
if isinstance(some_string, string):
...
And I would repeat the checking process for each column. I would trust anything to do it "automatically". I also expect your file of strings to be well structured, otherwise something more complicated would have to be done.
Assuming your file is a CSV (I'm not good with file reads in python, so read this as pseudocode):
while not EOF:
thisline = readline('thisfile.csv', separator=',') # this line is an ordered list of strings
thisthing = MyClass(some_float=thisline[0], some_string=thisline[1]...)
DBSession.add(thisthing)