How to change the format of SQLAlchemy engine 'echo' messages? - sqlalchemy

IIUC, the format is set in log.py (lines 33..38 in 1.4.7):
def _add_default_handler(logger):
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(
logging.Formatter("%(asctime)s %(levelname)s %(name)s %(message)s")
)
logger.addHandler(handler)
But all my attempts to set the format to a bare '%(message)s' have failed. For example,
logging.StreamHandler(sys.stdout).setFormatter('%(message)s')
has no effect.
When developing a program in SQLAlchemy, I want to see only the query executed by engine, and these extra fields asctime, levelname and name are distracting.
There is an entire subchapter about engine logging in SQLAlchemy documentation, but it talks only about levels and says nothing about formatting. On the other hand, I guess changing formatting should be possible, because in SQLAlchemy tutorial (see, e.g., here), the logging messages are presented just the way I would like them.

First some remarks:
When you write
: logging.StreamHandler(sys.stdout)
you're calling a constructor, i.e. you're getting a new instance of
logging.StreamHandler class, not the same handler which could have been used in
sqla's log module: _add_default_handler().
Modifying it won't have any effect as long as you don't add this handler to some active logger.
It you read carefully the docs page you mentionned (https://docs.sqlalchemy.org/en/14/core/engines.html#configuring-logging), you'll find some hints :
It’s important to note that these two flags work independently of
any existing logging configuration, and will make use of
logging.basicConfig() unconditionally. This has the effect of being
configured in addition to any existing logger configurations.
Therefore, when configuring logging explicitly, ensure all echo flags
are set to False at all times, to avoid getting duplicate log lines.
logging.basicConfig() accepts a good set of parameters, but deals only with
the root logger, from which other loggers will inherit settings.
As a first step I suggest you keep the level and logger name in output, to let
you know who is speaking in your messages.
>>> import logging
>>> # 1. Get rid of timestamp for all modules, and set other defaults
... logging.basicConfig(format="%(levelname)s %(name)s %(message)s", level="INFO")
... logging.info("Root logger talking")
...
INFO root Root logger talking
>>> # 2. Preset top level logger for SQLAlchemy
... logging.getLogger('sqlalchemy').setLevel("INFO")
...
>>> # 4. Run your code
... import sqlalchemy as sqla
... engine = sqla.create_engine('sqlite:///:memory:')
...
>>> from sqlalchemy.ext.declarative import declarative_base
...
... Base = declarative_base()
... from sqlalchemy import Column, Integer, String
>>> class User(Base):
... __tablename__ = 'users'
...
... id = Column(Integer, primary_key=True)
... name = Column(String)
...
INFO sqlalchemy.orm.mapper.Mapper (User|users) _configure_property(id, Column)
INFO sqlalchemy.orm.mapper.Mapper (User|users) _configure_property(name, Column)
INFO sqlalchemy.orm.mapper.Mapper (User|users) Identified primary key columns: ColumnSet([Column('id', Integer(), table=<users>, primary_key=True, nullable=False)])
INFO sqlalchemy.orm.mapper.Mapper (User|users) constructed
>>> Base.metadata.create_all(engine)
INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
INFO sqlalchemy.engine.base.Engine ()
INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
INFO sqlalchemy.engine.base.Engine ()
INFO sqlalchemy.engine.base.Engine PRAGMA main.table_info("users")
INFO sqlalchemy.engine.base.Engine ()
INFO sqlalchemy.engine.base.Engine PRAGMA temp.table_info("users")
INFO sqlalchemy.engine.base.Engine ()
INFO sqlalchemy.engine.base.Engine
CREATE TABLE users (
id INTEGER NOT NULL,
name VARCHAR,
PRIMARY KEY (id)
)
INFO sqlalchemy.engine.base.Engine ()
INFO sqlalchemy.engine.base.Engine COMMIT
>>>
To answer your question more specifically, here's a solution:
import logging, sys
sql_logger = logging.getLogger("sqlalchemy.engine.base.Engine")
hdlr = logging.StreamHandler(sys.stdout)
hdlr.setFormatter(logging.Formatter("[SQL] %(message)s"))
sql_logger.addHandler(hdlr)
sql_logger.setLevel(logging.INFO)
import sqlalchemy as sqla
engine = sqla.create_engine('sqlite:///:memory:')
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
from sqlalchemy import Column, Integer, String
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
Base.metadata.create_all(engine)
You'll need some more work to handle other modules log messages,
while avoiding duplicates

Related

Obtain enriched MetaData from Sqlalchemy Model

I'm trying to create a package that manages DB connection, ORM models and migration. It's separated from web service project such as a Flask application, so flask-sqlalchemy is not considered.
This is how I organize my project (only list out parts related to this question):
alembic.ini
src/
* my_project/
* db/
- connections.py
* models/
* abstract/
- abstract_base.py
* realized/
- realized_model.py
migrations/
* versions/
- env.py
- script.py.mako
src/my_project/db/connections.py:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
ENGINE = create_engine("db://url")
ModelBase = declarative_base(bind=ENGINE)
Session = sessionmaker(ENGINE)
src/my_project/models/abstract/abstract_base.py:
from sqlalchemy import Column, Integer, DateTime
from my_project.db.connections import ModelBase
class BaseWithTimestamp(ModelBase):
__abstract__ = True
id = Column(Integer, primary_key=True, nullable=False)
created_at = Column(DateTime, nullable=False)
src/my_project/models/realized/realized_model.py:
from sqlalchemy import Column, String
from my_project.models.abstract.abstract_base import BaseWithTimestamp
class Note(BaseWithTimestamp):
__tablename__ = "notes"
content = Column(String(300), nullable=True)
env.py (the same as alembic default except metadata setup):
# ...
from my_project.db.connections import ModelBase
# ...
target_metadata = ModelBase.metadata
# ...
Supposedly, when linked to an empty database, alembic should generate migration script that creates table notes with three columns specified in model Note when running revision command with auto-generation switched on. However, what I got is an empty migration script.
Hence I tried doing this in interactive shell to see what's stored in Base's metadata:
from my_project.db.connections import ModelBase
ModelBase.metadata.tables # => FacadeDict({})
Note/notes is expected to appear in metadata's table list, but above result shows that no table was memorized in Base's metatdata, which I think is the root cause for generating empty migration script. Is there anything I'm missing or doing wrong?
Seems that one needs to declare all related classes explicitly right after declarative base model is declared/imported, so that these extended models can get added to metadata:
migrations/env.py
# ...
from my_project.db.connections import ModelBase
from my_project.models.abstract.abstract_base import BaseWithTimestamp
from my_project.models.realized.realized_model import Note
# ...
target_metadata = ModelBase.metadata
# ...

Classmethod for retrieving a specific instance with sqlalchemy

I was trying to make my ORM code a bit more elegant by using classmethods in sqlalchemy and I wanted to make a method called get which would just retrieve a single existing instance of the ORM object with a few parameters. But since it seems I need a session to do it, the only way I could figure out how to do it was to pas the session as a parameter to the get method. It is working but I can't shake the feeling that I am building an antipattern.
Here is my ORM class (simplified):
class GeocellZone(Base):
__tablename__ = "geocell_zone"
__table_args__ = (UniqueConstraint("study_area_id", "version_id", "name"),)
id = Column(Integer, primary_key=True)
study_area_id = Column(Integer, ForeignKey("study_area.id"))
version_id = Column(Integer, ForeignKey("version.id"))
name = Column(Text)
geometry = Column(Geometry(srid=4326))
#classmethod
def get(cls, session, version, study_area, name):
return session.query(cls) \
.filter(cls.study_area_id == study_area.id) \
.filter(cls.version_id == version.id) \
.filter(cls.name == name) \
.first()
And here is what it looks like when I call it in my application:
import os
from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker
from myapp.orm import *
engine = create_engine(
f"postgresql://{os.getenv('DB_USER')}:{os.getenv('DB_PASS')}#{os.getenv('DB_HOST')}:{os.getenv('DB_PORT')}/{os.getenv('DB_NAME')}",
echo=False
)
session = sessionmaker(bind=engine)()
GeocellZone.get(session, version, study_area, "Antwerpen")
This works, it returns the exact GeocellZone instance that I want. But I feel dirty passing the session to the ORM class like this. Am I overthinking this? Or is there a better way to do this?

Is it possible to query using raw sql and get object back?

In Hibernate it's possible to query using raw sql and get entities (objects) back. Something like: createSQLQuery(sql).addEntity(User.class).list().
Is it possible to do the same in sqlalchemy?
As #Ilja notes via link in a comment to the question, it is possible to do what you describe using .from_statement() as described in the documentation:
from sqlalchemy import Column, create_engine, Integer, select, String, text
from sqlalchemy.orm import declarative_base, Session
engine = create_engine("sqlite://")
Base = declarative_base()
class Person(Base):
__tablename__ = "person"
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
def __repr__(self):
return f"<Person(id={self.id}, name='{self.name}')>"
Base.metadata.create_all(engine)
# sample data
with Session(engine) as session, session.begin():
session.add_all(
[Person(name="Adam"), Person(name="Alicia"), Person(name="Brandon")]
)
# test
with Session(engine) as session, session.begin():
sql = "SELECT id FROM person WHERE name LIKE 'A%'"
results = session.scalars(select(Person).from_statement(text(sql))).all()
print(results)
# [<Person(id=1, name='Adam')>, <Person(id=2, name='Alicia')>]
When using the entityManager you can try:
entityManager.createNativeQuery("select some native query", User.class)
According to the API:
public Query createNativeQuery(String sqlString, Class resultClass);

How to create postgresql's sequences in Alembic

I'm using alembic to maintain my tables. At the same time, I update my models using the declarative way.
This is one the alembic's table:
op.create_table(
'groups',
Column('id', Integer, Sequence('group_id_seq'), primary_key=True),
Column('name', Unicode(50)),
Column('description', Unicode(250)),
)
And the model is like the following:
class Group(Base):
__tablename__ = 'groups'
id = Column(Integer, Sequence('group_id_seq'), primary_key=True)
name = Column(Unicode(50))
description = Column(Unicode(250))
def __init__(self, name, description):
self.description = description
self.name = name
You can see, I'm using the Sequence in both the alembic migration and in the declarative model.
But I have noticed that when using PostgreSQL (v9.1) no sequences are created by alembic, and so the models fail to create instances since they will use the nextval(<sequence name>) clause.
So, how can I create my alembic migrations so that the sequences are truly generated in postgresql?
Just add following to your model:
field_seq = Sequence('groups_field_seq')
field = Column(Integer, field_seq, server_default=field_seq.next_value())
And add following to your migration file (before creating table):
from sqlalchemy.schema import Sequence, CreateSequence
op.execute(CreateSequence(Sequence('groups_field_seq')))
Found a hint at https://bitbucket.org/zzzeek/alembic/issue/60/autogenerate-for-sequences-as-well-as#comment-4100402
Following the CreateSequence found in the previous link I still have to jump through several hoops to make my migrations works in SQLite and PostgreSQL. Currently I have:
def dialect_supports_sequences():
return op._proxy.migration_context.dialect.supports_sequences
def create_seq(name):
if dialect_supports_sequences():
op.execute(CreateSequence(Sequence(name)))
And then call the create_seq whenever I need it.
Is this the best practice?
Not sure if I got your question right but as nobody else chose to answer, here is how I get perfectly normal ids:
Alembic:
op.create_table('event',
sa.Column('id', sa.INTEGER(), autoincrement=True, nullable=False),
The class:
class Event(SQLADeclarativeBase):
__tablename__ = 'event'
id = Column(Integer, primary_key = True)
I ran into this same issue recently and here is how i solved it.
op.execute("create sequence SEQUENCE_NAME")
I ran the above command inside the upgrade function and for the downgrade run the below code inside the downgrade function respectively.
op.execute("drop sequence SEQUENCE_NAME")

Trouble defining multiple self-referencing foreign keys in a table

I have some code here. I recently added this root_id parameter. The goal of that is to let me determine whether a File belongs to a particular Project without having to add a project_id FK into File (which would result in a model cycle.) Thus, I want to be able to compare Project.directory to File.root. If that is true, File belongs to Project.
However, the File.root attribute is not being autogenerated for File. My understanding is that defining a FK foo_id into table Foo implicit creates a foo attribute to which you can assign a Foo object. Then, upon session flush, foo_id is properly set to the id of the assigned object. In the snippet below that is clearly being done for Project.directory, but why not for File.root?
It definitely seems like it has to do with either 1) the fact that root_id is a self-referential FK or 2) the fact that there are several self-referential FKs in File and SQLAlchemy gets confused.
Things I've tried.
Trying to define a 'root' relationship() - I think this is wrong, this should not be represented by a join.
Trying to define a 'root' column_property() - allows read access to an already set root_id property, but when assigning to it, does not get reflected back to root_id
How can I do what I'm trying to do? Thanks!
from sqlalchemy import create_engine, Column, ForeignKey, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import backref, relationship, scoped_session, sessionmaker, column_property
Base = declarative_base()
engine = create_engine('sqlite:///:memory:', echo=True)
Session = scoped_session(sessionmaker(bind=engine))
class Project(Base):
__tablename__ = 'projects'
id = Column(Integer, primary_key=True)
directory_id = Column(Integer, ForeignKey('files.id'))
class File(Base):
__tablename__ = 'files'
id = Column(Integer, primary_key=True)
path = Column(String)
parent_id = Column(Integer, ForeignKey('files.id'))
root_id = Column(Integer, ForeignKey('files.id'))
children = relationship('File', primaryjoin=id==parent_id, backref=backref('parent', remote_side=id), cascade='all')
Base.metadata.create_all(engine)
p = Project()
root = File()
root.path = ''
p.directory = root
f1 = File()
f1.path = 'test.txt'
f1.parent = root
f1.root = root
Session.add(f1)
Session.add(root)
Session.flush()
# do this otherwise f1 will be returned when calculating rf1
Session.expunge(f1)
rf1 = Session.query(File).filter(File.path == 'test.txt').one()
# this property does not exist
print rf1.root
My understanding is that defining a FK foo_id into table Foo implicit creates a foo attribute to which you can assign a Foo object.
No, it doesn't. In the snippet, it just looks like it is being done for Project.directory, but if you look at the SQL statements being echo'ed, there is no INSERT at all for the projects table.
So, for it to work, you need to add these two relationships:
class Project(Base):
...
directory = relationship('File', backref='projects')
class File(Base):
...
root = relationship('File', primaryjoin='File.id == File.root_id', remote_side=id)