sqlalchemy - how to convert query with subquery into relationship - sqlalchemy

In the code below I want to replace all_holdings in Account with a property called holdings that returns the desired_holdings (which are the holdings representing the latest known quantity which can change over time). I'm having trouble figuring out how to construct the call to relationship.
In addition I'd appreciate any comments on the appropriateness of the pattern (keeping historic data in a single table and using a max date subquery to get most recent), as well as on better alternatives, or improvements to the query.
from sqlalchemy import Column, Integer, String, Date, DateTime, REAL, ForeignKey, func
from sqlalchemy.orm import relationship, aliased
from sqlalchemy.sql.operators import and_, eq
from sqlalchemy.ext.declarative import declarative_base
from db import session
import datetime
import string
Base = declarative_base()
class MySQLSettings(object):
__table_args__ = {'mysql_engine':'InnoDB'}
class Account(MySQLSettings, Base):
__tablename__ = 'account'
id = Column(Integer, primary_key=True)
name = Column(String(64))
all_holdings = relationship('Holding', backref='account')
def desired_holdings(self):
max_date_subq = session.query(Holding.account_id.label('account_id'),
Holding.stock_id.label('stock_id'),
func.max(Holding.as_of).label('max_as_of')). \
group_by(Holding.account_id, Holding.stock_id).subquery()
desired_query = session.query(Holding).join(Account,
Account.id==account.id).join(max_date_subq).\
filter(max_date_subq.c.account_id==account.id).\
filter(Holding.as_of==max_date_subq.c.max_as_of).\
filter(Holding.account_id==max_date_subq.c.account_id).\
filter(Holding.stock_id==max_date_subq.c.stock_id)
return desired_query.all()
def __init__(self, name):
self.name = name
class Stock(MySQLSettings, Base):
__tablename__ = 'stock'
id = Column(Integer, primary_key=True)
name = Column(String(64))
def __init__(self, name):
self.name = name
class Holding(MySQLSettings, Base):
__tablename__ = 'holding'
id = Column(Integer, primary_key=True)
account_id = Column(Integer, ForeignKey('account.id'), nullable=False)
stock_id = Column(Integer, ForeignKey('stock.id'), nullable=False)
quantity = Column(REAL)
as_of = Column(Date)
stock = relationship('Stock')
def __str__(self):
return "Holding(%f, '%s' '%s')"%(self.quantity, self.stock.name, str(self.as_of))
def __init__(self, account, stock, quantity, as_of):
self.account_id = account.id
self.stock_id = stock.id
self.quantity = quantity
self.as_of = as_of
if __name__ == "__main__":
ibm = Stock('ibm')
session.add(ibm)
account = Account('a')
session.add(account)
session.flush()
session.add_all([ Holding(account, ibm, 100, datetime.date(2001, 1, 1)),
Holding(account, ibm, 200, datetime.date(2001, 1, 3)),
Holding(account, ibm, 300, datetime.date(2001, 1, 5)) ])
session.commit()
print "All holdings by relation:\n\t", \
string.join([ str(h) for h in account.all_holdings ], "\n\t")
print "Desired holdings query:\n\t", \
string.join([ str(h) for h in account.desired_holdings() ], "\n\t")
The results when run are:
All holdings by relation:
Holding(100.000000, 'ibm' '2001-01-01')
Holding(200.000000, 'ibm' '2001-01-03')
Holding(300.000000, 'ibm' '2001-01-05')
Desired holdings query:
Holding(300.000000, 'ibm' '2001-01-05')

Following answer provided by Michael Bayer after I posted to sqlalchemy google group:
The desired_holdings() query is pretty complicated and I'm not seeing a win by trying to get relationship() to do it. relationship() is oriented towards maintaining the persistence between two classes, not as much a reporting technique (and anything with max()/group_by in it is referring to reporting).
I would stick #property on top of desired_holdings, use object_session(self) to get at "session", and be done.
See more information on query-enabled properties.

Related

SQLalchemy mutually dependent foreign key constraints

I’m trying to define 2 entities like this:
class User(Base):
id = Column(Integer, primary_key=True)
name = Column(String(256), index=True, unique=True)
main_token_id = Column(ForeignKey('token.id'), nullable=False)
main_token = relationship('Token', uselist=False)
tokens = relationship('Token', back_populates="user", foreign_keys=['token.id'])
class Token(Base):
id = Column(Integer, primary_key=True)
user_id = Column(ForeignKey('user.id'), nullable=False)
user: User = relationship("user", back_populates="tokens")
I want the user to have access to the collection of all his tokens and I also want him to have a special, main token. I want to ensure that the user has just one main token and I need integrity provided by the foreign key. By both of them actually.
I have read Cascading deletes in mutually dependent tables in SQLAlchemy but I don't feel it helps. I would like to have the integrity from both sides.
How can I make this work? If the design is flawed how can I rephrase this so that I may keep my integrity guarantees?
A kludge I have used to sort of solve this problem before is to create a column like precedence = Column(Integer, nullable=False) on tokens. Then set a unique constraint like UniqueConstraint('user_id', 'precedence'). Then set that integer manually when you create the tokens. The token with precedence 0 or the lowest precedence is the main token.
Here is an example. I'm sure some sqlalchemy geniuses can perform the precedence swap without 3 updates but I think in most cases that doesn't come up very often. There is a way to defer the unique constraint within a transaction but I guess sqlite does not support that yet.
This relies on your application not clearing the main token from precedence 0, ie. no integrity check to prevent that.
from sqlalchemy import (
create_engine,
UnicodeText,
Integer,
String,
ForeignKey,
UniqueConstraint,
update,
)
from sqlalchemy.schema import (
Table,
Column,
MetaData,
)
from sqlalchemy.sql import select
from sqlalchemy.orm import declarative_base, relationship
from sqlalchemy.orm import Session
from sqlalchemy.exc import IntegrityError
Base = declarative_base()
engine = create_engine("sqlite://", echo=False)
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String(256), index=True, unique=True)
tokens = relationship('Token', backref="user", cascade="all, delete-orphan", order_by='Token.precedence')
main_token = relationship('Token', primaryjoin='and_(User.id == Token.user_id, Token.precedence == 0)', viewonly=True, uselist=False)
class Token(Base):
__tablename__ = 'tokens'
id = Column(Integer, primary_key=True)
precedence = Column(Integer, nullable=False)
user_id = Column(ForeignKey('users.id'), nullable=False)
__table_args__ = (UniqueConstraint('precedence', 'user_id', name='tokens_user_precedence'),)
Base.metadata.create_all(engine)
with Session(engine) as session:
user = User(name='tokenizer')
session.add(user)
main_token = Token(user=user, precedence=0)
session.add(main_token)
session.add(Token(user=user, precedence=1))
session.commit()
assert session.query(Token).first()
assert session.query(User).first()
assert session.query(User).first().tokens
assert session.query(User).first().tokens[0] == main_token
# This viewonly relationship seems to be working.
assert session.query(User).first().main_token == main_token
# We don't want this so don't do this, no integrity checks here!!
main_token.precedence = 100
session.commit()
assert not session.query(User).first().main_token
# Put it back now.
main_token.precedence = 0
session.commit()
assert session.query(User).first().main_token
# Now check tokens are cleared.
session.delete(user)
session.commit()
assert not session.query(Token).all()
assert not session.query(User).all()
with Session(engine) as session:
# Try making 2 main tokens.
user = User(name='tokenizer')
session.add(user)
main_token = Token(user=user, precedence=0)
main_token2 = Token(user=user, precedence=0)
session.add_all([main_token, main_token2])
try:
session.commit()
except IntegrityError as e:
pass
else:
assert False, 'Exception should have occurred.'
with Session(engine) as session:
# Try swapping the tokens.
user = User(name='tokenizer')
session.add(user)
main_token = Token(user=user, precedence=0)
session.add(main_token)
other_token = Token(user=user, precedence=1)
session.add(other_token)
session.commit()
old_precedence = other_token.precedence
main_token.precedence = -1
session.flush()
other_token.precedence = 0
session.flush()
main_token.precedence = old_precedence
session.commit()
user.tokens[0] == other_token
user.tokens[1] == main_token
user.main_token == other_token
session.commit()

Sqlalchemy eager loading of parent all properties in joined table inheritance

I have the following problem:
I have a hierachy of classes with joined table inheritance:
class AdGroupModel(Base, AdwordsRequestMixin):
__tablename__ = 'ad_groups'
db_id = Column(BigInteger, primary_key=True)
created_at = Column(DateTime(timezone=False), nullable=False, default=datetime.datetime.now())
# ----RELATIONS-----
# campaign MANY-to-ONE
campaign_db_id = Column(BigInteger,
ForeignKey('campaigns.db_id', ondelete='CASCADE'),
nullable = True,
)
# # ads ONE-to-MANY
ads = relationship("AdModel",
backref="ad_group",
lazy="subquery",
passive_deletes=True,
single_parent=True,
cascade="all, delete, delete-orphan")
# # # keywords ONE-to-MANY
criteria = relationship("AdGroupCriterionModel",
backref="ad_group",
lazy="subquery",
passive_deletes=True,
single_parent=True,
cascade="all, delete, delete-orphan")
# Joined Table Inheritance
type = Column(Unicode(50))
__mapper_args__ = {
'polymorphic_identity': 'ad_group',
'polymorphic_on': type
}
class AdGroupCriterionModel(Base, AdGroupDependenceMixin):
__tablename__ = 'ad_group_criterion'
db_id = Column(BigInteger, primary_key=True)
destination_url = Column(Unicode, nullable=True)
status = Column(Enum("PAUSED", "ACTIVE", "DELETED",
name='criterion_status'), default="ACTIVE")
# ----RELATIONS---
# ad_group ONE-to-MANY
ad_group_db_id = Column(BigInteger, ForeignKey('ad_groups.db_id',
ondelete='CASCADE'), nullable=True)
# Joined Table Inheritance
criterion_sub_type = Column(Unicode(50))
__mapper_args__ = {
'polymorphic_on': criterion_sub_type
}
class AdGroupKeywordModel(AdGroupCriterionModel):
__tablename__ = 'ad_group_keyword'
__mapper_args__ = {'polymorphic_identity': 'Keyword'}
db_id = Column(Integer, ForeignKey('ad_group_criterion.db_id'), primary_key=True)
text = Column(Unicode, nullable=False)
class AdGroupDependenceMixin(object):
_aggad_id = Column(BigInteger, nullable=True)
_agname = Column(Unicode, nullable=True)
#hybrid_property
def ad_group_GAD_id(self):
if self.ad_group is None:
res = self._aggad_id
else:
res = self.ad_group.GAD_id
return res
#ad_group_GAD_id.setter
def ad_group_GAD_id(self, value):
self._aggad_id = value
if value is not None:
self.ad_group = None
#ad_group_GAD_id.expression
def ad_group_GAD_id(cls):
what = case([( cls._aggad_id != None, cls._aggad_id)], else_=AdGroupModel.GAD_id)
return what.label('adgroupgadid_expression')
#hybrid_property
def ad_group_name(self):
if self.ad_group is None:
return self._agname
else:
return self.ad_group.name
#ad_group_name.setter
def ad_group_name(self, value):
self._agname = value
if value is not None:
self.campaign = None
#ad_group_name.expression
def ad_group_name(cls):
what = case([( cls._agname != None, cls._agname)], else_=AdGroupModel.name)
return what.label('adgroupname_expression')
And I load the Keywords objects from the database with the following query:
all_objects1 = self.database.session.query(AdGroupKeywordModel).join(AdGroupModel)\
.options(subqueryload('ad_group'))\
.filter(AdGroupModel.GAD_id!=None)\
.limit(self.options.limit).all()
which returns obejcts of type AdGroupKeywordModel.
Unfortunately every time I try to access the properties of the AdGroupKeywordModel which are in the parent table (AdGroupCriterionModel) a query of this type is emitted:
sqlalchemy.engine.base.Engine
SELECT ad_group_criterion.destination_url AS ad_group_criterion_destination_url, ad_group_criterion.status AS ad_group_criterion_status, ad_group_criterion.ad_group_db_id AS ad_group_criterion_ad_group_db_id, ad_group_criterion.criterion_sub_type AS ad_group_criterion_criterion_sub_type, ad_group_keyword.text AS ad_group_keyword_text
FROM ad_group_criterion JOIN ad_group_keyword ON ad_group_criterion.db_id = ad_group_keyword.db_id
which is strongly compromising the performace.
What I would like to have is that all the attributes for the class AdGroupKeywordModel which are related to the parent (and other classes defined in the relationship) to be loaded with the initial query and be cached for further use. So that when I access them I do not get any overhead from further sqlstatements.
It seems that eager loading is only defined for relationships but not for hierarchies. Is it possible to have this behaviour in sqlalchemy for hierarchies as well?
Thanks
What I see is: only AdGroupModel has a relationship with a lazy= definition (which is the keyword which defines eager loading for relationships), and the query only has a subqueryload('ad_group').
The only point, in which ad_group or AdGroupModel touch with AdGroupKeywordModel is in AdGroupModel.criteria, which has as backref AdGroupCriterionModel.ad_group. I'm not familiar with the subqueryload syntax, but If I would want to eager-load AdGroupCriterionModel.ad_group, I'd define criteria like this:
criteria = relationship(
"AdGroupCriterionModel", backref=backref("ad_group", lazy="subquery"),
lazy="subquery", passive_deletes=True, single_parent=True,
cascade="all, delete, delete-orphan")
The key would be in defining the right lazy also for the backref.

Creating a self-referencing M2M relationship in SQLAlchemy (+Flask)

While trying to learn Flask, I am building a simple Twitter clone. This would include the ability for a User to follow other Users. I am trying to set up a relational database through SQLAlchemy to allow this.
I figured I would need a self-referencing many-to-many relationship on the User. Following from the SQLAlchemy documentation I arrived at:
#imports omitted
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///twitclone.db'
db = SQLAlchemy(app)
Base = declarative_base()
user_to_user = Table("user_to_user", Base.metadata,
Column("follower_id", Integer, ForeignKey("user.id"), primary_key=True),
Column("followed_id", Integer, ForeignKey("user.id"), primary_key=True)
)
class User(db.Model):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String, unique=False)
handle = Column(String, unique=True)
password = Column(String, unique=False)
children = relationship("tweet")
following = relationship("user",
secondary=user_to_user,
primaryjoin=id==user_to_user.c.follower_id,
secondaryjoin=id==user_to_user.c.followed_id,
backref="followed_by"
)
#Tweet class goes here
db.create_all()
if __name__ == "__main__":
app.run()
Running this code results in the database being created without any error messages. However, the whole part (table) connecting a user to a user is simply omitted. This is the definition of the User table:
CREATE TABLE user (
id INTEGER NOT NULL,
name VARCHAR,
handle VARCHAR,
password VARCHAR,
PRIMARY KEY (id),
UNIQUE (handle)
)
Why does SQLAlchemy not create the self-referential relationship for the User?
note: I am new to both Flask and SQLAlchemy and could be missing something obvious here.
Ok, it seems I mixed up two different styles of using SQLAlchemy with Flask: the declarative extension of SQLAlchemy and flask-sqlalchemy extension. Both are similar in capabilities with the difference being that the flask extension has some goodies like session handling. This is how I rewrote my code to strictly make use of flask-sqlalchemy.
from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
from datetime import datetime
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///kwek.db'
db = SQLAlchemy(app)
#Table to handle the self-referencing many-to-many relationship for the User class:
#First column holds the user who follows, the second the user who is being followed.
user_to_user = db.Table('user_to_user',
db.Column("follower_id", db.Integer, db.ForeignKey("user.id"), primary_key=True),
db.Column("followed_id", db.Integer, db.ForeignKey("user.id"), primary_key=True)
)
class User(db.Model):
__tablename__ = 'user'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(64), unique=False)
handle = db.Column(db.String(16), unique=True)
password = db.Column(db.String, unique=False)
kweks = db.relationship("Kwek", lazy="dynamic")
following = db.relationship("User",
secondary=user_to_user,
primaryjoin=id==user_to_user.c.follower_id,
secondaryjoin=id==user_to_user.c.followed_id,
backref="followed_by"
)
def __repr__(self):
return '<User %r>' % self.name
class Kwek(db.Model):
__tablename__ = 'kwek'
id = db.Column(db.Integer, primary_key=True)
content = db.Column(db.String(140), unique=False)
post_date = db.Column(db.DateTime, default=datetime.now())
user_id = db.Column(db.Integer, db.ForeignKey('user.id'))
def __repr__(self):
return '<Kwek %r>' % self.content
if __name__ == "__main__":
app.run()

SQLAlchemy insert many-to-one entries

Sorry, if this is a newbie question but the documentation about the many-to-one relationship doesn't seems to cover this. I have been looking for something similar to this (under the "How to Insert / Add Data to Your Tables" section), however in the shown example this is always a unique insertion.
Basically, I want to populate my database with data located on my local machine. For the sake of simplicity I have constructed the below-shown example into a MWE that illustrates the problem. The problem consists of two tables called Price and Currency and the implementation is done in a declarative style.
model.py
from sqlalchemy import Column, Integer, String
from sqlalchemy import Float, BigInteger, ForeignKey
from sqlalchemy.orm import relationship, backref
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Currency(Base):
__tablename__ = 'Currency'
id = Column(Integer, primary_key=True)
unit = Column(String(16), unique=True)
def __init__(self, unit):
self.unit = unit
class Price(Base):
__tablename__ = 'Price'
id = Column(BigInteger, primary_key=True)
currency_id = Column(Integer, ForeignKey("Currency.id"), nullable=False)
currency = relationship("Currency", backref="Currency.id")
hour1 = Column(Float)
hour2 = Column(Float)
def __init__(self, hour1, hour2):
self.hour1 = hour1
self.hour2 = hour2
Currently, I am populating the database using following code:
script.py
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
from model import *
engine = create_engine('sqlite:///example.db', echo=True)
db_session = scoped_session(sessionmaker(autocommit=False,
autoflush=False,
bind=engine))
session = db_session()
Base.metadata.create_all(engine)
oPrice = Price(2.5, 2.5)
oPrice.currency = Currency("EUR")
session.add(oPrice)
tPrice = Price(5.5, 1.5)
tPrice.currency = Currency("EUR")
session.add(tPrice)
session.commit()
This creates an error
sqlalchemy.exc.IntegrityError: (IntegrityError) column unit is not unique u'INSERT INTO "Currency" (unit) VALUES (?)' ('EUR',)
What is the best strategy for populating my database, such that I ensure that my Currency.id and Price.currency_id mapping is correct? Should I make the model-classes look for uniqueness before they are initialized, and do I do that in associated with the other table?
I'd second what Antti has suggested since currencies have standard codes like 'INR', 'USD' etc, you can make currency_code as primary key.
Or in case you want to keep the numeric primary key then one of the options is:
http://www.sqlalchemy.org/trac/wiki/UsageRecipes/UniqueObject
edit (adding example based on the recipe in the link above, the one with class decoartor)
database.py
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
engine = create_engine('sqlite:///example.db', echo=True)
db_session = scoped_session(sessionmaker(autocommit=False,
autoflush=False,
bind=engine))
model.py
from sqlalchemy import Column, Integer, String
from sqlalchemy import Float, BigInteger, ForeignKey
from sqlalchemy.orm import relationship, backref
from sqlalchemy.ext.declarative import declarative_base
from database import db_session
Base = declarative_base()
def _unique(session, cls, hashfunc, queryfunc, constructor, arg, kw):
cache = getattr(session, '_unique_cache', None)
if cache is None:
session._unique_cache = cache = {}
key = (cls, hashfunc(*arg, **kw))
if key in cache:
return cache[key]
else:
with session.no_autoflush:
q = session.query(cls)
q = queryfunc(q, *arg, **kw)
obj = q.first()
if not obj:
obj = constructor(*arg, **kw)
session.add(obj)
cache[key] = obj
return obj
def unique_constructor(scoped_session, hashfunc, queryfunc):
def decorate(cls):
def _null_init(self, *arg, **kw):
pass
def __new__(cls, bases, *arg, **kw):
# no-op __new__(), called
# by the loading procedure
if not arg and not kw:
return object.__new__(cls)
session = scoped_session()
def constructor(*arg, **kw):
obj = object.__new__(cls)
obj._init(*arg, **kw)
return obj
return _unique(
session,
cls,
hashfunc,
queryfunc,
constructor,
arg, kw
)
# note: cls must be already mapped for this part to work
cls._init = cls.__init__
cls.__init__ = _null_init
cls.__new__ = classmethod(__new__)
return cls
return decorate
#unique_constructor(
db_session,
lambda unit: unit,
lambda query, unit: query.filter(Currency.unit == unit)
)
class Currency(Base):
__tablename__ = 'Currency'
id = Column(Integer, primary_key=True)
unit = Column(String(16), unique=True)
def __init__(self, unit):
self.unit = unit
class Price(Base):
__tablename__ = 'Price'
id = Column(BigInteger, primary_key=True)
currency_id = Column(Integer, ForeignKey("Currency.id"), nullable=False)
currency = relationship("Currency", backref="Currency.id")
hour1 = Column(Float)
hour2 = Column(Float)
def __init__(self, hour1, hour2):
self.hour1 = hour1
self.hour2 = hour2
script.py:
from model import *
from database import engine, db_session as session
Base.metadata.create_all(engine)
oPrice = Price(2.5, 2.5)
oPrice.currency = Currency("EUR")
session.add(oPrice)
tPrice = Price(5.5, 1.5)
tPrice.currency = Currency("EUR")
session.add(tPrice)
session.commit()
The best simplest solution is to use the currency codes as the primary keys in Currency, and foreign keys in Price. Then you can have
price.currency_id = "EUR"
This also makes your database tables more readable - as in you won't have 28342 but 'GBP'.

After I create my tables using SQLAlchemy, how can I add additional columns to it?

This is my file so far:
from sqlalchemy import create_engine, ForeignKey
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship, backref
from sqlalchemy import Column, Integer, String
from sqlalchemy import Table, Text
engine = create_engine('mysql://root:ababab#localhost/alctest',
echo=False)
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key = True)
name = Column(String(100))
fullname = Column(String(100))
password = Column(String(100))
addresses = relationship("Address", order_by="Address.id", backref="user")
def __init__(self, name, fullname, password):
self.name = name
self.fullname = fullname
self.password = password
def __repr__(self):
return "<User('%s','%s', '%s')>" % (self.name, self.fullname, self.password)
class Address(Base):
__tablename__ = 'addresses'
id = Column(Integer, primary_key = True)
email_address = Column(String(100), nullable=False)
#foreign key, must define relationship
user_id = Column(Integer, ForeignKey('users.id'))
user = relationship("User", backref = backref('addresses',order_by=id))
Base.metadata.create_all(engine)
This file is pretty simple. It creates a User and Address tables. After I run this file, the tables are created.
But now I want to add a column to "User". How can I do that? What do I have to do?
You can add column with Table.append_column method.
test = Column('test', Integer)
User.__table__.append_column(test)
But this will not fire the ALTER TABLE command to add that column in database. As per doc given for append_column that command you have to run manually after adding that column in model.
Short answer: You cannot: AFAIK, currently there is no way to do it from sqlalchemy directly.
Howerever, you can use sqlalchemy-migrate for this if you change your model frequently and have different versions rolled out to production. Else it might be an overkill and you may be better off generating the ALTER TABLE ... scripts manually.