sqlalchemy child field order_by on backref - sqlalchemy

Answer no longer needed as I changed focus in code. (see my comment in answer) Post answers for future reference...
How do I retrieve results from a one to many backref ordered by a field in the child? I need all somethings for the gid ordered by index. But at this time they are retrieved randomly even though they are ordered in the ms sql server.
I have in TurboGears 2 datamodels.py:
`class Parcel(DeclarativeBase):
__tablename__ = 'GENERAL'
__table_args__ = ({'autoload': True})
gid = Column(Integer, primary_key=True)`
somethings = relationship('Something', backref='Parcel')
'class Something(DeclarativeBase):
__tablename__ = 'SKETCH'
__table_args__ = ({'autoload': True})
gid = Column(Integer, ForeignKey('GENERAL.gid'), primary_key=True)
index = Column(Integer, primary_key=True)
In Turbogears root.py:
query = DBSession.query(Parcel)
query = query.options(joinedload('somethings')
query=session.filter(Parcel.gid==gid)
Returns all somethings for gid unordered.

DBSession.query(Something).filter_by(gid=gid).order_by(Something.index).all()
edit: relationship() accepts a keyword argument order_by to order instances when you use the relationship. If you want to specify the ordering for the reverse direction, you can use the backref() function instead of the backref keyword and use the same order_by keyword argument as with relationship().

Related

SQLAlchemy: Counting multiple relationships - best way?

Imagine the following (example) datamodel:
class Organization(db.Model):
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
friendly_name = db.Column(db.Text, nullable=False)
users = db.relationship('Users', back_populates='organizations')
groups = db.relationship('Groups', back_populates='organizations')
class User(db.Model):
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
organization_id = db.Column(db.Integer, db.ForeignKey('organizations.id'))
organizations = relationship("Organization", back_populates="users")
class Group(db.Model):
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
organization_id = db.Column(db.Integer, db.ForeignKey('organizations.id'))
organizations = relationship("Organization", back_populates="groups")
(so basically an Organization has User and Group relationships)
What we want is to retrieve the counts for users and groups. Result should be similar to the following:
id
friendly_name
users_count
groups_count
1
o1
33
3
2
o2
12
2
3
o3
1
0
This can be achieved with a query similar to
query = db.session.query(
Organization.friendly_name,
func.count(User.id.distinct()).label('users_count'),
func.count(Group.id.distinct()).label('groups_count'),
) \
.outerjoin(User, Organization.users) \
.outerjoin(Group, Organization.groups) \
.group_by(Organization.id)
which seems quite overkill. The first intuitive approach would be something like
query = db.session.query(
Organization.friendly_name,
func.count(distinct(Organization.users)).label('users_count'),
func.count(distinct(Organization.groups).label('groups_count'),
)# with or without outerjoins
which is not working (Note: With one relationship it would work).
a) Whats the difference between User.id.distinct() and distinct(Organization.users) in this case?
b) What would be the best/most performant/recommended way in SQLAlchemy to get a count for each relationship an Object has?
Bonus): If instead of Organization.friendly_name the whole Model would be selected (...query(Organization, func....)) SQLAlchemy returns a tuple with the format t(Organization, users_count, groups_count) as result. Is there a way to just return the Organization with the two counts as additional fields? (as SQL would)
b:
You can try a window function to count users and groups with good performance:
query = db.session.query(
Organization.friendly_name,
func.count().over(partition_by=(User.id, Organization.id)).label('users_count')
func.count().over(partition_by=(Group.id, Organization.id)).label('groups_count')
)
.outerjoin(User, Organization.users)
.outerjoin(Group, Organization.groups)
bonus:
To return count as a field of Organization, you can use hybrid_property, but you would not be happy with the performance.

SqlAlchemy - apply filter before join statement

I try to add some contribution on 3c7's cool project, and I want to apply a filter on a join query(sqlalchemy).
Simple statement: multiple rules(table) can have multiple tags(table) - I want to filter out some rules based on some tags.
[Rules][2] table (rule_id, etc)
[Tags][2] Table (tag_id, etc)
tags_rules(Junction table) (rule_id,tag_id) -- no declaration
Issue: Applying a filter after join of course that will remove only the rules that have only one tag(the one that I specify). If a rule has multiple tags, one record from the join result will be removed, but the rule will still appear in there are any other tags associated with that rule
Sql alchemy declaration:
class Rule(Base):
id = Column(Integer, primary_key=True, index=True, autoincrement=True)
name = Column(String(255), index=True)
meta = relationship("Meta", back_populates="rule", cascade="all, delete, delete-orphan")
strings = relationship("String", back_populates="rule", cascade="all, delete, delete-orphan")
condition = Column(Text)
imports = Column(Integer)
tags = relationship("Tag", back_populates="rules", secondary=tags_rules)
ruleset_id = Column(Integer, ForeignKey("ruleset.id"))
ruleset = relationship("Ruleset", back_populates="rules")
class Tag(Base):
id = Column(Integer, primary_key=True, autoincrement=True, index=True)
name = Column(String(255), index=True)
rules = relationship("Rule", back_populates="tags", secondary=tags_rules)
I tried with sub queries but the most feasible seem to apply a filter on the tags table before the join.
Current implementation:
rules = rules.select_from(Tag).join(Rule.tags).filter(~Tag.name.in_(tags))
Any idea is greatly appreciated.

Optimizing hybrid_properties in SQLAlchemy

I have a piece of working code but it is very inefficient, instead of a single query with a join. I get one initial query, followed by one query per row in the response.
I have to following scenario:
class Job(Base, SerializeMixin, JobInterface):
__tablename__ = 'job_subjobs'
id = Column(Integer, primary_key=True, autoincrement=True)
group_id = Column(Integer, ForeignKey("job_groups.id"), nullable=False)
class Crash(Base, SerializeMixin):
__tablename__ = 'crashes'
id = Column(Integer, primary_key=True, autoincrement=True)
job_id = Column(Integer, ForeignKey("job_subjobs.id", ondelete='CASCADE'), nullable=False)
job = relationship('Job', backref='Crash')
#hybrid_property
def job_identifier(self):
return "{}:{}".format(self.job.group_id, self.job.id)
So given the above and I perform a query for all Crashes, It will perform one SELECT for all crashes. When I iterate and ask for job_identifier it will then do one separate SELECT for each crash.
self.session.query(Crash).all()
Is there someway i can create a #hybrid_property referencing a different table and have it JOIN from the beginning and preload the expression?
I've experimented with #xxx.expression without success. If all else fails I can add another foreign key in Crash table, but I would like to avoid changing current data structure if possible.
ended up using:
jobs = relationship('Job', backref='Crash', lazy='joined')

SQLALchemy filter_by() on a foreign key producting weird sql

I'm attempting to filter on a foreign key and none of the SO answers I've searched for have lent any results.
Where are my query statements.
testing = Comments\
.filter(Comments.post_id==post_id)
print(testing)
testing = Comments\
.query.join(Post, aliased=True)\
.filter(Comments.post_id==post_id)
print(testing)
Here's what my class definitions looks like
class Comments(db.Model):
comment_id = db.Column(db.Integer, primary_key=True)
post_id = db.Column(
db.Integer,
db.ForeignKey("Post.post_id"),
nullable=False)
class post(db.Model):
post_id = db.Column(db.Integer, primary_key=True)
Comments = db.relationship(
'Comments',
backref='Post',
lazy='dynamic')
The actual SQL queries which are being produced from the first and second case. They both have this weird :post_id_1 thing. In both cases I'm getting a null set back.
FROM "Comments"
WHERE "Comments".post_id = :post_id_1
FROM "Comments" JOIN "post" AS "post_1" ON "post_1".post_id = "Comments".post_id
WHERE "Comments".post_id = :post_id_1
If I do a simple
Select * from Comments where post_id = 1
in the mysql CLI I get a result set.
Your model definition is weird, the following part is not correctly indented:
Comments = db.relationship(
'Comments',
backref='Post',
lazy='dynamic')
Or maybe it's just a copy/paste issue (just to be sure).
What you call "weird :esc_id_1 thing" is in fact an named placeholder. They will be replaced by the real value when the SQL statement will be executed (this is mainly to avoid SQL injection, the driver is responsible to escape values).

SQLAlchemy recursive many-to-many relation

I've a case where I'm using one table to store user and group related datas. This column is called profile. So, basically this table is many-to-many table for the cases where one user is belonging in to many groups or there are many users in one group.
I'm a bit confused how it should be described...
Here's a simplified presentation of the class.
Entity relationship model
user_group_table = Table('user_group', metadata,
Column('user_id', Integer,ForeignKey('profiles.id',
onupdate="CASCADE", ondelete="CASCADE")),
Column('group_id', Integer, ForeignKey('profiles.id',
onupdate="CASCADE", ondelete="CASCADE"))
)
class Profile(Base)
__tablename__ = 'profiles'
id = Column(Integer, autoincrement=True, primary_key=True)
name = Column(Unicode(16), unique=True) # This can be either user- / groupname
groups = relationship('Profile', secondary=user_group_table, backref = 'users')
users = relationship('Profile', secondary=user_group_table, backref = 'groups')
#Example of the usage:
user = Profile()
user.name = 'Peter'
salesGroup = Profile()
salesGroup.name = 'Sales'
user.groups.append(salesGroup)
salesGroup.users
>[peter]
First of all, I agree with Raven's comment that you should use separate tables for Users and Groups. The reason being that you might get some inconsistent data where a User might have other Users as its users relations, as well as you might have cycles in the relationship tree.
Having said that, to make the relationship work declare it as following:
...
class Profile(Base):
__tablename__ = 'profiles'
id = Column(Integer, primary_key=True, autoincrement=True)
name = Column(Unicode(16), unique=True) # This can be either user- / groupname
groups = relationship('Profile',
secondary=user_group_table,
primaryjoin=user_group_table.c.user_id==id,
secondaryjoin=user_group_table.c.group_id==id,
backref='users')
...
Also see Specifying Alternate Join Conditions to relationship() documentation section.