subquery in select statement

subquery in select statement - sqlalchemy

I have two tables (albums,pictures) in a one to many relationship and I want to display each albums details with one picture so I have the following query
select albums.name
, (
select pictures.path
from pictures
where pictures.albumid = albums.id
limit 1
) as picture
from albums
where ...
Now I'm struggling creating this on Pylons with sqlalchemy I tried to do the following
picture = Session.query(model.Picture)
sub_q = picture.filter_by(albumid = model.Album.id).limit(1).subquery()
album_q = Session.query(model.Album, sub_q)
result = album_q.all()
but it creates the following statement displaying the incorrect picture beacuse the table albums is included in the subquery
select albums.name
, (
select pictures.path
from pictures, albums
where pictures.albumid = albums.id
)
from albums
where ...
Am I doing it wrong?, is this even possible in sqlalchemy?.
That works perfectly. I'm sorry but I forgot to say I'm using sqlalchemy reflection. I'm gonna try backreferencing the object and see if it works.

If I understand the basic premise of the question (admittedly it's a little unclear), it sounds using relationships might accomplish this better. From what I'm gathering, a picture has one album, but an album has many pictures? Anyway, that's somewhat irrelevant as the relationship concept is what's important here.
You haven't specified how your model is setup, but using declarative syntax you could define the tables like below:
class Album(Base):
__tablename__ = 'albums'
id = Column(Integer, Sequence('album_id_sequence'), primary_key=True)
class Picture(Base):
__tablename__ = 'pictures'
id = Column(Integer, Sequence('picture_id_sequence'), primary_key=True)
albumid = Column(Integer, ForeignKey('albums.id'))
picture = Column(String(99))
album = relationship('Album', backref=backref('pictures'))
Sqlalchemy then takes care of all the hard work for you. Now every 'picture' has an 'album' property, callable via 'picture.album'. In code this looks like:
pictures = session.query(Picture).all()
for picture in pictures:
for album in picture.albums:
# do stuff with album
And vice-versa, if you have an album, you can get a picture (this is what the backref in the relationship definition does):
albums = session.query(Album).all()
for album in albums:
album.picture # the picture object related to the album
I haven't tested the code so chances are there are typos, but hopefully you get the general idea.

Related

What is the best practice for lookup values in SQLAlchemy?

I am writing a pretty basic Flask application using Flask-SQLAlchemy for tracking inventory and distribution. I could use some guidance on how the best way to handle a lookup table for common values. My database back end will be MySQL and ElasticSearch for searches.
If I have a common mapping structure where all data going into a specific table, say Vehicle, have a common list of values to look up against for the Vehicle.make column, what would the best way to achieve this be?
My thought for approaching this is one of two ways:
Lookup Table
I could set something up similar to this where I have a relationship, and store the make in VehicleMake. However, if my expected list of makes is low (say 10), this seems unnecessary.
class VehicleMake(Model):
id = Column(Integer, primary_key=True)
name = Column(String(16))
cars = relationship('Vehicle', backref='make', lazy='dynamic')
class Vehicle(Model):
id = Column(Integer, primary_key=True)
name = Column(String(32))
Store as a String
I could just store this as a string on the Vehicle model. But would it be a waste of space to store a common value as a string?
class Vehicle(Model):
id = Column(Integer, primary_key=True)
name = Column(String(32))
make = Column(String(16))
My original idea was just to have a dict containing a mapping like this and reference it as needed within the model. I am just not clear how to tie this in when returning the vehicle model.
MAKE_LIST = {
1: 'Ford',
2: 'Dodge',
3: 'Chevrolet'
}
Any feedback is welcome - and if there is documentation that covers this specific scenario I'm happy to read that and answer this question myself. My expected volume is going to be low (40-80 records per week) so it doesn't need to be ridiculously fast, I just want to follow best practices.

The short answer is it depends.
The long answer is that it depends on what you store along with the make of said vehicles and how often you expect to add new types.
If you need to store more than just the name of each make, but also some additional metadata, like the size of the gas tank, the cargo space, or even a sortkey, go for an additional table. The overhead of such a small table is minimal, and if you communicate with the frontend using make ids instead of make names, there is no problem at all with this. Just remember to add an index to vehicle.make_id to make the lookups efficient.
class VehicleMake(Model):
id = Column(Integer, primary_key=True)
name = Column(String(16))
cars = relationship('Vehicle', back_populates="make", lazy='dynamic')
class Vehicle(Model):
id = Column(Integer, primary_key=True)
name = Column(String(32))
make_id = Column(Integer, ForeignKey('vehicle_make.id'), nullable=False)
make = relationship("VehicleType", innerjoin=True)
Vehicle.query.get(1).make.name # == 'Ford', the make for vehicle 1
Vehicle.query.filter(Vehicle.make_id == 2).all() # all Vehicles with make id 2
Vehicle.query.join(VehicleMake)\
.filter(VehicleMake.name == 'Ford').all() # all Vehicles with make name 'Ford'
If you don't need to store any of that metadata, then the need for a separate table disappears. However, the general problem with strings is that there is a high risk of spelling errors and capital/lowercase letters screwing up your data consistency. If you don't need to add new makes much, it's a lot better to just use Enums, there are even MySQL specific ones in SQLAlchemy.
import enum
class VehicleMake(enum.Enum):
FORD = 1
DODGE = 2
CHEVROLET = 3
class Vehicle(Model):
id = Column(Integer, primary_key=True)
name = Column(String(32))
make = Column(Enum(VehicleMake), nullable=False)
Vehicle.query.get(1).make.name # == 'FORD', the make for vehicle 1
Vehicle.query.filter(Vehicle.make == VehicleMake(2)).all() # all Vehicles with make id 2
Vehicle.query.filter(Vehicle.make == VehicleMake.FORD).all() # all Vehicles with make name 'Ford'
The main drawback of enums is that they might be hard to extend with new values, although at least for Postgres the dialect specific version was a lot better at this than the general SQLAlchemy one, have a look at sqlalchemy.dialects.mysql.ENUM instead. If you want to extend your existing enum, you can always just execute raw SQL in your Flask-Migrate/Alembic migrations.
Finally, the benefits of using strings is that you can always programmatically enforce your data consistency. But, this comes at the cost that you have to programmatically enforce your data consistency. If the vehicle make can be changed or inserted by external users, even colleagues, this will get you in trouble unless you're very strict about what enters your database. For example, it might be nice to uppercase all values for easy grouping, since it effectively reduces how much can go wrong. You can do this during writing, or you can add an index on sqlalchemy.func.upper(Vehicle.make) and use hybrid properties to always query the uppercase value.
class Vehicle(Model):
id = Column(Integer, primary_key=True)
name = Column(String(32))
_make = Column('make', String(16))
#hybrid_property
def make(self):
return self._make.upper()
#make.expression
def make(cls):
return func.upper(cls._make)
Vehicle.query.get(1).make.upper() # == 'FORD', the make for vehicle 1
Vehicle.query.filter(Vehicle.make == 'FORD').all() # all Vehicles with make name 'FORD'
Before you make your choice, also think about how you want to present this to your user. If they should be able to add new options themselves, use strings or the separate table. If you want to show a dropdown of possibilities, use the enum or the table. If you have an empty database, it's going to be difficult to collect all string values to display in the frontend without needing to store this as a list somewhere in your Flask environment as well.

Django GenereicForeignKey v/s custom manual fields performance/optimization

I'm trying to build a typical social networking site. there are two types of objects mainly.
photo
status
a user can like photo and status. (Note that these two are mutually exclusive)
means, We have two table (1) for Image only and other for status only.
now when a user likes an object(it could be a photo or status) how should I store that info.
I want to design a efficient SQL schema for this.
Currently I'm using Genericforeignkey(GFK)
class LikedObject(models.Model):
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = GenericForeignKey('content_type', 'object_id')
but yesterday I thought if I can do this without using GFK efficiently?
class LikedObject(models.Model):
OBJECT_TYPE = (
('status', 'Status'),
('image', 'Image/Photo'),
)
user = models.ForeignKey(User, related_name="liked_objects")
obj_id = models.PositiveIntegerField()
obj_type = models.CharField(max_length=63, choices=OBJECT_TYPE)
the only difference I can understand is that I have to make two queries if I want to get all liked_status of a particular user
status_ids = LikedObject.objects.filter(user=user_obj, obj_type='status').values_list('object_id', flat=True)
status_objs = Status.objects.filter(id__in=status_ids)
Am I correct? so What would be the best approach in terms of easy querying/inserting or performance, etc.

You are basically implementing your own Generic Object, only you limit your ContentType to your hard coded OBJECT_TYPE.
If you are only going to access the database as in your example (get all status objects liked by user x), or a couple specific queries, then your own implementation can be a little faster, of course. But obviously, if later you have to add more objects, or do other things, you may find yourself implementing your whole full generic solution. And like they say, why reinvent the wheel.
If you want better performance, and really only have those two Models to worry about, you may just want to have two different Like tables (StatusLike and ImageLike) and use inheritance to share functionality.
class LikedObject(models.Model):
common_field = ...
class Meta:
abstract = True
def some_share_function():
...
class StatusLikeObject(LikedObject):
user = models.ForeignKey(User, related_name="status_liked_objects")
status = models.ForeignKey(Status, related_name="liked_objects")
class ImageLikeObject(LikedObject):
user = models.ForeignKey(User, related_name="image_liked_objects")
image = models.ForeignKey(Image, related_name="liked_objects")
Basically, either you have a lot of Models to worry about, and then you probably want to use the more Django generic object implementation, or you only have two models, and why even bother with a half generic solution. Just use two tables.

In this case, I would check if your data objects Status and Photo may have many common data fields, e.g. Status.user and Photo.user, Status.title and Photo.title, Status.pub_date and Photo.pub_date, Status.text and Photo.caption, etc.
Could you combine them into an Item object maybe? That Item would have a Item.type field, either "photo" or "status"? Then you would only have a single table and a single object type a user can "like". Much simpler at basically no cost.
Edit:
from django.db import models
from django.utils.timezone import now
class Item(models.Model):
data_type = models.SmallIntegerField(
choices=((1, 'Status'), (2, 'Photo')), default=1)
user = models.ForeignKey(User)
title = models.CharField(max_length=100)
pub_date = models.DateTimeField(default=now)
...etc...
class Like(models.Model):
user = models.ForeignKey(User, related_name="liked_objects")
item = models.ForeignKey(Item)

How to eagerly load all relationships in SQLAlchemy

I have the following model:
class Item(Base):
a = relationship(<whatever>)
b = relationship(<whatever>)
c = relationship(<whatever>)
d = relationship(<whatever>)
other_stuff = Column(<whatever>)
Most of the time, I just want to see the other_stuff column, so I don't specify lazy='joined' in the relationship. But sometimes, I want to see all the joined fields, and I want them to be loaded in one SQL query. I could do the following:
query(Item).options(joinedload('a')).options(joinedload('b')).options(joinedload('c')).options(joinedload('d'))
But I feel like this is a common enough use case that there has to be a prettier way to do it.

You can simply say .options(joinedload('*')).
For reference: http://docs.sqlalchemy.org/en/latest/orm/loading_relationships.html#wildcard-loading-strategies

sqlalchemy relations and query on relations

Suppose I have 3 tables in sqlalchemy. Users, Roles and UserRoles defined in declarative way. How would one suggest on doing something like this:
user = Users.query.get(1) # get user with id = 1
user_roles = user.roles.query.limit(10).all()
Currently, if I want to get the user roles I have to query any of the 3 tables and perform the joins in order to get the expected results. Calling directly user.roles brings a list of items that I cannot filter or limit so it's not very helpful. Joining things is not very helpful either since I'm trying to make a rest interface with requests such as:
localhost/users/1/roles so just by that query I need to be able to do Users.query.get(1).roles.limit(10) etc etc which really should 'smart up' my rest interface without too much bloated code and if conditionals and without having to know which table to join on what. Users model already has the roles as a relationship property so why can't I simply query on a relationship property like I do with normal models?

Simply use Dynamic Relationship Loaders. Code below verbatim from the documentation linked to above:
class User(Base):
__tablename__ = 'user'
posts = relationship(Post, lazy="dynamic")
jack = session.query(User).get(id)
# filter Jack's blog posts
posts = jack.posts.filter(Post.headline=='this is a post')
# apply array slices
posts = jack.posts[5:20]

Django Combining AND and OR Queries with ManyToMany Field

hoping someone can help me out with this.
I'm trying to figure out whether I can construct a query that will allow me to retrieve items from my db based on a ForeignKey field and a ManyToManyField at the same time. The challenging part is that it will need to filter on multiple ManyToMany objects.
An example will hopefully make this clearer. Here are my (simplified) models:
class Item(models.Model):
name = models.CharField(max_length=200)
brand = models.ForeignKey(User, related_name='brand')
tags = models.ManyToManyField(Tag, blank=True, null=True)
def __unicode__(self):
return self.name
class Meta:
ordering = ['-id']
class Tag(models.Model):
name = models.CharField(max_length=64, unique=True)
def __unicode__(self):
return self.name
I would like to build a query that retrieves items based on two criteria:
Items that were uploaded by users that a user is following (called 'brand' in the model). So for example if a user is following the Paramount user account, I would want all items where brand = Paramount.
Items that match the keywords in saved searches. For example the user could make and save the following search: "80s comedy". In this case I would want all items where the tags include both "80s" and "comedy".
Now I know how to construct the query for each independently. For #1, it's:
items = Item.objects.filter(brand=brand)
And for #2 (based on the docs):
items = Item.objects.filter(tags__name='80s').filter(tags__name='comedy')
My question is: is it possible to construct this as a single query so that I don't have to take the hit of converting each query into a list, joining them, and removing duplicates?
The challenge seems to be that there is no way to use Q objects to construct queries where you need an item's manytomany field (in this case tags) to match multiple values. The following query:
items = Item.objects.filter(Q(tags__name='80s') & Q(tags__name='comedy'))
does NOT work.
(See: https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-valued-relationships)
Thanks in advance for your help on this!

After much research I could not find a way to combine this into a single query, so I ended up converting my QuerySets into lists and combining them.

Django's filters automatically AND. Q objects are only needed if you're trying to add ORs. Also, the __in query filter will help you out alot.
Assuming users have multiple brands they like and you want to return any brand they like, you should use:
`brand__in=brands`
Where brands is the queryset returned by something like someuser.brands.all().
The same can be used for your search parameters:
`tags__name__in=['80s','comedy']`
That will return things tagged with either '80s' or 'comedy'. If you need both (things tagged with both '80s' AND 'comedy'), you'll have to pass each one in a successive filter:
keywords = ['80s','comedy']
for keyword in keywords:
qs = qs.filter(tags__name=keyword)
P.S. related_name values should always specify the opposite relationship. You're going to have logic problems with the way you're doing it currently. For example:
brand = models.ForeignKey(User, related_name='brand')
Means that somebrand.brand.all() will actually return Item objects. It should be:
brand = models.ForeignKey(User, related_name='items')
Then, you can get a brands's items with somebrand.items.all(). Makes much more sense.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

subquery in select statement - sqlalchemy

Related

What is the best practice for lookup values in SQLAlchemy?

Django GenereicForeignKey v/s custom manual fields performance/optimization

How to eagerly load all relationships in SQLAlchemy

sqlalchemy relations and query on relations

Django Combining AND and OR Queries with ManyToMany Field

Categories

Resources