Order of defining association object, related tables using Flask-SQLAlchemy? - sqlalchemy

I'm working through Miguel Grinberg's Flask book.
In chapter 12, he has you define an association object Follow with followers and the followed, both mapping to a user, as well as adding followers and followed to the Users class.
I originally put the association table after the User table, and got an error when I ran python manage.py db upgrade:
line 75, in User followed = db.relationship('Follow', foreign_keys= [Follow.follower_id],
NameError: name 'Follow' is not defined
Then I moved the association object class Follow above the class User definition, and re-ran the migration. This time it worked.
Can someone explain the reason for this?
Both class definitions seem to need the other.
Is order something I should know about flask-sqlalchemy specifically, sqlalchemy, or ORM in general?
The SQLAlchemy documentation says "we can define the association_table at a later point, as long as it’s available to the callable after all module initialization is complete" and the relationship is defined in the class itself.
That is, for the case you're using and association_table to show the relationship between two separate models. I didn't see anything about this case in the Flask-SQLAlchemy or SQLAlchemy documentation, but it's very possible I just didn't recognize the answer when I saw it.
class User(UserMixin, db.Model):
__tablename__ = 'users'
...
followed = db.relationship('Follow',
foreign_keys=[Follow.follower_id],
backref=db.backref('follower', lazy='joined'),
lazy='dynamic',
cascade='all, delete-orphan')
followers = db.relationship('Follow',
foreign_keys=[Follow.followed_id],
backref=db.backref('followed', lazy='joined'),
lazy='dynamic',
cascade='all, delete-orphan')
Order of definition with:
class Follow(db.Model):
__tablename__ = 'follows'
follower_id = db.Column(db.Integer, db.ForeignKey('users.id'), primary_key=True)
followed_id = db.Column(db.Integer, db.ForeignKey('users.id'), primary_key=True)
timestamp = db.Column(db.DateTime, default=datetime.utcnow)
Or maybe order doesn't matter at all, and I am misattributing a problem?

First of all if you are going to use some class in later it must be defined already. The defination order is important, you can not use a class which doesn't exist yet.
Second, sqlalchemy says you will defined a third table to create relationship. If you use this approach User and Follow class would not access each other attributes so it won't cause defination order error.
Finally, if you won't define an associate table then you have to put classes in right order, to use attributes of them.

Related

What's the difference between an association table and a regular table?

I don't think I fully understand association tables. I know how to work with normal tables i.e add rows and what not but I don't understand how to work with an association table.
why would I use the below
student_identifier = db.Table('student_identifier',
db.Column('class_id', db.Integer, db.ForeignKey('classes.class_id')),
db.Column('user_id', db.Integer, db.ForeignKey('students.user_id'))
)
Vs
class studentIdent(db.model):
db.Column(db.Integer, db.ForeignKey('classes.class_id')),
db.Column(db.Integer, db.ForeignKey('students.user_id'))
As mentioned in a comment to the question, you would not bother creating a class for the association table if it only contains the foreign keys linking the two tables in the many-to-many relationship. In that case your first example – an association table – would be sufficient.
However, if you want to store additional information about the nature of the link between the two tables then you will want to create an association object so you can manipulate those additional attributes:
class StudentIdent(db.Model):
__tablename__ = "student_identifier"
course_id = db.Column(
db.Integer,
primary_key=True,
autoincrement=False,
db.ForeignKey('courses.course_id')
)
user_id = db.Column(
db.Integer,
primary_key=True,
autoincrement=False,
db.ForeignKey('students.user_id')
)
enrolment_type = db.Column(db.String(20))
# reason for student taking this course
# e.g., "core course", "elective", "audit"
and then you could create the link between a given student and a particular course by creating a new instance of the association object:
thing = StudentIdent(course_id=3, user_id=6, enrolment_type="elective")
Note: This is just a basic linkage. You can get more sophisticated by explicitly declaring a relationship between the ORM objects.

Can not mix get and filter together?

When I am trying to get next query:
answer = sess.query(User).filter(User.id==1).get(1)
I am getting error: sqlalchemy.exc.InvalidRequestError: Query.get() being called on a Query with existing criterion.
The query:
answer = sess.query(User).get(1)
works fine.
Why the first one is not working?
My class definition:
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
adr = relationship('Address', backref='uuu')
From documentation of Query.get:
get() is only used to return a single mapped instance, not multiple instances or individual column constructs, and strictly on a single primary key value. The originating Query must be constructed in this way, i.e. against a single mapped entity, with no additional filtering criterion. Loading options via options() may be applied however, and will be used if the object is not yet locally present.

Django GenereicForeignKey v/s custom manual fields performance/optimization

I'm trying to build a typical social networking site. there are two types of objects mainly.
photo
status
a user can like photo and status. (Note that these two are mutually exclusive)
means, We have two table (1) for Image only and other for status only.
now when a user likes an object(it could be a photo or status) how should I store that info.
I want to design a efficient SQL schema for this.
Currently I'm using Genericforeignkey(GFK)
class LikedObject(models.Model):
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = GenericForeignKey('content_type', 'object_id')
but yesterday I thought if I can do this without using GFK efficiently?
class LikedObject(models.Model):
OBJECT_TYPE = (
('status', 'Status'),
('image', 'Image/Photo'),
)
user = models.ForeignKey(User, related_name="liked_objects")
obj_id = models.PositiveIntegerField()
obj_type = models.CharField(max_length=63, choices=OBJECT_TYPE)
the only difference I can understand is that I have to make two queries if I want to get all liked_status of a particular user
status_ids = LikedObject.objects.filter(user=user_obj, obj_type='status').values_list('object_id', flat=True)
status_objs = Status.objects.filter(id__in=status_ids)
Am I correct? so What would be the best approach in terms of easy querying/inserting or performance, etc.
You are basically implementing your own Generic Object, only you limit your ContentType to your hard coded OBJECT_TYPE.
If you are only going to access the database as in your example (get all status objects liked by user x), or a couple specific queries, then your own implementation can be a little faster, of course. But obviously, if later you have to add more objects, or do other things, you may find yourself implementing your whole full generic solution. And like they say, why reinvent the wheel.
If you want better performance, and really only have those two Models to worry about, you may just want to have two different Like tables (StatusLike and ImageLike) and use inheritance to share functionality.
class LikedObject(models.Model):
common_field = ...
class Meta:
abstract = True
def some_share_function():
...
class StatusLikeObject(LikedObject):
user = models.ForeignKey(User, related_name="status_liked_objects")
status = models.ForeignKey(Status, related_name="liked_objects")
class ImageLikeObject(LikedObject):
user = models.ForeignKey(User, related_name="image_liked_objects")
image = models.ForeignKey(Image, related_name="liked_objects")
Basically, either you have a lot of Models to worry about, and then you probably want to use the more Django generic object implementation, or you only have two models, and why even bother with a half generic solution. Just use two tables.
In this case, I would check if your data objects Status and Photo may have many common data fields, e.g. Status.user and Photo.user, Status.title and Photo.title, Status.pub_date and Photo.pub_date, Status.text and Photo.caption, etc.
Could you combine them into an Item object maybe? That Item would have a Item.type field, either "photo" or "status"? Then you would only have a single table and a single object type a user can "like". Much simpler at basically no cost.
Edit:
from django.db import models
from django.utils.timezone import now
class Item(models.Model):
data_type = models.SmallIntegerField(
choices=((1, 'Status'), (2, 'Photo')), default=1)
user = models.ForeignKey(User)
title = models.CharField(max_length=100)
pub_date = models.DateTimeField(default=now)
...etc...
class Like(models.Model):
user = models.ForeignKey(User, related_name="liked_objects")
item = models.ForeignKey(Item)

SQLAlchemy db.session.query() vs model.query

For a simple return all results query should one method be preferred over the other? I can find uses of both online but can't really find anything describing the differences.
db.session.query([my model name]).all()
[my model name].query.all()
I feel that [my model name].query.all() is more descriptive.
It is hard to give a clear answer, as there is a high degree of preference subjectivity in answering this question.
From one perspective, the db.session is desired, because the second approach requires it to be incorporated in your model as an added step - it is not there by default as part of the Base class. For instance:
Base = declarative_base()
DBSession = scoped_session(sessionmaker())
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
fullname = Column(String)
password = Column(String)
session = Session()
print(User.query)
That code fails with the following error:
AttributeError: type object 'User' has no attribute 'query'
You need to do something like this:
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
fullname = Column(String)
password = Column(String)
query = DBSession.query_property()
However, it could also be argued that just because it is not enabled by default, that doesn't invalidate it as a reasonable way to launch queries. Furthermore, in the flask-sqlalchemy package (which simplifies sqlalchemy integration into the flask web framework) this is already done for you as part of the Model class (doc). Adding the query property to a model can also be seen in the sqlalchemy tutorial (doc):
class User(object):
query = db_session.query_property()
....
Thus, people could argue either approach.
I personally have a preference for the second method when I am selecting from a single table. For example:
serv = Service.query.join(Supplier, SupplierUsr).filter(SupplierUsr.username == usr).all()
This is because it is of smaller line length and still easily readable.
If am selecting from more than one table or specifying columns, then I would use the model query method as it extracting information from more than one model.
deliverables = db.session.query(Deliverable.column1, BatchInstance.column2).\
join(BatchInstance, Service, Supplier, SupplierUser). \
filter(SupplierUser.username == str(current_user)).\
order_by(Deliverable.created_time.desc()).all()
That said, a counter argument could be made in always using the session.query method as it makes the code more consistent, and when reading left to right, the reader immediately knows that the sqlalchemy directive they are going to read will be query, before mentally absorbing what tables and columns are involved.
At the end of the day, the answer to your question is subjective and there is no correct answer, and any code readability benefits either way are tiny. The only thing where I see a strong benefit is not to use model query if you are selecting from many tables and instead use the session.query method.

Create Active/Archive Models in a DRY way (Django)

I have a model like the following, which is growing too large and needs to be split into a separate active table. At the end of the day, one table will contain all objects and the other will only contain active objects.
class Tickets(models.Model):
price = ....
number = .....
date = ....
active = ....
parent = models.ForeignKey('self', related_name='children')
ManyMoreFields
There are two sources of complexity:
1) The parent field on the ActiveTickets table is going to point to the Tickets table. The related_name should not change.
2) The ActiveTickets and Tickets table both have proxy Models that inherit from them.
class CityTickets(Tickets):
class Meta:
proxy = True
class ActiveCityTickets(ActiveTickets):
class Meta:
proxy = True
Obviously, I could just copy and paste all of the fields in Ticket (there are many), but that is not the right way of doing it. I've tried to use Abstract inheritance and Mixins (defining the fields in a separate class that is inherited by both Tickets and ActiveTickets).
One issue with abstract inheritance is that the ForeignKey field, parent, is causing issues since it's duplicative and the related_name is the same. Generally, my attempts have caused my unit and functional tests to fail.
What are some elegant approaches here? Should I think about creating two separate MySQL tables and then just using a single Model with multiple managers (and db routers)? Is that reasonable?
Maybe this helps:
class Base(models.Model):
m2m = models.ManyToManyField(OtherModel, related_name="%(app_label)s_%(class)s_related")
class Meta:
abstract = True
https://docs.djangoproject.com/en/dev/topics/db/models/#be-careful-with-related-name