What's the difference between an association table and a regular table? - sqlalchemy

I don't think I fully understand association tables. I know how to work with normal tables i.e add rows and what not but I don't understand how to work with an association table.
why would I use the below
student_identifier = db.Table('student_identifier',
db.Column('class_id', db.Integer, db.ForeignKey('classes.class_id')),
db.Column('user_id', db.Integer, db.ForeignKey('students.user_id'))
)
Vs
class studentIdent(db.model):
db.Column(db.Integer, db.ForeignKey('classes.class_id')),
db.Column(db.Integer, db.ForeignKey('students.user_id'))

As mentioned in a comment to the question, you would not bother creating a class for the association table if it only contains the foreign keys linking the two tables in the many-to-many relationship. In that case your first example – an association table – would be sufficient.
However, if you want to store additional information about the nature of the link between the two tables then you will want to create an association object so you can manipulate those additional attributes:
class StudentIdent(db.Model):
__tablename__ = "student_identifier"
course_id = db.Column(
db.Integer,
primary_key=True,
autoincrement=False,
db.ForeignKey('courses.course_id')
)
user_id = db.Column(
db.Integer,
primary_key=True,
autoincrement=False,
db.ForeignKey('students.user_id')
)
enrolment_type = db.Column(db.String(20))
# reason for student taking this course
# e.g., "core course", "elective", "audit"
and then you could create the link between a given student and a particular course by creating a new instance of the association object:
thing = StudentIdent(course_id=3, user_id=6, enrolment_type="elective")
Note: This is just a basic linkage. You can get more sophisticated by explicitly declaring a relationship between the ORM objects.

Related

SQLAlchemy ORM Basic Relationship Patterns -- Provide an example or template. Especially for "One to Many" and "One to One"

Can you give me an example of how to use the software library SQLAlchemy ORM? In particular, how do I build standard database relationships like "One to Many" and "One to One"?
I know that the SQLAlchemy documentation already provides some examples at Basic Relationship Patterns , but I'm looking for examples that explain what's happening for the beginner user and especially discussing tradeoffs that need to be considered.
I've created some examples / templates with explanatory comments:
( a heavier formatted version is here )
# Building 1-to-Many Relationship
# https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html#one-to-many
# back_populates() targets are class attribute names.
# The example is made clearer using my data type prefix notation and specifically o_ as class attributes (but o_ are not table columns!)
# Note the difference between parent_id (integer) and o_parent_obj (sqla object)
# Note: l_children_list is a list of sqla objects.
# Read back_populates() as "this relationship back populates as the following class attribute on the opposing class"
class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
l_children_list = relationship("Child", back_populates="o_parent_obj") # not a table column
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('parent.id'))
o_parent_obj = relationship("Parent", back_populates="l_children_list") # not a table column
# Building 1-to-1 Relationship
# https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html#one-to-one
# Two changes:
# To convert this to “one-to-one”, the “one-to-many” or “collection” side is converted into a scalar relationship using the uselist=False flag.
# Add unique constraint (optional)
#
# Child.o_parent_obj will be 1-to-1 because there is only 1 value in the Child.parent_id column.
#
# Parent.o_first_child will be 1-to-1 at the ORM level, because ORM forces the value to be a scalar via uselist=False flag.
# Tip in docs: 1-to-1 enforcement at the db level is also possible and can be considered:
# This is a db design decision because it's a trade off: it provides referential integrity at the db level but at the cost of an additional db index.
# Enforce 1-to-1 for Parent.o_first_child at the db level as follows:
# put unique constraint on Child.parent_id column to make sure all Child rows point to different Parent rows.
# Note: this unique constraint is different from the foreign key designation because foreign key is uniqueness on the Parent table (not the Child table).
class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
o_first_child = relationship("Child", back_populates="o_parent_obj", uselist=False ) # uselist=False enforces 1-to-1 at ORM level
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('parent.id'), unique=True ) # unique constraint enforces 1-to-1 at db level. Optional. Creates db index.
o_parent_obj = relationship("Parent", back_populates="o_first_child")
# Building 1-way bookkeeping properties Relationship. "many-to-one"
"""
In the db schema design, sometimes there are "bookkeeping" property fields whose value is a row in another table.
These are typically less important fields that are not part of the core db design.
Rather, they are more like bookkeeping properites and the data type for this field is another table.
For example, created_by field to track which user created the data entry.
The python code might want to get the user who created the data, but it won't start from the user to get all the data entries that he created.
Thus, these are called "1-way bookkeeping properties".
Oh, there is a name for these: "many-to-one" relationships, but even the top google search results are poor at explaining them.
How to build a "1-way bookkeeping property"?
"""
class UserAccount(Base):
__tablename__ = 'user_account'
id = Column(Integer, primary_key=True)
# UserAccount does not track any of the bookkeeping properties that point to it.
class SomeData(Base):
__tablename__ = 'some_data'
id = Column(Integer, primary_key=True)
i_user_who_created_the_data = Column(Integer, ForeignKey('user_account.id'))
o_user_who_created_the_data = relationship("UserAccount", foreign_keys="[SomeData.i_user_who_created_the_data]") # 1-way ORM ability: from SomeData to UserAccount
i_user_who_last_viewed_the_data = Column(Integer, ForeignKey('user_account.id'))
o_user_who_last_viewed_the_data = relationship("UserAccount", foreign_keys="[SomeData.i_user_who_last_viewed_the_data]") # 1-way ORM ability: from SomeData to UserAccount

What is the best practice for lookup values in SQLAlchemy?

I am writing a pretty basic Flask application using Flask-SQLAlchemy for tracking inventory and distribution. I could use some guidance on how the best way to handle a lookup table for common values. My database back end will be MySQL and ElasticSearch for searches.
If I have a common mapping structure where all data going into a specific table, say Vehicle, have a common list of values to look up against for the Vehicle.make column, what would the best way to achieve this be?
My thought for approaching this is one of two ways:
Lookup Table
I could set something up similar to this where I have a relationship, and store the make in VehicleMake. However, if my expected list of makes is low (say 10), this seems unnecessary.
class VehicleMake(Model):
id = Column(Integer, primary_key=True)
name = Column(String(16))
cars = relationship('Vehicle', backref='make', lazy='dynamic')
class Vehicle(Model):
id = Column(Integer, primary_key=True)
name = Column(String(32))
Store as a String
I could just store this as a string on the Vehicle model. But would it be a waste of space to store a common value as a string?
class Vehicle(Model):
id = Column(Integer, primary_key=True)
name = Column(String(32))
make = Column(String(16))
My original idea was just to have a dict containing a mapping like this and reference it as needed within the model. I am just not clear how to tie this in when returning the vehicle model.
MAKE_LIST = {
1: 'Ford',
2: 'Dodge',
3: 'Chevrolet'
}
Any feedback is welcome - and if there is documentation that covers this specific scenario I'm happy to read that and answer this question myself. My expected volume is going to be low (40-80 records per week) so it doesn't need to be ridiculously fast, I just want to follow best practices.
The short answer is it depends.
The long answer is that it depends on what you store along with the make of said vehicles and how often you expect to add new types.
If you need to store more than just the name of each make, but also some additional metadata, like the size of the gas tank, the cargo space, or even a sortkey, go for an additional table. The overhead of such a small table is minimal, and if you communicate with the frontend using make ids instead of make names, there is no problem at all with this. Just remember to add an index to vehicle.make_id to make the lookups efficient.
class VehicleMake(Model):
id = Column(Integer, primary_key=True)
name = Column(String(16))
cars = relationship('Vehicle', back_populates="make", lazy='dynamic')
class Vehicle(Model):
id = Column(Integer, primary_key=True)
name = Column(String(32))
make_id = Column(Integer, ForeignKey('vehicle_make.id'), nullable=False)
make = relationship("VehicleType", innerjoin=True)
Vehicle.query.get(1).make.name # == 'Ford', the make for vehicle 1
Vehicle.query.filter(Vehicle.make_id == 2).all() # all Vehicles with make id 2
Vehicle.query.join(VehicleMake)\
.filter(VehicleMake.name == 'Ford').all() # all Vehicles with make name 'Ford'
If you don't need to store any of that metadata, then the need for a separate table disappears. However, the general problem with strings is that there is a high risk of spelling errors and capital/lowercase letters screwing up your data consistency. If you don't need to add new makes much, it's a lot better to just use Enums, there are even MySQL specific ones in SQLAlchemy.
import enum
class VehicleMake(enum.Enum):
FORD = 1
DODGE = 2
CHEVROLET = 3
class Vehicle(Model):
id = Column(Integer, primary_key=True)
name = Column(String(32))
make = Column(Enum(VehicleMake), nullable=False)
Vehicle.query.get(1).make.name # == 'FORD', the make for vehicle 1
Vehicle.query.filter(Vehicle.make == VehicleMake(2)).all() # all Vehicles with make id 2
Vehicle.query.filter(Vehicle.make == VehicleMake.FORD).all() # all Vehicles with make name 'Ford'
The main drawback of enums is that they might be hard to extend with new values, although at least for Postgres the dialect specific version was a lot better at this than the general SQLAlchemy one, have a look at sqlalchemy.dialects.mysql.ENUM instead. If you want to extend your existing enum, you can always just execute raw SQL in your Flask-Migrate/Alembic migrations.
Finally, the benefits of using strings is that you can always programmatically enforce your data consistency. But, this comes at the cost that you have to programmatically enforce your data consistency. If the vehicle make can be changed or inserted by external users, even colleagues, this will get you in trouble unless you're very strict about what enters your database. For example, it might be nice to uppercase all values for easy grouping, since it effectively reduces how much can go wrong. You can do this during writing, or you can add an index on sqlalchemy.func.upper(Vehicle.make) and use hybrid properties to always query the uppercase value.
class Vehicle(Model):
id = Column(Integer, primary_key=True)
name = Column(String(32))
_make = Column('make', String(16))
#hybrid_property
def make(self):
return self._make.upper()
#make.expression
def make(cls):
return func.upper(cls._make)
Vehicle.query.get(1).make.upper() # == 'FORD', the make for vehicle 1
Vehicle.query.filter(Vehicle.make == 'FORD').all() # all Vehicles with make name 'FORD'
Before you make your choice, also think about how you want to present this to your user. If they should be able to add new options themselves, use strings or the separate table. If you want to show a dropdown of possibilities, use the enum or the table. If you have an empty database, it's going to be difficult to collect all string values to display in the frontend without needing to store this as a list somewhere in your Flask environment as well.

Order of defining association object, related tables using Flask-SQLAlchemy?

I'm working through Miguel Grinberg's Flask book.
In chapter 12, he has you define an association object Follow with followers and the followed, both mapping to a user, as well as adding followers and followed to the Users class.
I originally put the association table after the User table, and got an error when I ran python manage.py db upgrade:
line 75, in User followed = db.relationship('Follow', foreign_keys= [Follow.follower_id],
NameError: name 'Follow' is not defined
Then I moved the association object class Follow above the class User definition, and re-ran the migration. This time it worked.
Can someone explain the reason for this?
Both class definitions seem to need the other.
Is order something I should know about flask-sqlalchemy specifically, sqlalchemy, or ORM in general?
The SQLAlchemy documentation says "we can define the association_table at a later point, as long as it’s available to the callable after all module initialization is complete" and the relationship is defined in the class itself.
That is, for the case you're using and association_table to show the relationship between two separate models. I didn't see anything about this case in the Flask-SQLAlchemy or SQLAlchemy documentation, but it's very possible I just didn't recognize the answer when I saw it.
class User(UserMixin, db.Model):
__tablename__ = 'users'
...
followed = db.relationship('Follow',
foreign_keys=[Follow.follower_id],
backref=db.backref('follower', lazy='joined'),
lazy='dynamic',
cascade='all, delete-orphan')
followers = db.relationship('Follow',
foreign_keys=[Follow.followed_id],
backref=db.backref('followed', lazy='joined'),
lazy='dynamic',
cascade='all, delete-orphan')
Order of definition with:
class Follow(db.Model):
__tablename__ = 'follows'
follower_id = db.Column(db.Integer, db.ForeignKey('users.id'), primary_key=True)
followed_id = db.Column(db.Integer, db.ForeignKey('users.id'), primary_key=True)
timestamp = db.Column(db.DateTime, default=datetime.utcnow)
Or maybe order doesn't matter at all, and I am misattributing a problem?
First of all if you are going to use some class in later it must be defined already. The defination order is important, you can not use a class which doesn't exist yet.
Second, sqlalchemy says you will defined a third table to create relationship. If you use this approach User and Follow class would not access each other attributes so it won't cause defination order error.
Finally, if you won't define an associate table then you have to put classes in right order, to use attributes of them.

multiple foreign keys on a django model

I am seeking to create a relational database design in Django where one table has relationships with multiple models in the DB.
The sample models are excerpted below.
from __future__ import unicode_literals
from django.db import models
class State(models.Model):
state_name = models.CharField(max_length=100, unique=True)
class District(models.Model):
state_id = models.ForeignKey(State, on_delete=models.CASCADE)
district_name = models.CharField(max_length=100)
class County(models.Model):
county_id = models.ForeignKey(County, on_delete=models.CASCADE)
district_id = models.ForeignKey(District, on_delete=models.CASCADE)
county_name = models.CharField(max_length=100, unique=True)
class Kiosk(models.Model):
county_id = models.ForeignKey(County, on_delete=models.CASCADE)
kiosk_name = models.CharField(max_length=100)
kiosk_type = models.CharField(max_length=100)
kiosk_size = models.CharField(max_length=100)
class Operator(models.Model):
kiosk_id = models.ForeignKey(County, on_delete=models.CASCADE)
operator_name = models.CharField(max_length=100)
The overall goal is to register kiosks and their operators in the administrative territories. All relationships between the models are one-to-many. Administrative territories are hierarchical from the States-Counties-Townships which according to the schema design leads to one table having many foreign keys. For example Township(state_id, county_id) and Kiosk(state_id, county_id, township_id) and so forth.
If such a design is appropriate, then how would i model it in Django such that a single model like kiosk has 2 or 3 foreign keys relating to the other models?
If i attempt to add foreign keys as it appears in the County model i get the following error on applying migrations.
You are trying to add a non-nullable field 'region_id' to county without a default; we can't do that (the database needs something to populate existing rows).
Please select a fix:
1) Provide a one-off default now (will be set on all existing rows with a null value for this column)
2) Quit, and let me add a default in models.py
Select an option:
It is obviously not the way to do it and i am seeking guidance from anyone who might have a solution to this problem.
I am working in Django 1.10.
Thank you all.

simple django join query without foreign key

** Below are two models teacher and loginstudent, A teacher can teach multiple sections and multiple students can be in same section.So section field cannot be a foreign key.If I want to find out all courses taken by a particular student,what should I do? Is there any simple django query just like sql did.How to do?
**
class Teacher(models.Model):
username=models.CharField(max_length=50)
password=models.CharField(max_length=89)
course=models.CharField(max_length=30)
section=models.CharField(max_length=30)
class LoginStudent(models.Model):
username=models.CharField(max_length=50)
password=models.CharField(max_length=89)
section=models.CharField(max_length=30)
OK, I would recommend to stick to the default user system from Django and build one-to-one profiles of the specific type where needed. Latter on you can differentiate between the users based on the value of the foreign key or you could implement permissions which are tied to the user type
from django.db.models.query_utils import Q
# for example, this could be a way to extend users to hold teachers and students
class TeacherProfile(models.Model):
user = models.OneToOneField(User, related_name='teacher_profile')
# other relevant teacher profile items
ONLY_TEACHERS_FILTER = Q(teacher_profile__isnull=False) & Q(student_profile__isnull=True)
class StudentProfile(models.Model):
user = models.OneToOneField(User, related_name='student_profile')
# other relevant student profile items
sections = models.ManyToManyField('Section', related_name='students') # mind the quotes to Section name
class Section(models.Model)
name = models.CharField(max_length=50)
# other section fields goes here...
class Course(models.Model):
name = models.CharField(max_length=50)
teacher = models.ForeingKey(User, related_name='courses', limit_choices_to=ONLY_TEACHERS_FILTER)
sections = models.ManyToManyField(Section, related_name='courses')
Now to answer to the question what are the courses to which a student attends to:
queryset = Course.objects.filter(section__students__in=[user])
Hope it helps!