SQLAlchemy, one to many vs many to one - sqlalchemy

I have the following data:
CREATE TABLE `groups` (
`bookID` INT NOT NULL,
`groupID` INT NOT NULL,
PRIMARY KEY(`bookID`),
KEY( `groupID`)
);
and a book table which basically has books( bookID, name, ... ), but WITHOUT groupID. There is no way for me to determine what the groupID is at the time of the insert for books.
I want to do this in sqlalchemy. Hence I tried mapping Book to the books joined with groups on book.bookID=groups.bookID.
I made the following:
tb_groups = Table( 'groups', metadata,
Column('bookID', Integer, ForeignKey('books.bookID'), primary_key=True ),
Column('groupID', Integer),
)
tb_books = Table( 'books', metadata,
Column('bookID', Integer, primary_key=True),
tb_joinedBookGroup = sql.join( tb_books, tb_groups, \
tb_books.c.bookID == tb_groups.c.bookID)
and defined the following mapper:
mapper( Group, tb_groups, properties={
'books': relation(Book, backref='group')
})
mapper( Book, tb_joinedBookGroup )
...
However, when I execute this piece of code, I realized that each book object has a field groups, which is a list, and each group object has books field which is a singular assigment. I think my definition here must have been causing sqlalchemy to be confused about the many-to-one vs one-to-many relationship.
Can someone help me sort this out?
My desired goal is:
g.books = [b, b, b, .. ]
book.group = g
where g is an instance of group, and b is an instance of book.

Pass userlist=False to relation() to say that property should represent scalar value, not collection. This will for independent on whether there is primary key for this column, but you probably want to define unique constraint anyway.

Related

What is the point of providing a JOIN condition when there are foreign keys?

TL;DR: Why do we have to add ON table1.column = table2.column?
This question asks roughly why do we need to have foreign keys if joining works just fine without them. Here, I'd like to ask the reverse. Given the simplest possible database, like this:
CREATE TABLE class (
class_id INT PRIMARY KEY,
class_name VARCHAR(40)
);
CREATE TABLE student (
student_id INT PRIMARY KEY,
student_name VARCHAR(40),
class_id INT,
FOREIGN KEY(class_id) REFERENCES class(class_id) ON DELETE SET NULL
);
… and a simple join, like this:
SELECT student_id, student_name, class_name
FROM student
JOIN class
ON student.class_id = class.class_id;
… why can't we just omit the ON clause?
SELECT student_id, student_name, class_name
FROM student
JOIN class;
To me, the line FOREIGN KEY(class_id) REFERENCES class(class_id) … in the definition of student already includes all the necessary information for the FROM student JOIN class to have an implicit ON student.class_id = class.class_id condition; but we still have to add it. Why is that?
For this you must consider the JOIN operation. It doesn't check if your two table or collection have relation or not. So the simple join without condition (ON) you will have a big result with all possibilities.
The ON operation filters to get your expected result
Reposting Damien_The_Unbeliever's comment as an answer
you don't have to join on foreign keys;
sometimes multiple foreign keys exist between the same pair of tables.
Also, SQL is a crusty language without many shortcuts for the most common use case.
JOIN condition is an expression which specifies the maching criteria, and it is checked during JOIN process. It can cause a fail only if syntax error occures.
FOREIGN KEY is a rule for data consistency checking subsystem, and it is checked during data change. It will cause a fail if the data state (intermnediate and/or final) does not match the rule.
In other words, there is nothing in common between them, they are completely different and unrelated things.
I feel like I have to reiterate parts of the question. Please, give it a second read - Dima Parzhitsky
Imagine that your offer is accepted. I have tables:
CREATE TABLE users (userid INT PRIMARY KEY);
CREATE TABLE messages (sender INT REFERENCES users (userid),
receiver INT REFERENCES users (userid));
I write SELECT * FROM users JOIN messages.
What reference must be used for joining condition? And justify your assumption...

Referencing data from multiple tables

I have three simple entities :
Course is like a Book, a sell-able product. The Course entity represents a course and has various properties, such as Duration, Fees, Author, Type and so on.
Course
{
int Id;
string Title;
}
A topic is like an individual page in a Book, it has the actual learning content. A topic may appear in multiple courses.
Topic
{
int Id;
string Title;
}
In context of a book, Quiz is also an individual page which holds questions instead of learning content. Again a Quiz may be appear into multiple courses.
Quiz
{
int Id;
string Title;
}
Now that i have individual Topics and Quizzes i wish to have a table that will assemble Topics and Quizzes into a Book. Consider this table as Table of Contents in a book. Below is a outline of what i am expecting it to look like :
CourseContents
{
int CourseId; // Foreign-Key to Courses.Id
int Page; // Foreign-Key to either Topic.Id or Quiz.Id
int SNo; // Sequence of this page (topic/quiz) in the course, much like page number in a book.
int Type // Type of the page i.e, Quiz or Topic.
}
Is there any way to achieve this in RDBMS ?
Attempt to Solution
One approach i am looking at is creating a table to create a unique identifier for a given Course Item. Then use it in mapping tables Courses-Topics and Courses-Quizzes. Please refer below :
CourseContents
{
int Id; // CourseContentId Primary-Key for this table
int CourseId; // Foreign key to Course.Id
int SNo; // Serial number of an item in this course;
}
CourseTopics
{
int TopicId; // Foreign-Key to Topics.Id
int CourseContentsId; // Foreign-Key to CourseContents.Id
}
CourseQuizzes
{
int QuizId; // Foreign-Key to Quizzes.Id
int CourseContentsId; // Serial number of the quiz in the course
}
Problem : The CourseContentId represent a particular position ( of Topic/Quiz ) in a particular course. Two items cannot occupy same position in a course sequence, hence one CourseContentId must be associated with just one item in either CourseTopics or CourseQuizzes. How can we put unique constraint on CourseContentsId across two tables ?
Further Addition
The above said problem can be solved by adding a ContentType column in CourseContents, CourseTopics and CourseQuizzes column. Then applying Check constraint on the tables to make sure :
CourseContents has a unique combination of CourseContentId and ContentType.
CourseTopics & CourseQuizzes must have the same content Type across.
Adding a Foreign key referencing CourseContents(CourseContentId, ContentType) in CourseTopics & CourseQuizzes tables.
This will ensure that a CourseContentId will not appear in both the tables.
The CourseContentId represent a particular position ( of Topic/Quiz ) in a particular course.
CourseTopics
{
int TopicId; // Foreign-Key to Topics.Id
int CourseContentsId; -- first of 3-part FK
int Page; -- added
int SNo; -- added
PRIMARY KEY(TopicId, CourseContentsId, Page, SNo), -- for JOINing one way
INDEX (CourseContentsId, Page, SNo, TopicId) -- for JOINing the otehr way
}
Meanwhile, ...
I guess that your main Problem is embodied in this one line:
int Page; // Foreign-Key to either Topic.Id or Quiz.Id
That is impractical. The solution is to have a single table for Topic and Page and differentiate from there.
CREATE TABLE CourseContents (
CourseContentsId INTEGER NOT NULL PRIMARY KEY,
CourseContentType CHAR(1) CHECK (CourseContentType IN ('T', 'Q')),
CourseId INTEGER REFERENCES Courses(Id),
SNo INTEGER NOT NULL,
CONSTRAINT UniqueCourseContent UNIQUE (CourseId, SNo),
CONSTRAINT UniqueCourseContentMapping UNIQUE (CourseContentsId, CourseContentType),
);
Course Contents table generates a Unique Id ( CourseContentsId ) for each CourseId and SNo combination, which can then be referenced in the Topics & Quizzes table. As there are two different tables (Topics & Quizzes), we introduce another column that identifies the Type of the Content(Topic/Quiz) that it is linked to. By using a composite UNIQUE constraint on CourseContentsId & CourseContentType we make sure that each entry can be linked to only one content type.
CREATE TABLE CourseTopics (
CourseContentsId INTEGER NOT NULL,
CourseContentType CHAR(1) DEFAULT 'T' CHECK (CourseContentType = 'T'),
TopicId INTEGER REFERENCES Topics(Id),
PRIMARY KEY (CourseContentsId, CourseContentType),
FOREIGN KEY (CourseContentsId, CourseContentType) REFERENCES CourseContents (CourseContentsId, CourseContentType)
);
Course Topics table is mapping table between Topics and Courses ( we have many-to-many relationship between Courses & Topics tables ). The foreign & primary key to CourseContents table ensures that we'll have one entry for each CourseContent ( in other words Course & SNo ). The table restricts CourseContentType to only accept 'T', which means a given CourseContentId must have Content Type of Topic inorder to be linked with a Topic.
CREATE TABLE CourseQuizzes (
CourseContentsId INTEGER NOT NULL,
CourseContentType CHAR(1) DEFAULT 'Q' CHECK (CourseContentType = 'Q'),
QuizId INTEGER REFERENCES Quizzes(Id),
PRIMARY KEY (CourseContentsId, CourseContentType),
FOREIGN KEY (CourseContentsId, CourseContentType) REFERENCES CourseContents (CourseContentsId, CourseContentType)
);
Similar to the Topics table we now create CourseQuizzes table. Only difference is here we have CourseContentType 'Q'.
Finally to simplify querying, we can create a view that joins these tables together. For e.g, the view below will list : CourseId, SNo, ContentType, TopicId, QuizId. In context of a book, with this view you can get what's on a particular Page Number (SNo) of a given book (Course), with type of content on the page (Topic or Quiz) and the id of the content.
CREATE VIEW CourseContents_All AS
SELECT CourseContents.CourseId, CourseContents.SNo, CourseContents.CourseContentType , CourseTopics.Id, CourseQuizzes.Id
FROM CourseContents
LEFT JOIN CourseTopics ON (CourseContents.CourseContentsId = CourseTopics.CourseContentsId)
LEFT JOIN CourseQuizzes ON (CourseContents.CourseContentsId = CourseQuizzes.CourseContentsId);
The advantages that i feel with this approach are :
This structure follows inheritance, which means we can support more content types by just adding another table and modifying CourseContentType Check constraint in CourseContents table.
For a given CourseId and SNo. i also know the content type. This certainly will help in the application code.
Note : Check constraint does not work in MySQL. For it one needs to use Triggers instead.

Sql correct pratice when making a collection of tables

I wasn't sure what to call this post but here is my problem:
I have two tables: Online_Module & Offline_Module These two tables are used in my program to determine if the learning module has to be taken online or on location.
Now I also have a table called Academy. An Academy consists of many modules. For this I wanted to create the following sub-table: Academy_has_Module
And here lies the problem. Because the Online_Module and Offline_Module are not in the same table one of the values in my Academy_has_Module will always be null
Here are some pictures that show the buildup of these tables:
As you can see, one of the values will always be null. I want to know, what is best practice in situations like this?
1) create a table Module, holding fields commong for online and offline module:
id INT,
description
material
status
category
2) then keep the offline/online module tables (edit: minus the fields -except the id field - you are now keeping in the Module table ), but make them FK reference the new Module table, using the Module table as an intermediate link.
Edit:
Now, not to overwhelm you with a lot of stuff, but there are several questions you have to Q&A yourself, ie:
- can module be only offline/online?
- if yes, do i want to enforce it 100% in DB?
Because with my solution, you can have one Module and have it referenced by several Offline/Online modules. There are ways to solve it, but i think they would go far beyound what you asked, i am just mentioning it so you know..
As to getting the information (maybe not the best, but this is my level for now, if anyone knows better, feel free to teach me a new trick :)). *Notice: ugly coding, too tired for full up coding style :D *:
select *, case
when OffM.Id is null and OnM.Id is null then 'No module!'
when OffM.Id is not null and OnM.Id is not null then 'Too many modules!'
when OffM.Id is null and OnM.Id is not null then 'Online module!'
when OffM.Id is not null and OnM.Id is null then 'Offline module!'
end --probably a different, better way to compare?
from Academy_has_Module as AHM
join Module as M
on Academy.Module = M.Id
left join OfflineModule as OffM
on OffM.ModuleID = M.Id
left join OnlineModule as OnM
on OnM.ModuleID = M.Id
But now, from what i understand, if you added an ModuleType into the Module table, ORM frameworks (i dont really use them that much to be honest, so no 1st hand experience), can use this to return you an object of the correct class. But this is going much deeper into the whole architecture and technology used in your project and is outside the scope of this question and even my actual experience.
EDIT2:
Ok, one more thing that came to my mind: Is it not reasonable to change the online/offline module table structure to be same somehow? for instance online module :
-could have a mentor/responsible person as weel
-could be open from - to datetime
-offline module doesnt have a name?
-location for online module would be ie 'Online'
and just merge the tables together with a ModuleType - either constraint or a FK to an enumeration table (not sure what the right term is in English). Maybe a little bit forced, but (again, a lot of this depends on overall requirements, i have never seen even a simple table being added in a single iteration, it always influences something and it propagates inside the design) could make your life simple. Sometimes, its better to waste few bytes of space per record then trying to be too cute and getting bit in the posterior down the road.
Have a nice day
This is superclass/subclass design (or super-entity/sub-entity if preferred). To expand on #VladislavZalesak's answer, place the common attributes into a super-entity table. But don't forget to implement the proper data integrity checks.
create table Module(
ID int not null primary key,
ModType char( 1 ) not null,
Name varchar( 45 ) not null,
Description text,
MaterialID int,
StatusID int,
CategoryID int,
constraint TypeCK check( ModType in( 'O', 'F' ))
);
create unique index ModuleType_UIX on Module( ID, ModType );
Now, if ID is the PK, it must be unique. So why, you ask, do we create a unique index with ID and ModType? So we can reference them as a group:
create table OnlineModule(
ID int not null primary key,
ModType char( 1 ) not null default 'O',
Price double,
Time varchar( 45 ),
constraint OnlineTypeCK check( ModType = 'O' ),
constraint OnlineTypeFK foreign key( ID, ModType )
references Module( ID, ModType )
);
create table OfflineModule(
ID int not null primary key,
ModType char( 1 ) not null default 'F',
StartDate datetime,
EndDate datetime,
Mentor varchar( 45 ),
constraint OfflineTypeCK check( ModType = 'F' ),
constraint OfflineTypeFK foreign key( ID, ModType )
references Module( ID, ModType )
);
Now your Acadamy_has_Module table needs only the one FK to Module, which gives most of the information you need, even the fact that it is offline or online.
Both submodules have an identity relationship with the module table, not just by ID, which would be technically sufficient, but also by module type, which implements the sub/super relationship. The check constraints keep the structure enforced.
You can create views which join module to either submodule to show all the fields of each type of submodule.
It's extensible in that you can add new submodules. But if you think that would be likely, you might want to create a submodule_type table and define the ModType field of Module as an FK to that table. That would eliminate having to alter the check constraint of Module every time you added a new subtype.
Kinda cumbersome when you first look at it but it's flexible and solves some up-front design problems.

automatically retrieve data from related tables

I'm working with a database that contains a table called model_meta which contains metadata about all the various models in use by the application. Here is the relevant data structure:
CREATE TABLE model_meta (
id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(64),
oo INT(11),
om INT(11),
mo INT(11),
mm INT(11),
INDEX (name)
);
CREATE TABLE inventory (
id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
type VARCHAR(255),
customers_id INT(11)
);
CREATE TABLE customers (
id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255),
contact VARCHAR(255)
);
The columns oo, om, mo, and mm in the model_meta table contain a comma-separated list of ids to which that model has the specified relationship (i.e. one-to-one, one-to-many, etc.).
When a client requests data, all I'm given is the name of the table they're requesting (e.g. 'inventory') - from that, I need to determine what relationships exist and query those tables to return the appropriate result set.
Given a single variable (let's call it $input) that contains the name of the requested model, here are the steps:
get model metadata: SELECT model_meta.* FROM model_meta WHERE model_meta.name = $input;
determine which, if any, of the relationship columns (oo, om, mo, mm) contain values - keeping in mind that they can contain a comma-separated list of values.
use the values from step 2 to determine the name of the related model(s) - for the sake of example, let's just say that only mo contains a value and we'll refer to it as $mo.
So: SELECT model_meta.name FROM model_meta WHERE model_meta.id = $mo;
Let's call this result $related.
Finally, select data from the requested table and all tables that are related to it - keeping in mind that we may be dealing with a one-to-one, one-to-many, many-to-one, or many-to-many relationship. For this specific example:
In psuedo-SQL: SELECT $input.*, $related.* FROM $input LEFT JOIN $related ON ($related.id = $input.$related_id);
This method uses three separate queries - the first to gather metadata about the requested table, the second to gather the names of related tables, and the third to query those tables and return the actual data.
My question: Is there an elegant way to combine any of these queries, reducing their number from from 3 to 2 - or even down to one single query?
The real goal, of course, is to in some way automate the retrieval of data from related tables (without the client having to knowing how the tables are related). That's the goal.

How to use Django with legacy readonly database tables with composite primary keys?

I want to use Django for a client project that has a legacy database. If at all possible I would like to be able to use the Django admin interface. However, the database has tables with multicolumn primary keys, which I see Django does not like - the admin interface throws a MultipleObjectsReturned exception.
What are the options here? I can add new tables, but I can't change existing tables since other projects are already adding data to the database. I've seen other questions mentioning surrogate keys, but it seems like that would require changing the tables.
EDIT: The database in question is a MySQL database.
You are talking about a legacy READONLY database then, perhaps you can create an external schema (views) with no multi-column PKs. For example you can concatenate field keys. Here and example:
For example:
Tables:
create table A (
a1 int not null,
a2 int not null,
t1 varchar(100),
primary key (a1, a2)
)
create table B (
b1 int not null,
b2 int not null,
a1 int not null,
a2 int not null,
t1 varchar(100),
primary key (b1, b2),
constraint b_2_a foreign key (a1,a2)
references A (a1, a2)
)
External schema to be read by django:
Create view vA as
select
a1* 1000000 + a2 as a, A.*
from A
Create view vB as
select
b1* 1000000 + b2 as b,
a1* 1000000 + a2 as a, B.*
from B
django models:
class A(models.Model):
a = models.IntegerField( primary_key=True )
a1 = ...
class Meta(CommonInfo.Meta):
db_table = 'vA'
class B(models.Model):
b = models.IntegerField( primary_key=True )
b1 = ...
a = models.ForeignKey( A )
a1 = ...
class Meta(CommonInfo.Meta):
db_table = 'vB'
You can refine technique to make varchar keys to be able to work with indexes. I don't write more samples because I don't know what is your database brand.
More information:
Do Django models support multiple-column primary keys?
ticket 373
Alternative methods