I have two tables: Student & User
the Student table has a primary key of INT(9)
the User table has a primary key of MEDIUMINT
Now please take a look at this picture
(source: imgh.us)
Now the problem is: in the messages table i've the messageFrom and messageTo cols, I don't know weather the sender or the receiver is a Student or a User.
I can't reference the two tables because of different primary key types, however, am trying to avoid major changes as possible.
the same issue is everywhere reportedPosts table, comment table. Everywhere.
HOW to get around this issue or a possible solutions to fix this ?
AND please feel free to feedback the database structure, i would like to know and learn from your advices.
thanks in advance.
Both Users and Students (the entities, not the tables) are both examples of People. There are attributes of Users that don't belong to Students and there are attributes of Students that don't belong to Users. Also, there are actions Users may take that Students cannot take and vice versa. However, there are attributes common to both (name, address, phone number, etc.) and actions both may take (send/receive messages, post comments, etc.). This strongly implies a separate table to contain the common attributes and allow the common actions.
create table People(
ID MediumInt auto_generating primary key,
PType char( 1 ) not null check( PType in( 'U', 'S' )) -- User or Student
Name varchar( 64 ) not null,
Address varchar( 128 ),
..., -- other common attributes
constraint UQ_PeopleIDType unique( ID, PType ) -- create anchor for FKs
);
create table Users(
UserID MediumInt not null primary key,
UType char( 1 ) check( UType = 'U' ),
..., -- attributes for Users
constraint FK_Users_People foreign key( UserID, UType )
references People( ID, PType )
);
create table Students(
StudentID MediumInt not null primary key,
SType char( 1 ) check( SType = 'S' ),
..., -- attributes for Students
constraint FK_Students_People foreign key( StudentID, SType )
references People( ID, PType )
);
Notice that if a Person is created with a type of 'S' (Student), the ID value for that Person can only be inserted into the Student table.
Now all tables that must refer to Users may FK to the Users table and those that must refer to Students may FK to the Students table. When tables can refer to either, they may FK to the People table.
Foreign Keys do not solve all problems. Add a suitable index instead of depending on the FK. Then:
Plan A: Do the equivalent of FK checks in the application code, or
Plan B: Forgo any FK checks.
Related
I have a property management application with a full blown accounting system built into it. I have a journal entries table that controls all the postings for various accounting activities such as:
Invoices
Payments
Bills
Deposits
In some cases it's necessary to join these entities to the journal entries table to aggregate accounting entries by different properties and units.
I'm looking for the best way to do this. I have several options:
1) Add a foreign key on the journal entry table to link to the invoice_id, payment_id, bill_id, deposit_id, however most combinations of these will be mutually exclusive (i.e. a deposit would not have a payment) so I would have cases where for a given journal entry I would have nulls in those foreign keys that do not apply to that given journal entry.
2) I could create a single foreign key, let's call it doc_id and another column doc_type to indicate the type of document (Invoice, Payment, Bill, Deposit, etc) and have the combination of doc_id and the document_type_id to reference a primary key on one of the extension tables (i.e. doc_id = 1 & doc_type = Invoice that combination would reference the primary key on the Invoice table).
Which is the better way to go about this or am I thinking about this all wrong?
This sounds like a standard base entity/sub entity pattern. There is one table, let's call it JournalEntries, which contains the attributes that all journal entries have in common: ID, type of entry, when it was created, who created it, and so on.
create table JournalEntries(
ID Int auto_generating primary key,
EType char( 1 ) not null check( EType in( 'I', 'P', 'B', 'D' )) -- Invoice, Payment, etc.
Amount currency not null,
CreateDate Date not null,
..., -- other common attributes
constraint UQ_JournalEntryType unique( ID, EType ) -- create anchor for FKs
);
Notice that ID is the primary key so therefore unique. So the constraint making the combination of ID and EType unique is redundant from a domain definition point of view. All it does is define an anchor for foreign keys.
These FKs will be in the subentity tables -- one table for each subentity: Invoice, Payment, Bill and Deposit. Note that if an entry is defined in the JournalEntries table as a Deposit (EType = 'D') a corresponding entry can only be made in the Deposits table. You can't, for example, mistakenly use that ID in, say, the Payments table.
Let's define one of the subentity tables:
create table Invoices(
ID int primary key, -- value generated by JournalEntries table
IType char( 1 ) not null check( IType = 'I' ), -- Nothing but invoices
..., -- Invoice-specific attributes
constraint FK_InvoiceToEntry foreign key( ID, IType )
references JournalEntries( ID, EType )
);
Now let's create an activity that always has one Invoice associated with it and may have any number of other entries. The constraints ensure only invoices can be inserted and the ID value must match a JournalEntries entry that is defined as an Invoice.
create table Activities(
ID int auto_generating primary key,
InvID int not null,
IType char( 1 ) check( IType = 'I' ),
..., -- other data
constraint FK_ActivityInvoice foreign key( InvID, Type )
);
There may be any number of additional entries and they may be any of the entry types, so you need an intersection table:
create table ActivityEntries(
ActID int not null,
EntID int not null,
DateEntered date not null,
constraint FK_ActEntry_Activity foreign key( ActID )
references Activities( ID ),
constraint FK_ActEntry_JEntry foreign key( EntID )
references JournalEntries( ID )
);
Note that a "Journal Entry" is the JournalEntries data joined with the associated data from one of the subentity tables. So FK references to any journal entry should refer to the JournalEntries table, not any of the subentity tables, even if you know what kind of entry it is. So the Activities rows refer to the JournalEntries table using the EType field as additional data integrity effort because it must be an invoice. The intersection table contains any type of entry so its FK target is just the PK.
Note: for illustration purposes, the type indicator in the JournalEntries table was constrained by a check statement. In an actual database, a much better design would be an entry types lookup table. That maintains the data integrity but is a much more flexible design. (Plus the fact that MySQL still(!) doesn't implement check constraints.)
I have a conceptual question regarding how best to organise my database.
Currently I have four core tables users, teachers, students and notifications. However both the teachers and students tables inherit from the users table so contain the foreign key user_id.
The notifications table as you might have guessed refers to notifications. These need to appear for all users that belong to an employee group i.e. under the employment of another.
Both students and teachers can employ other users.
So the crux is I need an eloquent way of modelling this. The basic workflow of the code would be the below:
getCurrentUser->getAllEmployer(s)->getNotifications
This is the Laravel Eloquent I'm used to $user->employers()->notifications;
Unfortunately it's not as simple as that as in this case an employer can refer to two tables.
So my choices are as follows.
Create an Eloquent Relationship for both the student and teacher
relationship as employers. The shortfall being I need to write if
tests to check if the current user belongs to either and this code
would be repeated frequently.
Add a teacher_id and student_id to
the users table. However one would obviously be redundant in each
record. The chance of needing to add other columns is very likely as
well due to the emergence of new employer entities.
Create an employer_employee table that contains two columns both referencing a user_id. A SQL query would LEFT JOIN both student and
teacher tables with the employer_employee table and then a JOIN
with notifications would return all those relevant. However would
so many joins reduce the speed of the query when compared with the
other options.
Something I haven't considered.
I'm really looking for the most efficient, scalable solution.
Any help is appreciated. If you could clarify why your answer is the most efficient scalable solution as well that would be superb.
There is a similar question here using a Media supertype and adding subtypes of CD, VCR, DVD, etc.
This is scalable in that in creating, say, a BluRay subtype, you create the table to contain the BluRay-specific data and add an entry to the MediaTypes table. No changes needed for existing data or code -- except, of course, to add the code that will work with BluRay data.
In your case, Users would be the supertype table with Teachers and Students the subtype tables.
create table Users(
ID int not null auto_generating,
Type char( 1 ) check( Type in( 'T', 'S' )),
-- other data common to all users,
constraint PK_Users primary key( ID ),
constraint UQ_UserType unique( ID, Type ),
constraint FK_UserTypes foreign key( Type )
references UserTypes( ID )
);
create table Teachers(
TeacherID int not null,
TeacherType char( 1 ) check( TeacherType = 'T' )),
-- other data common to all teachers...,
constraint PK_Teachers primary key( TeacherID ),
constraint FK_TeacherUser foreign key( TeacherID, TeacherType )
references Users( ID, Types )
);
The makeup of the Students table would be similar to the Teachers table.
Since both teachers and students may employ other teachers and students, the table that contains this relationship would refer to the Users table.
create table Employment(
EmployerID int not null,
EmployeeID int not null,
-- other data concerning the employment...,
constraint CK_EmploymentDupes check( EmployerID <> EmployeeID ),
constraint PK_Employment primary key( EmployerID, EmployeeID ),
constraint FK_EmploymentEmployer foreign key( EmployerID )
references Users( ID ),
constraint FK_EmploymentEmployee foreign key( EmployeeID )
references Users( ID )
);
As I understand it, Notifications are grouped by employer:
create table Notifications(
EmployerID int not null
NotificationDate date,
NotificationData varchar( 500 ),
-- other notification data...,
constraint FK_NotificationsEmployer foreign key( EmployerID )
references Users( ID )
);
The queries should be simple enough. For example, if a user wanted to see all the notifications from his employer(s):
select e.EmployerID, n.NotificationDate, n.NotificationData
from Employment e
join Notifications n
on n.EmployerID = e.EmployerID
where e.EmployeeID = :UserID;
This is an initial sketch, of course. Refinements are possible. But to your numbered points:
The Employment table relates employers to employees. The only check if to make user employers cannot employee themselves, but otherwise any user can be both an employee and employer.
The Users table forces each user to be either a teacher ('T') or student ('S'). Only users defined as 'T' can be placed in the Teachers table and only users defined as 'S' can be placed in the Students table.
The Employment table joins only to the Users table, not to both the Teachers and Students tables. But this is because both teachers and students can be both employers and employees, not for any performance reason. In general, don't worry about performance during the initial design. Your primary concern at this point is data integrity. Relational databases are very good with joins. If a performance issue should crop up, then fix it. Don't restructure your data to solve problems that do not yet exist and may never exist.
Well, give this a try and see how it works.
I couldn't think of a title for this and so didn't even know where to start researching for myself.
I have to make a database where I have a table for CD/DVDs but the type of entertainment on them requires different attributes in terms of metadata/information for example music CDs have artist, publisher, producer, CDNo. etc. Whereas a piece of software may have similarities but has some that music wont have and likely the same with movies and games. And so I'm not sure how this would work in terms of an E-R diagram, so far I decided on:
CD/DVDs being in the items table or stock table not sure on the name yet.
tbl_items -> item_id,
item_format(DVD or CD, maybe axpand to blu-ray or hd-dvd),
item_entertainment_type(Music, Movie etc.) <--- Maybe in another not sure.
foreign key to a metadata table, this is so that when deliveries for new CD/DVDs are made if the metadata already exists I just enter a new item and so its a one to many between metadata and items (items >-- meta).
The question I think is, is it bad practice to have null able foreign key fields and Just choose which to add a relation to, so musicMeta_id INT NULL, FOREIGN KEY musicMetaID REFERENCES tbl_musicMeta(musicMeta_id)
like that for each type? or somehow merge them, or is there a trick databaes have.
I'm using MySQL with php.
Thanks!
There is no general rule or Best Practice the foreign keys should not be nullable. Many times it makes perfect sense for an entity not to have a relationship with another entity. For example, you may have a table of artists you track but, at the moment, you have no CDs recorded by those artists.
As for having Media (CD, DVD, BluRay) that can be either music/audio or software, you can have a table with the information in common and then two foreign keys, one to each extension table (AudioData and SoftwareData), but one must be NULL. This presents a situation called, among other things, an exclusive arc. This is generally considered to be...problematic.
Think of a superclass and two derived classes in an OO language like Java or C++. One way to represent that in a relational schema is:
create table Media(
ID int not null, -- identity, auto_generated, generated always as identity...
Type char( 1 ) not null,
Format char( 1 ) not null,
... <other common data>,
constraint PK_Media primary key( ID ),
constraint FK_Media_Type foreign key( Type )
references MediaTypes( ID ), -- A-A/V, S-Software, G-Game
constraint FK_Media_Format foreign key( Format )
references MediaFormats( ID ) -- C-CD, D-DVD, B-BluRay, etc.
);
create unique index UQ_Media_ID_Type( ID, Type ) on Media;
create table AVData( -- For music and video
ID int not null,
Type char( 1 ) not null,
... <audio-only data>,
constraint PK_AVData primary key( ID ),
constraint CK_AVData_Type check( Type = 'A',
constraint FK_AVData_Media foreign key( ID, Type )
references Media( ID, Type )
);
create table SWData( -- For software, data
ID int not null,
Type char( 1 ) not null,
... <software-only data>,
constraint PK_SWData primary key( ID ),
constraint CK_SWData_Type check( Type = 'S',
constraint FK_SWData_Media foreign key( ID, Type )
references Media( ID, Type )
);
create table GameData( -- For games
ID int not null,
Type char( 1 ) not null,
... <game-only data>,
constraint PK_GameData primary key( ID ),
constraint CK_GameData_Type check( Type = 'G',
constraint FK_GameData_Media foreign key( ID, Type )
references Media( ID, Type )
);
Now if you are looking for a movie, you search the AVData table, then join with the Media table for the rest of the information and so on with software or games. If you have an ID value but don't know what kind it is, search the Media table and the Type value will tell you which of the three (or more) data tables to join with. The point is that the FK is referring to the generic table, not from it.
Of course, a movie or game or software can be released on more than one media type, so you can have intersection tables between the Media table and the respective data tables. Otoh, those are generally labeled with different SKUs so you may want to also treat them as different items.
The code, as you might expect, can get fairly complicated, though not too bad. Otoh, our design goal is not code simplicity but data integrity. This makes it impossible to mix, for instance, game data with a movie item. And you get rid of having a set of fields where only one must have a value and the others must be null.
My opinion: Get rid of the FOREIGN KEYs; just be sure you have suitable INDEXes.
I have a mysql database table called Vehicles.
This table has many rows: car, helicopter, plane, etc.
I want each of these rows to have their own table, so I can have different data for each vehicle type.
My question is, how can I make this table so that each row references not another row on another table, but the table itself. I thought of just using the table name, but it feels a bit hackish.
If I'm reading this correctly, you are thinking of a Base or Master entity table to hold all attributes that the vehicles have in common, then a derived table for all the attributes associated with each vehicle.
Something like this maintains data integrity and is scalable:
create table Vehicles(
ID int not null auto_generated primary key,
VType char( 1 ) not null check( VType in( 'C', 'H', 'P' )),
... <other common attributes>,
constraint UQ_ID_VType unique( ID, VType )
);
create table cars(
ID int not null primary key,
VType char( 1 ) not null check( VType = 'C' ),
... <other car-only attributes>,
constraint FK_Car_Vehicle foreign key( ID, VType )
references Vehicles( ID, VType )
);
create table Helicopters(
ID int not null primary key,
VType char( 1 ) not null check( VType = 'H' ),
... <other helicopter-only attributes>,
constraint FK_Helicopter_Vehicle foreign key( ID, VType )
references Vehicles( ID, VType )
);
Etc. The Vehicles table can contain only defined vehicle types and each sub-table can only contain their particular vehicle type and each ID/Type combination must exist in Vehicles first. Beyond that, each one can define any attributes it needs to describe its particular vehicle type. To add Motorcycle, add 'M' to the check constraint of VType field of Vehicles table and create a Motorcycles table with 'M' as the only acceptable value of its VType field.
If all you have is the vehicle ID value, you locate that in the Vehicles table to find out the type -- car, helicopter, etc. -- and that tells you which sub-table to access to get the rest of the information.
As an alternate, and it helps with the scalability factor, have the VType field of the Vehicles table be a FK to a VehicleTypes table. Then you don't have to redefine the table itself to add a new type of vehicle.
I have 2 tables, customers and affiliates. I need to make sure that customers.email and affiliates.email are exclusive. In other words, a person cannot be both a customer and an affiliate. It's basically the opposite of a foreign key. Is there a way to do this?
You can use a table that stores emails and have unique constrain on the email, and reference that table from the customer and affiliate. (still need to ensure that there are no 2 records referencing the same key)
You can use trigger before insert and before update to check if the email is not present.
Or you can leave this validation to the application logic - not in the database, but in the applicationc ode.
There is no key you can do this with, but it sounds like you shouldn't be using two tables. Instead, you can have one table with either customer/affiliate data (that needs to be unique in this table) and another table that has the type (customer/affiliate).
CREATE TABLE People (
pplid,
pplEmail,
ptid,
UNIQUE KEY (pplEmail)
)
CREATE TABLE PeopleType (
ptid,
ptType
)
INSERT INTO PeopleType VALUES (1, 'affiliates'), (2, 'customers');
You can try the following.
Create a new table, which will be a master for customers and affiliates:
CREATE TABLE party
(
id int not null auto_increment primary key ,
party_type enum('customer','affiliate') not null,
email varchar(100),
UNIQUE (id,party_type)
);
--Then
CREATE TABLE customer
(
....
party_id INT NOT NULL,
party_type enum('customer') NOT NULL DEFAULT 'customer',
PRIMARY KEY (party_id,party_type)
FOREIGN KEY (party_id,party_type) REFERENCES party(id,party_type)
);
CREATE TABLE affiliates
(
....
party_id INT NOT NULL,
party_type enum('affiliate') NOT NULL DEFAULT 'affiliate',
PRIMARY KEY (party_id,party_type)
FOREIGN KEY (party_id,party_type) REFERENCES party(id,party_type)
)
-- enum is used because mysql still doesn't have CHECK constraints
This way each party can be only of one type