Quick question about DB design! In this example there are users and schedules. Each user can have many schedules and each schedule can belong to many users.
I have two tables, 'user' and 'schedule', that each have a unique identifier/primary key (user_id and schedule_id): these tables have a many-to-many relationship.
This is where I am unsure/inexperienced: In order to connect them together and adhere to good db design, I want to create a link table that has two columns, user_id and schedule_id. I plan to make these both primary keys (therefore a composite key). However, do I also add two foreign keys, one on user_id linked to the 'user' table and one on schedule_id linked to the 'schedule' table?
TLDR: I plan to use a composite key in 2-column 'link' table that connects two tables. Should/Do I also need to make those into foreign keys?
PKs and FKs serve different purposes. In a link table, you need the PK to preserve uniqueness of the data. However, if you do not also create the FKs then you may end up with data integrity problems because the ID could be deleted from the original table and not the link table.
Sometimes people think they can get away without the FKs because they will enforce data integrity through the application. Almost always this is because they find it annoying when the constraints won't let them do something they want to do. Of course that is the purpose of the constraint, to prevent users and developers from doing things they should not. Data integrity must be preserved through the database; it is too important to risk letting the application handle it. I have seen a lot of data from hundreds of databases and the ones with the worst data are invariably the ones where the devs thought they could manage stuff like table relationships through the application. There are always holes when you do this and eventually they come back to bite you and then they can be very difficult to fix properly.
Related
A database was created with 5 tables. These tables were populated with data upon creation - perhaps it was imported from a previous database.
When the DB was created, primary keys were created for each table, however foreign keys were not.
How do I run a query to identify which tables columns contain data that relates to the PK in other tables? Effectively, how do I identify the FK column(s) on each table? Some tables may contain 2 FK's.
The end goal is to identify the FK('s) in each table and properly set up the table with appropriate FK structure and table relations.
Don't try to use queries to automate this database design / reverse-engineering process. (If you had 500 tables, maybe. But you only have five.)
Eyeball your table definitions. If you have, for example, an id primary key column in your user table, your contact table might have a user_id column. That is the FK to user.id. It will help you greatly if you really understand how your tables tie together with FKs.
And, keep in mind that your system will still work tolerably well if you don't bother to actually declare these foreign keys. What you'll lose:
constraints, in which the database engine prevents, for example a contact.user_id column value that doesn't point to any user.id row.
possibly some helpful indexing.
MySql Workbench has a reverse engineering feature. It inspects the definition of a database and does its best to sort out various entities (tables) and the relationships (foreign key dependencies) between them. It presents graphical e:r diagrams and can generate DDL. That can help you understand a database and set up appropriate FKs. But still, check the relationships it suggests: this data is yours, not Workbench's.
This is my first database design of my home library. I have a question about how to implement the primary keys in each table, I am also interested to know if it is acceptable to have many tables associated with one foreign as I have with the many relationships with "contributor_id."
Not all of my tables have a primary key. If a primary key should NOT be the foreign key (as I have come to understand it) what could serve as a primary key in copy_info and book_info tables? I am not sure I have correctly implemented the intermediary tables where there many to many relationships might exist. Is this a situation where I should require a composite primary
key?
Maybe there are differing opinions on how to do this, but any insight would be appreciated. Where I am brand new to this, please excuse if my question is not sufficiently specific.
I just give few quick advices:
I understand the 4 tables on the left are intermediary tables between book_info and other 4 tables with translators info, authors info and so on; am I right? So is a many-to-many relationships (a book can have many authors and one authors has many books), but if you set up the tables like this every time you try to insert the same contributor for another book it will cause a duplicate key error! So, remove the keys from there.
the right way to assign a key to these aggregate tables with be to assign the key to both values (so you will never break the "uniqueness" constraint), BUT is completely unuseful (and a waste of space and Db resources) to assign keys in these simple aggregate tables
books and book_info is a one-to-one relationship? So it makes sense to split this in 2 separate tables ONLY for tidiness purposes (for example if you have a lot of fields and you want to separate them by some logical criteria)
books and book_info is another many-to-many relationship? In this case the "id" field as key is a correct solution, but is not correct to couple together the tables books and copy_info (better to couple the copy_info to the book_info, so you wil split it to two one-to-many relationship (easier to manage in this case)
i hope I helped a little, anyway you can easily solve it with little practice on the field.
I was misunderstanding how to implement a many-to-many relationship. Once I put in junction tables, my tables all had primary keys.
Updated Design
I'm using MySQL and have been planning out the database structure for a system I'm building out. As I've been going along, I started to wonder if it was acceptable to have a particular foreign key constraint in many different tables. From what I understand, it would be fine, as it makes sense. But I'd like to double check.
For example, I have a users table, and I use the user_id as a foreign key for many tables, sometimes multiple times in one table. For example, I have a one-to-one relationship with a user_settings table, which of course stores the user_id. And then I have a companies table, which alone has a few references to the user_id key. In this case, I have a column that keeps track of the user that created the company in the system (created_by), a column for the main contact (main_contact, who is also a user of the system), and there might be another reference. So that alone, already has the user_id key being used as a foreign key constraint 3-4 times.
Just to add another bit of info, I have a tasks table and that of course needs to reference the user_id to keep track of who it's assigned to, and I also have another column that keeps track of the user that created the task. That would be assigned_to and created_by, respectively.
There are more tables though that reference back to that key. I might be up to 8 references already. I do believe I've designed it properly so far, but based on what I've mentioned, does this sound fine?
Your foreign key usage seems fine to me - after all, you are simply representing logical relationships between your tables.
A user within your system interacts with the data in many ways, and to define these relationships your approach is the correct one.
The key point I think is that under a lot circumstances, you won't always want (or need) to make all the joins that represent your relationships - simply the ones that you need in that context.
As per my undestanding the way you are defining is fine i.e to use a user id to many tables as foreign key.
If your line:: I have a companies table, which alone has a few references to the user_id key doesn't mean that you are using multipe user_id in same table and I know you are not.
I am designing a database for student information. I wish to implement the best practices regarding separate tables and use of Primary and Foreign Keys.
Let's say I have the following tables (High Level):
Users
Student Information
Student Transcripts
Student Records
There will be different users with different levels. Also, the information in Student Info/Transcripts/Records will all have a Foreign Key with the ID that's in Users.
SO, it would be dumb to just clump all the tables into one big table, wouldn't it? Is it a good idea to keep all this information separate and just use Primary/Foreign keys to link things together, as well as maybe Joins? I just personally think a big table would be quite messy and through this way, it allows one to keep similar data together with its own kind.
Thanks for all input on the matter!
Why can't I just leave those relationships out?
What's the point of them?
I can stil run queries and treat them like it a relationship myself...
Yes, you can always leave the foreign key constraints out but then you will be responsible about the integrity of your data. If you use foreign key constraints, then you won't have to worry about the referential integrity among tables. You can read more about referential integrity from Wikipedia. I will also try to explain it with an example below.
Think of a shopping cart scenario. You have three tables: item, shopping_cart and shopping_cart_item. You can choose not to define any relationship between these tables, that's fine for any SQL solution. When user starts shopping, you create a shopping cart by adding a shopping_cart entry. As user adds items to his shopping cart, you save this information by adding rows to shopping_cart_item table.
One problem may occur at this step: If you have a buggy code that assigns incorrect shopping_cart_id's to shopping_cart_items, then you will definitely end up with incorrect data! Yes, you can have this case even with a foreign key constraint if the assigned id actually exists in the shopping_cart table. But this error will be more detectable when a foreign key exists since it would not insert shopping_cart_item record when the foreign key constraint fails.
Let's continue with the assumption that your code is not buggy and you won't have first type of referential integrity. Then suddenly a user wants to stop shopping and delete the cart and you chose to implement this case by deleting the shopping_cart and shopping_cart_item entries. Then you will have to delete entries in both tables with two separate queries. If something goes wrong after you delete shopping_cart entries, then you will again have a referential integrity problem: You will have shopping_cart_items that are not related to any shopping_cart. You will then have to introduce transaction managing, try to provide meaningful data to your business logic about the error happened in data access layer, etc..
In this type of scenario's, foreign keys can save life. You can define a foreign key constraint that will prevent insertion of any sort of incorrect data and you can define cascade operations that will automatically perform deletion of related data.
If there is anything unclear, just leave a comment and I can improve the answer.
Apart from what the others have said about why you technically want (actually: need) them:
foreign key constraints also document your model.
When looking at a model without FK constraints you have no idea which table relates to which. But with FK constraints in place you immediately see how things belong together.
You create FOREIGN KEYs to instruct the database engine to ensure that you never perform an action on the database that creates invalid records.
So, if you create a FOREIGN KEY relationship between users.id and visits.userid the engine will refuse to perform any actions that result in a userid value in visits that does not exist in users. This might be adding an unknown userid to visits, removing an id from users that already exists in visits, or updating either field to "break" the relationship.
That is why PRIMARY and FOREIGN KEYs are referred to as referential integrity constraints. The tell your database engine how to keep your data correct.
It doesn't allow you to enter an id which does not exist in another table, for example, if you have products and you keep owner Id, by creating a foreign key ton the owner id to id field of the owners table, you do not allow users to create an object record which has an owner id which does not exist in the owner table. such things are called referential intergrity.
The foreign key constraint helps you ensure referential integrity.
If you delete a row in one table, mysql can automatically delete all rows in other tables that the deleted row refers to via the foreign key. You can also make it reject the delete command.
Also when you try to insert a row, mysql can automatically create new rows in other tables, so the foreign key does not refer to nothing.
That is what referential integrity is all about.
Databases can be affected by more than just the application. Not all data changes go through the application even if they are supposed to. People change stuff directly on the database all the time. Rules that need to apply to all data all the time belong on the database. Suppose you can update the prices of your stock. That's great for updating anindividual price. But what happens when the boss decides to raise all prices by 15%. No one is going to go through and change 10,000 prices one at a time through the GUI, they are going to write a quick SQL script to do the update. Or suppose two suppliers join together to have one company and you want to change all of thie items to be the new company. Those kinds of changes happen to databases every day and they too need to follow the rules for data integrity.
New developers may not know about all the places where the foreign key relationships should exist and thus make mistakes which cause the data to be no longer useful.
Databases without foreign key constraints have close to a 100% chance of having bad data in them. Do you really want to have orders where you can't identify who the customers were?
THe FKS will prevent you from deleting a customer who has orders for instance or if you use a natural key of company_name and the name changes, all related records must be changed with the key change.
Or suppose you decide to put a new GUI together and dump the old one, then you might have to figure out all the FK relationships again (because you are using a different datalayer or ORM) and the chances are you might miss some.
It is irresponsible in the extreme to not put in FK relationships. You are risking the lifeblood of your company's business because you think it is a pain to do. I'd fire you if you suggested not using FKs because I would know I couldn't trust my company's data to you.