best practice for foreign key relationships - mysql

Which is the best practice to create the foreign key from the mentioned two-
possibility 1:
table1 : user(id,name,password)
table2 : exams(id,name)
table 3 : user_exam(is,user_id,exam_id)
possibility 2:
table1 : exams(id,name)
table2 : user(id,name,password, exam_id)

Based on what you are modeling, I would guess that you have a many-to-many relationship between Exams and users. In other words, you coudl have exams without users and users without exams. In this case model 2 does not work at all.
In model 2 each user would only get one id or you woudl have to mhave multipel users records each time an exam for tehm is added, This increase the likelihood of data integrity problems espcially since password is there. Do not even consider using model 2 unless you can guarantee there will never be a need for more than one exam.
Depending on what type of exams you are talking about the user_exams table should probably include additional information such as a date. What else you might need depends on the meaning of teh data you are modeling.

Your question is more like an opinion based question but as far as I know foreign keys should be in the junction table. So Possibility 1 is an ideal approach to use the foreign key.

Related

MySQL Database Layout/Modelling/Design Approach / Relationships

Scenario: Multiple Types to a single type; one to many.
So for example:
parent multiple type: students table, suppliers table, customers table, hotels table
child single type: banking details
So a student may have multiple banking details, as can a supplier, etc etc.
Layout Option 1 students table (id) + students_banking_details (student_id) table with the appropriate id relationship, repeat per parent type.
Layout Option 2 students table (+others) + banking_details table. banking_details would have a parent_id column for linking and a parent_type field for determining what the parent is (student / supplier / customers etc).
Layout Option 3 students table (+others) + banking_details table. Then I would create another association table per parent type (eg: students_banking_details) for the linking of student_id and banking_details_id.
Layout Option 4 students table (+others) + banking_details table. banking_details would have a column for each parent type, ie: student_id, supplier_id, customers_id - etc.
Other? Your input...
My thoughts on each of these:
Multiple tables of the same type of information seems wrong. If I want to change what gets stored about banking details, thats also several tables I have to change as opposed to one.
Seems like the most viable option. Apparently this doesnt maintain 'referential integrity' though. I don't know how important that is to me if I'm just going to be cleaning up children programatically when I delete the parents?
Same as (2) except with an extra table per type so my logic tells me this would be slower than (2) with more tables and with the same outcome.
Seems dirty to me with a bunch of null fields in the banking_details table.
Before going any further: if you do decide on a design for storing banking details which lacks referential integrity, please tell me who's going to be running it so I can never, ever do business with them. It's that important. Constraints in your application logic may be followed; things happen, exceptions, interruptions, inconsistencies which are later reflected in data because there aren't meaningful safeguards. Constraints in your schema design must be followed. Much safer, and banking data is something to be as safe as possible with.
You're correct in identifying #1 as suboptimal; an account is an account, no matter who owns it. #2 is out because referential integrity is non-negotiable. #3 is, strictly speaking, the most viable approach, although if you know you're never going to need to worry about expanding the number of entities who might have banking details, you could get away with #4 and a CHECK constraint to ensure that each row only has a value for one of the four foreign keys -- but you're using MySQL, which ignores CHECK constraints, so go with #3.
Index your foreign keys and performance will be fine. Views are nice to avoid boilerplate JOINs if you have a need to do that.

SQL many-to-many relation 3 ways

Hej all,
Let's say I have 4 tables named "user", "office", "product", "event".
And another table named "document". A same document can be assigned to
one or many users, offices, products and events. So here we need a
many-to-many relationship. But I have 3 ways to do that :
-a table named "user_document", another named "office_document", "product_document" and "event_document" which all have a field named
"document_id" which is foreign key for document id and another field
"user_id" (for user_document) which is foreign key to user id (and so
on with office, product and event of course...)
OR
-a table named "document_ownership" which has these fields : "document_id", "user_id", "office_id", "product_id" and "event_id".
Here document_id should be not Null and one (or more) of other fields
that can be Null. For example if I set a same document for a user and
a product, I will have a row with document_id, user_id and product_id
not Null.
OR
-a table named "document_ownership" that will have these fields : "document_id", "relation_type" and "relation_id". Here relation_type
field is for example a string (which represent the relation table
name) or a foreign key pointing to another additionnal table named for
example "relationtype" in which we have strings like "user" (id=1),
"office" (id=2), "product" (id=3) and "event" (id=4) (which also
represent the relation table name), and relation_id which is the id of
the specified relation table (relation_type)
My question is, what is the pro/cons of all these 3 ways of doing what I want and what should be the best practice please ?
Thanks in advance for your advices,
Michal
This question is not really answerable as asked. A purist would say that approach 1 is correct but it is not always that simple. Think of it like this - your database design should express the relationships between the data and what the data means. So each of your approaches imply several things about the nature of the data.
Approach 1 says that user, office, product and event are important, and oh yeah they can have documents. Maybe.
Approach 2 says that documents are important, and we need to track what each document relates to. So the document is the key thing and everything else is annotated around that.
Approach 3 is more complicated and technical and does not really give an idea of how you want the data to be used.
In all cases the data is same. It is just designing the data to tell the story of how it should be used.
Sorry to wax lyrical. Just my $0.02.
In a data conception (Merise) view you have :
Document-0,n---------0,n-User
Document-0,n---------0,n-Event
...
This is the logical view.
When you transform this to physical data view you will end up with 1 more table for each relation.
So the 1st solution is the way to go, if you want to apply best practice in data modelisation.
Concerning the two other solutions, which breaks some normal form :
the second solution is a total no go. You will have a lot of null value everywhere and will strugle to do some basic statistic because of that.
The third solution, that looks like a spaghetthi plate, will globally work and is, in my point of view, a good alternative. IF you can handle the loss of constraint integrity

How can I join two tables with two different primary keys into another table?

I have two tables: students and courses, assuming that each student can be in more than one course and that each course can have more than one student.
[Table Students] [Table Courses]
id(PK) id(PK)
name name
age duration
etc... etc...
and what I want to do it is to relate both tables into another table, for example, studying, in which I will store the course or courses that is doing each student. Like this:
[Table studying]
idStudent
idCourse
What I have deduced
I think that idStudent and idCourse should be foreign keys because the information it is stored in students and courses respectively with an unique primary key and to respect the consistency of the database. It cannot exist a relation without information neither of the student nor the course or just without the information of one of them.
I also know that some tables has two primary keys to allow that in the table could exist more than one repeated value of a primary key, but not of both primary keys at the same time.
My questions
These ids (idStudent, idCourse). Have to be primary keys or foreign keys?
Should the table studying has another column with an ID?
Is my deduction in the good way?
P.S: I do not need sql statements, I just need help to clarify my confusion.
Thanks in advance!
These ids (idStudent, idCourse). Have to be primary keys or foreign keys?
You want them to be foreign keys, because the existence of each record on your third table depends on the availability of the first, that is, there cannot be a "Student Course" or a "Course with Students" without either the course or the student. It could (if you don't make those keys) but you would break referential integrity
On the other hand, having FK's is usually a good thing because you make sure that you don't remove dependable records by mistake (which is what the constraint is for on the first place) unless you did something like cascade deleting
Should the table studying has another column with an ID?
No, it does not have to but again, sometimes it is a good practice because some software like Object Relational Mappers, Diagram Software, etc. may rely on the fact that they always needs a by-convention primary key. Some others don't even support composite keys so while it is not mandatory it can help in the future and it does not hurt. Of course this all depends on what you are using the database for and how (pure SQL, which engine you use, if you use it with a framework etc.)
Is my deduction in the good way?
All is relative. But I think your logic is good. My advice is that you always design your data schemas as flexible as you can because if a project grows its harder (and more costly) to do those changes down the road. Invest time on thinking how you may expand your application functionality and think if the schema will adapt to it.
Your deduction is correct.
In fact, you should have a composite primary key consisting of both (idStudent, idCourse) columns, because this tuple is the identifier of row in the table, you do not need additional ID column (of course, you can also take that approach to add additional ID column that would be your primary key, but you do not need it if one student can have one course assigned only once)
To respect the integrity, both columns (separately) should be foreign keys - idStudent should be referencing id column of Students table and idCourse should reference id column of Courses table.
If you like you can make them primary keys on studying table. But this is unnecesary, because relation (role of studying table) is many to many and this kind of table dont need primary keys. You need to know that also when you make them pk (pair of student id and course id) , thats mean that theee could be only one pair of each, thats equivalent to constrain unique - student can take a course only ones. In the future you maybe would like to add to this table start_date and this kind of pk could be a problem, you will need to modify them.

Model a table that can have a relationship with several tables

I have a table called 'notes', on this table I need to track who made that note, but the problem is that the creator of the note can be a user stored in one of three possible tables:
users
leads
managers
I have though of simply create three fields on 'notes' to represent the three possible relations: note.user, note.lead, note.manager
With this approach I would be forced to create three table joins when requesting the notes to gather the creators information, and I don't think that is the way to go, so I would like to hear your ideas or comments and what would be the best approach on this.
For me personally this smells like a design problem on a totally different part of the schema: Are manageers not users? Do leads carry person information?
With any approach that creates a relation between one column and one of three others, you will need three joins for the select. If you can't rectify the underlying problem, I recommend you use
note_type ENUM('users','leads','managers')
as an additional field and
SELECT
...
IFNULL(users.name(IFNULL(managers.name,leads.name))) AS name
..
FROM notes
LEFT JOIN users ON notes.note_type='users' AND users.id=notes.note_source
LEFT JOIN managers ON notes.note_type='managers' AND managers.id=notes.note_source
LEFT JOIN leads ON notes.note_type='leads' AND leads.id=notes.note_source
...
for the query
I think you need to abstract out the concept of a user id, so that it does not depend on their role. The author of a note could then be specified by the user id.
Users could be assigned roles, and maybe more than one.
The correct way to structure this would be to pull all common data out of users, leads, and managers. Unify this data into a "contact" table. Then if you want to get all notes for a given manager:
managers->contacts->notes
for a lead:
leads->contacts->notes
Notice your original post: "the problem is that the creator of the note can be a user stored in one of three possible tables"
From the structure of your sentence you even admit that all these entities have something in common; they are all users. Why not make the DB reflect this?
you have to model a parent table for the three tables you already have. Define a table that depicts generally user, leads and manager tables. Something like "Person". So you have all of the ids of the three tables and any common attributes on the Person table. And when you must define the relationship you put the foreign id "Person_ID" on the note table. And when you model user, leads and manager tables you also put the primary key as a foreign key to the Person table.
So you would have something like this:
Table users:
Users(
person_id primary key
...(attributes of Users)
foreign key person_id references Person.person_id
)
This model i depict is common to any relational model you have to model using parents and childs

Different database tables joining on single table

So imagine you have multiple tables in your database each with it's own structure and each with a PRIMARY KEY of it's own.
Now you want to have a Favorites table so that users can add items as favorites. Since there are multiple tables the first thing that comes in mind is to create one Favorites table per table:
Say you have a table called Posts with PRIMARY KEY (post_id) and you create a Post_Favorites with PRIMARY KEY (user_id, post_id)
This would probably be the simplest solution, but could it be possible to have one Favorites table joining across multiple tables?
I've though of the following as a possible solution:
Create a new table called Master with primary key (master_id). Add triggers on all tables in your database on insert, to generate a new master_id and write it along the row in your table. Also let's consider that we also write in the Master table, where the master_id has been used (on which table)
Now you can have one Favorites table with PRIMARY KEY (user_id, master_id)
You can select the Favorites table and join with each individual table on the master_id and get the the favorites per table. But would it be possible to get all the favorites with one query (maybe not a query, but a stored procedure?)
Do you think that this is a stupid approach? Since you will perform one query per table what are you gaining by having a single table?
What are your thoughts on the matter?
One way wold be to sub-type all possible tables to a generic super-type (Entity) and than link user preferences to that super-type. For example:
I think you're on the right track, but a table-based inheritance approach would be great here:
Create a table master_ids, with just one column: an int-identity primary key field called master_id.
On your other tables, (users as an example), change the user_id column from being an int-identity primary key to being just an int primary key. Next, make user_id a foreign key to master_ids.master_id.
This largely preserves data integrity. The only place you can trip up is if you have a master_id = 1, and with a user_id = 1 and a post_id = 1. For a given master_id, you should have only one entry across all tables. In this scenario you have no way of knowing whether master_id 1 refers to the user or to the post. A way to make sure this doesn't happen is to add a second column to the master_ids table, a type_id column. Type_id 1 can refer to users, type_id 2 can refer to posts, etc.. Then you are pretty much good.
Code "gymnastics" may be a bit necessary for inserts. If you're using a good ORM, it shouldn't be a problem. If not, stored procs for inserts are the way to go. But you're having your cake and eating it too.
I'm not sure I really understand the alternative you propose.
But in general, when given the choice of 1) "more tables" or 2) "a mega-table supported by a bunch of fancy code work" ..your interests are best served by more tables without the code gymnastics.
A Red Flag was "Add triggers on all tables in your database" each trigger fire is a performance hit of it's own.
The database designers have built in all kinds of technology to optimize tables/indexes, much of it behind the scenes without you knowing it. Just sit back and enjoy the ride.
Try these for inspiration Database Answers ..no affiliation to me.
An alternative to your approach might be to have the favorites table as user_id, object_id, object_type. When inserting in the favorites table just insert the type of the favorite. However i dont see a simple query being able to work with your approach or mine. One way to go about it might be to use UNION and get one combined resultset and then identify what type of record it is based on the type. Another thing you can do is, turn the UNION query into a MySQL VIEW and simply query that VIEW.
The benefit of using a single table for favorites is a simplicity, which some might consider as against the database normalization rules. But on the upside, you dont have to create so many favorites table and you can add anything to favorites easily by just coming up with a new object_type identifier.
It sounds like you have an is-a type relationship that needs to be modeled. All of the items that can be favourited are a type of "item". It sounds like you are on the right track, but I wouldn't use triggers. What could be the right answer if I have understood correctly, is to pull all the common fields into a single table called items (master is a poor name, master of what?), this should include all the common data that would be needed when you need a users favourite items, I'd expect this to include fields like item_id (primary key), item_type and human_readable_name and maybe some metadata about when the item was created, modified etc. Each of your specific item types would have its own table containing data specific to that item type with an item_id field that has a foreign key relationship to the item table. Then you'd wrap each item type in its own insertion, update and selection SPs (i.e. InsertItemCheese, UpdateItemMonkey, SelectItemCarKeys). The favourites table would then work as you describe, but you only need to select from the item table. If your app needs the specific data for each item type, it would have to be queried for each item (caching is your friend here).
If MySQL supports SPs with multiple result sets you could write one that outputs all the items as a result set, then a result set for each item type if you need all the specific item data in one go. For most cases I would not expect you to need all the data all the time.
Keep in mind that not EVERY use of a PK column needs a constraint. For example a logging table. Even though a logging table has a copy of the PK column from the table being logged, you can't build a constraint.
What would be the worst possible case. You insert a record for Oprah's TV show into the favorites table and then next year you delete the Oprah Show from the list of TV shows but don't delete that ID from the Favorites table? Will that break anything? Probably not. When you join favorites to TV shows that record will fall out of the result set.
There are a couple of ways to share values for PK's. Oracle has the advantage of sequences. If you don't have those you can add a "Step" to your Autonumber fields. There's always a risk though.
Say you think you'll never have more than 10 tables of "things which could be favored" Then start your PK's at 0 for the first table increment by 10, 1 for the second table increment by 10, 2 for the third... and so on. That will guarantee that all the values will be unique across those 10 tables. The risk is that a future requirement will add table 11. You can always 'pad' your guestimate