Understanding referential integrity in databases - mysql

I just started learning databases. My professor mentioned the term "referential integrity" and I'm trying to understand it.
Here's my understanding. If I have
Table Manager: Manager_id , Manager_name
Table Employee: Employee_id , Manager_id , Employee_name
Manager_id is the primary key for "Manager table" & foreign key for "Employee table"
If I delete/update any Manager_id from Manager then all the entries having that manager id will get deleted/updated. Thats cascading update or delete.
But what if I try to delete or update manager_id in the "employee table" ? Will it correspondingly delete the entries from Manager table?

If I delete/update any Manager_id from Manager then all the entries
having that manager id will get deleted/updated. Thats cascading
update or delete.
This happens only if you have defined UPDATE/DELETE rules for Foreign Key. By default, cascading UPDATES/DELETES are disabled as it might cause unexpected changes in your data.
But what if I try to delete or update manager_id in the "employee
table" ? Will it correspondingly delete the entries from Manager
table?
It will delete only single employee as related manager could be referenced by some other employee as well.

The 'cascade' part goes from the Primary Key towards the Foreign Key, but not the other way around.
When you delete Manager1 let's say, then it makes no sense for Employee1 to have Manager1 as a manager, since they are no longer defined.
But when you change Employee1's manager from Manager1 to Manager2 - then that is still fine assuming both managers still exist.

Related

Constrains on a child table in mySQL

I have this situation:
MANAGER (ManagerID, Salary, .... , email)
PROJECT (ProjectID, ..., Date)
Since there is relationship M:N between Manager an project, I'll have a third table:
Manager_has_Project( ManagerID, ProjectID )
where ( ManagerID, ProjectID ) is the compound PK for Manager_has_Project
Let's suppose we have to delete a Manager who has created some projects from our database: SQL won't make us do that. We could add the constraint on the fk ManagerID in the child table "ON DELETE CASCADE", but in this case we will lose information about, for example, how many managers worked for a project. The alternative is "ON DELETE SET NULL" but, since ManagerID is part of the compound pK of Manager_has_Project, we can't set a PK as null.
What would recommend to do?
If you want to to keep the information, use soft deletes rather than actually removing the rows.
That is, add a column, say is_deleted or deletion_datetime that indicates that a Manager has been deleted. Then you can keep all the information, even about "deleted" managers.
You can use views so "normal" queries would only return managers who are not deleted.

Foreign Key settings with regards to bridging tables

I have a question on foreign key settings with regards to bridging tables. I still am unsure of how the deletion process works. My foreign keys are currently all set to On Delete: No Action, so does that mean that in the case of a bridging table, in order to delete records in one or both of the parent tables, I would have to delete the records they feature in in the bridging table first or does it work differently with many-to-many relationships? Apologies if this is a simple, dumb question but it seems pretty difficult for someone new to databases to find clear, simple, easy-to-follow documentation anywhere to explain these things.
The rule is pretty straightforward:
You can't delete a row if some other row exists that references the one you want to delete.
Example: A college photography course is created as a row in the courses table.
INSERT INTO courses SET course_id = 1234, title = 'Photography';
People enroll in the course:
INSERT INTO enrollments SET course_id = 1234, student_id = 9877;
INSERT INTO enrollments SET course_id = 1234, student_id = 9876;
INSERT INTO enrollments SET course_id = 1234, student_id = 9875;
Then the instructor wants to cancel the course.
DELETE FROM courses WHERE course_id = 1234;
This is blocked, because there are rows in enrollments that reference the row in courses.
Likewise, a student may want to withdraw from school this semester. They try to remove their record:
DELETE FROM students WHERE student_id = 9877;
This is blocked, because the student is still enrolled in the photography class.
The enrollments class is a bridging table (I call these intersection tables, but there's no official terminology for these types of tables). It is basically a pair of foreign key columns, which reference the respective tables courses and students.
The foreign key constraints in enrollments require that each of the referenced rows in the other two tables exist. You can't delete either the courses row or the students row while there's an enrollment that references it.
The way to handle this is to delete the dependent row (the one that has the foreign key constraint) before you delete the referenced row.
The optional ON DELETE CASCADE syntax makes a foreign key constraint handle this automatically. That is, deleting a row in courses would automatically delete any rows that reference the course. If you don't use this option, then trying to delete the course returns an error.

SQL: delete from joint tables where count condition

I have these tables:
tutors:
email firstname, lastname...
courses:
url tutorid description
reviews:
review courseid author
author in reviews and tutorid in courses is foreign key = tutors.email.
I need to delete all tutors that have 2 or more courses without a description.
I first tried to just select such tutors:
select tutors.email, COUNT(courses.url)
from courses
left join tutors on courses.tutorid = tutors.email
where
description is null group by (tutors.email);
this works fine. However I'm not sure how to delete the tutors with the given emails, considering the tutors.email is a foreign key in other tables.
If you don't delete the records in related tables that contains the foreign key, you will eventually create something called orphaned records, and that is something that goes against the referential integrity rule in general in RDBMS.
It is also possible that the RDBMS will enforce referential integrity, and you wont be able to do that before you delete all the associate records in the other tables first.

Foreign keys in mysql

I have a doubt about the way of relating some tables. I have these tables:
User table: username (primary key)
Team table: team_name(primary key), username (foreign key references User(username))
With this relationship, I get that an user can have more than one team.
Group table: group_name (primary key)
I want that a group can have many teams, but these teams have to be of different users, so two teams of a user cannot be in the same group.
I have thought to do a relationship with the three tables of this way:
Group_teams table: (group_name, username, team_name). This table would have a composite primary key (group_name and username), in this way I would be sure that an user cannot has more than one team in a same group.
In addition, I think that I should create a composite foreign key references User(username) and Team (team_name) to be able to control that the team of a user exists. Finally, I should create another foreign key references Group (group_name) to control that a group exists.
I'm not sure that it would be of this way because I have errors when I try to do it. Could yo help me and tell me your opinions?
If a user can be on (at most) only one team, then you have a 0/1 - many relationship.
The easiest approach is to have TeamId in the Users table. This would be a foreign key reference to Teams.
There is no need for a third table to represent this relationship.
There's no need to reproduce the username field in the Group_teams table, just have group_name & team_name.
Create a trigger on the Group_teams table to fire before a new row is inserted, checking for more other teams with the same username.
Have a look at the question "How do you check constraints from another table when entering a row into a table?", specifically this answer from Jim V describing such a setup.

How to make proper use of foreign keys

I'm developing a helpdesk-like system, and I want to employ foreign keys, to make sure the DB structure is decent, but I don't know if I should use them at all, and how to employ them properly.
Are there any good tutorials on how (and when) to use Foreign keys ?
edit The part where I'm the most confused at is the ON DELETE .. ON UPDATE .. part, let's say I have the following tables
table 'users'
id int PK auto_increment
department_id int FK (departments.department_id) NULL
name varchar
table 'departments'
id int PK auto_increment
name
users.department_id is a foreign key from departments.department_id, how does the ON UPDATE and ON DELETE functions work here when i want to delete the department or the user?
ON DELETE and ON UPDATE refer to how changes you make in the key table propagate to the dependent table. UPDATE means that the key values get changed in the dependent table to maintain the relation, and DELETE means that dependent records get deleted to maintain the integrity.
Example: Say you have
Users: Name = Bob, Department = 1
Users: Name = Jim, Department = 1
Users: Name = Roy, Department = 2
and
Departments: id = 1, Name = Sales
Departments: id = 2, Name = Bales
Now if you change the deparments table to modify the first record to read id = 5, Name = Sales, then with "UPDATE" you would also change the first two records to read Department = 5 -- and without "UPDATE" you wouldn't be allowed to make the change!
Similarly, if you deleted Department 2, then with "DELETE" you would also delete the record for Roy! And without "DELETE" you wouldn't be allowed to remove the department without first removing Roy.
You will need foreign keys if you are splitting your database into tables and you are working with a DBMS (e.g. MySQL, Oracle and others). I assume from your tags you are using MySQL.
If you don't use foreign keys your database will become hard to manage and maintain. The process of normalisation ensures data consistency, which uses foreign keys.
See here for foreign keys. See here for why foreign keys are important in a relational database here.
Although denormalization is often used when efficiency is the main factor in the design. If this is the case you may want to move away from what I have told you.
Hope this helps.