Should I be using onDelete=cascade with my foreign keys? - mysql

Related question: Foreign key constraints: When to use ON UPDATE and ON DELETE.
We'll take an example, a company table with a user table containing people from theses company
CREATE TABLE COMPANY (
company_id INT NOT NULL,
company_name VARCHAR(50),
PRIMARY KEY (company_id)
) ENGINE=INNODB;
CREATE TABLE USER (
user_id INT,
user_name VARCHAR(50),
company_id INT,
INDEX company_id_idx (company_id),
FOREIGN KEY (company_id) REFERENCES COMPANY (company_id) ON...
) ENGINE=INNODB;
ON DELETE CASCADE : dangerous : if you delete a company row in table COMPANY the engine will delete as well the related USERs. This is dangerous but can be used to make automatic cleanups on secondary tables (so it can be something you want, but quite certainly not for a COMPANY<->USER example)
Now, let's suppose that I have multiple companies, each with multiple customers. I make a habit of having a primary auto index key on each table, and using that as a foreign key on child tables.
So, since my company_id is auto generated and guaranteed to be unique, is there any danger in me setting the foreign key company_id in the users table to onDelete=cascade?
Obviously, my GUI has lots of "are you sure that you are certain that you really want to delete this? Action cannot be undone!"
But, if I don't onDelete=cascade, then before I can DELETE FROM companies WHERE company_id=X, I first have to DELETE FROM users WHERE company_id=X, which is what I have been doing until now.
I am considering onDelete=cascade for the first time & just want to be sure that I have grokked it. Deleting dependent rows can get tedious when the dependency tree is multiple levels deep.
Also, since the keys are auto index, they won;t change, so I can't see that I would need onUpdate.
[Update] One answer was concerned about deleting business data. That's just an example from a related question.
Imagine architecture: a single user can have plans of multiple sites, each with multiple buildings, each with multiple floors, each with multiple rooms.
It is a cascading, tree-like, hierarchy. Does it make sense to have onDelete=Cascade there? I think so, but want to hear from those more more knowledgeable

So much of it will depend on your specific use case. Since you are trying to delete the users anyway, and you want it to happen automatically as part of the cleanup it seems like a good candidate for using ON DELETE to me.
I probably wouldn't be deleting these records though. I would be deactivating them, setting the company to inactive. Then ON UPDATE would be a good candidate, cascading the inactive state down to all users for the company.
I would hesitate to do the delete for two reasons:
First, if the company returns, this allows you to restore the pieces you want for faster setup. And less likely to trigger a restore-from-backup if a company is incorrectly deleted.
Second, I assume that the these entities propagate out into other tables too. I wouldn't want to delete a client/suppliers order history just because we no longer have an active relationship. Even if you don't delete records from those other tables, you'll wind up orphaning the companyId/userId likely in those records.

Related

Are these too many foreign keys?

I'm using MySQL and have been planning out the database structure for a system I'm building out. As I've been going along, I started to wonder if it was acceptable to have a particular foreign key constraint in many different tables. From what I understand, it would be fine, as it makes sense. But I'd like to double check.
For example, I have a users table, and I use the user_id as a foreign key for many tables, sometimes multiple times in one table. For example, I have a one-to-one relationship with a user_settings table, which of course stores the user_id. And then I have a companies table, which alone has a few references to the user_id key. In this case, I have a column that keeps track of the user that created the company in the system (created_by), a column for the main contact (main_contact, who is also a user of the system), and there might be another reference. So that alone, already has the user_id key being used as a foreign key constraint 3-4 times.
Just to add another bit of info, I have a tasks table and that of course needs to reference the user_id to keep track of who it's assigned to, and I also have another column that keeps track of the user that created the task. That would be assigned_to and created_by, respectively.
There are more tables though that reference back to that key. I might be up to 8 references already. I do believe I've designed it properly so far, but based on what I've mentioned, does this sound fine?
Your foreign key usage seems fine to me - after all, you are simply representing logical relationships between your tables.
A user within your system interacts with the data in many ways, and to define these relationships your approach is the correct one.
The key point I think is that under a lot circumstances, you won't always want (or need) to make all the joins that represent your relationships - simply the ones that you need in that context.
As per my undestanding the way you are defining is fine i.e to use a user id to many tables as foreign key.
If your line:: I have a companies table, which alone has a few references to the user_id key doesn't mean that you are using multipe user_id in same table and I know you are not.

A many-to-many primary key for "appointments"

I know this question has been asked a lot but my example seems different.
I have two entities: Doctor and Client, and a many-to-many relationship between them to create the entity Appointment, which has, say "appointment_date_time" for an attribute.
I'm using the foreign keys from Doctor and Client to create a composite primary key in Appointment, but since there can be many appointments between the same doctor and person, should the "date_time" also be included as part of the primary key so there's no duplicates? Or would the two foreign keys be enough to query off of?
Thanks!
Your PRIMARY KEY needs to always be unique, so including the datetime would make an usable composite PRIMARY KEY that would (probably, unless you could have multiple appointments at the same time for 2 different purposes, which might happen) be unique.
However this is unlikely the best approach practically speaking, as if they move the appointment then this time will change (or maybe changed the doctor). Then you have no way to reference the appointment statically, say for example if you associated some extra data to it during creation or had to reference it as an audit entry. It also means any references to it that you do create would need to store all 3 columns.
As such I would simply look to create an auto incrementing primary key in this case, and simply index on both doctor and client for fast searches.

MySQL fixed ids for categories, how to ensure they are static

I have a system that has categories that are joined to events. All of these categories are simple, they have an id, and name. One thing that worries me is that when I am creating these categories, the ids should always remain the same, be static. If I deleted one, let's say "Politics" at id=1, all of those events would have an orphaned category. One solution I thought of is to just assign string ids to them, so if they do happen to get deleted, it wouldn't really matter. What kind of solution do you recommend?
From my perspective it seems like you could keep the ids and just put a constraint that doesn't allow you to delete the record, only edit them. Another, is to use string ids, but that seems like a pain, although it seems to solve the problem of worrying about the ids being messed with.
Yes, this is what foreign key constraints are for. They also allow for rules on how to handle deletes -- for example, allowing a delete to cascade through dependent records, which is of course very dangerous and not what you would want in this situation. A simple basic constraint will do.
HOWEVER!!!! This is the important thing to understand about mysql. The default mysql engine (myisam) has absolutely no support for foreign key constraints. You need to use an engine that supports them -- most commonly innodb.
If you specify a constraint when you're generating your DDL, a myisam table will accept the constraint but simply ignore it, so make sure all your related tables are setup/altered to be innodb tables before you add your constraint(s).
How do you add a constraint?
ALTER TABLE `event` ADD CONSTRAINT `category_event`
FOREIGN KEY (`category_id`) REFERENCES `category` (`category_id`);
In this example, it assumes your event table has the foreign key category_id in it, to create your linkage. After adding this constraint, if you attempt to delete a row from the category table, and an existing event row contains that key, mysql will disallow the DELETE and return an error. This is discussed in great detail here: http://dev.mysql.com/doc/refman/5.6/en/innodb-foreign-key-constraints.html

In MySQL, why do I have to define ForeignKey relationships?

Why can't I just leave those relationships out?
What's the point of them?
I can stil run queries and treat them like it a relationship myself...
Yes, you can always leave the foreign key constraints out but then you will be responsible about the integrity of your data. If you use foreign key constraints, then you won't have to worry about the referential integrity among tables. You can read more about referential integrity from Wikipedia. I will also try to explain it with an example below.
Think of a shopping cart scenario. You have three tables: item, shopping_cart and shopping_cart_item. You can choose not to define any relationship between these tables, that's fine for any SQL solution. When user starts shopping, you create a shopping cart by adding a shopping_cart entry. As user adds items to his shopping cart, you save this information by adding rows to shopping_cart_item table.
One problem may occur at this step: If you have a buggy code that assigns incorrect shopping_cart_id's to shopping_cart_items, then you will definitely end up with incorrect data! Yes, you can have this case even with a foreign key constraint if the assigned id actually exists in the shopping_cart table. But this error will be more detectable when a foreign key exists since it would not insert shopping_cart_item record when the foreign key constraint fails.
Let's continue with the assumption that your code is not buggy and you won't have first type of referential integrity. Then suddenly a user wants to stop shopping and delete the cart and you chose to implement this case by deleting the shopping_cart and shopping_cart_item entries. Then you will have to delete entries in both tables with two separate queries. If something goes wrong after you delete shopping_cart entries, then you will again have a referential integrity problem: You will have shopping_cart_items that are not related to any shopping_cart. You will then have to introduce transaction managing, try to provide meaningful data to your business logic about the error happened in data access layer, etc..
In this type of scenario's, foreign keys can save life. You can define a foreign key constraint that will prevent insertion of any sort of incorrect data and you can define cascade operations that will automatically perform deletion of related data.
If there is anything unclear, just leave a comment and I can improve the answer.
Apart from what the others have said about why you technically want (actually: need) them:
foreign key constraints also document your model.
When looking at a model without FK constraints you have no idea which table relates to which. But with FK constraints in place you immediately see how things belong together.
You create FOREIGN KEYs to instruct the database engine to ensure that you never perform an action on the database that creates invalid records.
So, if you create a FOREIGN KEY relationship between users.id and visits.userid the engine will refuse to perform any actions that result in a userid value in visits that does not exist in users. This might be adding an unknown userid to visits, removing an id from users that already exists in visits, or updating either field to "break" the relationship.
That is why PRIMARY and FOREIGN KEYs are referred to as referential integrity constraints. The tell your database engine how to keep your data correct.
It doesn't allow you to enter an id which does not exist in another table, for example, if you have products and you keep owner Id, by creating a foreign key ton the owner id to id field of the owners table, you do not allow users to create an object record which has an owner id which does not exist in the owner table. such things are called referential intergrity.
The foreign key constraint helps you ensure referential integrity.
If you delete a row in one table, mysql can automatically delete all rows in other tables that the deleted row refers to via the foreign key. You can also make it reject the delete command.
Also when you try to insert a row, mysql can automatically create new rows in other tables, so the foreign key does not refer to nothing.
That is what referential integrity is all about.
Databases can be affected by more than just the application. Not all data changes go through the application even if they are supposed to. People change stuff directly on the database all the time. Rules that need to apply to all data all the time belong on the database. Suppose you can update the prices of your stock. That's great for updating anindividual price. But what happens when the boss decides to raise all prices by 15%. No one is going to go through and change 10,000 prices one at a time through the GUI, they are going to write a quick SQL script to do the update. Or suppose two suppliers join together to have one company and you want to change all of thie items to be the new company. Those kinds of changes happen to databases every day and they too need to follow the rules for data integrity.
New developers may not know about all the places where the foreign key relationships should exist and thus make mistakes which cause the data to be no longer useful.
Databases without foreign key constraints have close to a 100% chance of having bad data in them. Do you really want to have orders where you can't identify who the customers were?
THe FKS will prevent you from deleting a customer who has orders for instance or if you use a natural key of company_name and the name changes, all related records must be changed with the key change.
Or suppose you decide to put a new GUI together and dump the old one, then you might have to figure out all the FK relationships again (because you are using a different datalayer or ORM) and the chances are you might miss some.
It is irresponsible in the extreme to not put in FK relationships. You are risking the lifeblood of your company's business because you think it is a pain to do. I'd fire you if you suggested not using FKs because I would know I couldn't trust my company's data to you.

simple DB design question

This is probably a very stupid question, but I am just not sure which solution is the most elegant and the best(most performant) way to go in the following scenario.
I have the following tables:
Customer, Company, Meter, Reading
all of the tables above the line are supposed to be linked to one or more records of a "Comment" table. Which is the best way to model this relationship?
I am seeing two solutions here:
1.) use m:n relationships: CustomerComment, CompanyComment, etc. -> easy to extend later on, but a lot of new tables
2.) use 1:n relationships: Comment table has a field for the PK of the tables above (Customer_id, Company_id, ...) -> minimal table approach, but "harder" to extend since I would have to add a new field to the comment table whenever there is a new table that needs to be have comments
The target is a modular application, which may or may not have any of those four tables.
Which one is the better one - or are there more?
Thanks!
This is the problem with using integers for primary keys. You have a few solutions you can use.
The true unique ID for any given row for Customer, Company, Meter, Reading is a UUID. Maybe because of the database design the primary key has to be an integer but that is ok. This means you never have to add fields to the COMMENTS table if you have a new type in your system. It will always reference by the types ID.
Your tables can look like this:
CUSTOMER
ID UUID
COMPANY
ID UUID
METER
ID UUID
COMMENTS
ID
RELATED_TO UUID
COMMENT TEXT
Now you can attach comments to any table that has a unique ID.
If you want to support referential constraints
OBJECT is a table that holds all of the ID's of all the pieces of data you have in your system. We really start building a system in which you can associate any comment with anything you want. This may not be suitable in your design.
OBJECT
ID UUID
CUSTOMER
ID UUID
FOREIGN_KEY (ID) REFERENCES OBJECT(ID) ON DELETE CASCADE
COMPANY
ID UUID
FOREIGN_KEY (ID) REFERENCES OBJECT(ID) ON DELETE CASCADE
METER
ID UUID
FOREIGN_KEY (ID) REFERENCES OBJECT(ID) ON DELETE CASCADE
COMMENTS
ID
RELATED_TO UUID
COMMENT TEXT
FOREIGN_KEY (RELATED_TO) REFERENCES OBJECT(ID) ON DELETE CASCADE
This complicates the design in order to assure you don't need to add 2 tables for each new type in the system. Each design has sacrifices. In this one you've complicated things by saying for every each entry whether it be Company, Customer, Meter I need an associated ID int he Object table so I can put a foreign key on it.
I prefer one for each pair - CustomerComment, CompanyComment, etc. It eventually will speed up your queries, and while it isn't as 'extensible' as a single CommentLink table, you'll need to make schema changes when you add something else that needs comments anyway.
I would use seperate tables, that way you can keep the referential constraints simple.