Let's say my database is for ecommerce store. Database holds records of users and orders. I have 2 tables 'users' and 'orders'.
'orders' table have userId column(foreign key) that reference to 'users' table id column.
When I want to delete a user- it throws an error because the 'orders' table have a record referencing this user's id. So how should I handle this?
I found 3 ways to overcome this.
not use foreign keys
use 'ON DELETE CASCADE' so when i delete user from 'users' table it will delete related order records of that user automaticaly.(is it a good idea?)
delete all child records and then delete parent record.
What is the best way?
For the simple question of how do you delete a record and it's references...
Foreign keys are critical for the integrity of your database. Without foreign keys you easily wind up with records that refer to objects which no longer exists. Don't remove them.
Manually deleting the referencing rows is error prone and will break when you change the schema.
on delete cascade allows you to have referential integrity, and also be able to delete records.
The comments got into the larger question of whether deleting user and order records are a good idea. There is not enough information in your question to know what is best. That would be another question.
However, even if you decide to use a status field to set users and orders as inactive (a timestamp, not a flag, because you'll want to know when the user was deactivated), you still want to set up the tables with on delete cascade so when you eventually do delete inactive records (for example, perhaps an annual cleanup, or for testing, or due to a mistake) the delete will work.
Related
For instance there are 20 tables which have foreign key of a table let's call it Child. Now when i delete a record from Child it will check whether the record is referenced from somewhere or not, is it so or some other scenario.
My question is how this foreign key relation impacting performance of deletion operation.
Actually i'm using hibernate and i have an entity which has got only 3 columns and is used in many other Entities(one-to-one) mapping.
I'm thinking to make this entity embeddable for performance tuning because if i keep it entity then mapping between tables is done using foreign key. Although when i delete an entity there are only two query are running :- delete parent , then delete child. But as child's foreign key is referenced from many other tables with lot of records so it will check whether reference of record in child exists in some tables or not if not then delete while deleting child record. So i want to target this issue by making child embeddable which will result columns of child will be included in parent tables. Whether this will help?
Performance depends very much on which DBMS you're using, how your tables are designed, indexed and stored, and how much data you have.
In general, foreign key constraints save time and effort and prevent mistakes. Without a foreign key constraint, you would have to enforce integrity yourself.
For example, manually cascading a delete or update would be done in multiple round-trips to the database which would normally be wrapped in a transaction. Manually checking for related records to restrict changes would also require additional queries and data transferred between server and client.
If you missed anything or another user modified related data between your queries, you might end up with invalid data, which can be very costly - both in terms of DBA time as well as customer satisfaction.
I'm trying to de-duplicate user accounts in our system and I know there are lots of questions out there about removing/identifying duplicates (such as Remove duplicate rows in MySQL), but I haven't seen any that required maintaining referential records.
I have a users table and a subscriptions table with a foreign key field User_ID common to both and set to CASCADE in subscriptions.
I'd like to remove all duplicates in the users table but in doing so, all of the records corresponding to User_ID in the subscriptions table would be lost due to the CASCADE behavior.
Is it possible to UPDATE the users table, altering the User_ID of the duplicate records to the one I want to keep, without colliding with the unique index, allowing all referential records to be updated accordingly and finally removing the duplicate User record without cascading the delete?
The added complication is that the User_ID field in the users table is obviously indexed with unique.
EDIT: I should add that this is a simplified example, our DB has 100+ tables many of which have foreign keys based on the User_ID.
So in the end, as #MarcB helped me discover above, the correct answer is to have planned better in the beginning ;)
So in the end, we're going to have to write a programatic solution to manually join accounts. We're lucky enough to have DAO/DTO's for every object type and so it shouldn't be too bad dealing with the referential records, it'll just be an intense operation and so will require some good planning wink.
I have a Customers table, a Sports table, and Customers_Sports join table. The last table tells me which customers play what sports, ie contains foreign keys to the other two tables.
Foreign key constraints are enforced, and foreign keys cannot be null.
Using LINQ, is there a simple way to delete a customer and at the same time delete all the records in the join table that reference the customer?
I can do it the hard way, ie first delete relevant records from the join table, then delete the customer's record from Customers.
Simple is the real kicker. There's not an automatic way to do it. You're basically stuck deleting everything from the Customer_Sports table that matches the given customer, then deleting the customer yourself. I believe if you delete both before you do a SubmitChanges() you shouldn't run into any foreign key constraint violations.
If you wanted to get really fancy you could use reflection to create a generic function that would do this any time there was a foreign key kind of like Linq to SQL cascading delete with reflection .
Setting up the database to do this for you is probably a lot less error prone. The performance is probably better too, although unless you know you have a performance problem you shouldn't worry about performance.
Why can't I just leave those relationships out?
What's the point of them?
I can stil run queries and treat them like it a relationship myself...
Yes, you can always leave the foreign key constraints out but then you will be responsible about the integrity of your data. If you use foreign key constraints, then you won't have to worry about the referential integrity among tables. You can read more about referential integrity from Wikipedia. I will also try to explain it with an example below.
Think of a shopping cart scenario. You have three tables: item, shopping_cart and shopping_cart_item. You can choose not to define any relationship between these tables, that's fine for any SQL solution. When user starts shopping, you create a shopping cart by adding a shopping_cart entry. As user adds items to his shopping cart, you save this information by adding rows to shopping_cart_item table.
One problem may occur at this step: If you have a buggy code that assigns incorrect shopping_cart_id's to shopping_cart_items, then you will definitely end up with incorrect data! Yes, you can have this case even with a foreign key constraint if the assigned id actually exists in the shopping_cart table. But this error will be more detectable when a foreign key exists since it would not insert shopping_cart_item record when the foreign key constraint fails.
Let's continue with the assumption that your code is not buggy and you won't have first type of referential integrity. Then suddenly a user wants to stop shopping and delete the cart and you chose to implement this case by deleting the shopping_cart and shopping_cart_item entries. Then you will have to delete entries in both tables with two separate queries. If something goes wrong after you delete shopping_cart entries, then you will again have a referential integrity problem: You will have shopping_cart_items that are not related to any shopping_cart. You will then have to introduce transaction managing, try to provide meaningful data to your business logic about the error happened in data access layer, etc..
In this type of scenario's, foreign keys can save life. You can define a foreign key constraint that will prevent insertion of any sort of incorrect data and you can define cascade operations that will automatically perform deletion of related data.
If there is anything unclear, just leave a comment and I can improve the answer.
Apart from what the others have said about why you technically want (actually: need) them:
foreign key constraints also document your model.
When looking at a model without FK constraints you have no idea which table relates to which. But with FK constraints in place you immediately see how things belong together.
You create FOREIGN KEYs to instruct the database engine to ensure that you never perform an action on the database that creates invalid records.
So, if you create a FOREIGN KEY relationship between users.id and visits.userid the engine will refuse to perform any actions that result in a userid value in visits that does not exist in users. This might be adding an unknown userid to visits, removing an id from users that already exists in visits, or updating either field to "break" the relationship.
That is why PRIMARY and FOREIGN KEYs are referred to as referential integrity constraints. The tell your database engine how to keep your data correct.
It doesn't allow you to enter an id which does not exist in another table, for example, if you have products and you keep owner Id, by creating a foreign key ton the owner id to id field of the owners table, you do not allow users to create an object record which has an owner id which does not exist in the owner table. such things are called referential intergrity.
The foreign key constraint helps you ensure referential integrity.
If you delete a row in one table, mysql can automatically delete all rows in other tables that the deleted row refers to via the foreign key. You can also make it reject the delete command.
Also when you try to insert a row, mysql can automatically create new rows in other tables, so the foreign key does not refer to nothing.
That is what referential integrity is all about.
Databases can be affected by more than just the application. Not all data changes go through the application even if they are supposed to. People change stuff directly on the database all the time. Rules that need to apply to all data all the time belong on the database. Suppose you can update the prices of your stock. That's great for updating anindividual price. But what happens when the boss decides to raise all prices by 15%. No one is going to go through and change 10,000 prices one at a time through the GUI, they are going to write a quick SQL script to do the update. Or suppose two suppliers join together to have one company and you want to change all of thie items to be the new company. Those kinds of changes happen to databases every day and they too need to follow the rules for data integrity.
New developers may not know about all the places where the foreign key relationships should exist and thus make mistakes which cause the data to be no longer useful.
Databases without foreign key constraints have close to a 100% chance of having bad data in them. Do you really want to have orders where you can't identify who the customers were?
THe FKS will prevent you from deleting a customer who has orders for instance or if you use a natural key of company_name and the name changes, all related records must be changed with the key change.
Or suppose you decide to put a new GUI together and dump the old one, then you might have to figure out all the FK relationships again (because you are using a different datalayer or ORM) and the chances are you might miss some.
It is irresponsible in the extreme to not put in FK relationships. You are risking the lifeblood of your company's business because you think it is a pain to do. I'd fire you if you suggested not using FKs because I would know I couldn't trust my company's data to you.
I'm new to foreign key constraints. I will formulate a simple example to explain my situation.
I have a table user and a table entry. In user there is a user.firstEntry which is a foreign key to entry.EntryID. In entry there is a entry.userID which is a foreign key to the user.userID table. These IDs are all auto increment values.
Are cycles like that forbidden? Then I will have to change the design?
I am not able to insert some valid entry into both tables, because the first insert already says that there's a problem with the constraints. Auto commit is off.
What shall I do?
Thanks
Bit strange design, but you can do this :
When creating a User, set firstEntry to NULL.
Insert an Entry with that user's id.
Update Users and set firstEntry to the id of the inserted entry.
Both user and entry need the other to be already created beforehand. and since either cant be created without the other, you will have this problem IF foreign constraints check is on that is.
Whatever I can understand from your question, each user seems to have multiple entries. So your table design could look like Table_User(user_id(pk), user_name etc) and the entry table could be Table_Entry(entry_id(pk), entry_whatever,...,user_id(fk to user table)) As it seems the user is independent but the entries are dependent on users.
A foreign key constraint is supposed to prevent your from adding invalid data into the foreign key column.
In most cases it will check to see if the value actually exists in the specified table. Because you have a cycle in your user and entry table, when you attempt to create a entry it will check to see if the value of entry.userID exists in the user table. It will do the same when you attempt to add a new user, it will check the entry table for the value you entered for user.firstEntry. If both user and entry are new there is no way to link the two because of your cycle. A new entry record needs an existing user and a new user record needs an existing entry. When both tables are empty I don't think you will be able to satisfy the constraint.
I would suggest keeping the foreign key to userID in the entry table (since I'm assuming entries are linked to users) and finding some other way to represent a user's first entry. Maybe an user_entry_history table or something along those lines.
DISCLAIMER - It's been awhile since I messed with Database design.