I know this is an odd question because I've always been taught to use a foreign key constraint. However, I've come across a case where a foreign key reference value must be kept for historic purpose when the reference is deleted.
It is a task management system whereby a task occurrence references a parent task containing the recurrence rule. This parent task can be deleted, but the occurrence itself must remain in tact with the non-existing parent id. If the parent task cannot be found, the system simply returns an error - eg "parent task no longer exist." The reason why the parent id cannot be set to null on cascade is because it is being used elsewhere in the occurrence as an identifying key.
Another example: What about a YouTube video that was removed from a playlist. Similar situation right? It is being referenced in the playlist, but the video doesn't exist, so it returns an error in the playlist instead.
Do I simply not define a foreign key at all and just simply create the parent_id reference column as a normal column? I just want to be sure how this is normally handled when one encounters a case where one table references another, but the former is not constraint by the existence of the latter.
Having a constraint is just a technical helper to enforce the semantics defined for the database, i.e. "this column contains a number that is not only an INTEGER(32) but also an identifier for a record in some other table". As such they're not strictly necessary, but it:
makes the intention of the field clear (self documentation)
keeps your data "clean" by preventing incorrect data from being inserted
gives the database engine a hint concerning the content of the table which may allow the db to perform more efficiently.
That said, the "proper" way to accomplish what you've described would be not to physically delete the parent record in the first place. Instead, mark the parent as deleted. Since you're keeping the record for historical purposes, surely you'll want to be able to know what the parent used to be, even if it's no longer active or valid.
Second option would be to create a dummy "parent record deleted" reference. Whenever you delete a parent, you update remaining references to point to the dummy record instead. At least you wouldn't rely on errors to implement expected and valid behaviour.
Finally, I see no reason you shouldn't be able to set the foreign key to NULL. It sounds like you're using the foreign key as part of the primary key of the record in question ("is being used .. as an identifying key"). This you almost certainly should not be doing, if that's the root cause of the problem, start by changing that.
Do I simply not define a foreign key at all and just simply create the
parent_id reference column as a normal column?
Yes. At least this is the way I got to know and how we handle this stuff at work.
You might then want to set an index on the reference column.
Related
For instance there are 20 tables which have foreign key of a table let's call it Child. Now when i delete a record from Child it will check whether the record is referenced from somewhere or not, is it so or some other scenario.
My question is how this foreign key relation impacting performance of deletion operation.
Actually i'm using hibernate and i have an entity which has got only 3 columns and is used in many other Entities(one-to-one) mapping.
I'm thinking to make this entity embeddable for performance tuning because if i keep it entity then mapping between tables is done using foreign key. Although when i delete an entity there are only two query are running :- delete parent , then delete child. But as child's foreign key is referenced from many other tables with lot of records so it will check whether reference of record in child exists in some tables or not if not then delete while deleting child record. So i want to target this issue by making child embeddable which will result columns of child will be included in parent tables. Whether this will help?
Performance depends very much on which DBMS you're using, how your tables are designed, indexed and stored, and how much data you have.
In general, foreign key constraints save time and effort and prevent mistakes. Without a foreign key constraint, you would have to enforce integrity yourself.
For example, manually cascading a delete or update would be done in multiple round-trips to the database which would normally be wrapped in a transaction. Manually checking for related records to restrict changes would also require additional queries and data transferred between server and client.
If you missed anything or another user modified related data between your queries, you might end up with invalid data, which can be very costly - both in terms of DBA time as well as customer satisfaction.
First of all, my apologies if this question is a duplicate - but I find it difficult, putting short, precise words on my problem.
I've got these entities.
The left contains several groups (like in Unix, in order to make data available to a whole group at once) and at the moment, it's always 1. The right one contains projects - and the middle one makes sure, that one group can gain access to several projects.
As you can see, there are foreign key relationships among them. Now, I want to create a new project in nmd__tree. When doing that, it returns an error:
Cannot add or update a child row: a foreign key constraint fails
(nmd.nmd__tree, CONSTRAINT FK_nmd__tree FOREIGN KEY (treeid)
REFERENCES nmd__helperusergrouphierarchy (treeidfk))
This makes sense, since the nmd_tree relies on a valid foreign key in the helper entity - but at the same time, it presents the problem, that the treeidfk isn't yet known, since it is autogenerated in nmd__tree
A solution could be to remove the relations, insert the record in nmd__tree, extract the newly written primary key (treeid) and create a record in the middle helper entity with the new id. It will work, but is really not very elegant. Also, removed relations presents other, potential problems.
My intentions are to create a query, that deals with this problem by creating both records at once. I know, it isn't possible to make a double insert and found this suggestion (my version doesn't write any records), as well as an article, suggesting stored procedures, which I don't see should make a difference
I would really appreciate a push in the right direction, please.
It seems you've got your constraints defined in the wrong direction; The middle table should have two foreign key constraints not the two end tables. That way, you can insert records in the two end tables and then link them up using the middle table.
I have a rails app with records that contain foreign IDs linking to records other records (belongs_to).
I am aware that index IDs start at 1 by default, and it's common to have a validation in the model that ensures foreign ID values are greater than zero.
In my example, I would like to have the option to not have a foreign ID set (i.e. it is not [yet] linked to a another 'owning' record).
Would it be appropriate to remove the validation from the model, and then set the value of the foreign ID to zero in this case, indicating that it is not assigned?
For some reason, I don't seem to be able to find this stated in documentation anywhere, maybe I'm using the wrong terminology, or it's too obvious to document ;-)
My question is that can I use cascade with composite Primary key?
I have a table FbUser and a table FbFriends. FbFriends table has UID and FID as composite primary key, In other tables it is represented as foreign key(UID,FID)
If I make statement delete from FbFriends where UID="10" and FID="2" CASCADE, Will that delete the child rows as well?
ON DELETE CASCADE is an attribute of the foreign key. It is not a clause that you add to your DELETE statement. If the foreign key is defined to delete child rows when the parent is deleted, it doesn't matter whether the foreign key is defined on a single column or on multiple columns, the delete will cascade.
Personally, though, I'm not a big fan of cascading deletes or any other "magic" that happens outside of the logic in a piece of code. I've seen way too many cases where an ORM is misconfigured to do a DELETE followed by an INSERT rather than an UPDATE or where a developer builds a script that deletes and reloads some number of rows in a table inadvertently create a mess when a cascading foreign key or a trigger that wasn't looked at caused modifications to some number of other tables. If the original developer fails to realize that those tables are potentially impacted by his change, he'll certainly fail to test the data in those tables and the change can rather easily get promoted to production before users start seeing the problem and crying. Sure, it's more verbose to explicitly delete from the child table before the parent table. But doing so generally makes it much more likely that someone can read and follow your code in its entirety.
In the Oracle realm, for example, Tom Kyte is against cascade deletes. You can also find various cases where cascading constraints caused unexpected behavior because the developers maintaining a system didn't remember that someone long ago had configured the constraints in a particular way. Personally, I'd much rather get an error telling me that the database can't delete a row because there are child rows rather than potentially losing data that I didn't intend to lose.
When I am using Foreign Keys in MySQL, I will get an error if the source value is 0 (because there is no target record with ID 0). Therefore, I am changing the source column to be NULL, and then it works. However, I am not sure if this is the right way this should be done. Is it the right way, or can I somehow keep the source ID set to 0 instead of NULL?
Foreign keys are constraints. This means that if the value of the column that has the foreign key is set to anything (and "anything" does not include NULL), that value must exist in the referenced table or MySQL will throw an error.
So, in short, you can either set the value to NULL, remove the foreign key constraint and set the value to whatever you desire, including 0, or add a record with a 0 in the referenced table. Of these options setting the value to NULL seems the cleanest.
It is the right way. 0 is a value and null says that there is nothing in the column.
Yes, this is the right way. The whole point of an FK is to enforce that a record with the referenced ID actually exists. So if you set the FK column to 0, there must be a record with ID 0.
The only way around this is to make the FK column NULLable, as you did.
At any rate, why would you want to set the FK column to 0? The canonical value for "does not exist" in SQL is NULL.
using a NULL is better than zero for two reasons. first, it's clearer that it's a "special" value (there's nothing that forces table ids to always be non-zero, although it is often true for auto-generated ids), and second it works in SQL with the foreign key constraint.
so what you are doing is common practice - many people use NULL as a marker that says "missing value", and that's what SQL's foreign key constraint expects.
another way to handle missing values is to use a third "link" table that has an entry only if there is a connection between the two classes (as you would do in a many-to-many relation). this avoids the need for a NULL, and so is preferred by some database purists, but makes everything more complex. see Nullable Foreign Key bad practice? for more discussion.
Yes, this is the right way and the correct pattern to use in those cases.
As stated, what is indicated in those structures is to leave the column as null, indicating the line would not be linked to any counterpart in the foreign table. It whould not be considered "right" in database teories, but is a very used pattern, so, its not actualy considered "wrong" by the most of database designers. I gues you could say its the kind of pattern you try not to look to when trying to find mistakes in the structure.
The pattern is considered incorrect because it is expected to use a non-null column in a primary key and in this sense the columns in the table that will receive the key must be identical to the primary column of the table, that is, they would never be null. However, in most databases physically, there are no impediments to creating a different column, which makes the null value possible.
The problem with this architecture is when the table gets too big, when, p. eg, it have more than 1000 lines (yes, the "big" would be that low!), in these situations, specially in a small infrastructure, the answer time starts to get too long and "questionable". It happens that null records can not be part of indexes and the algorithm ends up doing a full scan. Therefor this type of pattern is implemented when we know the table will always be verry, verry small! Otherwise, I recommend use the pattern creating an external table where the "null" option in your case would be a "not found" in this other table.