I've had some behavior in MS Access 2019 that surprised me and I've boiled it down to the following:
I've got two tables with a different number of records:
I need to establish an outer join between them that includes all records from table [test 1] and only those records of [test 2] where the joined fields are equal, and I'd like to have referential integrity so that I can't accidentally delete or modify a joined field on one side only.
When I open the Relationships window with the two tables and drag field [ID] from table [test 1] to [test 2], the join properties I need appear as choice 2:
When I attempt to create the join, I get an error message saying that data in table [test 2] violates referential integrity:
However, if I define the join in the opposite direction by dragging field [ID] from table [test 2] to [test 1], the result is different. First, the join properties I need appear as choice 3:
I've seen that difference before, and it's no problem. But the surprise is that when I attempt to create the join, now it works:
So my ability to establish referential integrity appears to depend on which direction I drag the field to set up the join. (Does that make the join left vs. right?) I don't remember seeing anything before about directional dependence (or perhaps it could be called non-commutativity) of referential integrity. The purpose of referential integrity is to prevent me from deleting or modifying a joined field in one table without making the corresponding change in the other. How does that objective depend on which direction I drag the field to set up the join?
Short Answer. No, referential integrity is not commutative.
Column X references column Y is not the same as column Y references column X.
Deep dive. The idea of foreign keys is fundamental to the Relational model of data. Without it, the expressive power of the model would be so hampered that it would never have caught on the way it did some 50 years ago. A foreign key can be a foreign key with or without a foreign key constraint. Still, the foreign key constraint will usually be helpful, for reasons you mention in your question.
And you are right that dragging from X to Y won't produce the same constraint as dragging Y to X.
Outer joins nearly always produce all the results from the referenced side, and only the valid results from the referencing side. Outer joins and inner joins can both be useful, in different circumstances. That is why Access offers you three options.
Related
I do something like this:
SET foreign_key_checks = 0;
//many different operations on several tables
SET foreign_key_checks = 1;
How can I verify that my entire base is consistent? I want to be sure that all relationships are properly maintained. For example, if I delete a "country" with id: 20, I want to make sure that no "city" has a non-existent relationship "country_id" = 20.
It's easier if you do not SET foreign_key_checks = 0. If you keep the constraint enforcement on, then you can't make inconsistencies or broken references. You get an error if you try. So you should consider not turning off the FK checks if referential integrity is important.
If you do think you have inconsistencies, you must do a query like the following to verify there are no "orphans" that reference a parent that no longer exists:
SELECT cities.city_id
FROM cities
LEFT OUTER JOIN countries
ON cities.country_id = countries.country_id
WHERE countries.country_id IS NULL;
If the JOIN condition was based on equality of country_id, this means country_id must not be NULL. The left outer join returns NULL for all columns when there is no match. So if you search in the WHERE clause for cases where country_id IS NULL this will only return cities that have no match in the other table.
You must do a separate query for each relationship in your database. This can be quite a chore, and if the tables are very large, it can take a long time.
I once had to do this many years ago in a buggy application that had no foreign key constraints (it used MyISAM tables). I ran a script to do all these orphan-checks every night, and eventually it grew to dozens of queries, and took hours to run.
Then comes the part that is even harder: once you do find some orphaned records, what do you do about them? Do you delete the orphans? Do you change their foreign key column to reference a parent record that does still exist? Do you restore the parent record? It could be any of these options, and you must have the orphaned records reviewed case by case, by someone with the knowledge and authority to choose how to resolve the issue.
It's far better to keep the constraints enforced so you don't have to do that work.
Hello people I have this foreign key dilemma. Let's say we have Table A and Table B and Table C.
Table A is child of super table B and the records are connected through foreign key on id from A to B (one way). Now table C contains information that could be applied to A and B. I know that having this information on table B will come in handy but I am not sure about table A, technically the information could belong also to table A.
Now my question is, would it be better, to have table A access information in table C through its parent row in table B or make a "shortcut" from table A to table C and reference table C directly?
To simplify those two options:
Option 1: table A references table B + table B references table C
Option 2: table A references table B + table B references table C + table A references table C
Is there any benefit doing option 2 since same information is one table away in option 1?
A FOREIGN KEY is
an implicitly generated index (for performance) and
a constraint (for data integrity);
A FK is only indirectly "how you reference another table". So, I would prefer you simply talk about columns. (Any column could be used to 'reference' any other table.)
One of many textbook principles is DRY -- Don't Repeat Yourself. It is a wise principle because eventually something will go wrong and the repeated data will become inconsistent. The extra link from A to C is redundant.
On the other hand, in huge datasets, all sorts of textbook principles are violated to provide the required performance. (By "huge", I mean billions of rows, perhaps millions, but not thousands.)
Since you seem to be just starting out, I recommend not having the short cut and worrying about performance when you hit a problem. Yes, it will take an extra JOIN.
For novices, performance problems usually happens pretty soon, but not because of the lack of a shortcut; there are many other lessons to learn first. Hint: Learn about "composite indexes"; I think it is the number one performance technique that beginners fail to learn about. (And focusing on FKs distracts from focusing on INDEXes.)
They say a foreign key is to make possible a relationship between two tables, but I can do this in my statements with JOINs. Exactly what can I do with a foreing key in a SQL statement that I can't do with a JOIN? Or is a foreign key only to help us while we are working with tables in the database?
Relationships between rows of two tables can be established by storing a "common value" in columns of each table. (This is a fundamental tenet of relational database theory.)
A FOREIGN KEY is an integrity constraint in the database. If there is a foreign key constraint defined (and enforced), the database will prohibit invalid values from being stored in a row (by INSERT and UPDATEstatement, and prevent rows from being removed (by DELETE statement.)
A JOIN operation in a SQL statement just allows us to access multiple tables. Typically, a join operation will include conditions that require a "match" of foreign key in one table with a primary key of another table. But this isn't required. It's possible to "join" tables on a huge variety of conditions, or on no condition at all (CROSS JOIN).
A foreign key is designed to protect database integrity. You can read data with a join without any foreign key being present (and we do it all the time).
What a foreign key will do is prevent you form corrupting your data by doing things like deleting the parent record that a child record refers to. If you attempt to delete the parent record without deleting the child first, it will error, preventing the data corruption. It can also be configured so that if you delete the parent, child records are automatically deleted.
We don't use FKs (foreign keys) to query or update.
Tables represent application relationships. When some values or entities identified by values are related in a certain way we put that row in the table for that relationship. We get or put rows that participate in relationships combined from base table relationships by writing queries mentioning the base tables. JOIN of tables returns the rows that are related by one's relationship AND by the others. UNION returns the rows that are related by one's relationship OR the other. ON and WHERE become AND. Etc. (Is there any rule of thumb to construct SQL query from a human-readable description?) By setting columns equal we force the same value or entity to play roles in multiple relationships. There might or might not be a FK between them, but we don't need to know that to query or update.
FKs get called "relationships", but they're not. They are facts. (They are also "instances" from a "meta" relationship on tables & columns.) They state that the subrow values for some columns in a table are always also subrow values for some columns that are PRIMARY KEY or UNIQUE in some table. (This also means that a certain implication using the tables' relationships is always true in the application situation.) Declaring a FOREIGN KEY to the DBMS means that it can reject update attempts that don't satisfy that constraint as errors. FK declarations are also tied to CASCADE rules in SQL DBMSs, simplifying updates.
I am using a MySQL database. In my relational data model, I've got two entities that relate 1:1 to each other. In my schema, a 1:1 relation is set up by putting a FK field in one of the two tables, that relates to the PK of the other table. Both tables have PKs and they are both auto increment BIGINTs.
I am wondering whether it would be possible to have an ON DELETE CASCADE behaviour on them that works both ways.
i.e. A 1:1 B, means that [ deleting A also deletes B ] as well as [ deleting B also deletes A ].
I realise that this may not be absolutely necessary in terms of proper application design, but I am just wondering whether it is actually possible. As far as I recall, you can't put an FK constraint on a PK.
It'd be impossible to insert such records if you have a 2-way relationship enforced. Chicken-and-egg. Record in table #1 can't be inserted because there's no matching record in table #2, and table #2 cannot be inserted into because there's nothing in table #1 to hook to.
You can disable FK constraints temporarily (set foreign_key_checks = 0), but this should never be done in a "real" system. It's intended more for loading dumps where the table load order cannot be guaranteed.
I'm designing a small database for a personal project, and one of the tables, call it table C, needs to have a foreign key to one of two tables, call them A and B, differing by entry. What's the best way to implement this?
Ideas so far:
Create the table with two nullable foreign key fields connecting to the two tables.
Possibly with a trigger to reject inserts and updates that would result 0 or 2 of them being null.
Two separate tables with identical data
This breaks the rule about duplicating data.
What's a more elegant way of solving this problem?
You're describing a design called Polymorphic Associations. This often gets people into trouble.
What I usually recommend:
A --> D <-- B
^
|
C
In this design, you create a common parent table D that both A and B reference. This is analogous to a common supertype in OO design. Now your child table C can reference the super-table and from there you can get to the respective sub-table.
Through constraints and compound keys you can make sure a given row in D can be referenced only by A or B but not both.
If you're sure that C will only ever be referring to one of two tables (and not one of N), then your first choice is a sensible approach (and is one I've used before). But if you think the number of foreign key columns is going to keep increasing, this suggests there's some similarity or overlap that could be incorporated, and you might want to reconsider.