I'm designing a small database for a personal project, and one of the tables, call it table C, needs to have a foreign key to one of two tables, call them A and B, differing by entry. What's the best way to implement this?
Ideas so far:
Create the table with two nullable foreign key fields connecting to the two tables.
Possibly with a trigger to reject inserts and updates that would result 0 or 2 of them being null.
Two separate tables with identical data
This breaks the rule about duplicating data.
What's a more elegant way of solving this problem?
You're describing a design called Polymorphic Associations. This often gets people into trouble.
What I usually recommend:
A --> D <-- B
^
|
C
In this design, you create a common parent table D that both A and B reference. This is analogous to a common supertype in OO design. Now your child table C can reference the super-table and from there you can get to the respective sub-table.
Through constraints and compound keys you can make sure a given row in D can be referenced only by A or B but not both.
If you're sure that C will only ever be referring to one of two tables (and not one of N), then your first choice is a sensible approach (and is one I've used before). But if you think the number of foreign key columns is going to keep increasing, this suggests there's some similarity or overlap that could be incorporated, and you might want to reconsider.
Related
Hello people I have this foreign key dilemma. Let's say we have Table A and Table B and Table C.
Table A is child of super table B and the records are connected through foreign key on id from A to B (one way). Now table C contains information that could be applied to A and B. I know that having this information on table B will come in handy but I am not sure about table A, technically the information could belong also to table A.
Now my question is, would it be better, to have table A access information in table C through its parent row in table B or make a "shortcut" from table A to table C and reference table C directly?
To simplify those two options:
Option 1: table A references table B + table B references table C
Option 2: table A references table B + table B references table C + table A references table C
Is there any benefit doing option 2 since same information is one table away in option 1?
A FOREIGN KEY is
an implicitly generated index (for performance) and
a constraint (for data integrity);
A FK is only indirectly "how you reference another table". So, I would prefer you simply talk about columns. (Any column could be used to 'reference' any other table.)
One of many textbook principles is DRY -- Don't Repeat Yourself. It is a wise principle because eventually something will go wrong and the repeated data will become inconsistent. The extra link from A to C is redundant.
On the other hand, in huge datasets, all sorts of textbook principles are violated to provide the required performance. (By "huge", I mean billions of rows, perhaps millions, but not thousands.)
Since you seem to be just starting out, I recommend not having the short cut and worrying about performance when you hit a problem. Yes, it will take an extra JOIN.
For novices, performance problems usually happens pretty soon, but not because of the lack of a shortcut; there are many other lessons to learn first. Hint: Learn about "composite indexes"; I think it is the number one performance technique that beginners fail to learn about. (And focusing on FKs distracts from focusing on INDEXes.)
I am creating a new database design in which i got stuck in a problem. I am having a table with attribute(A,B,C,D,E) here (A,B,C,D) is composite key which uniquely define E . My problem is attribute D is multi valued Which is a part of primary key.Currently i am thinking using comma separated values for D but it also has some limitations as while searching for E the values in D should be in same order as they were inserted.As-
Let D is i,j,k So My Table T is A,B,C,D(i,j,k)->E
Now if i want the result then i should fire query in same order as (A,B,C,D(i,j,k)).
So i am wondering is there any other better way to do this thing.
Don't use comma separated values as one of the fields in a composite primary key. When you do this, you are really creating a table that is not in First Normal Form, even though it may appear to be 1NF to the DBMS.
First Normal Form was devised way back when the relational model was brand new. It's purpose was to guarantee keyed access to all data. Don't define your keys in a way that defeats the purpose of having keys.
Here's what to do: First decompose the rows of your table into separate rows for each individual value in D, so that D doesn't have to be a multivalue any more. This will conform to 1NF, but will probably be in violation of 2NF, since non key values may be determined by just A, B, and C. You will probably want to decompose into two tables so as to conform to 2NF. And you may want to normalize even more than that.
In MySQL, I have 2 tables, A and B, with many-to-many relationships. And I have a pivot table, A_B, for the relationships between table A and B.
In table A, there are some rows that I want to link to *ALL* (including current and future) rows in table B. For example, let's say currently there are 5 rows in table B. Row A2 in table A is linked to all 5 rows in table B. In the future, when new rows are added to table B, I want row A2 to be automatically linked to ALL those new rows in table B.
Currently I am thinking of using MySQL Trigger to add this function. But I am not sure whether this is the correct way to do this or not.
Or I should add a new column in table A, and use it as an indicator for the default relationship? Then in my programming codes, I can check the column value whether a row in table A has relationship to all rows in table B.
Pivot table
In Relational Database terms, that is an Associative Table.
What you have is a need, such that designated rows in Table_A are related to all rows in Table_B; and the rest re related via Table_A_B. The great Dr E F Codd asks us to think of data in terms of sets. There are two sets in Table_A. Let's call the first set All_B and the second set Specific_B, for which Table_A_B identifies the specific relations.
Therefore is not a "default" relationship, it is two separate relationships, based on the Subtype.
You don't need the "default" relationship you described. That would result in masses of rows that serve no purpose, because the rows in Table_B for All_B are known, they do not need to be stored.
You don't need triggers (which would have been necessary to keep inserting Table_A_B Rows whenever an All_B row is inserted). Ugly as sin.
The correct method is to use Exclusive Subtypes, named for the sets, where:
Specific_B is related to Table_B, via Table_A_B
Table_A itself, and All_B, are related to nothing.
Your code can determine which Subtype any Table_A row is, and join Table_B accordingly:
via Table_A_B for Specific_B
Cartesian Product Table_A::Table_B for All_B.
Picture
Typical XOR Gate • Table Relation Diagram
An Exclusive Subtype requires a Discriminator column in the Basetype.
If you don't want to bother with full Relational Integrity, then sure, an Indicator column in Table_A will suffice. That is the Discriminator column in Table_A, minus the Subtypes.
Response to Comment
This is the first time I learn about the concept of exclusive subtype. Now I know how to solve the problem
In that case, study this document on Subtypes. The links include code that you will need, such that the integrity is Declarative. If you would like a full discussion, and particularly to avoid ugly methods to enforce "integrity", read this answer.
I think using database triggers is the most appropriate way to do what you want because it conserve independence of the DB from program and DB is complete by itself.
But maybe you consider using Grouping concept as it is used in user accounts management DBs. you can create a group such as admins and relate all records of table B to it and only assign records of A to this group.
I am creating a DB for my project and I am facing a doubt regarding best practice.
My concrete case is:
I have a table that stores the floors of a building called "floor"
I have a second table that stores the buildings called "building"
I have a third table that stores the relationship between them, called building_x_floor
The problem is this 3rd table.
What should I do?
Have only two columns, one holding a FK to the PK of building and another holding an FK to the PK of floor;
Have the two columns above and a third column with a PK and control consistency with trigger, forbidding to insert a replicated touple of (idbuilding, idfloor)?
My first thought was to use the first option, but I googling around and talking I heard that it is not always the best option.
So I am asking for guidance.
I am Using MySQL 5.6.17
You don't need third table. Because there is one-to-many relationship between building and floor.
So one building has many floors and a floor belongs to one building. Don't get things complicated. Even though you need a table with composite keys, you should be careful. You need to override equals and hashCode methods.
I am still not confortable with that approach. I am not saying it is wrong or innapropriate, very far from that. I am trying to understand how the informations would be organized and how performatic it would be.
If I have a 1:* relationship, like a student may be attending to more than one subject along its university course within a semester I would Have the 3rd table with (semester, idstudent, iddiscipline).
If I try to get rid of the join table my relationship would be made with a FK inside student table or inside subject table. And it does not make sense to do that because student table is a table for a set of information related with registering the info of a person while the discipline table holds the data of a discipline, like content, hours...it is more a parametric table.
So I would need a table for the join.
I have a parent table called 'Website' which holds records about websites. I have a child table called 'SupportSystem' which holds records about different types of support systems such as email, phone, ticketing, live chat etc. There is an intermediate table 'Website_SupportSystem' which joins these tables in a many-many relationship.
If the SupportSystem for a Website is ticketing, I also want to record the software platform .e.g. WHMCS. My instinct is to create a new lookup table called SupportPlatform and relate this to the existing join table 'Website_SupportSystem' and store the data there. However, then there is no relationship between the SupportSystem and SupportPlatform. If I relate those then I end up with a circular reference.
Can you see what I am doing wrong? What would be the best way to model this data?
You could use super-type/subtype relationship, as shown in the diagram.
SupportSystem table contains columns common to all support systems.
Email, Ticketing, Phone and LiveChat tables have columns specific to each one.
Primary key in the subtype table is also a foreign key to the super-type table.
I would add a new column 'SupportPlatformId" to the "SupportSystem" table which lookup to the table "SupportPlatform", because "SupportSystem" to "SupportPlatform" is probably one-to-one or many-to-one.
Hence: Website -> (via Website_SupportSystem) SupportSystem -> SupportPlatform
Data about a Support Platform should be stored in the SupportPlatform table.
You can add a third foreign key, namely SupportPlatfromID, to the Website_SupportSystem table. If you do this, your intermediate table now records a ternary relationship, of the kind many-to-many-to-many. If this reflects the reality, then so be it.
If you want to relate SupportSystems and SupportPlatforms, just use the intermediate table as an intermediate table in the joins. You can even do a three way join to join all three entities via the intermediate table.
An alternative would be to create another intermediate table, SupportPlatform_SupportSystem, with a pair of foreign keys, namely SupportSystemID and SupportPlatformID. If this reflects the reality better, so be it. Then you can join it all together with a five table join, if needs be.