MySQL > Should psuedokey table columns be identifying or non-identifying relationships? - mysql

Take the following database tables
|========|
|user |
|========|
|id |
|username|
|password|
|========|
|=========|
|blog |
|=========|
|id |
|date |
|content |
|author_id|
|=========|
blog.author_id is supposed to connected to a particular user.id, whichever user who wrote the blog entry obviously.
My question is with regards to 1:1, 1:n identifying and non-identifying relationships... I don't really understand them very much. Should this relationship be one of these types of relationships or not? And if so, which one? And what is the advantage of this?

In this example, there's a 1:1 relationship between a blog record and an author. The reason they exist as separate entities/tables is the grouping of information -- user related stuff doesn't belong with a blog record, and it could be duplicated if someone writes more than one blog.
The reason you want that implemented as a foreign key constraint is because the constraint ensures that the author for the blog record exists in the user table. Otherwise, it could be nonsense/bad data. The foreign key doesn't stop duplicates -- you'd need a primary or unique key for that -- the foreign key only validates data.
Now that Nanne clarified the identifying/non-identifying terminology for me, the blog.author_id would be the identifying relationship. Because it's identifying who (what user record) the author is.
The id column in both tables can be assumed to be be the primary key, because an artificial/surrogate key is the most common primary key. Which makes these columns the non-identifying relationship...

As a blog and a user are seperate things, and not defined by eachother, these are non-identifying relationships. One can be something without the other, evne though the author-id might be mandatory.
Also see this link for more explanation about the two terms: What's the difference between identifying and non-identifying relationships?

Related

How to model diamond like many-to-many relationship in database ERD

Legend:
PK (Blue): Primary key
FK (Green): Foreign key
PFK (Blue): Primary Key and Foreign Key at the same time
How to model a diamond like (if term is correct) relationship? Better to explain using a simplified example:
There is organization, item and tag entities.
My aim is to model:
Every tag is unique by itself and belongs to a single organization.
Every item is unique by itself and belongs to a single organization.
Items have many tags (joined using M2M table) and related tag/item pairs must belong to same organization. (i.e. item from organization A cannot pair with a tag from organization B)
I diagrammed two alternative solutions, but none of them satisfied me.
Diagram 1 breaks 3rd aim: items and tags are unique by themselves using id as primary key, but there is nothing to stop insert pairs into item_tag which belong to different organization.
Diagram 2 does not break, but bends 1st and 2nd aims: organization_id is added as a Primary and Foreign Key to item and tag tables and item_tag.organization_id column references both. This prevents pairs from different organization. tag.id and item.id columns are part of a unnecessary composite primary key now, because in reality single id column represents uniqueness of the item and tag.
How can I model those requirements correctly?
To enforce referential integrity, you'll have to ...
include organization_id in all tables
create logically redundant UNIQUE (or PK) constraints on (organization_id, id) in both tables tag and item
have multicolumn FK constraints in item_tag matching the columns of those UNIQUE constraints.
If you don't include the organization_id (logically redundantly) there would be nothing to keep you from linking items and tags from different organizations (by mistake).
That would be your diagram 2. But do you really need data type uuid for tags? bigint or even int should suffice, while being a bit smaller and faster.
Closely related case with code example for PostgreSQL:
Enforcing constraints “two tables away”

MySQL Database Normalization .. one table to connect multiple others?

Let's assume I have a very large database with tons of tables in it.
Certain of these tables contain datasets to be connected to each other like
table: album
table: artist
--> connected by table: album_artist
table: company
table: product
--> connected by table: company_product
The tables album_artist and company_product contain 3 columns representing primary key, albumID/artistID meanwhile companyID/productID...
Is it a good practice to do something like an "assoc" table which is made up like
---------------------------------------------------------
| id int(11) primary | leftID | assocType | rightID |
|---------------------------------------------------------|
| 1 | 10 | company:product | 4 |
| 2 | 6 | company:product | 5 |
| 3 | 4 | album:artist | 10 |
---------------------------------------------------------
I'm not sure if this is the way to go or if there's anything else than creating multiple connection tables?!
No, it is not a good practice. It is a terrible practice, because referential integrity goes out the window. Referential integrity is the guarantee provided by the RDBMS that a foreign key in one row refers to a valid row in another table. In order for the database to be able to enforce referential integrity, each referring column must refer to one and only one referred column of one and only one referred table.
No, no, a thousand times no. Don't overthink your many-to-many relationships. Just keep them simple. There's nothing to gain and a lot to lose by trying to consolidate all your relationships in a single table.
If you have a many to many relationship between, say guiarist and drummer, then you need a guitarist_drummer table with two columns in it: guitarist_id and drummer_id. That table's primary key should be comprised of both columns. And you should have another index that's made of the two columns in the opposite order. Don't add a third column with an autoincrmenting id to those join tables. That's a waste, and it allows duplicated pairs in those tables, which is generally confusing.
People who took the RDBMS class in school will immediately recognize how these tables work. That's good, because it means you don't have to be the only programmer on this project for the rest of your life.
Pro tip: Use the same column name everywhere. Make your guitarist table contain a primary key called guitarist_id rather than id. It makes your relationship tables easier to understand. And, if you use a reverse engineering tool like Sql Developer that tool will have an easier time with your schema.
The answer is that it "depends" on the situation. In your case and most others, no, it does not make sense. It does make sense if you are doing a many <-> many relationship, the constraints can be enforced by the link table with foreign keys and a unique constraint. Probably the best use case would be if you had numerous tables pointing to a single table. Each table could have a link table with indexes on it. This would be beneficial if one of the tables is a large table, and you need to fetch the linked records separately.

Can A Foreign Key Be Used More than Once?

Apologies for the newbie question.
The primary key of a table, such as Holiday, would be something like Holiday_ID. Holiday reference a get-away ticket that you can buy to go on a type of holiday, based on the ticket you buy.
Suppose I used Holiday_ID in a composite entity with Customer_ID to identify an instance of Holiday associated with customer, for whatever purpose.
However, suppose I also want to keep track of other information related to this instace: how much has the customer paid for the ticket, how much has the customer yet to pay for the ticket
I have two options:
a) I can create another composite entity. However, I am not sure if I can do that because I am not sure if you can use a particualr foreign key more than once
b) I can create a composite/associate entity, however, I am not sure if you can create a composite entity with more than two foreign keys?
To answer the technical parts of your question, once you create a composite unique or primary key, ONLY ONE record in the table can have the same values in the set of fields defined in that key. SO, no, you cannot reuse the holidayId key WITH THE SAME customer. You can use it with another, different customer if you wish.
Second, there is no limit to the number of attributes that can be included in a Unique or primary key. If you need, and if it's appropriate and conforms to the rules of normalization, the key can include all the attributes of the table.
Third, to answer your question below, Any column, or set of columns in a table can be defined as a Foreign Key, as long as it is also the primary key or unique key of some table in the database. And there can be any number of FKs defined in a table, they can even overlap. (you can have HolidayId as a FK, and also have HolidayID and CustomerId as a composite FK) the only restriction is that the FK must reference a Primary or Unique Key of some table in the database.(It can also be the same table the FK is in as well, as when you add a supervisorId to an employee Table that is a FK to the EMployeeId of the same employee table)
This example illustrates one of the problems of using surrogate keys without also using a natural key. to wit, what, exactly is a "Holiday"? Is Christmas 2016 the same "Holiday" as Christmas 2015? Is Christmas in Aruba the same holiday as Christmas in Hawaii?
and then, about the composite table to identify associations of customer with Holiday, is it the same association if the customer goes to Aruba on Christmas the next year, or a different instance? What does the row in the table represent if the customer wants 5 tickets?
The first thing that should be done in database design is a logical design which defines, as clearly and unambiguously as possible, in business terms, the meanings of the entities for each table in the database.

MySQL - Should every table contain it's own id/primary column?

I'm putting together a question and answers application - the answers are only going to exist as long as there is a question that relates to it.
So I've decided not to give the answers table it's own id column and have made the primary key a foreign key that relates to the question_id.
Questions table:
id | title
Answers table:
question_id | title
Should I keep it this way or give the answers table it's own id column?
If there is possibility of multiple answers for a single question then it will be better to have a primary key on answer table too to identify each row uniquely if we get duplicate answers as follows
id | question_id | title
1 1 5
2 1 5
3 2 true
But, in case you are anticipating only a single answer for each question then it is better to merge it to the question table as both question and answer are directly dependent on a single primary key.
id | question | answer
1 quest 1 ? 5
2 quest 2 ? 5
3 quest 3 ? true
4 quest 4 ? null
I hope, this clarifies your doubt.
To expound a bit on the two valuable comments that have been made, in my experience, the following is the most effective set of rules to follow when defining a database schema (I will give reasons after):
Create a Primary Key for each table
Create a surrogate key to be that Primary Key
When you have a one to many relationship (as you do in your questions & answers tables) include the PK from the one table (your questions table) in the many table (your answers table) NOTE: this is exactly as you have done it... except the answers table doesn't have it's own PK & surrogate key
When there is a many to many relationship between two tables create a linkage/join/relationship table which has a one to many relationship to your two tables (meaning you put the Primary Key of each table into the relationship table as a foreign key to the two tables, respectively)
REASONS (in the same order):
Primary key columns guarantee uniqueness for each row within the scope of the table itself (no other database object has to be involved & every row will be required to be unique). They also provide a default index in most databases, which speeds up table scans/queries. As, mentioned this effectively meets first normal form.
I've found surrogate keys to be a powerful & effective way to simplify both database design & relationships between tables. If you aren't familiar please read here: http://en.wikipedia.org/wiki/Surrogate_key
You have done this already, so I'm assuming you understand the benefits.
This is here simply to provide an example of how using surrogate keys as primary keys in every table can help you as a database schema grows. If you need to add other tables in the future you won't have to spend as much time & effort figuring out how to join them you already have all the keys you need to easily create a join table (for instance, if you later add users to the mix... users can be the author of either a question or answer OR both... this could get a little harry if you attempt to associate the SAME value to both the question & answers tables independently... in fact it won't work)

Database modeling for a weak entity

I have 2 tables in my database orders and orderHistory.
----------------- -----------------------
| orders | | orderHistory |
----------------- -----------------------
| orderID (PK) | | historyLineID (PK) |
| orderDate | | status |
| price | | quantity |
----------------- -----------------------
Now an order can have multiple history lines. However, a history line can't exist on its own. I heard this is called a weak entity and therefore the PK from orders must be part of the PK of table orderHistory.
Questions
Is this really a correct weak entity relationship? Is there other ways to identify them?
Should I add the PK of table order to table orderHistory and make it a composite primary key?
In case I decide to add a new record to orderHistory, how will I add a new composite key? (orderID is available from table orders, but historyLineID should be auto incremented.)
What if I decide to model this as a normal One-To-Many relationship where orderID is added as a foreign key only instead? what are the cons of doing so?
Will ignoring Weak entities at all cause any problems later in a design provided all tables are in 3rd normal form?
Note
Both orderID & historyLineID are surrogate keys.
Thanks in advance.
An entity is not weak because it can't exist independently, but because it can't be identified independently. Therefore, a relationship that "leads" to a weak entity is called "identifying" relationship. In practice, this means that the parent's primary key is migrated into (usually proper) subset of child's PK (the term "weak entity" is usually defined in relation to primary keys, though it could in theory apply to any key).
It is perfectly legit to have an entity that can't exist independently, but can be identified independently - in other words, that is in a non-identifying relationship to a non-NULL.
You have to ask: can historyLineID be unique alone, or in combination with orderID? I suspect the latter is the case, which would make it a weak entity.
Is this really a correct weak entity relationship?
What you have shown us isn't a weak entity - parent's PK is not migrated into the child's PK.
Is there other ways to identify them?
You have essentially two options:
orderHistory has a composite PK: {orderID, historyLineID}, where orderID is FK. BTW, this PK could be considered "natural":
orderHistory has a surrogate PK: {orderHistoryID}, while orderID is outside of the PK. You'd still need to have an alternate key {orderID, historyLineID} though:
Should I add the PK of table order to table orderHistory and make it a composite primary key?
Yes, this is the first option described above. Unless you have child relationships on orderHistory itself, this is also the best solution. If orderHistory does have children, than this may or may not be the best solution, depending on several factors.
What if I decide to model this as a normal One-To-Many relationship where orderID is added as a foreign key instead? what are the cons of doing so?
This is not either-or. A field can be both FK and a part of a (primary or alternate) key, as shown above.
Will ignoring Weak entities at all cause any problems later in a design provided all tables are in 3rd normal form?
You won't be able to reach 3NF unless you specify your keys correctly, and you won't be able to do that without considering which entity can be identified independently and which can't.
It is a weak entity relationship because of the reliance, but it is essentially an instance of indecisiveness. An order may have one to many history lines, but each history line must have a orderID, correct?
It sounds like an optional-mandatory relationship.
So your orderId has "optional" attributes in orderHistory...
2. You can partly solve the problem by making the primary key a composition of orderID and historyLineID
3. You will have to do a involuted relationship on the orderID table. so you will have to join back on the order.orderID and then create the new historyLineID, otherwise you cannot create on something that doesn't exist yet.
4. This is the way it should be. It is much easier to understand this way for future people working on the script, and probably yourself. Use the foreign key to create an orderID (parent) with multiple historyLineID's (children), because the order can have multiple order lines, this method would probably be the best.
LINK: enter link description here