First of all, I have read several similar questions with "technical" answers that look like C & P. What I need is a clear example. The normalization is 3NF.
In this project, in the administrative panel, you have to create cities and zones and each zone has to belong to a city. Also create hotels and assign them in the corresponding zone, and finally create aliases for each particular hotel, as people know the same hotel under different names. The tables hotels and hotels_alias are to fill an autocomplet input.
The price calculation is done according to the service (standard, private and VIP) depending on the zone and according to the number of passengers and the season, I still do not create the logic or tables to calculate the price per passenger and season. That is why they are not in the diagram below.
A good explanation I found is
What's the difference between identifying and non-identifying relationships?
However I have some doubts.
Example 1
hotels_alias can not exist without the table hotels that in turn can not exist without the zones table and this in turn does not exist without cities. Since a city is divided into many zones, hotels belong to these zones, zones that are part of a city, and hotel aliases belong to a hotel and can not exist if there is no hotel.
So far it is clear that cities are a strong or parent entity and zones, hotels and hotels_alias are child entities.
In the EER diagram you can see that it has an identifying relationship. The first question is: Is it correct that despite being child entities have their own ID? and that this ID is PK and NN and AI? In some examples, these child entities do not have their own ID, hence their PK is formed by two FKs from the related tables as in an N: N (zones_has_servicees) relationship.
If in fact child tables do not have to have their own ID because they must be able to identify themselves by their parent table, then how would you be able to update or delete an area, or a hotel or a hotel alias?
DELETE FROM zones WHERE name = 'name'
Is this correct? Should I create an index to the name column? What advantages, if any, would do with name colum instead of its own ID? Is it okay for a child table to have its own ID and create a composite PK with this ID and the ID of its parent table? Does this type of relationship have any function or is it only for engines like InnoDB ? to perform an ON DELETE CASCADE action?
What happens if I have two zones with the same name? for example: Hotel Zone, that both cities of Cancun and Tulum have that area. To make a DELETE would be ?:
DELETE FROM zones WHERE name = 'name' AND cities_id = ID
Understanding what a parent and a child entity is then why WordPress creates relationships like the one below where you can see that it uses a weak relationship with wp_postmeta and wp_posts. It is assumed that a wp_postmeta can not exist without a wp_posts, right? It does the same with comments and users.
WP EER
First, your example 1 is not an EER diagram (rather call it a table diagram). To be called an ER or EER diagram, you have to use a notation (like Chen's notation) that represents entity-relationship model concepts and distinguishes entity sets from relationships. In the ER model, both entity relations and relationship relations are implemented using tables, and neither map to FK constraints, which are just an integrity mechanism. Many people confuse the ER model for the old network data model.
Second, identifying relationships are used in conjunction with weak entity sets, in which the regular (parent) entity set's primary key forms part of the weak (child) entity set's primary key. When an entity set is identified by its own attributes, it's a regular entity set.
To delete a row from a weak entity relation, you would usually identify it by its primary key. Weak entity sets generally have a composite primary key, consisting of its parent's key and an additional weak key. The weak key need only be unique in conjunction with the parent key. For example, if zones were identified by cities_id and name, you could delete a zone by specifying those attributes:
DELETE FROM zones WHERE cities_id = 1 AND name = 'name';
The composite primary key should automatically be indexed and uniquely constrained by your DBMS if you declared it as the PK. The advantage of weak entity sets is that, in some cases, this method of identification is more natural than introducing a meaningless surrogate key.
It's not a good idea to have a table with a composite primary key consisting of a unique surrogate ID together with another attribute like its parent's ID. Besides the risk of unintended duplicate values if uniqueness isn't correctly enforced, it unnecessarily over-complicates what would otherwise be a straightforward table with a simple surrogate PK.
Your WordPress diagram doesn't illustrate weak entity sets or identifying relationships (and it's not an EER diagram, as mentioned before). The tables you mentioned each have their own surrogate keys. Note that there's no such thing as a weak relationship in the ER model.
Related
I have two tables matches and tournaments with below structure,
MATCH
MATCH_ID
PLAYER_ID_1
PLAYER_ID_2
RESULT
TOURNAMENT_ID
and
TOURNAMENT
TOURNAMENT_ID
NAME
OTHER_DETAILS
with one tournament will have multiple matches
and a match may or may not have tournament id
use Cases:
retrieve all matches
retrieve all matches by tournaments
Is it good to have tournament id in match table? Or should I create a separate joining table for tournament and match mapping? Which will have good performance when the volume increases?
TOURNAMENT_ID has a 1:M relationship to MATCH. It seems to be a straightforward foreign key. The standard way of implementing foreign keys - even optional foreign keys - is a column on the child table with a foreign key constraint. This would support both your use cases.
A separate table would normally be a head scratcher. I say "normally" because there are schools of thought which abominate NULL columns in databases; either for practical reasons - NULLs can do weird things to our code and need wrangling - and academic reasons - NULL is contrary to Relational Algebra. So, if you have a data model which forbids the use of nulls you will need a TOURNAMENT_MATCH table to hold Matches which are part of a Tournament. It also would be likely to perform slightly worse than a foreign key column on MATCH, but unless you have a vast amount of data you won't notice the difference.
There is a use case for join tables (also known as junction or intersection tables) and that is implementing many-to-many relationships. Suppose we add a third table to the mix, PLAYER. A Player can participate in many Tournaments and a Tournament has many Players. Classic M:N relationship. So we can resolve it with a join table REGISTERED_PLAYER. which as a compound key of (TOURNAMENT_ID,PLAYER_ID) and the appropriate foreign keys to TOURNAMENT and PLAYER.
For the sake of completeness I will mention Link tables from Data Vault modelling. This is an interesting modelling technique for Data Warehouses, where - gross simplification alert - tables are defined as Hubs (business and technical keys) and Satellites (immutable attribute records). This approach allows for the capture of data changes over time. Foreign key relationships between Hubs are implemented through Link tables, to support changing relationships over time.
There are several benefits to Data Vault for wrangling large amounts of data in a time-sensitive fashion but an easy-to-understand physical data model isn't one of them. Anyway, find out more.
The simple rule: for one-to-many mapping always prefer a foreign key association to a join table association.
It is hard to control a join table using a standard #OneToMany Hibernate mapping — you can't just delete rows from a join table, or add an additional row. You will need to use list on the Tournament side to do things like that. Another option is to use an additional entity for a join table.
Note: Match can has a tournaments list too, but looks like Tournament is the owner of the association.
A few opinions have been offered in other answers, here is mine.
You do NOT want a separate join table, you would only need that if a Match can be in multiple Tournaments. In your example, just use a foreign key.
The only other suggestion is that if the Match is not part of a Tournament then it is not actually "unknown" which is the meaning of NULL, it is actually something else like "Individual Match". So consider adding a row to your Tournament table, maybe using a known key like 0 or -1, and using that for matches that are not part of a tournament.
I am trying as an exercise for an exam to transfer a database from the ER model to a relational database.
However, I am very unsure whether my solution makes sense. In particular, the two relationships between location and has makes great problems. I thought I could add one ZipCode as a regular primary key into the table has and a second ZipCode as foreign key. I would be very grateful if someone could help me with this.
My Solution so far:
If you are following Chen ER design with this Chen ER diagram then you need a table for every entity type box and every relationship (association) type diamond and a FK (foreign key) for every participation/role line for a relationship type.
(It is a bad idea to call lines/FKs "relationships" or "associations" in a Chen context because diamonds/tables represent relationship types and lines/FKs represent participations.)
So your Ship tourID would be dropped in favour of relationship/table takes with lines/FKs to Ship & Tour. And you would have two FKs in the has table to Location. It doesn't matter that you need different column names in the relationship table than in the participant table. A FK just says the values in some table & column list appear in some other table & column list. The diagram says the names are start & target; use them.
Don't use a flaccid uninformative name like has. If you picked a better name and/or explained when a triplet of entities satisfied the has relationship then we could know what reasonable designs would be. Eg you may not be using cardinalities correctly. The Chen way is, a number or range tells for some instance of the entity type how many relationship instances it can participate in. Another way is, a number or range tells you for a some combination of entity instances of the other participating entity types how many instances of the line's entity type can participate with it. If the latter has a zero that means a relationship instance can have a NULL. But that can't arise in a Chen design; participating entity instance combinations identify relationship instances and form PKs (primary keys).
However, a Chen design can't express all relational designs. And we can represent the same data as a Chen ER schema by rearranging tables. Eg dropping binary relationship tables that are not many:many and putting FKs (sometimes nullable) into entity tables instead, just as you did with takes, Ship & Tour. Some methods have non-Chen diagrams expressing such designs directly. Others allow it in the move from Chen diagram to schema. You have to ask your teachers whether they care just what variations from the Chen style of ER diagrams and corresponding schemas you are permitted to make.
(It is this dropping in non-Chen methods of explicit 1:many relationships/associations and their representation by FKs that leads to FKs being incorrectly (but commonly) called "relationships" or "associations".)
I have two tables currently with the same primary key, can I have these two tables with the same primary key?
Also are all the tables in 3rd normal form
Ticket:
-------------------
Ticket_id* PK
Flight_name* FK
Names*
Price
Tax
Number_bags
Travel class:
-------------------
Ticket id * PK
Customer_5star
Customer_normal
Customer_2star
Airmiles
Lounge_discount
ticket_economy
ticket_business
ticket_first
food allowance
drink allowance
the rest of the tables in the database are below
Passengers:
Names* PK
Credit_card_number
Credit_card_issue
Ticket_id *
Address
Flight:
Flight_name* PK
Flight_date
Source_airport_id* FK
Dest_airport_id* FK
Source
Destination
Plane_id*
Airport:
Source_airport_id* PK
Dest_airport_id* PK
Source_airport_country
Dest_airport_country
Pilot:
Pilot_name* PK
Plane id* FK
Pilot_grade
Month
Hours flown
Rate
Plane:
Plane_id* PK
Pilot_name* FK
This is not meant as an answer but it became too long for a comment...
Not to sound harsh, but your model has some serious flaws and you should probably take it back to the drawing board.
Consider what would happen if a Passenger buys a second Ticket for instance. The Passenger table should not hold any reference to tickets. Maybe a passenger can have more than one credit card though? Shouldn't Credit Cards be in their own table? The same applies to Addresses.
Why does the Airport table hold information that really is about destinations (or paths/trips)? You already record trip information in the Flights table. It seems to me that the Airport table should hold information pertaining to a particular airport (like name, location?, IATA code et cetera).
Can a Pilot just be associated with one single Plane? Doesn't sound very likely. The pilot table should not hold information about planes.
And the Planes table should not hold information on pilots as a plane surely can be connected to more than one pilot.
And so on... there are most likely other issues too, but these pointers should give you something to think about.
The only tables that sort of looks ok to me are Ticket and Flight.
Re same primary key:
Yes there can be multiple tables with the same primary key. Both in principle and in good practice. We declare a primary or other unique column set to say that those columns (and supersets of them) are unique in a table. When that is the case, declare such column sets. This happens all the time.
Eg: A typical reasonable case is "subtyping"/"subtables", where entities of a kind identified by a candidate key of one table are always or sometimes also of the kind identifed by the same values in another table. (If always then the one table's candidate key values are also in the other table's. And so we would declare a foreign key from the one to the other. We would say the one table's kind of entity is a subtype of the other's.) On the other hand sometimes one table is used with attributes of both kinds and attributes inapplicable to one kind are not used. (Ie via NULL or a tag indicating kind.)
Whether you should have cases of the same primary key depends on other criteria for good design as applied to your particular situation. You need to learn design including normalization.
Eg: All keys simple and 3NF implies 5NF, so if your two tables have the same set of values as only & simple primary key in every state and they are both in 3NF then their join contains exactly the same information as they do separately. Still, maybe you would keep them separate for clarity of design, for likelihood of change or for performance based on usage. You didn't give that information.
Re normal forms:
Normal forms apply to tables. The highest normal form of a table is a property independent of any other table. (Athough you might choose that form based on what forms & tables are alternatives.)
In order to normalize or determine a table's highest normal form one needs to know (in general) all the functional dependencies in it. (For normal forms above BCNF, also join dependencies.) You didn't give them. They are determined by what the meaning of the table is (ie how to determine what rows go in it in any given situation) and the possible situtations that can arise. You didn't give them. Your expectation that we could tell you about the normal forms your tables are in without giving such information suggests that you do not understand normalization and need to educate yourself about it.
Proper design also needs this information and in general all valid states that can arise from situations that arise. Ie constraints among given tables. You didn't give them.
Having two tables with the same key goes against the idea of removing redundancy in normalization.
Excluding that, are these tables in 1NF and 2NF?
Judging by the Names field, I'd suggest that table1 is not. If multiple names can belong to one ticket, then you need a new table, most likely with a composite key of ticket_id,name.
Question
Is there a way to have a many-to-many relationship among 3 tables without the use of automatic incrementers (usually ID), or are ID's required for this?
Why I ask
I have 3 relative tables. Since one-to-one relationships seem to can't happen directly, I made a 4th to do one-to-many relationships to the other 3 tables. However, since there's still a primary key to each table, a value can only be used once in a table, which I don't want to happen.
What I have
Connectors has multiple Pockets which have multiple pins.
The 4th Table is ConnectorFullInfo
There is no requirement that a table have an "automatic incrementer" as a primary key.
But, a familiar pattern is to add a surrogate ID column as primary key on entity tables. The "ideal" primary key will be "anonymous" (carry no meaningful information), "unique" (no duplicate values), "simple" (single column, short simple native datatype), ...
There are a couple of schools of thought on whether it's a good idea to introduce a surrogate key. I will also note that there are those who have been later burned by the decision to use a natural key rather than a surrogate key. And there are those that haven't yet been burned by that decision.
In the case of "association" tables (tables introduced to resolve many-to-many relationships), the combination of the foreign keys can be used as the primary key. I often do this.
BUT, if the association table is itself turns out to be entity table, with it's own attributes, I will introduce a surrogate ID column. As an example, the association between person and club, a person can be a member of multiple clubs, and a club can have multiple members...
club +--< membership >--+ person
When we start adding attributes to membership (such as status, date_joined, office_held, etc... at that point membership isn't just an association table; it's turning into an entity. When I suspect that an association is actually an entity, so I'll add the surrogate ID column.
The other case where I will add a surrogate ID column to an association table is when we want to allow "duplicates", where we want to allow multiple associations. In that case, I will also introduce a surrogate ID column.
Yes you can but,
It is customary to represent a table row by a unique identifier which is the number, its becomes more efficient.
I've read this question: What's the difference between identifying and non-identifying relationships?
But I'm still not too sure...
What I have is three tables.
Users
Objects
Pictures
A user can own many objects and can also post many pictures per individual object.
My gut feeling tells me this is an identifying relationship, because I'll need the userID in the objects table and I'll need the objectID in the pictures tables...
Or am I wrong? The explanations in the other topic limit themselves to the theoretical explanation of the way the database interprets it after it's already been coded, not how the objects are connected in real life. I'm kinda confused as to how to make the decision of identifying versus non-identifying when thinking about how I'm going to build the database.
Both sound like identifying relationships to me. If you have heard the terms one-to-one or one-to-many, and many-to-many, one-to- relationships are identifying relationships, and many-to-many relationships are non-identifying relationships.
If the child identifies its parent, it is an identifying relationship. In the link you have given, if you have a phone number, you know who it belongs to (it only belongs to one).
If the child does not identify its parent, it is a non-identifying relationship. In the link, it mentions states. Think of a state as a row in a table representing mood. "Happy" doesn't identify a particular person, but many people.
Edit: Other real life examples:
A physical address is a non-identifying relationship, because many people may reside at one address. On the other hand, an email address is (usually considered) an identifying relationship.
A Social Security Number is an identifying relationship, because it only belongs to one person
Comments on Youtube videos are identifying relationships, because they only belong to one video.
An original of a painting only has one owner (identifying), while many people may own reprints of the painting (non-identifying).
I think that an easier way to visualize it is to ask yourself if the child record can exist without the parent. For example, an order line item requires an order header to exist. Thus, an order line item must have the order header identifier as part of its key and hence, this is an example of an identifying relationship.
On the other hand, telephone numbers can exist without ownership of a person, although a person may have several phone numbers. In this case, the person who owns the phone number is a non-key or non-identifying relationship since the phone numbers can exist irrespective of the owner person (hence, the phone number owner person can be null whereas in the order line item example, the order header identifier cannot be null.
NickC Said: one-to- relationships are identifying relationships, and many-to-many relationships are non-identifying relationships
The explanation seems totally wrong to me. You can have:
Ono-to-One Non-identifying Relationships
One-to-Many Non-identifying Relationships
One-to-One Identifying Relationships
One-to-Many Identifying Relationships
Many-to-Many Identifying Relationships
Imagine you have the following tables: customer, products and feedback. All of them are based on the customer_id which exists on the cutomer table. So, by NickC definition there shouldn't be exists any kind of Many-to-Many Identifying Relationships, however in my example, you can clearly see that: A Feedback can exists only if the relevant Product exists and has been bought by the Customer, so Customer, Products and Feedback should be Identifying.
You can take a look at MySQL Manual, explaining how to add Foreign Keys on MySQL Workbench as well.
Mahdi, your instincts are correct. This is a duplicate question and this up-voted answer is not correct or complete.
Look at the top two answers here:
difference between identifying non-identifying
Identifying vs non-identifying has nothing to do with identity.
Simply ask yourself can the child record exist without the parent? If the answer is yes, the it is non-identifying.
The core issue whether the primary key of the child includes the foreign key of the parent. In the non-identifying relationship the child's primary key (PK) cannot include the foreign key (FK).
Ask yourself this question
Can the child record exist without the parent record?
If the child can exist without the parent, then the relationship is non-identifying. (Thank you MontrealDevOne for stating it more clearly)
One-to-one identifying relationship
Social security numbers fit nicely in to this category. Let's imagine for example that social security numbers cannot exist with out a person (perhaps they can in reality, but not in our database) The person_id would be the PK for the person table, including columns such as a name and address. (let's keep it simple). The social_security_number table would include the ssn column and the person_id column as a foreign key. Since this FK can be used as the PK for the social_security_number table it is an identifying relationship.
One-to-one non-identifying relationship
At a large office complex you might have an office table that includes the room numbers by floor and building number with a PK, and a separate employee table. The employee table (child) has a FK which is the office_id column from the office table PK. While each employee has only one office and (for this example) every office only has one employee this is a non-identifying relationship since offices can exist without employees, and employees can change offices or work in the field.
One-to-many relationships
One-to-many relationships can be categorized easily by asking the same question.
Many-to-many relationships
Many-to-many relationships are always identifying relationships. This may seem counter intuitive, but bear with me. Take two tables libary and books, each library has many books, and a copy of each book exists in many libraries.
Here's what makes it and identifying relationship:
In order to implement this you need a linking table with two columns which are the primary keys of each table. Call them the library_id column and the ISBN column. This new linking table has no separate primary key, but wait! The foreign keys become a multi-column primary key for the linking table since duplicate records in the linking table would be meaningless. The links cannot exist with out the parents; therefore, this is an identifying relationship. I know, yuck right?
Most of the time the type of relationship does not matter.
All that said, usually you don't have to worry about which you have. Just assign the proper primary and foreign keys to each table and the relationship will discover itself.
EDIT: NicoleC, I read the answer you linked and it does agree with mine. I take his point about SSN, and agree that is a bad example. I'll try to think up another clearer example there. However if we start to use real-world analogies in defining a database relationship the analogies always break down. It matters not, whether an SSN identifies a person, it matters whether you used it as a foreign key.