database design two tables vs join table, which is better - mysql

I have two tables matches and tournaments with below structure,
MATCH
MATCH_ID
PLAYER_ID_1
PLAYER_ID_2
RESULT
TOURNAMENT_ID
and
TOURNAMENT
TOURNAMENT_ID
NAME
OTHER_DETAILS
with one tournament will have multiple matches
and a match may or may not have tournament id
use Cases:
retrieve all matches
retrieve all matches by tournaments
Is it good to have tournament id in match table? Or should I create a separate joining table for tournament and match mapping? Which will have good performance when the volume increases?

TOURNAMENT_ID has a 1:M relationship to MATCH. It seems to be a straightforward foreign key. The standard way of implementing foreign keys - even optional foreign keys - is a column on the child table with a foreign key constraint. This would support both your use cases.
A separate table would normally be a head scratcher. I say "normally" because there are schools of thought which abominate NULL columns in databases; either for practical reasons - NULLs can do weird things to our code and need wrangling - and academic reasons - NULL is contrary to Relational Algebra. So, if you have a data model which forbids the use of nulls you will need a TOURNAMENT_MATCH table to hold Matches which are part of a Tournament. It also would be likely to perform slightly worse than a foreign key column on MATCH, but unless you have a vast amount of data you won't notice the difference.
There is a use case for join tables (also known as junction or intersection tables) and that is implementing many-to-many relationships. Suppose we add a third table to the mix, PLAYER. A Player can participate in many Tournaments and a Tournament has many Players. Classic M:N relationship. So we can resolve it with a join table REGISTERED_PLAYER. which as a compound key of (TOURNAMENT_ID,PLAYER_ID) and the appropriate foreign keys to TOURNAMENT and PLAYER.
For the sake of completeness I will mention Link tables from Data Vault modelling. This is an interesting modelling technique for Data Warehouses, where - gross simplification alert - tables are defined as Hubs (business and technical keys) and Satellites (immutable attribute records). This approach allows for the capture of data changes over time. Foreign key relationships between Hubs are implemented through Link tables, to support changing relationships over time.
There are several benefits to Data Vault for wrangling large amounts of data in a time-sensitive fashion but an easy-to-understand physical data model isn't one of them. Anyway, find out more.

The simple rule: for one-to-many mapping always prefer a foreign key association to a join table association.
It is hard to control a join table using a standard #OneToMany Hibernate mapping — you can't just delete rows from a join table, or add an additional row. You will need to use list on the Tournament side to do things like that. Another option is to use an additional entity for a join table.
Note: Match can has a tournaments list too, but looks like Tournament is the owner of the association.

A few opinions have been offered in other answers, here is mine.
You do NOT want a separate join table, you would only need that if a Match can be in multiple Tournaments. In your example, just use a foreign key.
The only other suggestion is that if the Match is not part of a Tournament then it is not actually "unknown" which is the meaning of NULL, it is actually something else like "Individual Match". So consider adding a row to your Tournament table, maybe using a known key like 0 or -1, and using that for matches that are not part of a tournament.

Related

Normalize two tables with same primary key to 3NF

I have two tables currently with the same primary key, can I have these two tables with the same primary key?
Also are all the tables in 3rd normal form
Ticket:
-------------------
Ticket_id* PK
Flight_name* FK
Names*
Price
Tax
Number_bags
Travel class:
-------------------
Ticket id * PK
Customer_5star
Customer_normal
Customer_2star
Airmiles
Lounge_discount
ticket_economy
ticket_business
ticket_first
food allowance
drink allowance
the rest of the tables in the database are below
Passengers:
Names* PK
Credit_card_number
Credit_card_issue
Ticket_id *
Address
Flight:
Flight_name* PK
Flight_date
Source_airport_id* FK
Dest_airport_id* FK
Source
Destination
Plane_id*
Airport:
Source_airport_id* PK
Dest_airport_id* PK
Source_airport_country
Dest_airport_country
Pilot:
Pilot_name* PK
Plane id* FK
Pilot_grade
Month
Hours flown
Rate
Plane:
Plane_id* PK
Pilot_name* FK
This is not meant as an answer but it became too long for a comment...
Not to sound harsh, but your model has some serious flaws and you should probably take it back to the drawing board.
Consider what would happen if a Passenger buys a second Ticket for instance. The Passenger table should not hold any reference to tickets. Maybe a passenger can have more than one credit card though? Shouldn't Credit Cards be in their own table? The same applies to Addresses.
Why does the Airport table hold information that really is about destinations (or paths/trips)? You already record trip information in the Flights table. It seems to me that the Airport table should hold information pertaining to a particular airport (like name, location?, IATA code et cetera).
Can a Pilot just be associated with one single Plane? Doesn't sound very likely. The pilot table should not hold information about planes.
And the Planes table should not hold information on pilots as a plane surely can be connected to more than one pilot.
And so on... there are most likely other issues too, but these pointers should give you something to think about.
The only tables that sort of looks ok to me are Ticket and Flight.
Re same primary key:
Yes there can be multiple tables with the same primary key. Both in principle and in good practice. We declare a primary or other unique column set to say that those columns (and supersets of them) are unique in a table. When that is the case, declare such column sets. This happens all the time.
Eg: A typical reasonable case is "subtyping"/"subtables", where entities of a kind identified by a candidate key of one table are always or sometimes also of the kind identifed by the same values in another table. (If always then the one table's candidate key values are also in the other table's. And so we would declare a foreign key from the one to the other. We would say the one table's kind of entity is a subtype of the other's.) On the other hand sometimes one table is used with attributes of both kinds and attributes inapplicable to one kind are not used. (Ie via NULL or a tag indicating kind.)
Whether you should have cases of the same primary key depends on other criteria for good design as applied to your particular situation. You need to learn design including normalization.
Eg: All keys simple and 3NF implies 5NF, so if your two tables have the same set of values as only & simple primary key in every state and they are both in 3NF then their join contains exactly the same information as they do separately. Still, maybe you would keep them separate for clarity of design, for likelihood of change or for performance based on usage. You didn't give that information.
Re normal forms:
Normal forms apply to tables. The highest normal form of a table is a property independent of any other table. (Athough you might choose that form based on what forms & tables are alternatives.)
In order to normalize or determine a table's highest normal form one needs to know (in general) all the functional dependencies in it. (For normal forms above BCNF, also join dependencies.) You didn't give them. They are determined by what the meaning of the table is (ie how to determine what rows go in it in any given situation) and the possible situtations that can arise. You didn't give them. Your expectation that we could tell you about the normal forms your tables are in without giving such information suggests that you do not understand normalization and need to educate yourself about it.
Proper design also needs this information and in general all valid states that can arise from situations that arise. Ie constraints among given tables. You didn't give them.
Having two tables with the same key goes against the idea of removing redundancy in normalization.
Excluding that, are these tables in 1NF and 2NF?
Judging by the Names field, I'd suggest that table1 is not. If multiple names can belong to one ticket, then you need a new table, most likely with a composite key of ticket_id,name.

In ERD modeling Does A Relation Map To A Database Table

The following is an Entity Relationship of a a Baseball League.
I'm having a bit of confusion understanding Relations and Attributes of Relations.
An description of the diagram follows:
According to the description, Participates is a Relation and Performance is an Attribute (complex) of Participates.
Questions:
How do Participates Map to actual tables in a database?
Would there be a Participates table with the fields that define Performance?
{Hitting(AtBat#, Inning#, HitType, Runs, RunsBattedIn,
StolenBases)}, {Pitching(Inning#, Hits, Runs, EarnedRuns, StrikeOuts, Walks, Outs, Balks, WildPitches)}, {Defense(Inning{FieldingRecord(Position,
PutOuts, Assists, Errors)})}
Similarly are Plays_For, Away_Team and Home_Team also tables.
As you create tables in a database (say MySql) how are Relations differentiated from Entities / Objects like Player, Team and Game.
Thanks for your help.
Question 1: Participates would be an actual table with foreign key columns for Player and Game as well as the column(s) for Performance. All M-N relationships need to be modelled in a separate table.
Question 2: To keep it as a semi-decent relational DB you would have to separate all the info into separate columns so that each column would only hold one singular data. If you didn't separate the data you would break the first normal form and would probably run into problems later in the design.
Question 3: As these three are 1-N you could also implement them with columns on the N-side. In the Game table for example you could have two foreign keys to Team table as well as all the data about the relationships in columns. For claritys sake you could make those relationships as separate tables also. As a sidenote: are you sure Player-Team is a 1-N-relationship so that a if a player changes teams the history-info about the StartDate and EndDate of the previous team is immediately lost?
Question 4: They are all treated absolutely the same - no differentiation.

SQL Many-To-Many relationships

Question
Is there a way to have a many-to-many relationship among 3 tables without the use of automatic incrementers (usually ID), or are ID's required for this?
Why I ask
I have 3 relative tables. Since one-to-one relationships seem to can't happen directly, I made a 4th to do one-to-many relationships to the other 3 tables. However, since there's still a primary key to each table, a value can only be used once in a table, which I don't want to happen.
What I have
Connectors has multiple Pockets which have multiple pins.
The 4th Table is ConnectorFullInfo
There is no requirement that a table have an "automatic incrementer" as a primary key.
But, a familiar pattern is to add a surrogate ID column as primary key on entity tables. The "ideal" primary key will be "anonymous" (carry no meaningful information), "unique" (no duplicate values), "simple" (single column, short simple native datatype), ...
There are a couple of schools of thought on whether it's a good idea to introduce a surrogate key. I will also note that there are those who have been later burned by the decision to use a natural key rather than a surrogate key. And there are those that haven't yet been burned by that decision.
In the case of "association" tables (tables introduced to resolve many-to-many relationships), the combination of the foreign keys can be used as the primary key. I often do this.
BUT, if the association table is itself turns out to be entity table, with it's own attributes, I will introduce a surrogate ID column. As an example, the association between person and club, a person can be a member of multiple clubs, and a club can have multiple members...
club +--< membership >--+ person
When we start adding attributes to membership (such as status, date_joined, office_held, etc... at that point membership isn't just an association table; it's turning into an entity. When I suspect that an association is actually an entity, so I'll add the surrogate ID column.
The other case where I will add a surrogate ID column to an association table is when we want to allow "duplicates", where we want to allow multiple associations. In that case, I will also introduce a surrogate ID column.
Yes you can but,
It is customary to represent a table row by a unique identifier which is the number, its becomes more efficient.

One-to-One Relation or Use the Same Table?

I have the following tables created:
Animes(id,title,date), Comics(id,title,date), TVSeries(id,title,season,episode,date)
Some of them have already foreign keys (for one-to-many or many-to-many relations) of directors, genres, articles and so on.
Now i would like to create two more tables Reviews(id,rating,date) and Posts(id,thumbid,articleid,reviewid).
A review is about one Anime and/or Comic TVSerie and vise-versa but properties of a review may be in more than one table. Its the same about a posts.
Is this a typical example of one-to-one relation in separate table or is it more efficient to add more properties to the existing tables?
So more tables and relations or less tables more columns?
Thank you and i hope my question isnt that stupid but im a bit confused.
In my view, It is better to avoid foreign key relationship for one-to-one relationship. It is best suitable for one - many relationships.
I'm not exactly sure what your requirements are, but the choices are as follows:
Have Reviews have 2 columns, either being a foreign key to the applicable table, can be NULL. This is really for when a single review can be about both.
Have a ReviewsComics and ReviewsAnime table. You'd then have all the fields from Reviews in each table (and no Reviews table).
An alternative (2) is to use them in conjunction with a Reviews table, then those 2 tables only has 2 fields which are foreign keys to Reviews and Comics/Anime respectively (thus no direct link between Reviews and Comics/Anime).
Have a base table to which Anime and Comics are linked to 1-to-1 and have reviews link to that table instead.
(Edit) If all the fields are all going to be the same (or similar) for Anime/Comics, you can merge them into a single table and add a type field, indicating Anime/Comics, then the problem goes away. This is similar to the base table option.
EDIT: The 2 reviews tables will probably give the best performance (unless you want to select all reviews for either, often), but with proper indices the performance shouldn't be an issue with any of the above.

Table relationship for subtypes

I have a parent table called 'Website' which holds records about websites. I have a child table called 'SupportSystem' which holds records about different types of support systems such as email, phone, ticketing, live chat etc. There is an intermediate table 'Website_SupportSystem' which joins these tables in a many-many relationship.
If the SupportSystem for a Website is ticketing, I also want to record the software platform .e.g. WHMCS. My instinct is to create a new lookup table called SupportPlatform and relate this to the existing join table 'Website_SupportSystem' and store the data there. However, then there is no relationship between the SupportSystem and SupportPlatform. If I relate those then I end up with a circular reference.
Can you see what I am doing wrong? What would be the best way to model this data?
You could use super-type/subtype relationship, as shown in the diagram.
SupportSystem table contains columns common to all support systems.
Email, Ticketing, Phone and LiveChat tables have columns specific to each one.
Primary key in the subtype table is also a foreign key to the super-type table.
I would add a new column 'SupportPlatformId" to the "SupportSystem" table which lookup to the table "SupportPlatform", because "SupportSystem" to "SupportPlatform" is probably one-to-one or many-to-one.
Hence: Website -> (via Website_SupportSystem) SupportSystem -> SupportPlatform
Data about a Support Platform should be stored in the SupportPlatform table.
You can add a third foreign key, namely SupportPlatfromID, to the Website_SupportSystem table. If you do this, your intermediate table now records a ternary relationship, of the kind many-to-many-to-many. If this reflects the reality, then so be it.
If you want to relate SupportSystems and SupportPlatforms, just use the intermediate table as an intermediate table in the joins. You can even do a three way join to join all three entities via the intermediate table.
An alternative would be to create another intermediate table, SupportPlatform_SupportSystem, with a pair of foreign keys, namely SupportSystemID and SupportPlatformID. If this reflects the reality better, so be it. Then you can join it all together with a five table join, if needs be.