MySQL Database Layout/Modelling/Design Approach / Relationships - mysql

Scenario: Multiple Types to a single type; one to many.
So for example:
parent multiple type: students table, suppliers table, customers table, hotels table
child single type: banking details
So a student may have multiple banking details, as can a supplier, etc etc.
Layout Option 1 students table (id) + students_banking_details (student_id) table with the appropriate id relationship, repeat per parent type.
Layout Option 2 students table (+others) + banking_details table. banking_details would have a parent_id column for linking and a parent_type field for determining what the parent is (student / supplier / customers etc).
Layout Option 3 students table (+others) + banking_details table. Then I would create another association table per parent type (eg: students_banking_details) for the linking of student_id and banking_details_id.
Layout Option 4 students table (+others) + banking_details table. banking_details would have a column for each parent type, ie: student_id, supplier_id, customers_id - etc.
Other? Your input...
My thoughts on each of these:
Multiple tables of the same type of information seems wrong. If I want to change what gets stored about banking details, thats also several tables I have to change as opposed to one.
Seems like the most viable option. Apparently this doesnt maintain 'referential integrity' though. I don't know how important that is to me if I'm just going to be cleaning up children programatically when I delete the parents?
Same as (2) except with an extra table per type so my logic tells me this would be slower than (2) with more tables and with the same outcome.
Seems dirty to me with a bunch of null fields in the banking_details table.

Before going any further: if you do decide on a design for storing banking details which lacks referential integrity, please tell me who's going to be running it so I can never, ever do business with them. It's that important. Constraints in your application logic may be followed; things happen, exceptions, interruptions, inconsistencies which are later reflected in data because there aren't meaningful safeguards. Constraints in your schema design must be followed. Much safer, and banking data is something to be as safe as possible with.
You're correct in identifying #1 as suboptimal; an account is an account, no matter who owns it. #2 is out because referential integrity is non-negotiable. #3 is, strictly speaking, the most viable approach, although if you know you're never going to need to worry about expanding the number of entities who might have banking details, you could get away with #4 and a CHECK constraint to ensure that each row only has a value for one of the four foreign keys -- but you're using MySQL, which ignores CHECK constraints, so go with #3.
Index your foreign keys and performance will be fine. Views are nice to avoid boilerplate JOINs if you have a need to do that.

Related

SQL for one to one between a single table

I'd like to know what the best way of reflecting relations between precisely two rows from a single (my)sql table is?
Exemplified, we have:
table Person { id, name }
If I want to reflect that persons can be married monogamously (in liberal countries at least), is it better to use foreign keys within the Person:
table Person { id, name, spouse_id(FK(Person.id)) }
and then create stored procedures to marry and divorce Persons (ensuring mutual registration of the marriage or annulment of it + triggers to handle on_delete events..
or use a mapping table:
table Marriage {
spouse_a(FK(Person.id)),
spouse_b(FK,Person.id) + constraint(NOT IN spouse_a))
}
This way divorces (delete) would simply be delete queries without triggers to cascade, and marriage wouldn't require stored procedure.
The constraint is to prevent polygamy / multi-marriage
I guess the second option is preferred? What is the best way to do this?
I need to be able to update this relation on and off, so it has to be manageable..
EDIT:
Thanks for the replies - in practice the application is physical point-to-point interfaces in networking, where it really is a 1:1 relationship (monogamous marriage), and change in government, trends etc will not change this :)
I'm going to use a separate table with A & B, having A < B checked..
To ensure monogamy, you simply want to ensure that the spouses are unique. So, this almost does what you want:
create table marriage (
spouse_a int not null unique,
spouse_b int not null unique
);
The only problem is that a given spouse can be in either table. One normally handles this with a check constraint:
check (spouse_a < spouse_b)
Voila! Uniqueness for the relationship.
Unfortunately, MySQL does not support check constraints. So you can implement this using a trigger or at the application layer.
Option #1 - Add relationships structurally
You can add one additional table for every conceivable relationship between two people. But then, when someone asks for a new ralationship you forgot to add structurally, you'll need to add a new table.
And then, there will be relationship for three people at a time. And then four. And then, variable size relationships. You name it.
Option #2 - Model relationships as tables
To make it fool proof (well... never possible) you could model the relationships into a new table. This table can have several properties such as size, and also you can model restrictions to it. For example, you can decide to have a single person be the "leader of the cult" if you wish to.
This option requires more effor to design, but will resist much more options, and ideas from your client that you never thought before.

Does data redundancy in different tables not follow Third Normal Form (3NF)?

I have 4 tables. Each of them contain the following attributes:
Table 1 :
Person (Id (Primary key), Name, Occupation, Location, SecondJob, PerHour, HoursWorked, Phone, Workphone)
Table 2 :
Job (Id (Foreign key that refers to Person), Title, Name, Location, Salary)
Table 3 :
SecondJob (Id (Foreign key that refers to Person), Title, Name)
Table 4:
PhoneNumber (Id (Foreign key that refers to Person), Name, Phone, Workphone)
I can obtain the values of each attribute like Name, Title, Phone and Workphone from the Person table with the following psuedo SQL statement:
Select (ATTRIBUTE NAME) FROM Person WHERE Id IN (PERSONS ID)
Does the fact that some of the information is being repeated in DIFFERENT TABLES (Data Redundancy), break (ie, not follow) the Third Normal Form (3NF)?
Or should the values be put into the other Tables separately and reason what attribute is identifying with the Primary Key of the Table?
I calculate Salary in Job by getting PerHour and HoursWorked from Person, then multiply them. I have also heard that this is redundant Data, due to the fact that is is data that you could extrapolate from existing Data within the Tables.
But, does this break the Third Normal Form??
Does the fact that information is repeated in DIFFERENT TABLES (Data Redundancy), break against 3NF Normalization?
No. A table value or variable is or isn't in a given NF. This is independent of any other table. (We do also talk about a database being in NF when all of its tables are in that NF.)
Normalization can be reasonably said to remove redundancy. But there is lots of redundancy not addressed by normalization. And there is lots of redundancy that is not bad. And duplication is not necessarily redundancy. Just because data is repeated doesn't mean "information" is repeated. What data says by being or not being in a table depends on the meaning of the table.
But you seem to think that just because duplicating data in a different table doesn't violate 3NF that it doesn't violate other principles of good design. That's wrong. Also, it's 5NF that matters. The only reason lower NFs are used is that SQL DBMSs don't support 5NF well.
Or should i just put in the values into the other Tables seperately and reason what attribute is identifying with the Primary Key of the Table?
I guess you are trying to say, Should I only put the values in one table each and reconstruct the second table via queries involving shared keys? Ie, if you can get the values in a column by querying the rest of the database then should you avoid having that column? Generally speaking, yes.
Your question assumes a misconception. It's not a matter of "(exclusive) or" here. You should do both.
I calculate Salary in Job by getting PerHour and HoursWorked from Person, then multiply them. I heard that this is also redundant Data, due to it being data that you could extrapulate from existing Data in the Tables.
It is redundant given the rest of the database, because you could use a query instead. And if you don't constrain salary values appropriately then that is bad redundancy. Even if you do the column and constraint complicate the schema.
But does it break 3NF Normalization?
No, because the NF of a table is independent of other tables. But that doesn't mean it's ok.
(If you added Salary to Person, the new table would not be in 3NF. But then, SQL DBMSs have computed columns that make that ok, by making the non-3NF table with Salary a view of the 3NF table without it.)
Learn some database design method(s) and how they apply principles of good design. Your tables needlessly address overlapping aspects of the application. Also learn about JOIN in writing queries.

Normalize two tables with same primary key to 3NF

I have two tables currently with the same primary key, can I have these two tables with the same primary key?
Also are all the tables in 3rd normal form
Ticket:
-------------------
Ticket_id* PK
Flight_name* FK
Names*
Price
Tax
Number_bags
Travel class:
-------------------
Ticket id * PK
Customer_5star
Customer_normal
Customer_2star
Airmiles
Lounge_discount
ticket_economy
ticket_business
ticket_first
food allowance
drink allowance
the rest of the tables in the database are below
Passengers:
Names* PK
Credit_card_number
Credit_card_issue
Ticket_id *
Address
Flight:
Flight_name* PK
Flight_date
Source_airport_id* FK
Dest_airport_id* FK
Source
Destination
Plane_id*
Airport:
Source_airport_id* PK
Dest_airport_id* PK
Source_airport_country
Dest_airport_country
Pilot:
Pilot_name* PK
Plane id* FK
Pilot_grade
Month
Hours flown
Rate
Plane:
Plane_id* PK
Pilot_name* FK
This is not meant as an answer but it became too long for a comment...
Not to sound harsh, but your model has some serious flaws and you should probably take it back to the drawing board.
Consider what would happen if a Passenger buys a second Ticket for instance. The Passenger table should not hold any reference to tickets. Maybe a passenger can have more than one credit card though? Shouldn't Credit Cards be in their own table? The same applies to Addresses.
Why does the Airport table hold information that really is about destinations (or paths/trips)? You already record trip information in the Flights table. It seems to me that the Airport table should hold information pertaining to a particular airport (like name, location?, IATA code et cetera).
Can a Pilot just be associated with one single Plane? Doesn't sound very likely. The pilot table should not hold information about planes.
And the Planes table should not hold information on pilots as a plane surely can be connected to more than one pilot.
And so on... there are most likely other issues too, but these pointers should give you something to think about.
The only tables that sort of looks ok to me are Ticket and Flight.
Re same primary key:
Yes there can be multiple tables with the same primary key. Both in principle and in good practice. We declare a primary or other unique column set to say that those columns (and supersets of them) are unique in a table. When that is the case, declare such column sets. This happens all the time.
Eg: A typical reasonable case is "subtyping"/"subtables", where entities of a kind identified by a candidate key of one table are always or sometimes also of the kind identifed by the same values in another table. (If always then the one table's candidate key values are also in the other table's. And so we would declare a foreign key from the one to the other. We would say the one table's kind of entity is a subtype of the other's.) On the other hand sometimes one table is used with attributes of both kinds and attributes inapplicable to one kind are not used. (Ie via NULL or a tag indicating kind.)
Whether you should have cases of the same primary key depends on other criteria for good design as applied to your particular situation. You need to learn design including normalization.
Eg: All keys simple and 3NF implies 5NF, so if your two tables have the same set of values as only & simple primary key in every state and they are both in 3NF then their join contains exactly the same information as they do separately. Still, maybe you would keep them separate for clarity of design, for likelihood of change or for performance based on usage. You didn't give that information.
Re normal forms:
Normal forms apply to tables. The highest normal form of a table is a property independent of any other table. (Athough you might choose that form based on what forms & tables are alternatives.)
In order to normalize or determine a table's highest normal form one needs to know (in general) all the functional dependencies in it. (For normal forms above BCNF, also join dependencies.) You didn't give them. They are determined by what the meaning of the table is (ie how to determine what rows go in it in any given situation) and the possible situtations that can arise. You didn't give them. Your expectation that we could tell you about the normal forms your tables are in without giving such information suggests that you do not understand normalization and need to educate yourself about it.
Proper design also needs this information and in general all valid states that can arise from situations that arise. Ie constraints among given tables. You didn't give them.
Having two tables with the same key goes against the idea of removing redundancy in normalization.
Excluding that, are these tables in 1NF and 2NF?
Judging by the Names field, I'd suggest that table1 is not. If multiple names can belong to one ticket, then you need a new table, most likely with a composite key of ticket_id,name.

Is it proper to make a grand-parent key, a primary key, in its grand-child, in a multi-level identifying relationship?

Asked this here a couple of days ago, but haven't gotten many views, let alone a response, so I'm reposting to stackoverflow.
I'm modeling a DB for a conference ticketing system. In this system attendees are members of an attendee group, which belong to a conference. These relationships are identifying, and therefore FKs must be PKs in the respective children.
My current model:
Q: Is it proper to have attendeeGroupConferenceId FK, as a PK, in the attendee table, as MySQL Workbench has automatically set up for me?
On one side one would get a performance boost by keeping it in there for quick association at "check in". However, it does not strictly necessary since the combination of id, attendeeGroupId, and a corresponding lookup of conferenceId in the respective attendeeGroup table, is enough. (Therefore becomes redundant data.)
To me, it feels like it might violate some form of normalization, but I plan on keeping it in for the speed boost as described. I'm just curious about what proper design says about giving it PK status or not.
You definitely don't need the attendeeGroupConferenceId in your attendee table. It's redundant and notice that candidate key is the combination of (attendeeGroupId, personId), not the attendeeGroupConferenceId alone.
The table attendee also seems to violate the Second normal form (2NF) as it is.
My suggestion is to remove the attribute attendeeGroupConferenceId. In any case you can just join the tables in your queries to get extra info rather than keeping an extra attribute.

Guaranteeing a FK relationship through multiple tables

I'm using MySQL / InnoDB, and using foreign keys to preserve relationships across tables. In the following scenaro (depicted below), a 'manager' is associated with a 'recordLabel', and an 'artist' is also associated with a 'recordLabel'. When an 'album' is created, it is associated with an 'artist' and a 'manager', but both the artist and the manager need to be associated with the same recordLabel. How can I guarantee that relationship with the current table setup, or do I need to redesign the tables?
You cannot achieve this result using pure DRI - Declarative Referential Integrity, or the linking of foreign keys to ensure the schema's referential integrity.
There are 2 ways to solve this problem:
Consider the requirement a database problem, and use a trigger on INSERT and UPDATE to validate the requirements, and fail otherwise.
Consider the nested link a business logic requirement, and implement it in your business logic in PHP/C#/whatever.
As a sidenote, I think the structure is rather strange from a practical perspective - as far as I know an Artist is signed to a RecordLabel, and assigned a Manager separately (either from the label or individually, many artists retain their own manager when switching to another label). Linking the Manager also to the Album only makes sense to record historic managers, enabling you to retrieve who was the manager to the artist when the album was released, but that automatically means your requirement is invalid if the artist switches labels and/or manages later on. I think therefore it is wrong from a practical data view to enforce this link.
What you do is add recordLabel id to the albums table. Then you put two, two column indexes on albumns (recordLabel_id, artist_id) and (recordLabel_id, managers_id).
Because the record_id can only have one value in each row of the albumns table you will have insured integrity.