Normalization of data for database?

Normalization of data for database? - mysql

Here is the data I have to normalize:
//
1NF
Customer ID [First Name (PK), Last Name (PK), Phone, Address, Town, Postcode, Email]
Booking [Date (PK), Room (PK), Type, Occupants, Nights, Arrival Time]
ExtraID [Item Name, Item Cost, Date (FK), Room (FK)]
//
First Name + Last Name = Composite Key
Date + Room = Composite Key
//
Is this ok?
Also to go into 2NF I have to identify partial dependencies. As far as I see Phone, Address, Town, Postcode and email requires both parts of the composite key?
So is this in 2NF already?
Thank you.

It's generally a good idea to use synthetic keys. This has to do with people, especially---none of the natural keys are truly fit for the primary key purpose (we might go into biometrics here but this would be a bit off topic).
So, there would need to be a table of customers with its own customers_pk primary key which could be either a sequence in the RDBMS or, say, GUID (http://en.wikipedia.org/wiki/Globally_unique_identifier).
There should be a historical table for rooms---the rates tend to change, as well as other room's characteristics. We could define a room to be unique physical object and also decide on whether a room stays the same after remodelling (there are pros and contras to that, it would depend on the business standpoint).
I would create a separate dictionary for the extras (we could use receipt IDs in there and CDR's references) and then link the extras with the bookings via a table with it's own primary key, foreign key to booking's primary key and the foreign key to extra's primary key.
Now, bookings' table should have it's own primary key, then customer's key, rooms' key, dates (they could be date type or we could create a separate time dimension where we could list various useful information such as whether it's high season or if there are some local events), number of occupants, sum of extras charges (well, might be not the best of all ideas) and the grant total.
You could use a separate sequence for each table's keys or just one for all of them---the latter is a bit more elegant.

Related

Can A Foreign Key Be Used More than Once?

Apologies for the newbie question.
The primary key of a table, such as Holiday, would be something like Holiday_ID. Holiday reference a get-away ticket that you can buy to go on a type of holiday, based on the ticket you buy.
Suppose I used Holiday_ID in a composite entity with Customer_ID to identify an instance of Holiday associated with customer, for whatever purpose.
However, suppose I also want to keep track of other information related to this instace: how much has the customer paid for the ticket, how much has the customer yet to pay for the ticket
I have two options:
a) I can create another composite entity. However, I am not sure if I can do that because I am not sure if you can use a particualr foreign key more than once
b) I can create a composite/associate entity, however, I am not sure if you can create a composite entity with more than two foreign keys?

To answer the technical parts of your question, once you create a composite unique or primary key, ONLY ONE record in the table can have the same values in the set of fields defined in that key. SO, no, you cannot reuse the holidayId key WITH THE SAME customer. You can use it with another, different customer if you wish.
Second, there is no limit to the number of attributes that can be included in a Unique or primary key. If you need, and if it's appropriate and conforms to the rules of normalization, the key can include all the attributes of the table.
Third, to answer your question below, Any column, or set of columns in a table can be defined as a Foreign Key, as long as it is also the primary key or unique key of some table in the database. And there can be any number of FKs defined in a table, they can even overlap. (you can have HolidayId as a FK, and also have HolidayID and CustomerId as a composite FK) the only restriction is that the FK must reference a Primary or Unique Key of some table in the database.(It can also be the same table the FK is in as well, as when you add a supervisorId to an employee Table that is a FK to the EMployeeId of the same employee table)
This example illustrates one of the problems of using surrogate keys without also using a natural key. to wit, what, exactly is a "Holiday"? Is Christmas 2016 the same "Holiday" as Christmas 2015? Is Christmas in Aruba the same holiday as Christmas in Hawaii?
and then, about the composite table to identify associations of customer with Holiday, is it the same association if the customer goes to Aruba on Christmas the next year, or a different instance? What does the row in the table represent if the customer wants 5 tickets?
The first thing that should be done in database design is a logical design which defines, as clearly and unambiguously as possible, in business terms, the meanings of the entities for each table in the database.

Autoincrement ID with a prefix

I'm designing a table 'employees', which contains an primary key which is auto increment and represents an ID of the employee.
I want to prefix the ID with an number designating the city: city 1: 1, city 2:2, etc.
So the IDs should look like xyy where x represents the city and yy the ID of the employee.
When I'm adding new employee I'm selecting the city x, and I would like to yy values to auto increment.
Is that possible using SQL commands?

That is not good database design. You really should have a separate column for the city in your table. If you have many cities, the cities should perhaps be in their own table. What you are trying to do is overly complex and although 'everything is possible', I would not recommend it.

You are effectively packing two fields into one and violating the principle of atomicity and the 1NF in the process. On top of that, your key is not minimal (so to speak).
Instead, keep two separate fields: ID and CITY.
ID alone is the primary key. In your own words, ID is auto-increment, so it alone is unique.
You can easily concatenate ID and CITY together for display purposes in your query or VIEW or even in the client code. There is no reason to "precook" the concatenated value in the table itself.

Given this requirement from the comments, "Unique ID should provide users with an info of the city, company requirements", I would do this.
table employee would have an employeeID as the primary key. Other fields would be firstname, lastname, birthdate, gender, etc
table city would have a cityId as the primary key. Other fields would be the name of the city, provinceState, Country, whatever is appropriate.
Table EmployeeCity would have have a primary key of EmployeeId, CityId, and StartDate. Not part of the primary key would be field EndDate.
The primary key of EmployeeCity satisfies the requirement of a unique identifier which leads to city information. Also, if an employee changes cities, it's a simple matter of updating one record and adding another.

To make PK on unique combination of columns or add a numeric rowID

This is more of a design problem then a programming one.
I have a table where I store details about retail products:
Name Barcode BarcodeFormat etc...
----------------------------------------
(Name, Barcode, BarcodeFormat) are three columns will uniquely identify a record in the table (Candidate Key). However, I have other tables that need a FK on this one. So I introduced an auto_increment column itemId and made that the PK.
My question is - should I have the PK as (itemId, Name, Barcode, BarcodeFormat) or would it be better to have PK(itemId) and UNIQUE(Name, Barcode, BarcodeFormat).
My primary concern is performance in terms of INSERT and SELECT operations but comments on size are also welcome.
I'm using an innodb table with mysql

Definitely: PK(itemId) and UNIQUE(Name, Barcode, BarcodeFormat).
You don't want the hassle of using a multi-part key for all your joins etc
You may one day have rows without barcode values which then won't be unique, so you don't want uniqueness hard-wired into your model (you can easily drop the unique without breaking any relationships etc)
The constraint on uniqueness is a business-level issue, not a database entity one: You'll always need a key, but you may not always need the business rule of uniqueness

Unless you have millions of products, or very high throughput requirements it won't make much difference in terms of performance.
My preference is to have a surrogate PK (i.e. the auto increment column, your second option of PK(itemId) and UNIQUE(Name, Barcode, BarcodeFormat) ) because this is easier to manage if business keys change.

You have two candidate keys. We call the three-column compound key the 'natural key' and the auto_increment column (in this case) the 'surrogate key'. Both require unique constraints ('unique' in lower case to denote logical) at the database level.
Optionally, one candidate key may be designated 'primary'. The choice of which key (if any) should get this designation is arbitrary. Beware of anyone giving you definitive advice on this matter!

If you already add an itemId then you should use that as PK and have the other three columns with a UNIQUE.
If you don't have an itemId then you could use the other columns as the PK, but it may become difficult to keep it everywhere. In this case it is not great, because the product should have an id since it is an entity, but if it where just a relationship, then it would be acceptable not to have an id column.

Database structure: Would this structure work with this m:m?

Here is my issue: (Using MySQL)
I have 2 entities called 'shops' and 'clients'. I also have a M:M table between 'clients' and 'shops' called 'clients_shops' (CakePHP naming convention). The reason I am doing it this way is that this is a SaaS application where 'clients' may have many 'shops' and 'shops' will definitely have many 'clients'.
However, I don't want to give a shop the ability to UPDATE/DELETE a 'client' record since what really needs to happen is that the 'shop' will EDIT/DELETE that 'client' from their own records, rather than from a master 'clients' table which is managed by the 'clients'.
Anyway, using this structure a 'shop' can run a query on the 'clients_shops' table to get a list of their clients and a 'client' can run a query a get a list of their 'shops'. Good so far...
So far, the database looks like this:
table.clients
client_id (PK, AI, NN)
table.shops
shop_id (PK, AI, NN)
table.clients_shops
clients_shops_id (PK,AI,NN)
client_id (FK)
shop_id (FK)
The ORM looks like this:
shops hasMany clients_shops
clients hasMany clients_shops
So far so good (I think...) but here is my question. Let's say that there is a third table named 'trips'. The 'trips' table stores information on individual bookings whereby a 'client' will make reservations for a 'trip' that is provided by a 'shop'. This is where my brain is getting mushy. How should I set this relationship up?
Is it this way:
table.trips
trips_id (PK,AI,NN)
clients_shops_id (FK) [which would contain keys for both the shop and the client]
Or is there a better way to do this, like another table that uses clients.client_id AND clients_shops.clients_shops_id.
Thanks in advance to anyone that actually read this whole thing!

Unless it's required by your ORM, you don't need a surrogate foreign key for clients/shops and everything that refers to it.
Make a composite PRIMARY KEY instead and refer to it from elsewhere:
CREATE TABLE clients_shops
(
client_id INT NOT NULL,
shop_id INT NOT NULL,
PRIMARY KEY (client_id, shop_id)
);
CREATE TABLE trips
(
trip_id INT NOT NULL PRIMARY KEY,
client_id INT NOT NULL,
shop_id INT NOT NULL,
trip_data …,
CONSTRAINT fk_trips_clients_shops
FOREIGN KEY (client_id, shop_id)
REFERENCES clients_shops
);
This model assumes that you maintain clients/shops relationships separately from the clients' transactions and not let clients buy from the shops unless they are "related".
Probably you want the relationship to appear automatically whenever a trip is ordered by a client from a shop. In this case, you only need the second table, and the first table is a mere
SELECT DISTINCT client_id, shop_id
FROM trips

Here is the Logical Diagram to handle what you are looking for. Depending on your requirements you can change the non-identying relationships (Client::Trip & Shop::Trip) to identifying relationships. If you do though I would limit it to only changing the Shop::Trip to identifying though. Also make changes to the Cardinality as you see fit.

I would probably make the trips table like this:
table.trips
trip_id (PK)
shop_id (FK to shops)
client_id (FK to clients)
other_trip_column_etc
I would not reference the m-m table clients_shops from the trips table - just reference the shop and client tables with individual foreign keys.
The clients_shops table represents the current relationship between a client and a shop. The trip should not depend on these relationships, because they could potentially change in the future, and you probably wouldn't want the trip's data to change over time - it should be a transactional record that specifies exactly what shop, client, and trip was scheduled at that given time, regardless of the current relationship between that client and shop.

DB schema for a booking system of fitness class

I need a schema for fitness class.
The booking system needs to store max-number of students it can take, number of students who booked to join the class, students ids, datetime etc.
A student table needs to store classes which he/she booked. But this may not need if I store students ids in class tables.
I am hoping to get some good ideas.
Thanks in advance.

Student: ID, Name, ...
Class: ID, Name, MaxStudents, ...
Student_in_Class: STUDENT_ID, CLASS_ID, DATE_ENROLL

*Not a mySql guru, I typically deal w/ MS SQL, but I think you'll get the idea. You might need to dig a little in the mySql docs to find appropriate data types that match the ones I've suggested. Also, I only gave brief explanation for some types to clarify what they're for, since this is mySql and not MS SQL.
Class_Enrollment - stores the classes each student is registered for
Class_Enrollment_ID INT IDENTITY PK ("identity is made specifically
to serve as an id and it's a field that the system will manage
on its own. It automatically gets updated when a new record is
created. I would try to find something similar in mySql")
Class_ID INT FK
Student_ID INT FK
Date_Time smalldatetime ("smalldatetime just stores the date as a
smaller range of years than datetime + time up to minutes")
put a unique constraint index on class_id and student_id to prevent duplicates
Class - stores your classes
Class_ID INT IDENTITY PK
Name VARCHAR('size') UNIQUE CONSTRAINT INDEX ("UNIQUE CONSTRAINT INDEX is
like a PK, but you can have more than one in a table")
Max_Enrollment INT ("unless you have a different max for different sessions
of the same class, then you only need to define max enrollment once per
class, so it belongs in the class table, not the Class_Enrollment table")
Student - stores your students
Student_ID INT IDENTITY PK
First_Name VARCHAR('size')
Last_Name VARCHAR('size')
Date_of_Birth smalldatetime ("smalldatetime can store just the date,
will automatically put 0's for the time, works fine")
put a unique constraint index on fname, lname, and date of birth to eliminate duplicates (you may have two John Smiths, but two John Smiths w/ exact same birth date in same database is unlikely unless it's a very large database. Otherwise, consider using first name, last name, and phone as a unique constraint)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008