Normalization to 3NF?

Normalization to 3NF? - mysql

I have to Normalize this table to at least 3NF
customerID | customerName | petID | petName | transID | transName | transDetails | Price
with the FD as follows
customerID -> {customerName, petID}
petID -> {petName, transID}
transID -> {transName, transDetails, Price}
Now my answer to this is:
customer(customerID, customerName) //customerID as PK
pet(petID, petName, customerID) //petID as PK and customerID as FK
transaction(transID, transName, transDetails, Price, petID) //transID as PK and petID as FK
I really don't understand it that much since I can't absorb the information right now because my brain is now a pulp from uni :( So am I right?
I looked at the FD much closely, should the actual 3NF be :
pet(petID, petName, transID)
customer(customerID, customerName, petID)
transaction(transID, transName, transDetails, Price)

It is unclear what is the proper relationship between customers and pets. From the functional dependencies it seams that all three entities are related 1:1, which means your first answer can work as long as the foreign keys are also unique (to prevent 1:N).
The second answer satisfies all three requirements for 3NF
every field functionally depends on the key
the whole key (no composite keys)
and nothing but the key (no candidate keys)
But I would keep the foreign keys away from the pet or customer table. Each can exist on their own and we want to avoid nulls in foreign key columns. The foreign keys belong in the transaction table because a transaction is what links a pet to a customer, regardless of the desired cardinality.

Related

Entity-Relationship Diagram Redundancy: store, product, orders, categories

I'm trying design a model which allows a user to be a buyer and seller with a single account, but some teachers told me that this diagram is wrong because it has redundancy.
I had reviewed the diagram but I haven't found a way to solve this redundancy. In the table orders I need to know who is a buyer, so for this reason I didn't delete this from the table. Some ideas?

The only thing that are "redundant" (not normalized to be exact) in your scheme is this :
You don't need to make a special ID, a composite PK is enough.
-------------------
| ORDERPRODUCT |
-------------------
| PK | PRODUCT_ID |
| PK | ORDER_ID |
-------------------
ADD CONSTRAINT pk
PRIMARY KEY (PRODUCT_ID, ORDER_ID);

On top of what #Blag has said, for Categories, you have 2 fields that might do the same thing: categoryname and description. You already have an identifier with PK_IdCategory, so one of those might be unnecessary

In what case adding an ID field to a many-to-many table might be a good idea?

Let's say there are two entities - Product and Image with a many-to-many relationship between them. The order of images associated with each product does matter.
Product
------------------------------------
ProductID (primary key)
ProductName
...
Image
------------------------------------
ImageID (primary key)
Url
Size
...
What are the cons and pros of the following three many-to-many "bridge" table approaches for solving this problem?
ProductImage
------------------------------------
ProductImageID (primary key, identity)
ProductID (foreign key)
FullImageID (foreign key)
ThumbImageID (foreign key)
OrderNumber
or
ProductImage
------------------------------------
ProductID (primary key, foreign key)
IndexNumber (primary key)
FullImageID (foreign key)
ThumbImageID (foreign key)
or
ProductImage
------------------------------------
ProductID (primary key, foreign key)
FullImageID (primary key, foreign key)
ThumbImageID (foreign key)
OrderNumber (index)

There is no purpose (that I have ever found) in adding a surrogate key (i.e. the IDENTITY field) to a many-to-many "bridge" table (or whatever you want to call it). However, neither of your proposed schemas is correct.
In order to get the ideal setup, you first need to determine the scope / context of the following requirement:
The order of images associated with each product does matter.
Should the ordering of the images be the same, in relation to each other, regardless of what Products they are associated with? Meaning, images A, B, C, and D are always in alphabetical order, regardless of what combination of them any particular Product has.
Or, can the ordering change based on the Product that the Image is associated with?
If the ordering of the Images needs to remain consistent across Products, then the OrderNumber field needs to go into the Image table. Else, if the ordering can change per Product, then the OrderNumber field go into this bridge / relationship table.
In either case:
the PK is the combination of FKs:
A Primary Key uniquely, and hopefully reliably (meaning that is doesn't change), identifies each row. And if at all possible, it should be meaningful. Using the combination of the two FK fields gives exactly that while enforcing that uniqueness (so that one Product cannot be given the same Image multiple times, and vice-versa). Even if these two fields weren't chosen as the PK, they would still need to be grouped into a UNIQUE INDEX or UNIQUE CONSTRAINT to enforce that data integrity (effectively making it an "alternate key"). But since these IDs won't be changing (only inserted and deleted) they are well suited to be the PK. And if you are using SQL Server (and maybe others) and decide to use this PK as the Clustered index, then you will have the benefit of having both ProductID and ImageID in any Non-Clustered Indexes. So when you need to sort by [OrderNumber], the Non-Clustered Index on that field will automatically be a covering index because the only two data fields you need from it are already there.
On the other hand, placing the [OrderNumber] field into the PK has a few downsides:
It can change, which is not ideal for PKs.
It removes the ability to enforce that a ProductID and ImageID can only relate to each other one time. Hence would need that additional UNIQUE INDEX or UNIQUE CONSTRAINT in order to maintain the data integrity. Else, even if you include all 3 fields in the PK, it still allows for the ProductID + ImageID combination to be there multiple times per various values of IndexID.
there is no need for an IDENTITY field:
With the above information in mind, all of the requirements of a PK have already been met. Adding a surrogate key / auto-increment field adds no value, but does take up additional space.
And to address the typical reply to the above statement regarding the surrogate key not adding any value, some will say that it makes JOINs easier if this combination of ProductID+ImageID needs to be Foreign Keyed to a child table. Maybe each combination can have attributes that are not singular like [OrderNum] is. An example might be "tags" (although those would most likely be associated with just ImageID, but it works as a basic example). Some people prefer to only place a single ID field in the child table because it is "easier". Well, it's not easier. By placing both ImageID and ProductID fields in the child table and doing the FK on both back to this PK, you now have meaningful values in the child table and will not need to JOIN to this [ProductImage] table all of the time just to get that information (which will probably be needed in most queries that are not simply listing or updating those attributes for a particular ProductID+ImageID combination). And if it is not clear, adding a surrogate key still requires a UNIQUE INDEX or UNIQUE CONSTRAINT to enforce the data integrity of unique ProductID+ImageID combinations (as stated above in the first bullet point).
And placing both ID fields into the child table is another reason to stay away from fields that can change when choosing a PK: if you have FKs defined, you need to set the FK to ON UPDATE CASCADE so that the new value for the PK propagates to all child tables, else the UPDATE will fail.
ProductImage
------------------------------------
ProductID (primary key, foreign key to Product table)
FullImageID (primary key, foreign key to Image table)
ThumbImageID (foreign key; shouldn't this field be in the Image table?)
OrderNumber TINYINT (only here if ordering is per Product, else is in Image table)
The only reason I can see for adding a surrogate key in this situation is if there is a requirement from some other software. Things such as SQL Server Replication (or was it Service Broker?) and/or Entity Framework and/or Full-Text Search. Not sure if those examples do require it, but I have definitely seen 1 or 2 "features" that require a single-field PK.

The best way to achieve this is by having three tables, one for products, one for images and one for their relationship
products
--------
+ product_id (pk)
- product_name
- product_description
- ...
images
------
+ image_id (pk)
- image_title
- ...
product_images
--------------
+ product_id (fk)
+ image_id (fk)

Why do you have seperate tables for fullImage and thumbImage?
Table1 is better since it allows you identify individual rows inside the table.
Table2, im sure you can't have two primary keys.
It might be better to have an Image table as follows.
ImageId (primary)
FullImage [actual value/FK]
ThumbNail [actual value/FK]
and then,
ProductImageID (primary)
ProductID [FK]
ImageID [FK]
How that helps,
Regards,
Rainy

Composite primary key vs single primary key and unique index

I'm modeling a voting system which has the following entities:
category
nominee
phase
As the name suggest, I'll be storing categories and nominees in the respective tables. The voting will have two phases. In the first phase there'll be 8 nominees per category. The 4 most voted nominees will pass to the second (and final) phase.
So far I have this structure (simplified)
category
id PK
name
nominee
id PK
name
phase
id PK
name
My problem is how to model the voting part. I think I have 2 options, but I'm not sure which one is better or what are the pros / cons of each:
Option 1: Having a category_nominee table with a composite 3 column primary key (I'm pretty sure the "canonical" PK here is formed by these 3 fields; not sure about performance implications; I'm using mysql)
category_nominee
category_id PK
nominee_id PK
phase_id PK
What I don't like about this is that to reference category_nominee from the votes table I'll have to use these 3 columns again, since I dont' have an single identifier in category_nominee. So, in the vote table I'll have to repeat the 3 columns:
vote
id
category_id FK
nominee_id FK
phase_id FK
Additionally, I'm not sure if category_id should point to category.id or to category_nominee.category_id (I'm leaning towards the latter)
Option 2: Create an autoincremented id column in category_nominee and make category_id, nominee_id and phase_id a composite unique key.
category_nominee
id
category_id Unique
nominee_id Unique
phase_id Unique
vote
id PK
category_nominee_id FK
This will simplify referencing a category_nominee record and will avoid some repetition. I expect to have much more records in vote than in category_nominee. Still I'm not sure which option is more convenient.
SQL Fiddle for option 1
SQL Fiddle for option 2

From what I learned about modeling data, option 1 is the good option. Maybe this is the reason for the existence of foreign keys. Never seen option 2.
But in your option 1, category_nominee and vote are duplicates. Implement something like this :
category
id PK
name
nominee
id PK
name
phase
id PK
name
vote
(category_id FK
nominee_id FK
phase_id FK) PK
//others fields required or not
Nothing prevents you from renaming the (category_nominee.)category_id field, if you want unique column names in all your tables. You simply have to link this column to the origin column as a foreign key.

Does it make sense to have three primary keys, two of which are foreign keys, in one table?

I've created a database with three tables in it:
Restaurant
restaurant_id (autoincrement, PK)
Owner
owner_id (autoincrement, PK)
restaurant_id (FK to Restaurant)
Deal
deal_id (autoincrement)
owner_id (FK to Owner)
restaurant_id (FK to Restaurant)
(PK: deal_id, owner_id, restaurant_id)
There can be many owners for each restaurant. I chose two foreign keys for Deal so I can reference the deal by either the owner or the restaurant. The deal table would have three primary keys, two being foreign keys. And it would have two one-to-many relationships pointing to it. All of my foreign keys are primary keys and I don't know if I'll regret doing it like this later on down the road. Does this design make sense, and seem good for what I'm trying to achieve?
Edit: What I really need to be able to accomplish here is when a owner is logged in and viewing their account, I want them to be able to see and edit all the deals that are associated with that particular restaurant. And because there can be more that one owner per restaurant, I need to be able to perform a query something like: select *from deals where restaurant_id = restaurant_id. In other words, if I'm an owner and I'm logged in, I need to be able to make query: get all of the deal that are related to not just me, the owner, but to all of the owners associated with this restaurant.

You're having some trouble with terminology.
A table can only ever have a one primary key. It is not possible to create a table with two different primary keys. You can create a table with two different unique indexes (which are much like a primary key) but only one primary key can exist.
What you're asking about is whether you should have a composite or compound primary key; a primary key using more than one column.
Your design is okay, but as written you probably have no need for the column deal_id. It seems to me that restaurant_id and owner_id together are enough to uniquely identify a row in Deal. (This may not be true if one owner can have two different ownership stakes in a single restaurant as the result of recapitalization or buying out another owner, but you don't mention anything like that in your problem statement).
In this case, deal_id is largely wasted storage. There might be an argument to be made for using the deal_id column if you have many tables that have foreign keys pointing to Deal, or if you have instances in which you want to display to the user Deals for multiple restaurants and owners at the same time.
If one of those arguments sways you to adopt the deal_id column, then it, and only it, should be the primary key. There would be nothing added by including the other two columns since the autoincrement value itself would be unique.

If u have a unique field, this should be the PK, that would be the incremented field.
In this specific case it gives u nothing at all to add more fields to this key, it actually somewhat impacts performance (don't ask me how much, u bench it).

if you want to create 2 foreign keys in the deal table which are the restaurant and the owner the logic is something like a table could exist in the deal even without an owner or an owner could exist in the deal even without identifying the table on it but you could still identify the table because it's being used as a foreign key on the owner table, but if your going to put values on each columns that you defined as foreign key then I think it's going to be redundant cause I'm not sure how you would use the deal table later on but by it's name I think it speaks like it would be used to identify if a restaurant table is being reserved or not by a customer and to see how you have designed your database you could already identify the table which they have reserved even without specifying the table as foreign key in the deal table cause by the use of the owner table you would able to identify which table they have reserved already since you use it as foreign key on the owner table you just really have to be wise on defining relationships between your tables and avoid redundancy as much as possible. :)

I think it is not best.
First of all, the Deal table PK should be the deal_id. There is no reason to add additional columns to it--and if you did want to refer to the deal_id in another table, you'd have to include the restaurant_id and owner_id which is not good. Whether deal_id should also be the clustered index (a.k.a. index organized on this column) depends on the data access pattern. Will your database be full of data_id values most often used for lookup, or will you primarily be looking deals up by owner_id or restaurant_id?
Also, using two separate FKs way the you have described it (as far as I can tell!) would allow a deal to have an owner and restaurant combination that are not a valid (combining an owner that does not belong to that restaurant). In the Deal table, instead of one FK to Owner and one FK to Restaurant, if you must have both columns, there should be a composite FK to only the Owner table on (OwnerID, RestaurantID) with a corresponding unique key in the Owner table to allow this link up.
However, with such a simple table structure I don't really see the problem in leaving RestaurantID out of the Deal table, since the OwnerID always fully implies the RestaurantID. Obviously your deals cannot be linked only with the restaurant, because that would imply a 1:M relationship on Deal:Owner. The cost of searching based on Restaurant through the Owner table shouldn't really be that bad.

Its not wrong, it works. But, its not recommended.
Autoincrement Primary Keys works without Foreign Keys (or Master Keys)
In some databases, you cannot use several fields as a single primary key.
Compound Primary Keys or Compose Primary Keys are more difficult to handle in a query.
Compound Primary Key Query Example:
SELECT
D.*
FROM
Restaurant AS R,
Owner AS O,
Deal AS D
WHERE
(1=1) AND
(D.RestaurantKey = D.RestaurantKey) AND
(D.OwnerKey = D.OwnerKey)
Versus
Single Primary Key Query Example:
SELECT
D.*
FROM
Restaurant AS R,
Owner AS O,
Deal AS D
WHERE
(D.OwnerKey = O.OwnerKey)
Sometimes, you have to change the value of foreign key of a record, to another record. For Example, your customers already order, the deal record is registered, and they decide to change from one restaurant table to another. So, the data must be updated, in the "Owner", and "Deal" tables.
+-----------+-------------+
| OwnerKey | OwnerName |
+-----------+-------------+
| 1 | Anne Smith |
+-----------+-------------+
| 2 | John Connor |
+-----------+-------------+
| 3 | Mike Doe |
+-----------+-------------+
+-----------+-------------+-------------+
| OwnerKey | DealKey | Food |
+-----------+-------------+-------------+
| 1 | 1 | Hamburguer |
+-----------+-------------+-------------+
| 2 | 2 | Hot-Dog |
+-----------+-------------+-------------+
| 3 | 3 | Hamburguer |
+-----------+-------------+-------------+
| 1 | 3 | Soda |
+-----------+-------------+-------------+
| 2 | 1 | Apple Pie |
+-----------+-------------+-------------+
| 3 | 3 | Chips |
+-----------+-------------+-------------+
If you use compound primary keys, you have to create a new record for "Owner", and new records for "Deals", copy the other fields, and delete the previous records.
If you use single keys, you just have to change the foreign key of Table, without inserting or deleting new records.
Cheers.

how to save marital relationship in a database

I have to save this information in a database
Person -> is married to -> Person
Where should I save that information? What is the proper design pattern should I apply here?
Thank you!

If you can only be maried to one person: 1:1
-------------
- Person -
-------------
id (key)
maried_to_id (foreign key)
If you can be maried to more than one person or want to keep track of previous mariages, n:n
-------------
- Person -
-------------
person_id (key)
-------------
- Mariage -
-------------
first_person_id (foreign key)
second_person_id (foreign key)
start_date
end_date
(also first_person_id + second_person_id + date form a unique key for mariage. You could leave out the date, but then remariages wouldnt be tracked)

Here is a hypothetical schema you can use. All people are in a single table, and each person has a unique id. Marriages are in a relationship table, with foreign keys.
PERSONS
- ID - INTEGER, PK
- FIRSTNAME - VARCHAR(20)
- LASTNAME - VARCHAR(20)
- SEX - CHAR(1)
- ... any other fields
MARRIAGES
- PERSON1_ID - INTEGER, FK
- PERSON2_ID - INTEGER, FK
- MARRIAGE_DATE - DATE
- ANULLMENT_DATE - DATE
- ... any other fields

This is a great question for teaching schema design. What seems like a simple problem can easily become quite complicated:
E.g., how to handle:
- mariages of more than two people
- different types of marriage (legal, religious, other)
- concurrent marriages
- repeat marriages
- divorce
- self-marriage (hey, it happend on Glee!)
The trick, if there is one, is to carefully think out all the permutations of what you are trying to model. Only then do you actually go ahead and model it.

I would recommend Following structure
Lets say table name is Person.
PersonId (int, Key)
MarriedTo (int,
nullable)
.....
No need to create foreign key relation ship.

This sounds like a use for a simple lookup table- the important part is having two fields, one a foreign key for Person1's ID field the other a foreign key for Person2's ID field. Any details about the marriage ( dates, whether it is still current and so on ) would also be stored in this table.
That would facilitate people having had multiple marriages, polygamous relationships and so on. If you want a simple 1:1 relationship you could just include a foreign key reference to the spouse in the person field, but it would be considerably less flexible.

You could do it with a "Spouse" column on the "Person" table which can be null (for the case of an unmarried person).
If married this holds the id of the other person, as is a foreign key.
A better solution would be a separate "Marriage" table that has at least three columns:
MarriageId
Person1Id
Person2Id
...
The person id's are foreign keys into the "Person" table, and you should make the combination of MarriageId, Person1Id and Person2Id unique to avoid adding a row where the people are swapped over.
Though it should be pointed out that both these models are quite basic and make assumptions about how many people can be in one marriage ;)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008