Composite primary key vs single primary key and unique index - mysql

I'm modeling a voting system which has the following entities:
category
nominee
phase
As the name suggest, I'll be storing categories and nominees in the respective tables. The voting will have two phases. In the first phase there'll be 8 nominees per category. The 4 most voted nominees will pass to the second (and final) phase.
So far I have this structure (simplified)
category
id PK
name
nominee
id PK
name
phase
id PK
name
My problem is how to model the voting part. I think I have 2 options, but I'm not sure which one is better or what are the pros / cons of each:
Option 1: Having a category_nominee table with a composite 3 column primary key (I'm pretty sure the "canonical" PK here is formed by these 3 fields; not sure about performance implications; I'm using mysql)
category_nominee
category_id PK
nominee_id PK
phase_id PK
What I don't like about this is that to reference category_nominee from the votes table I'll have to use these 3 columns again, since I dont' have an single identifier in category_nominee. So, in the vote table I'll have to repeat the 3 columns:
vote
id
category_id FK
nominee_id FK
phase_id FK
Additionally, I'm not sure if category_id should point to category.id or to category_nominee.category_id (I'm leaning towards the latter)
Option 2: Create an autoincremented id column in category_nominee and make category_id, nominee_id and phase_id a composite unique key.
category_nominee
id
category_id Unique
nominee_id Unique
phase_id Unique
vote
id PK
category_nominee_id FK
This will simplify referencing a category_nominee record and will avoid some repetition. I expect to have much more records in vote than in category_nominee. Still I'm not sure which option is more convenient.
SQL Fiddle for option 1
SQL Fiddle for option 2

From what I learned about modeling data, option 1 is the good option. Maybe this is the reason for the existence of foreign keys. Never seen option 2.
But in your option 1, category_nominee and vote are duplicates. Implement something like this :
category
id PK
name
nominee
id PK
name
phase
id PK
name
vote
(category_id FK
nominee_id FK
phase_id FK) PK
//others fields required or not
Nothing prevents you from renaming the (category_nominee.)category_id field, if you want unique column names in all your tables. You simply have to link this column to the origin column as a foreign key.

Related

In what case adding an ID field to a many-to-many table might be a good idea?

Let's say there are two entities - Product and Image with a many-to-many relationship between them. The order of images associated with each product does matter.
Product
------------------------------------
ProductID (primary key)
ProductName
...
Image
------------------------------------
ImageID (primary key)
Url
Size
...
What are the cons and pros of the following three many-to-many "bridge" table approaches for solving this problem?
ProductImage
------------------------------------
ProductImageID (primary key, identity)
ProductID (foreign key)
FullImageID (foreign key)
ThumbImageID (foreign key)
OrderNumber
or
ProductImage
------------------------------------
ProductID (primary key, foreign key)
IndexNumber (primary key)
FullImageID (foreign key)
ThumbImageID (foreign key)
or
ProductImage
------------------------------------
ProductID (primary key, foreign key)
FullImageID (primary key, foreign key)
ThumbImageID (foreign key)
OrderNumber (index)
There is no purpose (that I have ever found) in adding a surrogate key (i.e. the IDENTITY field) to a many-to-many "bridge" table (or whatever you want to call it). However, neither of your proposed schemas is correct.
In order to get the ideal setup, you first need to determine the scope / context of the following requirement:
The order of images associated with each product does matter.
Should the ordering of the images be the same, in relation to each other, regardless of what Products they are associated with? Meaning, images A, B, C, and D are always in alphabetical order, regardless of what combination of them any particular Product has.
Or, can the ordering change based on the Product that the Image is associated with?
If the ordering of the Images needs to remain consistent across Products, then the OrderNumber field needs to go into the Image table. Else, if the ordering can change per Product, then the OrderNumber field go into this bridge / relationship table.
In either case:
the PK is the combination of FKs:
A Primary Key uniquely, and hopefully reliably (meaning that is doesn't change), identifies each row. And if at all possible, it should be meaningful. Using the combination of the two FK fields gives exactly that while enforcing that uniqueness (so that one Product cannot be given the same Image multiple times, and vice-versa). Even if these two fields weren't chosen as the PK, they would still need to be grouped into a UNIQUE INDEX or UNIQUE CONSTRAINT to enforce that data integrity (effectively making it an "alternate key"). But since these IDs won't be changing (only inserted and deleted) they are well suited to be the PK. And if you are using SQL Server (and maybe others) and decide to use this PK as the Clustered index, then you will have the benefit of having both ProductID and ImageID in any Non-Clustered Indexes. So when you need to sort by [OrderNumber], the Non-Clustered Index on that field will automatically be a covering index because the only two data fields you need from it are already there.
On the other hand, placing the [OrderNumber] field into the PK has a few downsides:
It can change, which is not ideal for PKs.
It removes the ability to enforce that a ProductID and ImageID can only relate to each other one time. Hence would need that additional UNIQUE INDEX or UNIQUE CONSTRAINT in order to maintain the data integrity. Else, even if you include all 3 fields in the PK, it still allows for the ProductID + ImageID combination to be there multiple times per various values of IndexID.
there is no need for an IDENTITY field:
With the above information in mind, all of the requirements of a PK have already been met. Adding a surrogate key / auto-increment field adds no value, but does take up additional space.
And to address the typical reply to the above statement regarding the surrogate key not adding any value, some will say that it makes JOINs easier if this combination of ProductID+ImageID needs to be Foreign Keyed to a child table. Maybe each combination can have attributes that are not singular like [OrderNum] is. An example might be "tags" (although those would most likely be associated with just ImageID, but it works as a basic example). Some people prefer to only place a single ID field in the child table because it is "easier". Well, it's not easier. By placing both ImageID and ProductID fields in the child table and doing the FK on both back to this PK, you now have meaningful values in the child table and will not need to JOIN to this [ProductImage] table all of the time just to get that information (which will probably be needed in most queries that are not simply listing or updating those attributes for a particular ProductID+ImageID combination). And if it is not clear, adding a surrogate key still requires a UNIQUE INDEX or UNIQUE CONSTRAINT to enforce the data integrity of unique ProductID+ImageID combinations (as stated above in the first bullet point).
And placing both ID fields into the child table is another reason to stay away from fields that can change when choosing a PK: if you have FKs defined, you need to set the FK to ON UPDATE CASCADE so that the new value for the PK propagates to all child tables, else the UPDATE will fail.
ProductImage
------------------------------------
ProductID (primary key, foreign key to Product table)
FullImageID (primary key, foreign key to Image table)
ThumbImageID (foreign key; shouldn't this field be in the Image table?)
OrderNumber TINYINT (only here if ordering is per Product, else is in Image table)
The only reason I can see for adding a surrogate key in this situation is if there is a requirement from some other software. Things such as SQL Server Replication (or was it Service Broker?) and/or Entity Framework and/or Full-Text Search. Not sure if those examples do require it, but I have definitely seen 1 or 2 "features" that require a single-field PK.
The best way to achieve this is by having three tables, one for products, one for images and one for their relationship
products
--------
+ product_id (pk)
- product_name
- product_description
- ...
images
------
+ image_id (pk)
- image_title
- ...
product_images
--------------
+ product_id (fk)
+ image_id (fk)
Why do you have seperate tables for fullImage and thumbImage?
Table1 is better since it allows you identify individual rows inside the table.
Table2, im sure you can't have two primary keys.
It might be better to have an Image table as follows.
ImageId (primary)
FullImage [actual value/FK]
ThumbNail [actual value/FK]
and then,
ProductImageID (primary)
ProductID [FK]
ImageID [FK]
How that helps,
Regards,
Rainy

Creating unique Key as FK - MySQL

I've an endpoint /user which creates an unique UUID for a user. It inserts the data(phoneno, gender, age) into the table(cassandra table) and then forwards the same data to another server along with the user_id just created, having MYSQL as the DB.
Now in my MySQL the table is as follow.
id(varchar)
phone no
age
gender
etc.
But I've read that using VARCHAR as PK is a very bad solution. Hence I modified my table as follow:-
id(interger auto increment)
user_id (varchar unique)
phone no
age
gender
etc.
I have another endpoint /recharge, which contains the user_id (UUID), recharge_amount, operator, etc..
My recharge table is as follow:-
user_id FK
amount
operator
Now the problem arises that whenever I'll receive the data for /recharge I need to get the respective id of the user from the Users table to reference it in the recharge table, which is an extra operation. ie for every insert, there will be an extra read operation.
Can I reference/use the unique key as my FK in the recharge table. If no, then what can be the possible solution?
Yes, you can use unique key as foreign key.
To use a column as FK in other table it has to be a PK or a Unique Key.

table relationship to many different tables

I'm creating a system that involves uploads. Now these uploads need to be attached to one of a manner of things eg. a message, a contract, a project
Is it okay to have one table for attachments then link them to these types - the caveat being that it needs to be linked to an individual id from each of these types
eg. in the attachment table
type - links to a table with the list of message contract etc
id. - an id # of what ever id for the type so if the type is message then it would refer to message.id if it was a contract it would refer to contract.id
but then there's no foreign key checks? But it seems odd to have to do foreign keys eg.
type
message_id (FK)
contract_id (FK)
project_id (FK)
Edit: there's a few more tables than 3 more like 5-6 perhaps more in future too..
I would recommend:
A table for attachment (attachment_id + other columns necessary for your attachment)
For each possible types (message, contract, project), you will have a relationship table.
Example:
MessageAttachmentTable: message_id (FK), attachment_id (FK)
ContractAttachmentTable: contract_id (FK), attachment_id (FK)
That way, you can have all the database integrity constraints with no unused columns.
Three NULLable fields with foreign keys to the respective tables is in my opinion the most sensible approach.
Moreover, if you have three foreign key fields, you don't even have to store the "type", since it is determined by the foreign key field which is not NULL.

Database Design with regard to my business logic

i am building an invoicing application consisting of following business logic.
a) Place a new order for customer. (an order is a group of three
related components, estimate, invoice and purchaseorder)
b) After placing an order a new estimate can be generated. an order
will have only one estimate.(An estimate consist of item details)
c) With reference to an estimate of the order. an invoice can be
generated. an invoice qualify for the discount of price. apart from
item details, an invoice consist of some expenses.an order can contain
only one invoice
d) with reference to an invoice of the order, PurchaseOrder can be
generated. PurchaseOrder consist of item information about vendor
purchase. an order can contain multiple PurchaseOrder.
here is the database table design i have come up with.
while all look good, i am having difficulty deciding where to store the items list that belongs to particular estimate, invoice or purchaseorder of the order.
i had thought of several solution.
Approach A : create different tables for item list for each. (estimate, invoice and purchase order)
table : estimate_item , invoice_item , purchaseorder_item.(this tables contains columns similar to that of
order_item in above image).
problem: problem with this approach is all the three tables consist of identical columns storing identical information, the only
difference is foreign key that will be stored.
Approach B: create one item list table order_item
tablename: order_item
problem: not sure what to store as foreign key in this table since the foreign key can be from three different table. i thought of few
way of handling foreign keys in this table as follows.
1)foreignKey table reference column: type (example values: estimate, invoice, purchaseorder) foreignKey column: type_id(consist
of foreignKey of any of three tables) problem: i am using naming
convention for column names for example column name ending with tablename_id defines the foreign key. and this method violate the rules.
2) foreignKeyColumn: order_id , estimate_id , invoice_id ,
purchaseorder_id. problem: unnecessary foreign key columns
defined.
i want to know how should i store the foreign key in order_item table so that it identifies the order and estimate/invoice/purchaseorder it belongs too.
the relationship for tables are:
id is the primary key for all the tables
table name: order relates to (contact, estimate, invoice, shipment) tables.
column name: contact_id (foreign key(referring to id column of the contact table)).
column name: estimate_id (foreign key(referring to id column of the estimate table)).
column name: invoice_id (foreign key(referring to id column of the invoice table)).
column name: shipment_id (foreign key(referring to id column of the shipment table)).
tablename: purchaseorder (this have one to many relationship with order table)
column name: order_id (foreign key(referring to id column of the order table)).
column name: contact_id (foreign key(referring to id column of the contact table)).
the question is about how to go with the storing of foreign keys in order_item table.
Thank you.
Update 1:
Please note that each table estimate , invoice and purchaseorder will have item of it's own and having no relation to each other.
Hi I'm not sure how the relations happen. for instance, you have 'estimate' pointing to 'order item' but I don't see what key you have to make that join (or look-up). as another 'order' points to 'estimate' but how are those two joined? I dont see any shared attributes that both those entities have.
I'm assuming 'id' is just something to make rows in each particular table unique, but are not id's that have value to the application. so, I would think you' need to carry estimate.reference number into the 'order item' table. This is just a cursory comment.
also, would hep for clarity if keys were listed first. so in 'order item' you have attribute 'order id' (which appears to be an FK) buried in the end of the list of other attributes. makes this hard to read.
If I understand you correctly, each document associated with an order (i.e. an estimate, purchaseorder and/or invoice) might contain a different list of items.
If that is the case, I would probably create a Documents table along the following lines:
CREATE TABLE Documents (
DocumentId INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
OrderId INT NOT NULL,
-- you can move any fields common to all document types here
-- e.g. date created, reference #, etc.
FOREIGN KEY (OrderId) REFERENCES order (id)
);
And then your order_item, estimate, purchaseorder and invoice tables all reference their associated entry in this table:
ALTER TABLE [tablename]
ADD COLUMN DocumentId INT NOT NULL,
ADD FOREIGN KEY (DocumentId) REFERENCES Documents (DocumentId)
);
Is this what you're after?

MySql foreign key, how it actually works

Maybe a newbie question about foreign keys, but I want to know the answer.
Let's say I have 2 tables:
products
--------
product_id (int)
name (unique) (varchar)
description (text)
vendor (varchar) (foreign key: vendors.name)
AND
vendors
--------
name (varchar)
I know that I should use a vendor_id (int), but this is just an example to help me ask my question.
So: if I create vendor: Apple, and product: 1, iPhone 4, Description.., Apple then the varchar "Apple" will be stored both in products and vendors, or just in vendors (because of the foreign key)?
Is this a wrong db design?
This is called "normalization" in the database. In your example, there are a couple things to consider:
In order for products to have a foreign key to vendors, vendors needs a key. Is name the primary key for vendors? If so, then the foreign key would also be a varchar. In that case, yes, the value "Apple" would be stored in both. (Note that this isn't a very good idea.)
If you add a vendor_id integer column to the vendors table, and it is the primary key for that table, then you can add a vendor_id (or any other name) column to the products table and make it a foreign key to the vendors table. In this case, only that integer would be stored in both tables. This is where the data becomes normalized. A small, simpler data type (integer) links the tables, which contain the actual data which describes the records.
Only that key value is stored in both tables. It's used as a reference to join the tables when selecting data. For example, in order to select a given product and its vendor, you'd do something like this:
SELECT products.name, products.description, vendors.name AS vendor
FROM products INNER JOIN vendors ON products.vendor_id = vendors.vendor_id
WHERE products.product_id = ?id
This would "join" the two tables into a single table (not really, just for the query) and select the record from it.
It will be stored in both. The foreign-key constraint requires that every value in products.vendor appear somewhere in vendor.name.
(By the way, note that MySQL only enforces foreign-key constraints if the storage engine is InnoDB.)