I have two columns in a table that will always be unique, vendor_identifier and product_identifier. Both of them are about equal length. Should I add both of them as primary keys, or only one, or some variant of that? Is there any difference between adding one or two here?
are que querying by both keys? or maybe one at the time?
depending on the answer you can do a composite index or two different indexes.. if you are adding two different indexes remember that the most used one should be at the left
but basically all depends of the architecture of your app / and the DB schema you choose to use...
In MySql the primary key gets the clustered index, so you should make the primary key be the unique identifier you will most frequently query. (This includes joins.)
It's not quite clear from your question if those two fields are each unique on their own, or if they're only guaranteed to be unique as a combination. If they should always be unique individually, then at the least you should put a separate unique index on each of them. If they're only unique in combination, then that's your only guarantee of uniqueness and the primary key should be the two of them together as a single key.
You can only have one be the primary key. You can have the other be a UNIQUE key.
Whichever you prefer to be the default PRIMARY KEY is your choice.
There is something you need to ask yourself:
Will a table that has both columns allow multiple products?
Will a table that has both columns allow multiple vendor?
Will a table that has both columns allow the tuple (vendor,product) one or more times?
Answering these rhetorial questions will help you decide whether a table has one of the following as the PRIMARY KEY
vendor_identifier
product_identifier
vendor_identifier,product_identifier
Consider the following:
(1) is the combination of vendor_id and product_id also guaranteed to be unique?
(2) will you always search with both vendor_id and product_id?
A compound primary key only makes sense if you can answer yes to both. If you cannot, then just select the one with higher cardinality to be the primary key and make a secondary index on the other.
since you dont describe your tables - i'm going to suggest that you actually have 3 tables here:
VENDOR
--------
vendor_id
other_cols
PRODUCT
---------
product_id
other_cols
VENDOR_PRODUCT
--------------
vendor_id
product_id
price-description-dates etc.
in this case - the VENDOR_ID in the VENDOR table is the PK.
the PRODUCT_ID in the PRODUCT table is the PK (for that table)
the VENDOR_ID in the VENDOR_PRODUCT table is a foreign key
the PRODUCT_ID in the VENDOR_PRODUCT table is a foreign key
you may choose to enforce uniqueness on the pair VENDOR_ID, PRODUCT_ID in the VENDOR_PRODUCT table, or not as you choose. If unique, they may be acting as a COMPOUND KEY in that table. If you need to reference rows in the VENDOR_PRODUCT from somewhere else in your schema, then you may consider a new single value primary key instead of copying these two columns to the new table and trying to get the FK definitions correct.
Assuming your vendor_identifier is a foreign key relating to a vendor table, and product_identifier is a foreign key relating to a product table, I'd create an autonumber field (vendor_product_identifier, perhaps?) to be the primary key of the table that has both vendor_id and product_id in it. Then I'd place a unique index on the combination of vendor_id and product_id.
So, the general idea would be:
Vendor
------
vendor_identifier PK
name
phone
etc...
Product
-------
product_identifier PK
name
category
etc...
Vendor_Product
--------------
vendor_product_identifier //"AUTONUMBER PK"
vendor_identifier //"FK to Vendor, and part of COMBOINDEX1"
product_identifier //"FK to Vendor, and part of COMBOINDEX1"
etc...
Having a new key for vendor_product gives you just one key to pass around on the application side to refer to a combination of both vendor and product. Having a unique index on the combination of vendor_id and product_id in the vendor_product table ensures that you won't get duplicate entries for that combination of data either (has to be a unique index though, not just an index).
Related
Lets says that I have an order table and item table :
CREATE TABLE if not exists ORDERS (
ORDERID INTEGER AUTO_INCREMENT,
ORDERTYPE VARCHAR (20) NOT NULL,
ShippedTime VARCHAR(40),
ORDERDATE DATE,
PRIMARY KEY (ORDERID),
);
CREATE TABLE if not exists ITEM(
ITEMID INTEGER AUTO_INCREMENT,
NAME VARCHAR (20) NOT NULL,
PRICE INTEGER NOT NULL CHECK (PRICE > 0),
PRIMARY KEY (ITEMID)
);
and the relation between the both tables will be existof :
CREATE TABLE if not exists EXISTOF (
ORDERID INTEGER NOT NULL,
ITEMID INTEGER NOT NULL,
FOREIGN KEY (ORDERID) REFERENCES ORDERS(ORDERID) ON DELETE CASCADE,
FOREIGN KEY (ITEMID) REFERENCES ITEM(ITEMID) ON DELETE CASCADE,
PRIMARY KEY (ORDERID,ITEMID)
);
The explanation should be for each order has multiple item and each item belongs to many orders.
If I do like this it will not be work because the ids are primary keys and I can't insert for specific order multiple item and also it can not items belongs to multiple order.
Does anyone have any recommendation how to do that?
Your Existof Table is not flexible enough. The way most order processing systems deal with this situation is to add a column, which we can call Quantity, to the Existof table. The default value is 1, but other quantities can be put in as well.
So if a given order wants to order say 5 reams of paper,and ream of paper in a product, the entry for this item in Existof will have a quantity of 5.
This assumes that all 5 reams are interchangeable, and therefore described by the same data. If some of the paper reams are of different colors, than they ought to be different products.
Create an intermediate table OrderItems with foreign keys item_id and order_id. There are other options but this is the easiest way I find to break down many-many relationships!
"... have to be ..." -- no. FOREIGN KEYs are never "required".
A FK provides three things:
A dynamic check that there is a matching element. This is useful as an integrity check on the data, but is not mandatory.
An INDEX to make the above check significantly faster. Manually specifying an INDEX is just as good. Anyway, a PRIMARY KEY is an index.
"Casscading delete, etc". This is an option that few schemas use, or even need.
There are 3 main types of "relations" between tables:
1:1 -- But why bother having two tables? The columns could simply be in a single table. (There are exceptions.)
1:many -- (This sounds like "many items in one order"??) That is implemented by simply having order_id in the Items table. (And index that column.) Optionally, it can be a FK. Others call the table OrderItems. And it links to a Products table.
many:many -- This is when you need an extra table with (usually) exactly two columns, namely ids into the other two tables. (Eg, Student vs class) Each column could be an FK, but the optimal indexes are PRIMARY KEY(a_id, b_id) and INDEX(b_id, a_id). The FKs would see that you already have indexes starting with a_id and b_id, so it would not create an extra index. Do not have "a unique junction table ID"; it is less efficient than the PK I suggest here. (More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table)
Back to your proposed design. I suggest that "item" implies the product and quantity of that product and the price charged at that time. Hence it needs to be 1:many. And that "product" is what you are thinking of. Please change the table name so I am not confused.
Now, another issue... Price. Is the price fixed forever? Or is the price going to be different for today's Orders than for yesterday's? Again, the Item and Price are tied to one Order. There may be a Price on the Product table, and that may be "current_price", which gets used when creating new Orders.
ShippedTime VARCHAR(40) -- Perhaps should be DATETIME?
i have two tables in my database which belongs to each other.
mp_order and mp_order_items.
mp_order has the main informations of an order of a customer like adress, date etc.
(order_id, customer_company, customer_name, customer_adress, order_date, ... [etc.])
mp_order_items has the priducts/items which was ordered
(order_id, item_id, item_qty)
Due to order_id and item_id can repeat (but not in combination) i cant set one column as primary key.
Should i implement another column as unique identifier for the single entries or is it valid to have a table without primary key?
You have two options:
Define a primary key on (order_id, item_id)
Define a synthetic primary key, such as an auto-incremented column.
I prefer the second method. It is more flexible for the future:
Perhaps an order could contain the same items, but with different pricing or shipping addresses or shipping times.
The rows are uniquely defined with a single number, which makes it easier to find them if you need to modify rows in the future.
The rows are more easily referenced in another table, for instance, if you had a returns table or if the items.
Of course, having a composite primary key also works and is a very viable method for implementing the logic as well.
Since, you requirement is that order_id and item_id can not repeat in combination meaning: (ord_134, itm_123) can't repeat itself then, I believe you need to create a COMPOSITE KEY.
PRIMARY KEY(order_id, item_id)
Basically, a combination of both Order Id and Item Id is what will uniquely identify a record in the table.
There is a caveat, if required, while defining a FOREIGN KEY, you can't link the tables using just order_id. You will need to include all the columns that are part of the COMPOSITE KEY inside the FOREIGN KEY relation.
Let's assume we have two tables: products and orders. As it is a multi-to-many relationship, I've created an extra table: ordersproducts.
As I read from many threads, two primary keys are recommended in this case - table ordersproducts:
order_id (PK), product_id (PK, FK),
However, in this situation there can't be duplicates in the table. Order_id can be duplicated, but product_id has to be unique, and I need a bit more flexibility - order_id should be able to duplicate and so should product_id.
Works correctly after removing the primary keys, leaving only the foreign key at product_id, however - table without primary keys doesn't seem right, does it?
Always have a PRIMARY KEY. It sounds like you need either (not both) of these:
PRIMARY KEY(order_id, product_id)
PRIMARY KEY(product_id, order_id)
These say (because a MySQL PK must be unique) that there may be duplicates of either column, but the pair is never duplicates.
Since you probably want to go both directions (given an order, find all the products and given a product, find all the orders), you need indexes both ways:
PRIMARY KEY(order_id, product_id),
INDEX(product_id, order_id)
Remember, a PK is UNIQUE and is an INDEX.
Here are more tips on virtually any many:many table. That discusses a generic solution to your generic problem.
When should I use KEY, PRIMARY KEY, UNIQUE KEY and INDEX?
KEY and INDEX are synonyms in MySQL. They mean the same thing. In databases you would use indexes to improve the speed of data retrieval. An index is typically created on columns used in JOIN, WHERE, and ORDER BY clauses.
Imagine you have a table called users and you want to search for all the users which have the last name 'Smith'. Without an index, the database would have to go through all the records of the table: this is slow, because the more records you have in your database, the more work it has to do to find the result. On the other hand, an index will help the database skip quickly to the relevant pages where the 'Smith' records are held. This is very similar to how we, humans, go through a phone book directory to find someone by the last name: We don't start searching through the directory from cover to cover, as long we inserted the information in some order that we can use to skip quickly to the 'S' pages.
Primary keys and unique keys are similar. A primary key is a column, or a combination of columns, that can uniquely identify a row. It is a special case of unique key. A table can have at most one primary key, but more than one unique key. When you specify a unique key on a column, no two distinct rows in a table can have the same value.
Also note that columns defined as primary keys or unique keys are automatically indexed in MySQL.
KEY and INDEX are synonyms.
You should add an index when performance measurements and EXPLAIN shows you that the query is inefficient because of a missing index. Adding an index can improve the performance of queries (but it can slow down modifications to the table).
You should use UNIQUE when you want to contrain the values in that column (or columns) to be unique, so that attempts to insert duplicate values result in an error.
A PRIMARY KEY is both a unique constraint and it also implies that the column is NOT NULL. It is used to give an identity to each row. This can be useful for joining with another table via a foreign key constraint. While it is not required for a table to have a PRIMARY KEY it is usually a good idea.
Primary key does not allow NULL values, but unique key allows NULL values.
We can declare only one primary key in a table, but a table can have multiple unique keys (column assign).
PRIMARY KEY AND UNIQUE KEY are similar except it has different functions. Primary key makes the table row unique (i.e, there cannot be 2 row with the exact same key). You can only have 1 primary key in a database table.
Unique key makes the table column in a table row unique (i.e., no 2 table row may have the same exact value). You can have more than 1 unique key table column (unlike primary key which means only 1 table column in the table is unique).
INDEX also creates uniqueness. MySQL (example) will create a indexing table for the column that is indexed. This way, it's easier to retrieve the table row value when the query is queried on that indexed table column. The disadvantage is that if you do many updating/deleting/create, MySQL has to manage the indexing tables (and that can be a performance bottleneck).
Hope this helps.
Unique Keys: The columns in which no two rows are similar
Primary Key: Collection of minimum number of columns which can uniquely identify every row in a table (i.e. no two rows are similar in all the columns constituting primary key). There can be more than one primary key in a table. If there exists a unique-key then it is primary key (not "the" primary key) in the table. If there does not exist a unique key then more than one column values will be required to identify a row like (first_name, last_name, father_name, mother_name) can in some tables constitute primary key.
Index: used to optimize the queries. If you are going to search or sort the results on basis of some column many times (eg. mostly people are going to search the students by name and not by their roll no.) then it can be optimized if the column values are all "indexed" for example with a binary tree algorithm.
The primary key is used to work with different tables. This is the foundation of relational databases. If you have a book database it's better to create 2 tables - 1) books and 2) authors with INT primary key "id". Then you use id in books instead of authors name.
The unique key is used if you don't want to have repeated entries. For example you may have title in your book table and want to be sure there is only one entry for each title.
Primary key - we can put only one primary key on a table into a table and we can not left that column blank when we are entering the values into the table.
Unique Key - we can put more than one unique key on a table and we may left that column blank when we are entering the values into the table.
column take unique values (not same) when we applied primary & unique key.
Unique Key :
More than one value can be null.
No two tuples can have same values in unique key.
One or more unique keys can be combined to form a primary key, but not vice versa.
Primary Key
Can contain more than one unique keys.
Uniquely represents a tuple.
I have two tables A,B. Both tables will have more than 1 million records.
A has two columns - id, photo_id. id is the primary key and photo_id is unique.
A needs to be referenced in B.
3 questions:
Can I ignore A's id and use photo_id to link the two tables?
Is there any benefit of using the primary column as opposed to using a unique column in B?
What difference will it make to have a foreign key? I probably won't use foreign key since it's not supported on the server I'm using. Will this make a significant difference when there are 1+ mil records?
Skip having an id-column. If you have a photo_id that is already unique you should use that instead. A primary key (in MySQL InnoDB) is automatically clustered, which means that the data is stored in the index, making for VERY efficient retrieval of an entire row if you use the primary key as reference.
To answer your questions:
Yes. And remove the id-column. It is an artificial key and provide no benefits over using the photo_id
Yes. The primary key index is clustered and makes for very efficient querying on both exact and range-queries. (i.e. select * from photos where 2 < id AND id < 10)
A foreign key puts a constraint on your database tables, and ensure that the data in the tables are in a consistent state. Without foreign keys you have to have some application level logic to ensure consistency.
I would only remove your id column if you are positive that photo_id values will never change. If multiple rows of your B table reference a specific A row and the photo_id for that row needs to be updated, you will want to be referencing the id column from your B table.