This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
mySQL's KEY keyword?
Like
PRIMARY KEY (ID),
KEY name (name),
KEY desc (desc),
etc.
what are they useful for?
Keys are used to enforce referential integrity in your database.
A primary key is, as its name suggests, the primary identification of a given row in your table. That is, each row's primary key will uniquely identify that row.
A unique key is a key that enforces uniqueness on that set of columns. It is similar to a primary key in that it will also uniquely identify a row in a table. However, there is the added benefit of allowing NULL in some of those combinations. There can only be 1 primary key, but you can have many unique keys.
A foreign key is used to enforce a relationship between 2 tables (think parent/child table). That way, a child table can not have a value of X in its parent column unless X actually appears in the parent table. This prevents orphaned records from appearing.
The primary key constraint ensures that the column(s) are:
not null
unique (unique sets if more than one column)
KEY is MySQL's terminology in CREATE TABLE statements for an index. Indexes are not ANSI currently, but all databases use indexes to speed up data retrieval (at the cost of insertion/update/deletion, because of maintenance to keep the index relevant).
There are other key constraints:
unique
foreign key (for referential integrity)
...but your question doesn't include examples of them.
keys are also called indexes. They are used for speeding up queries. Additionally keys can be constrains (unique key and foreign key). The primary key is also unique key and it identifies the records. The record can have other unique keys as well, that do not allow to duplicate a value in a given column. Foreign key enforces referential integrity (#Derek Kromm already wrote excellent description). The ordinary key is used only for speeding up queries. You need to index the columns used in the WHERE clause of the queries. If you have no index on the column, MySQL will need to read the whole table to find the records you need. When index is used, MySQL reads only the index (which is usually a B+ tree) and then read only those record from the table it found in the index.
Primary KEY is for creating unique/not null constraint for each row in the table. Also searching by this key is the fastest. You can create only one PK in the table.
Ordinary key/index is key for speeding your searching by this column, sorting, grouping and joining with other table by this key.
Indexes drawback:
Adding new indexes to table will influence on speed or running insert/update/delete statements. So you should select columns for indexing in your table very carefully.
Key are used for relation purposes between tables and you are able to create joins in order to select data from multiple tables
What, you didn't fine the wikipedia entry comprehensive? ;-)
So, a key, in a relational database (such as MySQL, PostgreSQL, Oracle, etc) is a data constraint on a column or set of columns. The most common keys are the Primary key and foreign keys and unique keys.
A foreign key specifically relates the data of one table to data in another table. You might see that a table blog_posts has a foreign key to users based on a user_id column. This means that every user_id in blog_posts will have a corresponding entry in the users column (this is a one-to-many relationship -- a topic for another time).
If a column (or group of columns) has a unique key, that means that there can only be one such incidence of the key in the table. Often you'll see things like email addresses be unique keys -- you only want one email address per user. I've also seen a combination of columns match to a unique key -- the five columns, first_name, last_name, address, city, and state, will often be a unique key -- realistically, there can only be one William Gates at 1835 73rd Ave NE, Medina, Washington. (I do realize that it is possible for a William Gates Jr. to be born, but the designers of that database didn't really care).
The primary key is the primary, unique identifier of a given table. By definition it is a unique key. It is something which cannot be null and must be unique. It holds a special place of prominence among the indexes of a given table.
Related
I know that foreign keys need not reference only primary keys but they can also reference a field that has a unique constraint on it. For my scenario, I am setting up a quiz where for each test, I have a set of questions. My table design is like this
The point is, in my 2nd table where I will put all the answer options, I want the question number field to link to the first table question number. How do I do this? Or is there an alternative to this design?
Thank you
Ideally there should be a question_id primary key column in the test_question table, and you would use this as the foreign key in the test_answer table.
With your composite primary key in the test_question table, you should make a corresponding composite foreign key:
CONSTRAINT FOREIGN KEY (test_id, question_no) REFERENCES test_question (test_id, question_no)
This is in addition to the foreign key just for the test_id column.
Add another table purely for answers, and link them via the question_no field.
A DB table should hold information on one sort of item. Questions and answers are separate sorts of information so should be in separate tables. Adding a separate table also allows changes to questions and answers independently. Additionally, if they are separate, you could add a language field to each table and have a multi-lingual quiz
Short answer:
You can JOIN on any columns or expressions. There is no "requirement" for a FOREIGN KEY, PRIMARY KEY, UNIQUE, or anything else.
Long answer:
However,... For performance (in large tables), some things make a difference.
If you are JOINing to a PK, Unique key, or even an indexed column, the query cold run faster.
Why have a FOREIGN KEY? An FK is two things:
A "constraint" that says that the value must exist in the other table. Also, with things like ON DELETE CASCADE, it can provide actions to take if the indicated row is removed. The constraint requires looking in the other table each time a write occurs (eg INSERT).
An Index. That is, specifying a FK automatically adds an INDEX (if not already present) to make the constraint faster.
Getting the id
Here is the "usual" way to do a pair of inserts, where you need the second to 'point' to the first:
INSERT INTO t1 ... -- with an AUTO_INCREMENT id
grab LAST_INSERT_ID() -- that id
INSERT INTO t2 ... -- and include the id from above
For AUTO_INCREMENT to work it must be the first column of some key. (Note: a PRIMARY KEY is a UNIQUE is a key (aka INDEX).)
Optionally you can specify a FK on the second table to point out the connection between the tables.
And, as spelled out in other answers, a FK could involve more than one column.
Entities and Relations
Sometimes, a set of tables like yours is best 'designed' this way:
Determine the "entities": users, tests, questions, answers
Relations and whether they are 1:1, 1:many, or many:many... Users:test is many-to-many; tests:questions is 1:many (unless you want questions to be shared between tests).
Answers is more complex since each 1 answer depends on the user and question.
1:1 -- rarely practical; may as well merge the tables together.
1:many -- a link (FK?) in one table to the other.
many:many -- need a bridge table with (usually) 2 columns, namely ids linking to the two tables.
At work we have a big database with unique indexes instead of primary keys and all works fine.
I'm designing new database for a new project and I have a dilemma:
In DB theory, primary key is fundamental element, that's OK, but in REAL projects what are advantages and disadvantages of both?
What do you use in projects?
EDIT: ...and what about primary keys and replication on MS SQL server?
What is a unique index?
A unique index on a column is an index on that column that also enforces the constraint that you cannot have two equal values in that column in two different rows. Example:
CREATE TABLE table1 (foo int, bar int);
CREATE UNIQUE INDEX ux_table1_foo ON table1(foo); -- Create unique index on foo.
INSERT INTO table1 (foo, bar) VALUES (1, 2); -- OK
INSERT INTO table1 (foo, bar) VALUES (2, 2); -- OK
INSERT INTO table1 (foo, bar) VALUES (3, 1); -- OK
INSERT INTO table1 (foo, bar) VALUES (1, 4); -- Fails!
Duplicate entry '1' for key 'ux_table1_foo'
The last insert fails because it violates the unique index on column foo when it tries to insert the value 1 into this column for a second time.
In MySQL a unique constraint allows multiple NULLs.
It is possible to make a unique index on mutiple columns.
Primary key versus unique index
Things that are the same:
A primary key implies a unique index.
Things that are different:
A primary key also implies NOT NULL, but a unique index can be nullable.
There can be only one primary key, but there can be multiple unique indexes.
If there is no clustered index defined then the primary key will be the clustered index.
You can see it like this:
A Primary Key IS Unique
A Unique value doesn't have to be the Representaion of the Element
Meaning?; Well a primary key is used to identify the element, if you have a "Person" you would like to have a Personal Identification Number ( SSN or such ) which is Primary to your Person.
On the other hand, the person might have an e-mail which is unique, but doensn't identify the person.
I always have Primary Keys, even in relationship tables ( the mid-table / connection table ) I might have them. Why? Well I like to follow a standard when coding, if the "Person" has an identifier, the Car has an identifier, well, then the Person -> Car should have an identifier as well!
Foreign keys work with unique constraints as well as primary keys. From Books Online:
A FOREIGN KEY constraint does not have
to be linked only to a PRIMARY KEY
constraint in another table; it can
also be defined to reference the
columns of a UNIQUE constraint in
another table
For transactional replication, you need the primary key. From Books Online:
Tables published for transactional
replication must have a primary key.
If a table is in a transactional
replication publication, you cannot
disable any indexes that are
associated with primary key columns.
These indexes are required by
replication. To disable an index, you
must first drop the table from the
publication.
Both answers are for SQL Server 2005.
The choice of when to use a surrogate primary key as opposed to a natural key is tricky. Answers such as, always or never, are rarely useful. I find that it depends on the situation.
As an example, I have the following tables:
CREATE TABLE toll_booths (
id INTEGER NOT NULL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
...
UNIQUE(name)
)
CREATE TABLE cars (
vin VARCHAR(17) NOT NULL PRIMARY KEY,
license_plate VARCHAR(10) NOT NULL,
...
UNIQUE(license_plate)
)
CREATE TABLE drive_through (
id INTEGER NOT NULL PRIMARY KEY,
toll_booth_id INTEGER NOT NULL REFERENCES toll_booths(id),
vin VARCHAR(17) NOT NULL REFERENCES cars(vin),
at TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
amount NUMERIC(10,4) NOT NULL,
...
UNIQUE(toll_booth_id, vin)
)
We have two entity tables (toll_booths and cars) and a transaction table (drive_through). The toll_booth table uses a surrogate key because it has no natural attribute that is not guaranteed to change (the name can easily be changed). The cars table uses a natural primary key because it has a non-changing unique identifier (vin). The drive_through transaction table uses a surrogate key for easy identification, but also has a unique constraint on the attributes that are guaranteed to be unique at the time the record is inserted.
http://database-programmer.blogspot.com has some great articles on this particular subject.
There are no disadvantages of primary keys.
To add just some information to #MrWiggles and #Peter Parker answers, when table doesn't have primary key for example you won't be able to edit data in some applications (they will end up saying sth like cannot edit / delete data without primary key). Postgresql allows multiple NULL values to be in UNIQUE column, PRIMARY KEY doesn't allow NULLs. Also some ORM that generate code may have some problems with tables without primary keys.
UPDATE:
As far as I know it is not possible to replicate tables without primary keys in MSSQL, at least without problems (details).
If something is a primary key, depending on your DB engine, the entire table gets sorted by the primary key. This means that lookups are much faster on the primary key because it doesn't have to do any dereferencing as it has to do with any other kind of index. Besides that, it's just theory.
In addition to what the other answers have said, some databases and systems may require a primary to be present. One situation comes to mind; when using enterprise replication with Informix a PK must be present for a table to participate in replication.
As long as you do not allow NULL for a value, they should be handled the same, but the value NULL is handled differently on databases(AFAIK MS-SQL do not allow more than one(1) NULL value, mySQL and Oracle allow this, if a column is UNIQUE)
So you must define this column NOT NULL UNIQUE INDEX
There is no such thing as a primary key in relational data theory, so your question has to be answered on the practical level.
Unique indexes are not part of the SQL standard. The particular implementation of a DBMS will determine what are the consequences of declaring a unique index.
In Oracle, declaring a primary key will result in a unique index being created on your behalf, so the question is almost moot. I can't tell you about other DBMS products.
I favor declaring a primary key. This has the effect of forbidding NULLs in the key column(s) as well as forbidding duplicates. I also favor declaring REFERENCES constraints to enforce entity integrity. In many cases, declaring an index on the coulmn(s) of a foreign key will speed up joins. This kind of index should in general not be unique.
There are some disadvantages of CLUSTERED INDEXES vs UNIQUE INDEXES.
As already stated, a CLUSTERED INDEX physically orders the data in the table.
This mean that when you have a lot if inserts or deletes on a table containing a clustered index, everytime (well, almost, depending on your fill factor) you change the data, the physical table needs to be updated to stay sorted.
In relative small tables, this is fine, but when getting to tables that have GB's worth of data, and insertrs/deletes affect the sorting, you will run into problems.
I almost never create a table without a numeric primary key. If there is also a natural key that should be unique, I also put a unique index on it. Joins are faster on integers than multicolumn natural keys, data only needs to change in one place (natural keys tend to need to be updated which is a bad thing when it is in primary key - foreign key relationships). If you are going to need replication use a GUID instead of an integer, but for the most part I prefer a key that is user readable especially if they need to see it to distinguish between John Smith and John Smith.
The few times I don't create a surrogate key are when I have a joining table that is involved in a many-to-many relationship. In this case I declare both fields as the primary key.
My understanding is that a primary key and a unique index with a not‑null constraint, are the same (*); and I suppose one choose one or the other depending on what the specification explicitly states or implies (a matter of what you want to express and explicitly enforce). If it requires uniqueness and not‑null, then make it a primary key. If it just happens all parts of a unique index are not‑null without any requirement for that, then just make it a unique index.
The sole remaining difference is, you may have multiple not‑null unique indexes, while you can't have multiple primary keys.
(*) Excepting a practical difference: a primary key can be the default unique key for some operations, like defining a foreign key. Ex. if one define a foreign key referencing a table and does not provide the column name, if the referenced table has a primary key, then the primary key will be the referenced column. Otherwise, the the referenced column will have to be named explicitly.
Others here have mentioned DB replication, but I don't know about it.
Unique Index can have one NULL value. It creates NON-CLUSTERED INDEX.
Primary Key cannot contain NULL value. It creates CLUSTERED INDEX.
In MSSQL, Primary keys should be monotonically increasing for best performance on the clustered index. Therefore an integer with identity insert is better than any natural key that might not be monotonically increasing.
If it were up to me...
You need to satisfy the requirements of the database and of your applications.
Adding an auto-incrementing integer or long id column to every table to serve as the primary key takes care of the database requirements.
You would then add at least one other unique index to the table for use by your application. This would be the index on employee_id, or account_id, or customer_id, etc. If possible, this index should not be a composite index.
I would favor indices on several fields individually over composite indices. The database will use the single field indices whenever the where clause includes those fields, but it will only use a composite when you provide the fields in exactly the correct order - meaning it can't use the second field in a composite index unless you provide both the first and second in your where clause.
I am all for using calculated or Function type indices - and would recommend using them over composite indices. It makes it very easy to use the function index by using the same function in your where clause.
This takes care of your application requirements.
It is highly likely that other non-primary indices are actually mappings of that indexes key value to a primary key value, not rowid()'s. This allows for physical sorting operations and deletes to occur without having to recreate these indices.
When should I use KEY, PRIMARY KEY, UNIQUE KEY and INDEX?
KEY and INDEX are synonyms in MySQL. They mean the same thing. In databases you would use indexes to improve the speed of data retrieval. An index is typically created on columns used in JOIN, WHERE, and ORDER BY clauses.
Imagine you have a table called users and you want to search for all the users which have the last name 'Smith'. Without an index, the database would have to go through all the records of the table: this is slow, because the more records you have in your database, the more work it has to do to find the result. On the other hand, an index will help the database skip quickly to the relevant pages where the 'Smith' records are held. This is very similar to how we, humans, go through a phone book directory to find someone by the last name: We don't start searching through the directory from cover to cover, as long we inserted the information in some order that we can use to skip quickly to the 'S' pages.
Primary keys and unique keys are similar. A primary key is a column, or a combination of columns, that can uniquely identify a row. It is a special case of unique key. A table can have at most one primary key, but more than one unique key. When you specify a unique key on a column, no two distinct rows in a table can have the same value.
Also note that columns defined as primary keys or unique keys are automatically indexed in MySQL.
KEY and INDEX are synonyms.
You should add an index when performance measurements and EXPLAIN shows you that the query is inefficient because of a missing index. Adding an index can improve the performance of queries (but it can slow down modifications to the table).
You should use UNIQUE when you want to contrain the values in that column (or columns) to be unique, so that attempts to insert duplicate values result in an error.
A PRIMARY KEY is both a unique constraint and it also implies that the column is NOT NULL. It is used to give an identity to each row. This can be useful for joining with another table via a foreign key constraint. While it is not required for a table to have a PRIMARY KEY it is usually a good idea.
Primary key does not allow NULL values, but unique key allows NULL values.
We can declare only one primary key in a table, but a table can have multiple unique keys (column assign).
PRIMARY KEY AND UNIQUE KEY are similar except it has different functions. Primary key makes the table row unique (i.e, there cannot be 2 row with the exact same key). You can only have 1 primary key in a database table.
Unique key makes the table column in a table row unique (i.e., no 2 table row may have the same exact value). You can have more than 1 unique key table column (unlike primary key which means only 1 table column in the table is unique).
INDEX also creates uniqueness. MySQL (example) will create a indexing table for the column that is indexed. This way, it's easier to retrieve the table row value when the query is queried on that indexed table column. The disadvantage is that if you do many updating/deleting/create, MySQL has to manage the indexing tables (and that can be a performance bottleneck).
Hope this helps.
Unique Keys: The columns in which no two rows are similar
Primary Key: Collection of minimum number of columns which can uniquely identify every row in a table (i.e. no two rows are similar in all the columns constituting primary key). There can be more than one primary key in a table. If there exists a unique-key then it is primary key (not "the" primary key) in the table. If there does not exist a unique key then more than one column values will be required to identify a row like (first_name, last_name, father_name, mother_name) can in some tables constitute primary key.
Index: used to optimize the queries. If you are going to search or sort the results on basis of some column many times (eg. mostly people are going to search the students by name and not by their roll no.) then it can be optimized if the column values are all "indexed" for example with a binary tree algorithm.
The primary key is used to work with different tables. This is the foundation of relational databases. If you have a book database it's better to create 2 tables - 1) books and 2) authors with INT primary key "id". Then you use id in books instead of authors name.
The unique key is used if you don't want to have repeated entries. For example you may have title in your book table and want to be sure there is only one entry for each title.
Primary key - we can put only one primary key on a table into a table and we can not left that column blank when we are entering the values into the table.
Unique Key - we can put more than one unique key on a table and we may left that column blank when we are entering the values into the table.
column take unique values (not same) when we applied primary & unique key.
Unique Key :
More than one value can be null.
No two tuples can have same values in unique key.
One or more unique keys can be combined to form a primary key, but not vice versa.
Primary Key
Can contain more than one unique keys.
Uniquely represents a tuple.
I create the foreign key to one field in my table, the index was automatically created to that field. why the index was automatically created to foreign key fields?
The indices are created on foreign keys to improve performance. If you have a foreign key is common to want to get related items, an index allows you to quickly get these items. Also the conditions imposed on foreign keys (delete, update) take advantage of the index to work fast. Finally you need to perform integrity checks when creating foreign keys, this requires performing searches, these searches take advantage of the indexes.
Foreign Key fields link to the contents of fields in other tables.
If we create a table like so:
Table Patient
id
name
address
And a table Illness
Table Illness
id
patient_id foreign key to patient.id
description
MySQL checks to see if a foreign link to table patient actually matches up with the id in Illness. It also does the reverve, if a patient is deleted, it checks to make sure it is not referenced in illness.
In order to do this effiently it needs to index these fields, otherwise it will have to do a spend too much time doing full table lookups.
Besides, the word key is a synonym for index so it makes sense to index keys :-).
When should I use KEY, PRIMARY KEY, UNIQUE KEY and INDEX?
KEY and INDEX are synonyms in MySQL. They mean the same thing. In databases you would use indexes to improve the speed of data retrieval. An index is typically created on columns used in JOIN, WHERE, and ORDER BY clauses.
Imagine you have a table called users and you want to search for all the users which have the last name 'Smith'. Without an index, the database would have to go through all the records of the table: this is slow, because the more records you have in your database, the more work it has to do to find the result. On the other hand, an index will help the database skip quickly to the relevant pages where the 'Smith' records are held. This is very similar to how we, humans, go through a phone book directory to find someone by the last name: We don't start searching through the directory from cover to cover, as long we inserted the information in some order that we can use to skip quickly to the 'S' pages.
Primary keys and unique keys are similar. A primary key is a column, or a combination of columns, that can uniquely identify a row. It is a special case of unique key. A table can have at most one primary key, but more than one unique key. When you specify a unique key on a column, no two distinct rows in a table can have the same value.
Also note that columns defined as primary keys or unique keys are automatically indexed in MySQL.
KEY and INDEX are synonyms.
You should add an index when performance measurements and EXPLAIN shows you that the query is inefficient because of a missing index. Adding an index can improve the performance of queries (but it can slow down modifications to the table).
You should use UNIQUE when you want to contrain the values in that column (or columns) to be unique, so that attempts to insert duplicate values result in an error.
A PRIMARY KEY is both a unique constraint and it also implies that the column is NOT NULL. It is used to give an identity to each row. This can be useful for joining with another table via a foreign key constraint. While it is not required for a table to have a PRIMARY KEY it is usually a good idea.
Primary key does not allow NULL values, but unique key allows NULL values.
We can declare only one primary key in a table, but a table can have multiple unique keys (column assign).
PRIMARY KEY AND UNIQUE KEY are similar except it has different functions. Primary key makes the table row unique (i.e, there cannot be 2 row with the exact same key). You can only have 1 primary key in a database table.
Unique key makes the table column in a table row unique (i.e., no 2 table row may have the same exact value). You can have more than 1 unique key table column (unlike primary key which means only 1 table column in the table is unique).
INDEX also creates uniqueness. MySQL (example) will create a indexing table for the column that is indexed. This way, it's easier to retrieve the table row value when the query is queried on that indexed table column. The disadvantage is that if you do many updating/deleting/create, MySQL has to manage the indexing tables (and that can be a performance bottleneck).
Hope this helps.
Unique Keys: The columns in which no two rows are similar
Primary Key: Collection of minimum number of columns which can uniquely identify every row in a table (i.e. no two rows are similar in all the columns constituting primary key). There can be more than one primary key in a table. If there exists a unique-key then it is primary key (not "the" primary key) in the table. If there does not exist a unique key then more than one column values will be required to identify a row like (first_name, last_name, father_name, mother_name) can in some tables constitute primary key.
Index: used to optimize the queries. If you are going to search or sort the results on basis of some column many times (eg. mostly people are going to search the students by name and not by their roll no.) then it can be optimized if the column values are all "indexed" for example with a binary tree algorithm.
The primary key is used to work with different tables. This is the foundation of relational databases. If you have a book database it's better to create 2 tables - 1) books and 2) authors with INT primary key "id". Then you use id in books instead of authors name.
The unique key is used if you don't want to have repeated entries. For example you may have title in your book table and want to be sure there is only one entry for each title.
Primary key - we can put only one primary key on a table into a table and we can not left that column blank when we are entering the values into the table.
Unique Key - we can put more than one unique key on a table and we may left that column blank when we are entering the values into the table.
column take unique values (not same) when we applied primary & unique key.
Unique Key :
More than one value can be null.
No two tuples can have same values in unique key.
One or more unique keys can be combined to form a primary key, but not vice versa.
Primary Key
Can contain more than one unique keys.
Uniquely represents a tuple.