Why do we need a primary key in table? [closed] - mysql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
Primary key:
Used to serve as a unique identifier for each row in a table.
Cannot accept NULL values.
Creates clustered index.
Only one primary key
Unique key:
Used to serve as a unique identifier for a row in a table.
Can accept one NULL value.
Creates non-clustered index
More than one unique key
Almost we are using the primary key to identify each row uniquely so the unique key as well.
I am not sure If I am right or wrong, But If we create a unique key with not null constraint would provide almost the same behavior which is provided by the primary key.
For example:
CREATE TABLE Persons (
ID int NOT NULL UNIQUE,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int);
So why do we really need a primary key, as a unique key can help us achieve whatever we want?
We can also use a unique key as a foreign key to the reference table as well.

A lot of relational purists regard the elevation of one key as the "primary key" as a mistake1 and believe that all candidate keys should be treated equally. If you're of that school of thought (as you appear to be) then you're correct, you don't need a PK.
1In the design of SQL itself, in having this feature.
Similarly, theorists see no problem with keys consisting of zero columns (a table with that key declared does make sense, and can only contain 0 or 1 row). But here SQL will prevent you from declaring such a key.

It's true that a UNIQUE KEY on NOT NULL column(s) does fill the same role as a PRIMARY KEY.
Folks should remember that a primary key is a constraint, and may be defined on multiple columns. Primary key does not have to be a single-column auto-inc integer.
In MySQL's InnoDB engine, the table is stored as a clustered index by its primary key, or if it doesn't have a primary key, then it's a clustered index by its first non-null unique key. (a table can have multiple unique keys). See https://dev.mysql.com/doc/refman/8.0/en/innodb-index-types.html
But isn't it quicker and more clear to say the clustered index is by the primary key? :-)

There is a semantic difference.
Ignoring the minor differences such as handling of NULL's, a primary key is essentially a unique constraint, but is simply designated as the primary one. It's the one true way to identify a given row. If I had a particular row that I wanted you to see, what would I tell you about the row to make sure you look at the exact same row? I'd tell you the primary key. "Hey, I think we have a problem with row # 78. Can you look at row # 78?" There might be other unique constraints on the table, but the PK is the go-to.
The primary key is the canonical identification of a row. It might be a single column or a combination of columns. It is entirely possible, valid, and common, to have a primary key and no unique constraints.
Practical Considerations
Anecdotally, foreign keys very often map rows across tables and therefore the primary key of a table is likely to be repeated across tables. When doing an update of the value, therefore, foreign keys get impacted. Due to this, it is often desirable to have an immutable primary key. So-called surrogate keys are popular for this reason, though there are schools of thought on that subject. [1]
Documentation tools typically inspect primary keys and foreign keys in order to determine the relationships between tables. For example, you can auto-generate an ER diagram by traversing the FK to PK relationships.
[1] With that said, keep in mind that foreign keys can point to any column of any table, including the same table. Pointing to the primary key of a table is extremely common because it usually makes the most sense, but there are cases where it does not make sense.

Although of similar definitions, they have different applications. The primary key is a column that serves to uniquely identify each tuple in the table. Furthermore, it adds integrity constraints to the table and maintains the relationship with foreign keys. The unique key is simpler, it only serves to uniquely identify a tuple in the table.

Related

why we need primary key and how does it related with index

I have a scenario where have two tables with unique elements and both the tables contains same sort of records. Now one table is having primary key and another doesn't. so what is the advantage of having primary key in case i have unique elements in both the tables. and how primary key is related with index.?
I have been asked this question in Nokia interview. pretty confusing please answers with some sort of example.
what is the advantage of having primary key?
Primary key induces or rather forces the column to have two conditions-
UNIQUE VALUE
NOT NULL
So when a table row is inserted it must follow both conditions. In case the table has some records already it will check for uniqueness while adding the constraint. If there are duplicate entries for that attribute then you cant add the primary key constraint.
how primary key is related with index.?
When you declare an attribute as PRIMARY KEY indexing will be created on that attribute by default.
This helps in faster access to the records when the number of records is too high. (=> faster fetching).
But for a small table indexing would slow down things as it needs to update the indexes every time you insert/update a row.

sql management studio [duplicate]

At work we have a big database with unique indexes instead of primary keys and all works fine.
I'm designing new database for a new project and I have a dilemma:
In DB theory, primary key is fundamental element, that's OK, but in REAL projects what are advantages and disadvantages of both?
What do you use in projects?
EDIT: ...and what about primary keys and replication on MS SQL server?
What is a unique index?
A unique index on a column is an index on that column that also enforces the constraint that you cannot have two equal values in that column in two different rows. Example:
CREATE TABLE table1 (foo int, bar int);
CREATE UNIQUE INDEX ux_table1_foo ON table1(foo); -- Create unique index on foo.
INSERT INTO table1 (foo, bar) VALUES (1, 2); -- OK
INSERT INTO table1 (foo, bar) VALUES (2, 2); -- OK
INSERT INTO table1 (foo, bar) VALUES (3, 1); -- OK
INSERT INTO table1 (foo, bar) VALUES (1, 4); -- Fails!
Duplicate entry '1' for key 'ux_table1_foo'
The last insert fails because it violates the unique index on column foo when it tries to insert the value 1 into this column for a second time.
In MySQL a unique constraint allows multiple NULLs.
It is possible to make a unique index on mutiple columns.
Primary key versus unique index
Things that are the same:
A primary key implies a unique index.
Things that are different:
A primary key also implies NOT NULL, but a unique index can be nullable.
There can be only one primary key, but there can be multiple unique indexes.
If there is no clustered index defined then the primary key will be the clustered index.
You can see it like this:
A Primary Key IS Unique
A Unique value doesn't have to be the Representaion of the Element
Meaning?; Well a primary key is used to identify the element, if you have a "Person" you would like to have a Personal Identification Number ( SSN or such ) which is Primary to your Person.
On the other hand, the person might have an e-mail which is unique, but doensn't identify the person.
I always have Primary Keys, even in relationship tables ( the mid-table / connection table ) I might have them. Why? Well I like to follow a standard when coding, if the "Person" has an identifier, the Car has an identifier, well, then the Person -> Car should have an identifier as well!
Foreign keys work with unique constraints as well as primary keys. From Books Online:
A FOREIGN KEY constraint does not have
to be linked only to a PRIMARY KEY
constraint in another table; it can
also be defined to reference the
columns of a UNIQUE constraint in
another table
For transactional replication, you need the primary key. From Books Online:
Tables published for transactional
replication must have a primary key.
If a table is in a transactional
replication publication, you cannot
disable any indexes that are
associated with primary key columns.
These indexes are required by
replication. To disable an index, you
must first drop the table from the
publication.
Both answers are for SQL Server 2005.
The choice of when to use a surrogate primary key as opposed to a natural key is tricky. Answers such as, always or never, are rarely useful. I find that it depends on the situation.
As an example, I have the following tables:
CREATE TABLE toll_booths (
id INTEGER NOT NULL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
...
UNIQUE(name)
)
CREATE TABLE cars (
vin VARCHAR(17) NOT NULL PRIMARY KEY,
license_plate VARCHAR(10) NOT NULL,
...
UNIQUE(license_plate)
)
CREATE TABLE drive_through (
id INTEGER NOT NULL PRIMARY KEY,
toll_booth_id INTEGER NOT NULL REFERENCES toll_booths(id),
vin VARCHAR(17) NOT NULL REFERENCES cars(vin),
at TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
amount NUMERIC(10,4) NOT NULL,
...
UNIQUE(toll_booth_id, vin)
)
We have two entity tables (toll_booths and cars) and a transaction table (drive_through). The toll_booth table uses a surrogate key because it has no natural attribute that is not guaranteed to change (the name can easily be changed). The cars table uses a natural primary key because it has a non-changing unique identifier (vin). The drive_through transaction table uses a surrogate key for easy identification, but also has a unique constraint on the attributes that are guaranteed to be unique at the time the record is inserted.
http://database-programmer.blogspot.com has some great articles on this particular subject.
There are no disadvantages of primary keys.
To add just some information to #MrWiggles and #Peter Parker answers, when table doesn't have primary key for example you won't be able to edit data in some applications (they will end up saying sth like cannot edit / delete data without primary key). Postgresql allows multiple NULL values to be in UNIQUE column, PRIMARY KEY doesn't allow NULLs. Also some ORM that generate code may have some problems with tables without primary keys.
UPDATE:
As far as I know it is not possible to replicate tables without primary keys in MSSQL, at least without problems (details).
If something is a primary key, depending on your DB engine, the entire table gets sorted by the primary key. This means that lookups are much faster on the primary key because it doesn't have to do any dereferencing as it has to do with any other kind of index. Besides that, it's just theory.
In addition to what the other answers have said, some databases and systems may require a primary to be present. One situation comes to mind; when using enterprise replication with Informix a PK must be present for a table to participate in replication.
As long as you do not allow NULL for a value, they should be handled the same, but the value NULL is handled differently on databases(AFAIK MS-SQL do not allow more than one(1) NULL value, mySQL and Oracle allow this, if a column is UNIQUE)
So you must define this column NOT NULL UNIQUE INDEX
There is no such thing as a primary key in relational data theory, so your question has to be answered on the practical level.
Unique indexes are not part of the SQL standard. The particular implementation of a DBMS will determine what are the consequences of declaring a unique index.
In Oracle, declaring a primary key will result in a unique index being created on your behalf, so the question is almost moot. I can't tell you about other DBMS products.
I favor declaring a primary key. This has the effect of forbidding NULLs in the key column(s) as well as forbidding duplicates. I also favor declaring REFERENCES constraints to enforce entity integrity. In many cases, declaring an index on the coulmn(s) of a foreign key will speed up joins. This kind of index should in general not be unique.
There are some disadvantages of CLUSTERED INDEXES vs UNIQUE INDEXES.
As already stated, a CLUSTERED INDEX physically orders the data in the table.
This mean that when you have a lot if inserts or deletes on a table containing a clustered index, everytime (well, almost, depending on your fill factor) you change the data, the physical table needs to be updated to stay sorted.
In relative small tables, this is fine, but when getting to tables that have GB's worth of data, and insertrs/deletes affect the sorting, you will run into problems.
I almost never create a table without a numeric primary key. If there is also a natural key that should be unique, I also put a unique index on it. Joins are faster on integers than multicolumn natural keys, data only needs to change in one place (natural keys tend to need to be updated which is a bad thing when it is in primary key - foreign key relationships). If you are going to need replication use a GUID instead of an integer, but for the most part I prefer a key that is user readable especially if they need to see it to distinguish between John Smith and John Smith.
The few times I don't create a surrogate key are when I have a joining table that is involved in a many-to-many relationship. In this case I declare both fields as the primary key.
My understanding is that a primary key and a unique index with a not‑null constraint, are the same (*); and I suppose one choose one or the other depending on what the specification explicitly states or implies (a matter of what you want to express and explicitly enforce). If it requires uniqueness and not‑null, then make it a primary key. If it just happens all parts of a unique index are not‑null without any requirement for that, then just make it a unique index.
The sole remaining difference is, you may have multiple not‑null unique indexes, while you can't have multiple primary keys.
(*) Excepting a practical difference: a primary key can be the default unique key for some operations, like defining a foreign key. Ex. if one define a foreign key referencing a table and does not provide the column name, if the referenced table has a primary key, then the primary key will be the referenced column. Otherwise, the the referenced column will have to be named explicitly.
Others here have mentioned DB replication, but I don't know about it.
Unique Index can have one NULL value. It creates NON-CLUSTERED INDEX.
Primary Key cannot contain NULL value. It creates CLUSTERED INDEX.
In MSSQL, Primary keys should be monotonically increasing for best performance on the clustered index. Therefore an integer with identity insert is better than any natural key that might not be monotonically increasing.
If it were up to me...
You need to satisfy the requirements of the database and of your applications.
Adding an auto-incrementing integer or long id column to every table to serve as the primary key takes care of the database requirements.
You would then add at least one other unique index to the table for use by your application. This would be the index on employee_id, or account_id, or customer_id, etc. If possible, this index should not be a composite index.
I would favor indices on several fields individually over composite indices. The database will use the single field indices whenever the where clause includes those fields, but it will only use a composite when you provide the fields in exactly the correct order - meaning it can't use the second field in a composite index unless you provide both the first and second in your where clause.
I am all for using calculated or Function type indices - and would recommend using them over composite indices. It makes it very easy to use the function index by using the same function in your where clause.
This takes care of your application requirements.
It is highly likely that other non-primary indices are actually mappings of that indexes key value to a primary key value, not rowid()'s. This allows for physical sorting operations and deletes to occur without having to recreate these indices.

When is it proper to use unique key in a table than a primary key?

Since primary key and unique is similar. I have trouble grasping the concept of the two. I know primary key doesnt accept null and unique key accepts a null once. Since a null value is a unique value so it can be only accepted once. But the idea of primary key is having a uniqueness in every row. which a unique key also do. thats why im asking when is it proper to use primary key over unique key and vice versa.
A UNIQUE constraint is similar to PRIMARY key, but you can have more than one UNIQUE constraint per table.
When you declare a UNIQUE constraint, SQL Server creates a UNIQUE index to speed up the process of searching for duplicates. In this case the index defaults to NONCLUSTERED index, because you can have only one CLUSTERED index per table.
The number of UNIQUE constraints per table is limited by the number of indexes on the table i.e 249 NONCLUSTERED index and one possible CLUSTERED index.
Contrary to PRIMARY key UNIQUE constraints can accept NULL but just once. If the constraint is defined in a combination of fields, then every field can accept NULL and can have some values on them, as long as the combination values is unique.
Also Refer other link (MSDN)
Executive summary: It is important for every base table to have a key, using either PRIMARY KEY or NOT NULL UNIQUE. The difference between the two is not a relational consideration and is not important from a logical point of view; rather, it is merely a psychological consideration.
a relvar can have several keys, but we choose just one for underlining
and call that one the primary key. The choice is arbitrary, so the
concept of primary is not really very important from a logical point
of view. The general concept of key, however, is very important! The
term candidate key means exactly the same as key (i.e., the addition
of candidate has no real significance—it was proposed by Ted Codd
because he regarded each key as a candidate for being nominated as the
primary key)... SQL allows a subset of a table's columns to be
declared as a key for that table. It also allows one of them to be
nominated as the primary key. Specifying a key to be primary makes
for a certain amount of convenience in connection with other
constraints that might be needed
What Is a Key? by Hugh Darwen
it's usual... to single out one key as the primary key (and any other
keys for the relvar in question are then said to be alternate keys).
But whether some key is to be chosen as primary, and if so which one,
are essentially psychological issues, beyond the purview of the
relational model as such. As a matter of good practice, most base
relvars probably should have a primary key—but, to repeat, this rule,
if it is a rule, really isn't a relational issue as such... Strong
recommendation [to SQL users]: For base tables, at any rate, use
PRIMARY KEY and/or UNIQUE specifications to ensure that every such
table does have at least one key.
SQL and Relational Theory: How to Write Accurate SQL Code
By C. J. Date
In standard SQL PRIMARY KEY
implies uniqueness but you can specify that explicitly (using UNIQUE).
implies NOT NULL but you can specify that explicitly when creating columns (but you should be avoiding nulls anyhow!)
allows you to omit its columns in a FOREIGN KEY but you can specify them explicitly.
can be declared for only one key per table but it is not clear why (Codd, who originally proposed the concept, did not impose such a restriction).
In some products PRIMARY KEY implies the table's clustered index but you can specify that explicitly (you may not want the primary key to be the clustered index!)
For some people PRIMARY KEY has purely psychological significance:
they think it signifies that the key will be referenced in a foreign key (this was proposed by Codd but not actually adopted by standard SQL nor SQL vendors).
they think it signifies the sole key of the table (but the failure to enforce other candidate keys leads to loss of data integrity).
they think it implies a 'surrogate' or 'artificial ' key with no significance to the business (but actually imposes unwanted significance on the enterprise by being exposed to users).
A table can have multiple UNIQUE key but only one PRIMARY key is allowed for a table.
IF your unique key is a NOT NUL UNIQUE KEY then it is always a good idea to promote it to PRIMARY KEY.
If your storage engine is INNODB and if you don't have any PRIMARY key then innodb automatically creates a internal HEXDECIMAL PRIMARY key which will have some performance impact, hence it is better to create a primary key always with INNODB storage engine.
A PK is considered to be an unique identifier of the row. It should never be subject to changes. For example the ID of the User.
An UK is considered to be unique throughout the whole column. It is not necessarily an identifier of the row as it may be subject to changes. For example the username or email address of the User.

What are keys used for in MySQL? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
mySQL's KEY keyword?
Like
PRIMARY KEY (ID),
KEY name (name),
KEY desc (desc),
etc.
what are they useful for?
Keys are used to enforce referential integrity in your database.
A primary key is, as its name suggests, the primary identification of a given row in your table. That is, each row's primary key will uniquely identify that row.
A unique key is a key that enforces uniqueness on that set of columns. It is similar to a primary key in that it will also uniquely identify a row in a table. However, there is the added benefit of allowing NULL in some of those combinations. There can only be 1 primary key, but you can have many unique keys.
A foreign key is used to enforce a relationship between 2 tables (think parent/child table). That way, a child table can not have a value of X in its parent column unless X actually appears in the parent table. This prevents orphaned records from appearing.
The primary key constraint ensures that the column(s) are:
not null
unique (unique sets if more than one column)
KEY is MySQL's terminology in CREATE TABLE statements for an index. Indexes are not ANSI currently, but all databases use indexes to speed up data retrieval (at the cost of insertion/update/deletion, because of maintenance to keep the index relevant).
There are other key constraints:
unique
foreign key (for referential integrity)
...but your question doesn't include examples of them.
keys are also called indexes. They are used for speeding up queries. Additionally keys can be constrains (unique key and foreign key). The primary key is also unique key and it identifies the records. The record can have other unique keys as well, that do not allow to duplicate a value in a given column. Foreign key enforces referential integrity (#Derek Kromm already wrote excellent description). The ordinary key is used only for speeding up queries. You need to index the columns used in the WHERE clause of the queries. If you have no index on the column, MySQL will need to read the whole table to find the records you need. When index is used, MySQL reads only the index (which is usually a B+ tree) and then read only those record from the table it found in the index.
Primary KEY is for creating unique/not null constraint for each row in the table. Also searching by this key is the fastest. You can create only one PK in the table.
Ordinary key/index is key for speeding your searching by this column, sorting, grouping and joining with other table by this key.
Indexes drawback:
Adding new indexes to table will influence on speed or running insert/update/delete statements. So you should select columns for indexing in your table very carefully.
Key are used for relation purposes between tables and you are able to create joins in order to select data from multiple tables
What, you didn't fine the wikipedia entry comprehensive? ;-)
So, a key, in a relational database (such as MySQL, PostgreSQL, Oracle, etc) is a data constraint on a column or set of columns. The most common keys are the Primary key and foreign keys and unique keys.
A foreign key specifically relates the data of one table to data in another table. You might see that a table blog_posts has a foreign key to users based on a user_id column. This means that every user_id in blog_posts will have a corresponding entry in the users column (this is a one-to-many relationship -- a topic for another time).
If a column (or group of columns) has a unique key, that means that there can only be one such incidence of the key in the table. Often you'll see things like email addresses be unique keys -- you only want one email address per user. I've also seen a combination of columns match to a unique key -- the five columns, first_name, last_name, address, city, and state, will often be a unique key -- realistically, there can only be one William Gates at 1835 73rd Ave NE, Medina, Washington. (I do realize that it is possible for a William Gates Jr. to be born, but the designers of that database didn't really care).
The primary key is the primary, unique identifier of a given table. By definition it is a unique key. It is something which cannot be null and must be unique. It holds a special place of prominence among the indexes of a given table.

How many primary key is possible in a table?

How many primary keys are possible in a table in MySQL database.
You can't have several of what is called "primary". The answer is: one. A primary key can contain several columns, though. Then it's what you called a "composite primary key"
For this kind of question, you will always find an answer in the manual:
http://dev.mysql.com/doc/refman/5.5/en/create-table.html
You can only have one primary key and it can be composed (or not). Also you can have many unique indexes which are logicaly identical to primary keys (but some functions are not aviable for them)
you should understand primary key as "the" first index of a table, on many RDBMS it is mandatory to have a primary key in order to have other indexes
You can only have one primary key. From the MySQL documentation:
A PRIMARY KEY is a unique index where all key columns must be defined
as NOT NULL. If they are not explicitly declared as NOT NULL, MySQL
declares them so implicitly (and silently). A table can have only one
PRIMARY KEY. If you do not have a PRIMARY KEY and an application asks
for the PRIMARY KEY in your tables, MySQL returns the first UNIQUE
index that has no NULL columns as the PRIMARY KEY.
You posted a comment about composite primary keys. I suggest doing some reading from the MySQL manual to learn about them http://dev.mysql.com/doc/refman/5.5/en/create-table.html. Plus there are question here on SO about composite primary keys you just have to look for them.
One, hence "primary". You can have other "unique" keys/indexes on a table that signify uniqueness in that column/columns (and would likely be referred to as a candidate key).
One.
You can however use several fields to construct the primary key, if that is what you are looking for.