Primary Key vs Candidate Key - Relational DBMS - relational-database

My understanding is that the primary key is a randomly chosen candidate key from a theoretical perspective.
According to this definition
' Candidate Key – A Candidate Key can be any column or a combination
of columns that can qualify as unique key in database. There can be
multiple Candidate Keys in one table. Each Candidate Key can qualify
as Primary Key.
Primary Key – A Primary Key is a column or a combination of columns
that uniquely identify a record. Only one Candidate Key can be Primary
Key.'
The sentences 'Each Candidate Key can qualify as Primary Key.' and 'Only one Candidate Key can be Primary Key.' only logically don't contradict if the primary key is chosen arbitrarily from the candidate keys. Is this correct?
What special properties does a Primary key have that a Candidate key does not?

The quoted definitions of CK & PK are wrong. Beware, most Stack Overflow/Stack Exchange answers re the relational model are very poor. Eg: You quote Database Administrators. Eg: All answers at the duplicate link merit downvotes except nvogel's. Follow a published academic textbook on information modelling, the relational model & DB design. (Manuals for languages & tools to record & use designs are not such textbooks.) (Nor are wiki articles or web posts.) Ask 1 specific researched non-duplicate question where stuck. PS It is more accurate to say that "PK" is not part of theory. – philipxy

Related

Why do we need a primary key in table? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
Primary key:
Used to serve as a unique identifier for each row in a table.
Cannot accept NULL values.
Creates clustered index.
Only one primary key
Unique key:
Used to serve as a unique identifier for a row in a table.
Can accept one NULL value.
Creates non-clustered index
More than one unique key
Almost we are using the primary key to identify each row uniquely so the unique key as well.
I am not sure If I am right or wrong, But If we create a unique key with not null constraint would provide almost the same behavior which is provided by the primary key.
For example:
CREATE TABLE Persons (
ID int NOT NULL UNIQUE,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int);
So why do we really need a primary key, as a unique key can help us achieve whatever we want?
We can also use a unique key as a foreign key to the reference table as well.
A lot of relational purists regard the elevation of one key as the "primary key" as a mistake1 and believe that all candidate keys should be treated equally. If you're of that school of thought (as you appear to be) then you're correct, you don't need a PK.
1In the design of SQL itself, in having this feature.
Similarly, theorists see no problem with keys consisting of zero columns (a table with that key declared does make sense, and can only contain 0 or 1 row). But here SQL will prevent you from declaring such a key.
It's true that a UNIQUE KEY on NOT NULL column(s) does fill the same role as a PRIMARY KEY.
Folks should remember that a primary key is a constraint, and may be defined on multiple columns. Primary key does not have to be a single-column auto-inc integer.
In MySQL's InnoDB engine, the table is stored as a clustered index by its primary key, or if it doesn't have a primary key, then it's a clustered index by its first non-null unique key. (a table can have multiple unique keys). See https://dev.mysql.com/doc/refman/8.0/en/innodb-index-types.html
But isn't it quicker and more clear to say the clustered index is by the primary key? :-)
There is a semantic difference.
Ignoring the minor differences such as handling of NULL's, a primary key is essentially a unique constraint, but is simply designated as the primary one. It's the one true way to identify a given row. If I had a particular row that I wanted you to see, what would I tell you about the row to make sure you look at the exact same row? I'd tell you the primary key. "Hey, I think we have a problem with row # 78. Can you look at row # 78?" There might be other unique constraints on the table, but the PK is the go-to.
The primary key is the canonical identification of a row. It might be a single column or a combination of columns. It is entirely possible, valid, and common, to have a primary key and no unique constraints.
Practical Considerations
Anecdotally, foreign keys very often map rows across tables and therefore the primary key of a table is likely to be repeated across tables. When doing an update of the value, therefore, foreign keys get impacted. Due to this, it is often desirable to have an immutable primary key. So-called surrogate keys are popular for this reason, though there are schools of thought on that subject. [1]
Documentation tools typically inspect primary keys and foreign keys in order to determine the relationships between tables. For example, you can auto-generate an ER diagram by traversing the FK to PK relationships.
[1] With that said, keep in mind that foreign keys can point to any column of any table, including the same table. Pointing to the primary key of a table is extremely common because it usually makes the most sense, but there are cases where it does not make sense.
Although of similar definitions, they have different applications. The primary key is a column that serves to uniquely identify each tuple in the table. Furthermore, it adds integrity constraints to the table and maintains the relationship with foreign keys. The unique key is simpler, it only serves to uniquely identify a tuple in the table.

Which column we should choose as primary key if we have 2 unique columns

I know what's the difference between unique and primary key as explained in following links
http://sqlhints.com/2013/06/02/difference-between-primary-key-and-unique-key-in-sql-server/
difference between primary key and unique key
but I was asked something different in an interview ,
If I have 2 columns in a table which can uniquely identify
entries , say student's roll no and national id no. Then which one
I'll select as primary key and what's the difference between primary
and unique key if we add not null constraint on unique key.
Any help regarding this?
Thanks in advance.
I am a big advocate of the third way -- having an auto-incremented, integer primary key. You can then put unique indexes on the others.
Such a key offers several things. First, it allows you to determine the order of insertion of records -- at least to a close approximation. Second, it makes foreign key references easier. I typically name such columns <tablename>Id, which makes external references obvious.
Third, integers are a pretty efficient mechanism for indexes (fixed width values, typically four bytes). Admittedly, they do incur a bit of overhead in the original table, but that is usually minor.
Fourth, the columns may not be applicable. In the United States, the closest we have to a national id number is a social security number -- and it is not guaranteed to be unique. Further, there may be students who enroll in the school that do not have such numbers -- foreign students, for instance.

When is it proper to use unique key in a table than a primary key?

Since primary key and unique is similar. I have trouble grasping the concept of the two. I know primary key doesnt accept null and unique key accepts a null once. Since a null value is a unique value so it can be only accepted once. But the idea of primary key is having a uniqueness in every row. which a unique key also do. thats why im asking when is it proper to use primary key over unique key and vice versa.
A UNIQUE constraint is similar to PRIMARY key, but you can have more than one UNIQUE constraint per table.
When you declare a UNIQUE constraint, SQL Server creates a UNIQUE index to speed up the process of searching for duplicates. In this case the index defaults to NONCLUSTERED index, because you can have only one CLUSTERED index per table.
The number of UNIQUE constraints per table is limited by the number of indexes on the table i.e 249 NONCLUSTERED index and one possible CLUSTERED index.
Contrary to PRIMARY key UNIQUE constraints can accept NULL but just once. If the constraint is defined in a combination of fields, then every field can accept NULL and can have some values on them, as long as the combination values is unique.
Also Refer other link (MSDN)
Executive summary: It is important for every base table to have a key, using either PRIMARY KEY or NOT NULL UNIQUE. The difference between the two is not a relational consideration and is not important from a logical point of view; rather, it is merely a psychological consideration.
a relvar can have several keys, but we choose just one for underlining
and call that one the primary key. The choice is arbitrary, so the
concept of primary is not really very important from a logical point
of view. The general concept of key, however, is very important! The
term candidate key means exactly the same as key (i.e., the addition
of candidate has no real significance—it was proposed by Ted Codd
because he regarded each key as a candidate for being nominated as the
primary key)... SQL allows a subset of a table's columns to be
declared as a key for that table. It also allows one of them to be
nominated as the primary key. Specifying a key to be primary makes
for a certain amount of convenience in connection with other
constraints that might be needed
What Is a Key? by Hugh Darwen
it's usual... to single out one key as the primary key (and any other
keys for the relvar in question are then said to be alternate keys).
But whether some key is to be chosen as primary, and if so which one,
are essentially psychological issues, beyond the purview of the
relational model as such. As a matter of good practice, most base
relvars probably should have a primary key—but, to repeat, this rule,
if it is a rule, really isn't a relational issue as such... Strong
recommendation [to SQL users]: For base tables, at any rate, use
PRIMARY KEY and/or UNIQUE specifications to ensure that every such
table does have at least one key.
SQL and Relational Theory: How to Write Accurate SQL Code
By C. J. Date
In standard SQL PRIMARY KEY
implies uniqueness but you can specify that explicitly (using UNIQUE).
implies NOT NULL but you can specify that explicitly when creating columns (but you should be avoiding nulls anyhow!)
allows you to omit its columns in a FOREIGN KEY but you can specify them explicitly.
can be declared for only one key per table but it is not clear why (Codd, who originally proposed the concept, did not impose such a restriction).
In some products PRIMARY KEY implies the table's clustered index but you can specify that explicitly (you may not want the primary key to be the clustered index!)
For some people PRIMARY KEY has purely psychological significance:
they think it signifies that the key will be referenced in a foreign key (this was proposed by Codd but not actually adopted by standard SQL nor SQL vendors).
they think it signifies the sole key of the table (but the failure to enforce other candidate keys leads to loss of data integrity).
they think it implies a 'surrogate' or 'artificial ' key with no significance to the business (but actually imposes unwanted significance on the enterprise by being exposed to users).
A table can have multiple UNIQUE key but only one PRIMARY key is allowed for a table.
IF your unique key is a NOT NUL UNIQUE KEY then it is always a good idea to promote it to PRIMARY KEY.
If your storage engine is INNODB and if you don't have any PRIMARY key then innodb automatically creates a internal HEXDECIMAL PRIMARY key which will have some performance impact, hence it is better to create a primary key always with INNODB storage engine.
A PK is considered to be an unique identifier of the row. It should never be subject to changes. For example the ID of the User.
An UK is considered to be unique throughout the whole column. It is not necessarily an identifier of the row as it may be subject to changes. For example the username or email address of the User.

What are keys used for in MySQL? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
mySQL's KEY keyword?
Like
PRIMARY KEY (ID),
KEY name (name),
KEY desc (desc),
etc.
what are they useful for?
Keys are used to enforce referential integrity in your database.
A primary key is, as its name suggests, the primary identification of a given row in your table. That is, each row's primary key will uniquely identify that row.
A unique key is a key that enforces uniqueness on that set of columns. It is similar to a primary key in that it will also uniquely identify a row in a table. However, there is the added benefit of allowing NULL in some of those combinations. There can only be 1 primary key, but you can have many unique keys.
A foreign key is used to enforce a relationship between 2 tables (think parent/child table). That way, a child table can not have a value of X in its parent column unless X actually appears in the parent table. This prevents orphaned records from appearing.
The primary key constraint ensures that the column(s) are:
not null
unique (unique sets if more than one column)
KEY is MySQL's terminology in CREATE TABLE statements for an index. Indexes are not ANSI currently, but all databases use indexes to speed up data retrieval (at the cost of insertion/update/deletion, because of maintenance to keep the index relevant).
There are other key constraints:
unique
foreign key (for referential integrity)
...but your question doesn't include examples of them.
keys are also called indexes. They are used for speeding up queries. Additionally keys can be constrains (unique key and foreign key). The primary key is also unique key and it identifies the records. The record can have other unique keys as well, that do not allow to duplicate a value in a given column. Foreign key enforces referential integrity (#Derek Kromm already wrote excellent description). The ordinary key is used only for speeding up queries. You need to index the columns used in the WHERE clause of the queries. If you have no index on the column, MySQL will need to read the whole table to find the records you need. When index is used, MySQL reads only the index (which is usually a B+ tree) and then read only those record from the table it found in the index.
Primary KEY is for creating unique/not null constraint for each row in the table. Also searching by this key is the fastest. You can create only one PK in the table.
Ordinary key/index is key for speeding your searching by this column, sorting, grouping and joining with other table by this key.
Indexes drawback:
Adding new indexes to table will influence on speed or running insert/update/delete statements. So you should select columns for indexing in your table very carefully.
Key are used for relation purposes between tables and you are able to create joins in order to select data from multiple tables
What, you didn't fine the wikipedia entry comprehensive? ;-)
So, a key, in a relational database (such as MySQL, PostgreSQL, Oracle, etc) is a data constraint on a column or set of columns. The most common keys are the Primary key and foreign keys and unique keys.
A foreign key specifically relates the data of one table to data in another table. You might see that a table blog_posts has a foreign key to users based on a user_id column. This means that every user_id in blog_posts will have a corresponding entry in the users column (this is a one-to-many relationship -- a topic for another time).
If a column (or group of columns) has a unique key, that means that there can only be one such incidence of the key in the table. Often you'll see things like email addresses be unique keys -- you only want one email address per user. I've also seen a combination of columns match to a unique key -- the five columns, first_name, last_name, address, city, and state, will often be a unique key -- realistically, there can only be one William Gates at 1835 73rd Ave NE, Medina, Washington. (I do realize that it is possible for a William Gates Jr. to be born, but the designers of that database didn't really care).
The primary key is the primary, unique identifier of a given table. By definition it is a unique key. It is something which cannot be null and must be unique. It holds a special place of prominence among the indexes of a given table.

Difference between Unique Key and Primary Keys

I came across the following SQL in a book:
CREATE TABLE 'categories'(
id SMALLINT NOT NULL AUTO INCREMENT,
category VARCHAR(30) NOT NULL,
PRIMARY KEY('id'),
UNIQUE KEY 'category'('category')
)ENGINE=MyISAM DEFAULT CHARSET = utf8;
I was wondering is there a reason why I would need a PRIMARY and UNIQUE KEY in the same table? I guess, underlying that question is, what is the difference between PRIMARY and UNIQUE keys?
The relational model says there's no essential difference between one key and another. That is, when a relation has more than one candidate key, there are no theoretical reasons for declaring that this key is more important than that key. Essentially, that means there's no theoretical reason for identifying one key as a primary key, and all the others as secondary keys. (There might be practical reasons, though.)
Many relations have more than one candidate key. For example, a relation of US states might have data like this.
State Abbr Postal Code
--
Alabama Ala. AL
Alaska Alaska AK
Arizona Ariz. AZ
...
Wyoming Wyo. WY
It's clear that values in each of those three columns are unique--there are three candidate keys.
If you were going to build a table in SQL to store those values, you might do it like this.
CREATE TABLE states (
state varchar(15) primary key,
abbr varchar(10) not null unique,
postal_code char(2) not null unique
);
And you'd do something like that because SQL doesn't have any other way to say "My table has three separate candidate keys."
I didn't have any particular reason for choosing "state" as the primary key. I could have just as easily chosen "abbr" or "postal_code". Any of those three columns can be used as the target for a foreign key reference, too.
And as far as that goes, I could have built the table like this, too.
CREATE TABLE states (
state varchar(15) not null unique,
abbr varchar(10) not null unique,
postal_code char(2) not null unique
);
I'm surprised that nobody mentionned that a primary key can be referenced as foreign key into other tables.
Also an unique constraint allows NULL values.
The reason you need two uniqueness restrictions (one being the Primary Key) is that you are using Id as a surrogate key. I.e., it is an arbitrary value that has no meaning in relation to the data itself. Without the unique key (or colloquially known as "business key" i.e, a key that the user would recognize as being enforced), a user could add two identical category values with different arbitrary Id values. Since users should never see the surrogate key, they would not know why they are seeing a duplicate even though the database would think they are different.
When using surrogate keys, having another unique constraint on something other than the surrogate key is critical to avoid duplicate data.
Depending on who you talk to and how they read the specification, Unique keys( which is redundant by the way. A "key" is by definition unique) are also not supposed to allow nulls. However, one can also read the specifications as saying that Unique constraints, unlike Primary Key constraints, are in fact supposed to allow nulls (how many nulls are allowed also varies by vendor). Most products, including MySQL, do allow nulls in Unique constraints whereas Primary Key constraints do not.
Similarity
Both a PRIMARY and UNIQUE index create a constraint that requires all values to be distinct (1).
Difference
The PRIMARY key (implicitly) defines all key columns as NOT NULL; additionally, a table can only have one primary key.
(1) Each NULL value is considered to be distinct.
A UNIQUE constraint and PRIMARY key both are similar and it provide unique enforce uniqueness of the column on which they are defined.
Some are basic differences between Primary Key and Unique key are as follows.
Primary key
Primary key cannot have a NULL value.
Each table can have only single primary key.
Primary key is implemented as indexes on the table. By default this index is clustered index.
Primary key can be related with another table's as a Foreign Key.
We can generate ID automatically with the help of Auto Increment field. Primary key supports Auto Increment value.
Unique Constraint
Unique Constraint may have a NULL value.
Each table can have more than one Unique Constraint.
Unique Constraint is also implemented as indexes on the table. By default this index is Non-clustered index.
Unique Constraint cannot be related with another table's as a Foreign Key.
Unique Constraint doesn't support Auto Increment value.
You can find detailed information from: http://www.oracleinformation.com/2014/04/difference-between-primary-key-and-unique-key.html