I'm creating a database using MySQL for a music streaming application for my school project. It has a table "song_discoveries" which has these columns: user_id, song_id and discovery_date. It has no primary key. The "user_id" and "song_id" are foreign keys and the "discovery_date" is self explanatory. My problem is that I want to ensure that there are no duplicate rows in this table since obviously a user can discover a song once, but I'm not sure on whether to use a unique constraint for all of the columns or create a composite primary key of all columns. My main concerns are what is the best practice for this and which has better performance? Are there any alternatives to these approaches?
In MySQL, a table is stored as a clustered index sorted by the primary key.
If the table has no primary key, but does have a unique key constraint on non-NULL columns, then the unique key becomes the clustered index, and acts almost exactly the same as a primary key. There's no performance difference between these two cases.
The only difference between a primary and a unique key on non-NULL columns is that you can specify the name of the unique key, but the primary key is always called PRIMARY.
If the goal is to create "no duplicate rows in this table". Then to do this, you need to identify what makes a "unique" record. If uniqueness is guaranteed by
the composite user_id, discovery_date and song_id that that should be your primary composite key.
Thinking a bit more, if we apply a rule that says, "a song can only be discovered once !" then your composite primary key should be user_id,song_id (this will guarantee that you don't add the same song multiple times), but
if you can discover the same song on multiple days, then you can leave the key as the composition of the 3 fields.
If you go with user/song then a table can look like this:
CREATE TABLE song_discoveries (
user_id int NOT NULL,
song_id int NOT NULL,
discovery_date DATE NOT NULL,
PRIMARY KEY (user_id, song_id)
);
Related
Lets say I have table A with two junction tables B and C, how would I go about creating primary keys for table A? I have two of these types of table in a diagram I drew, the circle keys are foreign keys btw.
Image with junction tables
Your games table needs only one primary key: this identifies each specific game. In the junction tables, the primary keys are composed of the game primary key and the directors (or types) primary key.
Taken from the reference in the tutorial MySQL Primary Key:
CREATE TABLE roles(
role_id INT AUTO_INCREMENT,
role_name VARCHAR(50),
PRIMARY KEY(role_id)
);
It is difficult to provide information about your specific question because there is too little details in it.
From your comment "if a table has two junction tables attached to it, would it need to have two primary keys?". No.
A primary key is actually a logical concept (a design mechanism) used to define a logical model. A primary key is a set of attributes (columns) that together uniquely identify each end every Tuple (row) in a relation (table). One of the rules of a primary key is that there is only one per relation.
The logical model is used, as mentioned, as the design to create the physical model, relations become tables, attributes become columns, Primary keys may become unique indexes. Foreign Keys may become indexes in the related table and so on.
Many RDBMS's allow the specification of a PRIMARY KEY in a physical table definition. Most also allow definition of FOREIGN KEYs on a physical table also. What they do with them may vary from one implementation to another. Many use the definition of a PRIMARY KEY to define a UNIQUE INDEX of some sort to enforce the "must uniquely identify" each and every record in the table.
So, No, your games_directors table does not need, nor can it have, two primary keys. if you did choose to specify a PRIMARY KEY, you would need to specify all the columns that uniquely identify records in the games_directors table - most likely PRIMARY KEY (game_id, director_id).
Similarly, the PRIMARY KEY for the games table would likely be PRIMARY KEY (game_id), for the directors would likely be PRIMARY KEY (director_id) and for game types it would likely be PRIMARY KEY (game_type_id).
You might use a foreign key from your games_directors table to ensure that when records are added to it that the corresponding director exists in the games table and the directors table. In this case, your games_directors table will have two foreign key relationships (one to games and another to directors). But only one PRIMARY KEY.
So you might end up with something like this:
create table games (
game_id integer,
PRIMARY KEY (game_id)
);
create table directors (
director_id integer,
PRIMARY KEY (director_id)
);
CREATE TABLE games_directors (
game_id INTEGER NOT NULL,
director_id INTEGER NOT NULL,
commission_paid DECIMAL(10,2),
PRIMARY KEY (game_id, director_id),
FOREIGN KEY (game_id) REFERENCES games(game_id),
FOREIGN KEY (director_id) REFERENCES directors(director_id)
);
NB: I didn't tested the above using PostgreSql. The syntax should work for most RDBMS's, but some may require tweaking slightly.
Indexes can be used to speed up access to individual records within table. For example, you might want to create an index on director name or director id (depending upon how you most frequenytly access that table). If you mostly access the director table with an equality condition like this : where director_name = 'fred' then an index on director_name might make sense.
Indexes become more useful as the number of records in the tables grows.
I hope this answers your question. :-)
I am working on a project and I realized I am unsure about how to use multiple primary keys. I have a table named "User_Details" that has the details of Customer ID, email address and password. From my understanding, I can use both Customer ID and email address as the primary key. In this case do I use only one as Primary Key or both? If I use both, do they become composite primary keys?
(PS. I have other tables, where the foreign key is the customer ID)
You can only have one primary key, but you could definitely have other unique fields.
Usually using an integer / id as primary key is preferred over a string key, and an id is presumably auto assigned, where as email could change - which would be a problem for foreign key relations.
Since you already use customer Id as a foreign key in other tables, I would suggest you continue to do that.
You can only have one primary key, but you can have multiple columns in your primary key, alternatively you can also have Unique Indexes on your table, which will work a bit like a primary key in that they will enforce unique values, and will speed up querying of those values.
Easiest way tho is a Composite Primary Key which is a primary key made from two or more columns. For example:
CREATE TABLE userdata (
userid INT,
userdataid INT,
info char(200),
primary key (userid, userdataid),
);
Here is more info: Link
You can have a Composite Primary Key which is a primary key made from two or more columns. For example:
CREATE TABLE userdata (
userid INT,
userdataid INT,
info char(200),
primary key (userid, userdataid),
);
A table can have multiple candidate keys. Each candidate key is a column or set of columns that are UNIQUE, taken together, and also NOT NULL. Thus, specifying values for all the columns of any candidate key is enough to determine that there is one row that meets the criteria, or no rows at all.
Candidate keys are a fundamental concept in the relational data model.
It's common practice, if multiple keys are present in one table, to designate one of the candidate keys as the primary key. It's also common practice to cause any foreign keys to the table to reference the primary key, rather than any other candidate key.
I recommend these practices, but there is nothing in the relational model that requires selecting a primary key among the candidate keys.
I'm using a MySQL database.
In which situations should I create a unique key or a primary key?
Primary Key:
There can only be one primary key constraint in a table
In some DBMS it cannot be NULL - e.g. MySQL adds NOT NULL
Primary Key is a unique key identifier of the record
Unique Key:
Can be more than one unique key in one table
Unique key can have NULL values
It can be a candidate key
Unique key can be NULL ; multiple rows can have NULL values and therefore may not be considered "unique"
Unique Key (UK): It's a column or a group of columns that can identify a uniqueness in a row.
Primary Key (PK): It's also a column or group of columns that can identify a uniqueness in a row.
So the Primary key is just another name for unique key, but the default implementation in SQL Server is different for Primary and Unique Key.
By Default:
PK creates a Clustered index and UK creates a Non Clustered Index.
PK is not null, but UK allows nulls (Note: By Default)
There can only be one and only one PK on a table, but there can be multiple UK's
You can override the default implementation depending upon your need.
It really depends what is your aim when deciding whether to create a UK or PK. It follows an analogy like
"If there is a team of three people, so all of them are peers, but there will be one of them who will be a pair of peers: PK and UK has similar relation.". I would suggest reading this article: The example given by the author may not seem suitable, but try to get an overall idea.
http://tsqltips.blogspot.com/2012/06/difference-between-unique-key-and.html
For an organization or a business, there are so many physical entities (such as people, resources, machines, etc.) and virtual entities (their Tasks, transactions, activities).
Typically, business needs to record and process information of those business entities.
These business entities are identified within a whole business domain by a Key.
As per RDBMS prospective, Key (a.k.a Candidate Key) is a value or set of values that uniquely identifies an entity.
For a DB-Table, there are so many keys are exist and might be eligible for Primary Key.
So that all keys, primary key, unique key, etc are collectively called as Candidate Key.
However, DBA selected a key from candidate key for searching records is called Primary key.
Difference between Primary Key and Unique key
1. Behavior: Primary Key is used to identify a row (record) in a table, whereas Unique-key is to prevent duplicate values in a column (with the exception of a null entry).
2. Indexing: By default SQL-engine creates Clustered Index on primary-key if not exists and Non-Clustered Index on Unique-key.
3. Nullability: Primary key does not include Null values, whereas Unique-key can.
4. Existence: A table can have at most one primary key, but can have multiple Unique-key.
5. Modifiability: You can’t change or delete primary values, but Unique-key values can.
For more information and Examples:
http://dotnetauthorities.blogspot.in/2013/11/Microsoft-SQL-Server-Training-Online-Learning-Classes-Integrity-Constraints-PrimaryKey-Unique-Key_27.html
A primary key must be unique.
A unique key does not have to be the primary key - see candidate key.
That is, there may be more than one combination of columns on a table that can uniquely identify a row - only one of these can be selected as the primary key. The others, though unique are candidate keys.
Primary Key
Unique Key
A primary key can't accept NULL values
Unique key can accept NULL values, so problematic in the context of being unique
A primary key cannot contain duplicate values
A unique key also cannot contain duplicate values
We can have only one primary key in a table
We can have more than one unique key in a table
We can make a primary key from one or more table fields
We can also make a unique key from one or more table fields
By default, a primary key creates a clustered index
By default, a unique key creates a non-clustered unique index
It is used to identify each record in the table
It prevents storing duplicate entries in a column
A primary key has the semantic of identifying the row of a database. Therefore there can be only one primary key for a given table, while there can be many unique keys.
Also for the same reason a primary key cannot be NULL (at least in Oracle, not sure about other databases)
Since it identifies the row it should never ever change. Changing primary keys are bound to cause serious pain and probably eternal damnation.
Therefor in most cases you want some artificial id for primary key which isn't used for anything but identifying single rows in the table.
Unique keys on the other hand may change as much as you want.
A Primary key is a unique key.
Each table must have at most ONE primary key but it can have multiple unique key. A primary key is used to uniquely identify a table row. A primary key cannot be NULL since NULL is not a value.
Think the table name is employe.
Primary key
Primary key can not accept null values. primary key enforces uniqueness of a
column. We can have only one Primary key in a table.
Unique key
Unique key can accept null values. unique key also enforces uniqueness of a column.you can think if unique key contains null values then why it can be unique ? yes, though it can accept null values it enforces uniqueness of a column. just have a look on the picture.here Emp_ID is primary and Citizen ID is unique. Hope you understand. We can use multiple unique key in a table.
I know this question is several years old but I'd like to provide an answer to this explaining why rather than how
Purpose of Primary Key: To identify a row in a database uniquely => A row represents a single instance of the entity type modeled by the table. A primary key enforces integrity of an entity, AKA Entity Integrity. Primary Key would be a clustered index i.e. it defines the order in which data is physically stored in a table.
Purpose of Unique Key: Ok, with the Primary Key we have a way to uniquely identify a row. But I have a business need such that, another column/a set of columns should have unique values. Well, technically, given that this column(s) is unique, it can be a candidate to enforce entity integrity. But for all we know, this column can contain data originating from an external organization that I may have a doubt about being unique. I may not trust it to provide entity integrity. I just make it a unique key to fulfill my business requirement.
There you go!
If your Database design is such that their is no need of foreign key, then you can go with Unique key( but remember unique key allow single null value ).
If you database demand foreign key then you leave with no choice you have to go with primary key.
To see the difference between unique and primary key visit here
Unique key :-
It should be used when you have to give unique value.In the case of
unique key it means null values are also allowed.Unique keys are those
keys which are unique and non similar in that column like for example
your pet name.it can be nothing like null and if you are asking in context of database then it must be noted that every null is different from another null in the database.EXCEPT-SQL Server where null=null is true
primary key :-
It should be used when you have to give uniquely identify a row.primary is key which unique for every row in a database constraint is that it doesn't allow null in it.so, you might have seen that the database have a column which is auto increment and it is the primary key of the table. plus it can be used as a foreign key in another table.example can be orderId on a order Table,billId in a bill Table.
now coming back to situation when to use it:-
1) primary key in the column which can
not be null in the table and you are using as foreign key in another
table for creating relationship
2) unique key in table where it
doesn't affect in table or in the whole database whether you take the
null for the particular column like snacks in the restaurant it is
possible you don't take snacks in a restaurant
difference between Primary Key and Unique Key
Both Primary key and Unique Key are used to uniquely define of a row in a table.
Primary Key creates a clustered index of the column whereas a Unique creates an unclustered index of the column.
A Primary Key doesn’t allow NULL value, however a Unique Key does allow one NULL value.
Simply Primary Key is a unique and can't be null, unique can be null and may not be unique.
Primary Keys
The main purpose of the primary key is to provide a means to identify each record in the table.
The primary key provides a means to identity the row, using data within the row. A primary key can be based on one or more columns, such as first and last name; however, in many designs, the primary key is an auto-generated number from an identity column.
A primary key has the following characteristics:
There can only be one primary key for a table.
The primary key consists of one or more columns.
The primary key enforces the entity integrity of the table.
All columns defined must be defined as NOT NULL.
The primary key uniquely identifies a row.
Primary keys result in CLUSTERED unique indexes by default.
Unique Keys
A unique key is also called a unique constraint. A unique constraint can be used to ensure rows are unique within the database.
Don’t we already do that with the primary key? Yep, we do, but a table may have several sets of columns which you want unique.
In SQL Server the unique key has the following characteristics:
There can be multiple unique keys defined on a table.
Unique Keys result in NONCLUSTERED Unique Indexes by default.
One or more columns make up a unique key.
Column may be NULL, but on one NULL per column is allowed.
A unique constraint can be referenced by a Foreign Key Constraint.
source : here
A primary key’s main features are:
It must contain a unique value for each row of data.
It cannot contain null values.
Only one Primary key in a table.
A Unique key’s main features are:
It can also contain a unique value for each row of data.
It can also contain null values.
Multiple Unique keys in a table.
I came across the following SQL in a book:
CREATE TABLE 'categories'(
id SMALLINT NOT NULL AUTO INCREMENT,
category VARCHAR(30) NOT NULL,
PRIMARY KEY('id'),
UNIQUE KEY 'category'('category')
)ENGINE=MyISAM DEFAULT CHARSET = utf8;
I was wondering is there a reason why I would need a PRIMARY and UNIQUE KEY in the same table? I guess, underlying that question is, what is the difference between PRIMARY and UNIQUE keys?
The relational model says there's no essential difference between one key and another. That is, when a relation has more than one candidate key, there are no theoretical reasons for declaring that this key is more important than that key. Essentially, that means there's no theoretical reason for identifying one key as a primary key, and all the others as secondary keys. (There might be practical reasons, though.)
Many relations have more than one candidate key. For example, a relation of US states might have data like this.
State Abbr Postal Code
--
Alabama Ala. AL
Alaska Alaska AK
Arizona Ariz. AZ
...
Wyoming Wyo. WY
It's clear that values in each of those three columns are unique--there are three candidate keys.
If you were going to build a table in SQL to store those values, you might do it like this.
CREATE TABLE states (
state varchar(15) primary key,
abbr varchar(10) not null unique,
postal_code char(2) not null unique
);
And you'd do something like that because SQL doesn't have any other way to say "My table has three separate candidate keys."
I didn't have any particular reason for choosing "state" as the primary key. I could have just as easily chosen "abbr" or "postal_code". Any of those three columns can be used as the target for a foreign key reference, too.
And as far as that goes, I could have built the table like this, too.
CREATE TABLE states (
state varchar(15) not null unique,
abbr varchar(10) not null unique,
postal_code char(2) not null unique
);
I'm surprised that nobody mentionned that a primary key can be referenced as foreign key into other tables.
Also an unique constraint allows NULL values.
The reason you need two uniqueness restrictions (one being the Primary Key) is that you are using Id as a surrogate key. I.e., it is an arbitrary value that has no meaning in relation to the data itself. Without the unique key (or colloquially known as "business key" i.e, a key that the user would recognize as being enforced), a user could add two identical category values with different arbitrary Id values. Since users should never see the surrogate key, they would not know why they are seeing a duplicate even though the database would think they are different.
When using surrogate keys, having another unique constraint on something other than the surrogate key is critical to avoid duplicate data.
Depending on who you talk to and how they read the specification, Unique keys( which is redundant by the way. A "key" is by definition unique) are also not supposed to allow nulls. However, one can also read the specifications as saying that Unique constraints, unlike Primary Key constraints, are in fact supposed to allow nulls (how many nulls are allowed also varies by vendor). Most products, including MySQL, do allow nulls in Unique constraints whereas Primary Key constraints do not.
Similarity
Both a PRIMARY and UNIQUE index create a constraint that requires all values to be distinct (1).
Difference
The PRIMARY key (implicitly) defines all key columns as NOT NULL; additionally, a table can only have one primary key.
(1) Each NULL value is considered to be distinct.
A UNIQUE constraint and PRIMARY key both are similar and it provide unique enforce uniqueness of the column on which they are defined.
Some are basic differences between Primary Key and Unique key are as follows.
Primary key
Primary key cannot have a NULL value.
Each table can have only single primary key.
Primary key is implemented as indexes on the table. By default this index is clustered index.
Primary key can be related with another table's as a Foreign Key.
We can generate ID automatically with the help of Auto Increment field. Primary key supports Auto Increment value.
Unique Constraint
Unique Constraint may have a NULL value.
Each table can have more than one Unique Constraint.
Unique Constraint is also implemented as indexes on the table. By default this index is Non-clustered index.
Unique Constraint cannot be related with another table's as a Foreign Key.
Unique Constraint doesn't support Auto Increment value.
You can find detailed information from: http://www.oracleinformation.com/2014/04/difference-between-primary-key-and-unique-key.html
When should I use KEY, PRIMARY KEY, UNIQUE KEY and INDEX?
KEY and INDEX are synonyms in MySQL. They mean the same thing. In databases you would use indexes to improve the speed of data retrieval. An index is typically created on columns used in JOIN, WHERE, and ORDER BY clauses.
Imagine you have a table called users and you want to search for all the users which have the last name 'Smith'. Without an index, the database would have to go through all the records of the table: this is slow, because the more records you have in your database, the more work it has to do to find the result. On the other hand, an index will help the database skip quickly to the relevant pages where the 'Smith' records are held. This is very similar to how we, humans, go through a phone book directory to find someone by the last name: We don't start searching through the directory from cover to cover, as long we inserted the information in some order that we can use to skip quickly to the 'S' pages.
Primary keys and unique keys are similar. A primary key is a column, or a combination of columns, that can uniquely identify a row. It is a special case of unique key. A table can have at most one primary key, but more than one unique key. When you specify a unique key on a column, no two distinct rows in a table can have the same value.
Also note that columns defined as primary keys or unique keys are automatically indexed in MySQL.
KEY and INDEX are synonyms.
You should add an index when performance measurements and EXPLAIN shows you that the query is inefficient because of a missing index. Adding an index can improve the performance of queries (but it can slow down modifications to the table).
You should use UNIQUE when you want to contrain the values in that column (or columns) to be unique, so that attempts to insert duplicate values result in an error.
A PRIMARY KEY is both a unique constraint and it also implies that the column is NOT NULL. It is used to give an identity to each row. This can be useful for joining with another table via a foreign key constraint. While it is not required for a table to have a PRIMARY KEY it is usually a good idea.
Primary key does not allow NULL values, but unique key allows NULL values.
We can declare only one primary key in a table, but a table can have multiple unique keys (column assign).
PRIMARY KEY AND UNIQUE KEY are similar except it has different functions. Primary key makes the table row unique (i.e, there cannot be 2 row with the exact same key). You can only have 1 primary key in a database table.
Unique key makes the table column in a table row unique (i.e., no 2 table row may have the same exact value). You can have more than 1 unique key table column (unlike primary key which means only 1 table column in the table is unique).
INDEX also creates uniqueness. MySQL (example) will create a indexing table for the column that is indexed. This way, it's easier to retrieve the table row value when the query is queried on that indexed table column. The disadvantage is that if you do many updating/deleting/create, MySQL has to manage the indexing tables (and that can be a performance bottleneck).
Hope this helps.
Unique Keys: The columns in which no two rows are similar
Primary Key: Collection of minimum number of columns which can uniquely identify every row in a table (i.e. no two rows are similar in all the columns constituting primary key). There can be more than one primary key in a table. If there exists a unique-key then it is primary key (not "the" primary key) in the table. If there does not exist a unique key then more than one column values will be required to identify a row like (first_name, last_name, father_name, mother_name) can in some tables constitute primary key.
Index: used to optimize the queries. If you are going to search or sort the results on basis of some column many times (eg. mostly people are going to search the students by name and not by their roll no.) then it can be optimized if the column values are all "indexed" for example with a binary tree algorithm.
The primary key is used to work with different tables. This is the foundation of relational databases. If you have a book database it's better to create 2 tables - 1) books and 2) authors with INT primary key "id". Then you use id in books instead of authors name.
The unique key is used if you don't want to have repeated entries. For example you may have title in your book table and want to be sure there is only one entry for each title.
Primary key - we can put only one primary key on a table into a table and we can not left that column blank when we are entering the values into the table.
Unique Key - we can put more than one unique key on a table and we may left that column blank when we are entering the values into the table.
column take unique values (not same) when we applied primary & unique key.
Unique Key :
More than one value can be null.
No two tuples can have same values in unique key.
One or more unique keys can be combined to form a primary key, but not vice versa.
Primary Key
Can contain more than one unique keys.
Uniquely represents a tuple.