Say I have the following table:
TABLE: widget
- widget_id (not null, unique, auto-increment)
- model_name (PK)
- model_year (PK)
model_name and model_year make up a composite key. Is there any problem to using widget_id as a FK in another table?
A key is any number of columns that can be used to uniquely identify each row within the table.
In the example you've shown, your widget table has two keys:
model_name, model_year
widget_id
In standard SQL, a foreign key may reference any declared key on the referenced table (either primary key or unique). I'd need to check MySQLs compliance.
From MySQL reference manual on foreign keys:
InnoDB permits a foreign key to reference any index column or group of columns. However, in the referenced table, there must be an index where the referenced columns are listed as the first columns in the same order.
As an alternative, if you wish to use the composite key from your referencing table, you'd have two columns in that table that correspond to model_name and model_year, and would then declare your foreign key constraint as:
ALTER TABLE OtherTable ADD CONSTRAINT
FK_OtherTable_Widgets (model_name,model_year)
references Widgets (model_name,model_Year).
Re InnoDB vs MyISAM, in the docs for ALTER TABLE
The FOREIGN KEY and REFERENCES clauses are supported by the InnoDB storage engine, which implements ADD [CONSTRAINT [symbol]] FOREIGN KEY (...) REFERENCES ... (...). See Section 13.6.4.4, “FOREIGN KEY Constraints”. For other storage engines, the clauses are parsed but ignored. The CHECK clause is parsed but ignored by all storage engines. See Section 12.1.17, “CREATE TABLE Syntax”. The reason for accepting but ignoring syntax clauses is for compatibility, to make it easier to port code from other SQL servers, and to run applications that create tables with references. See Section 1.8.5, “MySQL Differences from Standard SQL”.
I have no experience specific to my-sql, but with database-modeling in general
It is really important to understand the difference between primary and secondary keys.
Even if many db (I know for sure Oracle does) permit to specify an unique (simple or composite) key as the FK target, this is not considered a best practice. Use the PK instead.
FK to a secondary key should be used imo only to relate to tables that are not under your control.
In your specific case, I would certainly FK to widget_id: that is because the widget_id should be your PK, and the composite only made unique (and not null of course). This leads to better performance in mane cases, as you join only one column in queries, and is generally considered a best practice (google 'surrogate key' for more info)
MySQL will create an index on the column if there isn't one it can use:
InnoDB requires indexes on foreign
keys and referenced keys so that
foreign key checks can be fast and not
require a table scan. In the
referencing table, there must be an
index where the foreign key columns
are listed as the first columns in the
same order. Such an index is created
on the referencing table automatically
if it does not exist. (This is in
contrast to some older versions, in
which indexes had to be created
explicitly or the creation of foreign
key constraints would fail.)
index_name, if given, is used as
described previously.
http://dev.mysql.com/doc/refman/5.1/en/innodb-foreign-key-constraints.html
Let's say you have two tables. The first table looks like this:
TABLE: widget
- model_name (PK)
- model_year (PK)
- widget_id (not null, unique, auto-increment)
If you want to make another table that refers to unique records in the first table, it should look something like this:
TABLE B: sprocket
- part_number (PK)
- blah
- blah
- model_name_widget (FK to widget)
- model_year_widget (FK to widget)
- blah
With compound primary keys, you have to include all key fields in your FK references to make sure that you are uniquely specifying a record.
Related
At work we have a big database with unique indexes instead of primary keys and all works fine.
I'm designing new database for a new project and I have a dilemma:
In DB theory, primary key is fundamental element, that's OK, but in REAL projects what are advantages and disadvantages of both?
What do you use in projects?
EDIT: ...and what about primary keys and replication on MS SQL server?
What is a unique index?
A unique index on a column is an index on that column that also enforces the constraint that you cannot have two equal values in that column in two different rows. Example:
CREATE TABLE table1 (foo int, bar int);
CREATE UNIQUE INDEX ux_table1_foo ON table1(foo); -- Create unique index on foo.
INSERT INTO table1 (foo, bar) VALUES (1, 2); -- OK
INSERT INTO table1 (foo, bar) VALUES (2, 2); -- OK
INSERT INTO table1 (foo, bar) VALUES (3, 1); -- OK
INSERT INTO table1 (foo, bar) VALUES (1, 4); -- Fails!
Duplicate entry '1' for key 'ux_table1_foo'
The last insert fails because it violates the unique index on column foo when it tries to insert the value 1 into this column for a second time.
In MySQL a unique constraint allows multiple NULLs.
It is possible to make a unique index on mutiple columns.
Primary key versus unique index
Things that are the same:
A primary key implies a unique index.
Things that are different:
A primary key also implies NOT NULL, but a unique index can be nullable.
There can be only one primary key, but there can be multiple unique indexes.
If there is no clustered index defined then the primary key will be the clustered index.
You can see it like this:
A Primary Key IS Unique
A Unique value doesn't have to be the Representaion of the Element
Meaning?; Well a primary key is used to identify the element, if you have a "Person" you would like to have a Personal Identification Number ( SSN or such ) which is Primary to your Person.
On the other hand, the person might have an e-mail which is unique, but doensn't identify the person.
I always have Primary Keys, even in relationship tables ( the mid-table / connection table ) I might have them. Why? Well I like to follow a standard when coding, if the "Person" has an identifier, the Car has an identifier, well, then the Person -> Car should have an identifier as well!
Foreign keys work with unique constraints as well as primary keys. From Books Online:
A FOREIGN KEY constraint does not have
to be linked only to a PRIMARY KEY
constraint in another table; it can
also be defined to reference the
columns of a UNIQUE constraint in
another table
For transactional replication, you need the primary key. From Books Online:
Tables published for transactional
replication must have a primary key.
If a table is in a transactional
replication publication, you cannot
disable any indexes that are
associated with primary key columns.
These indexes are required by
replication. To disable an index, you
must first drop the table from the
publication.
Both answers are for SQL Server 2005.
The choice of when to use a surrogate primary key as opposed to a natural key is tricky. Answers such as, always or never, are rarely useful. I find that it depends on the situation.
As an example, I have the following tables:
CREATE TABLE toll_booths (
id INTEGER NOT NULL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
...
UNIQUE(name)
)
CREATE TABLE cars (
vin VARCHAR(17) NOT NULL PRIMARY KEY,
license_plate VARCHAR(10) NOT NULL,
...
UNIQUE(license_plate)
)
CREATE TABLE drive_through (
id INTEGER NOT NULL PRIMARY KEY,
toll_booth_id INTEGER NOT NULL REFERENCES toll_booths(id),
vin VARCHAR(17) NOT NULL REFERENCES cars(vin),
at TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
amount NUMERIC(10,4) NOT NULL,
...
UNIQUE(toll_booth_id, vin)
)
We have two entity tables (toll_booths and cars) and a transaction table (drive_through). The toll_booth table uses a surrogate key because it has no natural attribute that is not guaranteed to change (the name can easily be changed). The cars table uses a natural primary key because it has a non-changing unique identifier (vin). The drive_through transaction table uses a surrogate key for easy identification, but also has a unique constraint on the attributes that are guaranteed to be unique at the time the record is inserted.
http://database-programmer.blogspot.com has some great articles on this particular subject.
There are no disadvantages of primary keys.
To add just some information to #MrWiggles and #Peter Parker answers, when table doesn't have primary key for example you won't be able to edit data in some applications (they will end up saying sth like cannot edit / delete data without primary key). Postgresql allows multiple NULL values to be in UNIQUE column, PRIMARY KEY doesn't allow NULLs. Also some ORM that generate code may have some problems with tables without primary keys.
UPDATE:
As far as I know it is not possible to replicate tables without primary keys in MSSQL, at least without problems (details).
If something is a primary key, depending on your DB engine, the entire table gets sorted by the primary key. This means that lookups are much faster on the primary key because it doesn't have to do any dereferencing as it has to do with any other kind of index. Besides that, it's just theory.
In addition to what the other answers have said, some databases and systems may require a primary to be present. One situation comes to mind; when using enterprise replication with Informix a PK must be present for a table to participate in replication.
As long as you do not allow NULL for a value, they should be handled the same, but the value NULL is handled differently on databases(AFAIK MS-SQL do not allow more than one(1) NULL value, mySQL and Oracle allow this, if a column is UNIQUE)
So you must define this column NOT NULL UNIQUE INDEX
There is no such thing as a primary key in relational data theory, so your question has to be answered on the practical level.
Unique indexes are not part of the SQL standard. The particular implementation of a DBMS will determine what are the consequences of declaring a unique index.
In Oracle, declaring a primary key will result in a unique index being created on your behalf, so the question is almost moot. I can't tell you about other DBMS products.
I favor declaring a primary key. This has the effect of forbidding NULLs in the key column(s) as well as forbidding duplicates. I also favor declaring REFERENCES constraints to enforce entity integrity. In many cases, declaring an index on the coulmn(s) of a foreign key will speed up joins. This kind of index should in general not be unique.
There are some disadvantages of CLUSTERED INDEXES vs UNIQUE INDEXES.
As already stated, a CLUSTERED INDEX physically orders the data in the table.
This mean that when you have a lot if inserts or deletes on a table containing a clustered index, everytime (well, almost, depending on your fill factor) you change the data, the physical table needs to be updated to stay sorted.
In relative small tables, this is fine, but when getting to tables that have GB's worth of data, and insertrs/deletes affect the sorting, you will run into problems.
I almost never create a table without a numeric primary key. If there is also a natural key that should be unique, I also put a unique index on it. Joins are faster on integers than multicolumn natural keys, data only needs to change in one place (natural keys tend to need to be updated which is a bad thing when it is in primary key - foreign key relationships). If you are going to need replication use a GUID instead of an integer, but for the most part I prefer a key that is user readable especially if they need to see it to distinguish between John Smith and John Smith.
The few times I don't create a surrogate key are when I have a joining table that is involved in a many-to-many relationship. In this case I declare both fields as the primary key.
My understanding is that a primary key and a unique index with a not‑null constraint, are the same (*); and I suppose one choose one or the other depending on what the specification explicitly states or implies (a matter of what you want to express and explicitly enforce). If it requires uniqueness and not‑null, then make it a primary key. If it just happens all parts of a unique index are not‑null without any requirement for that, then just make it a unique index.
The sole remaining difference is, you may have multiple not‑null unique indexes, while you can't have multiple primary keys.
(*) Excepting a practical difference: a primary key can be the default unique key for some operations, like defining a foreign key. Ex. if one define a foreign key referencing a table and does not provide the column name, if the referenced table has a primary key, then the primary key will be the referenced column. Otherwise, the the referenced column will have to be named explicitly.
Others here have mentioned DB replication, but I don't know about it.
Unique Index can have one NULL value. It creates NON-CLUSTERED INDEX.
Primary Key cannot contain NULL value. It creates CLUSTERED INDEX.
In MSSQL, Primary keys should be monotonically increasing for best performance on the clustered index. Therefore an integer with identity insert is better than any natural key that might not be monotonically increasing.
If it were up to me...
You need to satisfy the requirements of the database and of your applications.
Adding an auto-incrementing integer or long id column to every table to serve as the primary key takes care of the database requirements.
You would then add at least one other unique index to the table for use by your application. This would be the index on employee_id, or account_id, or customer_id, etc. If possible, this index should not be a composite index.
I would favor indices on several fields individually over composite indices. The database will use the single field indices whenever the where clause includes those fields, but it will only use a composite when you provide the fields in exactly the correct order - meaning it can't use the second field in a composite index unless you provide both the first and second in your where clause.
I am all for using calculated or Function type indices - and would recommend using them over composite indices. It makes it very easy to use the function index by using the same function in your where clause.
This takes care of your application requirements.
It is highly likely that other non-primary indices are actually mappings of that indexes key value to a primary key value, not rowid()'s. This allows for physical sorting operations and deletes to occur without having to recreate these indices.
In this database, key1 & key2 make up the composite primary key of table4, but i'm able to add a foreign key to table3 that comprise just key1.
Why MySQL allows this? Does the above database design make no sense at all?
This answer takes the question's "add a foreign key to table3" to mean that a FK (foreign key) was added in table3 referencing one of the columns of the composite PK (primary key) of table4. In standard SQL a FK can reference a proper/smaller subset of a PK.
This other answer presumably takes "add a foreign key to table3" to mean that a FK was added in table4 with one of the columns of the PK referencing table3. A FK column set in a table is independent of any PK or UNIQUE declarations in it.
In standard SQL a FK can reference a proper/smaller subset of a PK.
The referenced column list must be declared PRIMARY KEY or UNIQUE. (PRIMARY KEY creates a UNIQUE NOT NULL constraint.) (The constraint has to be explicit, even though any set of NOT NULL columns containing a set that is UNIQUE has to be unique.)
Unfortunately MySQL lets you declare a FK referencing a column list that is not UNIQUE. Even though such a FK or one referencing non-NULL columns (OK in standard SQL) is not implemented properly, and the documentation itself advises not doing it:
The handling of foreign key references to non-unique keys or keys that contain NULL values is not well defined for operations such as UPDATE or DELETE CASCADE. You are advised to use foreign keys that reference only keys that are both UNIQUE (or PRIMARY) and NOT NULL.
(You can ponder just what are and are not the well-defined operations, since the documentation doesn't actually clarify.)
1.8.2.3 Foreign Key Differences
13.1.18 CREATE TABLE Syntax
13.1.18.6 Using FOREIGN KEY Constraints
PS Re relational vs SQL
In the relational model a FK references a CK (candidate key). A superkey is a unique column set. A CK is a superkey containing no smaller superkey. One CK can be called the PK (primary key). When a column set's values must appear elsewhere we say there is an IND (inclusion dependency). A FK is an IND to a CK. When an IND is to a superkey we could call that a "foreign superkey".
An SQL PK or UNIQUE NOT NULL declares a superkey. It is a CK when it does not contain a smaller column set declared as SQL PK or UNIQUE NOT NULL. SQL FK declares a foreign superkey. So an SQL PK might actually be a relational PK (hence CK) and a UNIQUE NOT NULL might actually be a CK. An SQL FK to one of these actually is a relational FK.
Foreign keys are not dependent of primary keys at all. In fact, they have nothing to do with primary keys.
A primary keys single purpose is to identify a row uniquely.
A foreign keys purpose is to make sure that for each entry in the referencing table (the child table) there must be an entry in the referenced table (the parent table). It's only a technical requirement in MySQL that there must be an index (not necessarily a primary key or a unique key, a simple index suffices) on the referenced column.
Therefore your design makes sense, yes.
I'm using MySQL 5.7 with WB6.3 for my study project to design the ER diagram.
When I try to create many-to-many relationship. By default MySQL will mark the primary keys of both tables as primary key in the lookup table as well.
Is it really need to have primary keys from both table as primary key in the lookup table as well ?
Please see the table car_item in my diagram below I've removed primary key from both red(idcar,iditem). Please tell me if PK is needed for them
There should be a unique key for (car_idcar, item_list_iditem_list), but it doesn't necessarily have to be the primary key. This way, you ensure that you don't create duplicate relationships between the same rows in the two tables.
A table can only have one primary key, and the car_item table already has primary key id_car_item, so the foreign keys for the relation can't also be a primary key. But there can be an arbitrary number of unique keys.
Some purists might say that if there are two unique keys (a primary key is also a unique key), one of them is redundant. In your case, the id_car_item column may not really be necessary, as it's not common to refer to relations in other tables, the relation table is just used to join the other two tables. But this isn't always the case. For instance, a user table might have a unique username column (since you don't allow multiple users to have the same name), but also a userid primary key that's used as the foreign key in other tables (this allows renaming users without having to update all the foreign keys). Some database designers like to have an auto-increment column in every table, as it's useful as a reference in user interface applications.
It is common practice to have a composite primary key in a table that implements many-to-many. This key should consist of primary keys from both tables on each side of the many-to-many relationship.
Such approach guarantees data consistency and eliminates duplication.
In your case you should define a composite primary key for table car_item that would consist of two fields { car_idcar, item_list_iditem_list }
The column id_car_item is not needed and can be dropped, it was probably added automatically, as many RDBMS add surrogate PK by default.
This may sound confusing and or simple but..
If I use a foreign key from Table B in Table A, which has a separate primary key. Do I need to include Table A's primary key as a foreign key in Table B?
Thanks!
========================================================================
EDIT:
Okay, let me try and clarify my question a bit.
In the case above, should I use Taco_ID as a FK in Table 2? Or is does it completely unnecessary?
In general, you don't usually make foreign keys bidirectionally like that. If you do, it means that the two tables exist in a 1-to-1 relationship: Each taco has a type, and each taco type can only be used by one taco. If you have a relationship like this, there's not really any reason to have them in separate tables, they could just be additional columns in the same table.
Normally foreign keys are used for 1-to-many or many-to-many relationships.
A 1-to-many relationship would be if many different tacos can be of the same type. They each have Taco_Type_ID foreign key.
For a many-to-many relationship, you typically use a separate relation table.
CREATE TABLE Taco_Types (
Taco_ID INT, -- FK to Table1.Taco_ID
Taco_Type_ID INT, -- FK to Table2.Taco_Type_ID
PRIMARY KEY (Taco_ID, Taco_Type_ID)
);
foreign keys and primary keys are SOMETIMES related, but not always. A foreign key in a table literally just means "whatever value is in this field MUST exist in this other table over ---> here". Whether that value is a PK or not in that other table is irrelevant - it just has to exist, and it has to be unique. That may fit the bill of being a primary key, but being primary is NOT required.
You can have composite foreign keys, e.g. in a (silly) address book, the child table you list a person's phone numbers in COULD be keyed with (firstname, lastname), but that runs into the problem of "ok, which John Smith does this number belong to?".
In most databases, foreign key references must be to primary keys or unique keys (and NULLs are allowed). MySQL recommends this but does not require it:
However, the system does not enforce a
requirement that the referenced columns be UNIQUE or be declared NOT
NULL. The handling of foreign key references to nonunique keys or keys
that contain NULL values is not well defined for operations such as
UPDATE or DELETE CASCADE. You are advised to use foreign keys that
reference only UNIQUE (including PRIMARY) and NOT NULL keys.
Do you need to reference primary keys for a foreign key relationship? First, you don't even need to declare foreign key relationships. I think they are important because they allow the database to maintain referential integrity. But, they are not required. Nor is there any semantic difference in queries, based on the presence of foreign keys (for instance, NATURAL JOIN does not use them). The optimize can make use of the declared relationship.
Second, if you are going to declare the foreign key, I would recommend using the primary key of the referenced table. That is, after all, one of the major reasons for having primary keys.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
mySQL's KEY keyword?
Like
PRIMARY KEY (ID),
KEY name (name),
KEY desc (desc),
etc.
what are they useful for?
Keys are used to enforce referential integrity in your database.
A primary key is, as its name suggests, the primary identification of a given row in your table. That is, each row's primary key will uniquely identify that row.
A unique key is a key that enforces uniqueness on that set of columns. It is similar to a primary key in that it will also uniquely identify a row in a table. However, there is the added benefit of allowing NULL in some of those combinations. There can only be 1 primary key, but you can have many unique keys.
A foreign key is used to enforce a relationship between 2 tables (think parent/child table). That way, a child table can not have a value of X in its parent column unless X actually appears in the parent table. This prevents orphaned records from appearing.
The primary key constraint ensures that the column(s) are:
not null
unique (unique sets if more than one column)
KEY is MySQL's terminology in CREATE TABLE statements for an index. Indexes are not ANSI currently, but all databases use indexes to speed up data retrieval (at the cost of insertion/update/deletion, because of maintenance to keep the index relevant).
There are other key constraints:
unique
foreign key (for referential integrity)
...but your question doesn't include examples of them.
keys are also called indexes. They are used for speeding up queries. Additionally keys can be constrains (unique key and foreign key). The primary key is also unique key and it identifies the records. The record can have other unique keys as well, that do not allow to duplicate a value in a given column. Foreign key enforces referential integrity (#Derek Kromm already wrote excellent description). The ordinary key is used only for speeding up queries. You need to index the columns used in the WHERE clause of the queries. If you have no index on the column, MySQL will need to read the whole table to find the records you need. When index is used, MySQL reads only the index (which is usually a B+ tree) and then read only those record from the table it found in the index.
Primary KEY is for creating unique/not null constraint for each row in the table. Also searching by this key is the fastest. You can create only one PK in the table.
Ordinary key/index is key for speeding your searching by this column, sorting, grouping and joining with other table by this key.
Indexes drawback:
Adding new indexes to table will influence on speed or running insert/update/delete statements. So you should select columns for indexing in your table very carefully.
Key are used for relation purposes between tables and you are able to create joins in order to select data from multiple tables
What, you didn't fine the wikipedia entry comprehensive? ;-)
So, a key, in a relational database (such as MySQL, PostgreSQL, Oracle, etc) is a data constraint on a column or set of columns. The most common keys are the Primary key and foreign keys and unique keys.
A foreign key specifically relates the data of one table to data in another table. You might see that a table blog_posts has a foreign key to users based on a user_id column. This means that every user_id in blog_posts will have a corresponding entry in the users column (this is a one-to-many relationship -- a topic for another time).
If a column (or group of columns) has a unique key, that means that there can only be one such incidence of the key in the table. Often you'll see things like email addresses be unique keys -- you only want one email address per user. I've also seen a combination of columns match to a unique key -- the five columns, first_name, last_name, address, city, and state, will often be a unique key -- realistically, there can only be one William Gates at 1835 73rd Ave NE, Medina, Washington. (I do realize that it is possible for a William Gates Jr. to be born, but the designers of that database didn't really care).
The primary key is the primary, unique identifier of a given table. By definition it is a unique key. It is something which cannot be null and must be unique. It holds a special place of prominence among the indexes of a given table.