I'm working on a small pizza delivery website and I ran in to a small problem with the MySQL tables.
I found this on Stackoverflow: https://stackoverflow.com/a/10322293/80907
It mentions the following:
Both tables have to use the INNODB engine.
"In the referencing table, there must be an index where the foreign key columns are listed as the first columns in the same order. "
The first isn't really a problem, but the second rule is where I'm scratching my head.
This is a website where you can order pizzas, so I'm saving all data on the users and their order in the database.
Here's a screenshot of what I'm about to write out:
So I'll have a "Users" table and an "Orders" table. The Users will have to have a one-to-many relationship to Orders. Simply put, the order is identified by the user who created it. So it's one-to-many, identifying.
This means that the "Orders" table will have a foreign key, such as "Users_id".
The problem arises when you have to make a table for the many-to-many relationship between the Pizzas table and the Orders table.
This table, let's call it "Order_Details"(MySQL Workbench automatically called it "Orders_has_Pizzas") must reference both "Orders" and "Pizzas".
Now, since "Orders" already references the Users table in an identifying relationship, that's part of the primary key of "Orders".
And let's get that rule out once again:
"In the referencing table, there must be an index where the foreign key columns are listed as the first columns in the same order. "
What this means is that you must reference the entire primary key. If I delete the "Order_Users_id" key, I'll get a 1005 error upon trying to create the database.
My question is simply: is there a way around this? Because as it is right now, I have that User id mentioned in 3 different tables.
Or, am I not understanding it properly and is it indeed necessary to do this for the sake of not having to query different tables for that data?
EDIT: People seem to disagree with me on the relation between Users and Orders being identifying.
I don't see how an individual order can be identified without knowing the id of the user. After the order is made, someone is going to have to deliver the pizza, meaning they'll need to know where to deliver it. That data is in the Users table. Therefor, the Users_id is part of the identity of a single order.
That's how I see it anyway. If I'm wrong, please explain why.
EDIT 2: Thanks to a_horse_with_no_name for clarifying the concept of "identity" in terms of databases, I see the error of my logic now. Info can be found in the comments.
To answer your original question, no, InnoDB foreign key constraints are not required to reference the entire primary key of the referenced table.
In other words, both of the following work in InnoDB:
mysql> ALTER TABLE Orders_has_Pizzas ADD FOREIGN KEY (Orders_id)
REFERENCES Orders (id);
Query OK, 0 rows affected (0.63 sec)
mysql> ALTER TABLE Orders_has_Pizzas ADD FOREIGN KEY (Orders_id,Orders_Users_id)
REFERENCES Orders (id, Users_id);
Query OK, 0 rows affected (0.02 sec)
In fact, InnoDB allows a foreign key to reference any indexed column, even if it's not part of the primary key:
mysql> CREATE TABLE Foo (fooid INT PRIMARY KEY, nonunique INT, KEY (nonunique));
Query OK, 0 rows affected (0.05 sec)
mysql> CREATE TABLE Bar (barid INT PRIMARY KEY, foo_nonid INT, FOREIGN KEY (foo_nonid)
REFERENCES Foo(nonunique));
Query OK, 0 rows affected (0.06 sec)
However, this is not standard SQL and I don't recommend doing it. It means that a row in Bar could reference more than one row in the parent table Foo. Which means a JOIN between these two on the foreign-primary key relationship could unexpectedly create a sort of mini-Cartesian product.
In the Orders table, it's possible for either column of a compound primary key to contain duplicates. Which means a given row in Orders_has_Pizzas could theoretically reference multiple Orders.
As for the question about an identifying relationship, I would agree that Orders has an identifying relationship with respect to Users. That is, it makes no sense for an order to exist with no referenced user.
But in a table where we use an auto-incrementing mechanism to generate unique id's, it seems redundant and unnecessary to add the extra column to the PK. Why would we need Orders to contain the users id? The id alone is guaranteed unique and therefore sufficient to uniquely address each row.
I would say that's a practical choice, whereas the theory would guide us to create the compound primary key in Orders.
It becomes more clear in a many-to-many table like your Orders_has_Pizzas. This table has an identifying relationship with both Orders and Pizzas. The primary key consists of two foreign keys, one referencing Orders and the other referencing Pizzas. There's no need for an auto-increment PK at all.
Some people add a superfluous auto-increment id for many-to-many tables, for the sake of a convention that every table has to have a single-column automatically-generated primary key. But there's no theoretical or practical reason to do this.
CREATE TABLE Orders_has_Pizzas (
id INT AUTO_INCREMENT PRIMARY KEY, -- what is this column for?
Orders_id INT,
Orders_Users_id INT,
Pizzas_id INT,
);
Related
I know that foreign keys need not reference only primary keys but they can also reference a field that has a unique constraint on it. For my scenario, I am setting up a quiz where for each test, I have a set of questions. My table design is like this
The point is, in my 2nd table where I will put all the answer options, I want the question number field to link to the first table question number. How do I do this? Or is there an alternative to this design?
Thank you
Ideally there should be a question_id primary key column in the test_question table, and you would use this as the foreign key in the test_answer table.
With your composite primary key in the test_question table, you should make a corresponding composite foreign key:
CONSTRAINT FOREIGN KEY (test_id, question_no) REFERENCES test_question (test_id, question_no)
This is in addition to the foreign key just for the test_id column.
Add another table purely for answers, and link them via the question_no field.
A DB table should hold information on one sort of item. Questions and answers are separate sorts of information so should be in separate tables. Adding a separate table also allows changes to questions and answers independently. Additionally, if they are separate, you could add a language field to each table and have a multi-lingual quiz
Short answer:
You can JOIN on any columns or expressions. There is no "requirement" for a FOREIGN KEY, PRIMARY KEY, UNIQUE, or anything else.
Long answer:
However,... For performance (in large tables), some things make a difference.
If you are JOINing to a PK, Unique key, or even an indexed column, the query cold run faster.
Why have a FOREIGN KEY? An FK is two things:
A "constraint" that says that the value must exist in the other table. Also, with things like ON DELETE CASCADE, it can provide actions to take if the indicated row is removed. The constraint requires looking in the other table each time a write occurs (eg INSERT).
An Index. That is, specifying a FK automatically adds an INDEX (if not already present) to make the constraint faster.
Getting the id
Here is the "usual" way to do a pair of inserts, where you need the second to 'point' to the first:
INSERT INTO t1 ... -- with an AUTO_INCREMENT id
grab LAST_INSERT_ID() -- that id
INSERT INTO t2 ... -- and include the id from above
For AUTO_INCREMENT to work it must be the first column of some key. (Note: a PRIMARY KEY is a UNIQUE is a key (aka INDEX).)
Optionally you can specify a FK on the second table to point out the connection between the tables.
And, as spelled out in other answers, a FK could involve more than one column.
Entities and Relations
Sometimes, a set of tables like yours is best 'designed' this way:
Determine the "entities": users, tests, questions, answers
Relations and whether they are 1:1, 1:many, or many:many... Users:test is many-to-many; tests:questions is 1:many (unless you want questions to be shared between tests).
Answers is more complex since each 1 answer depends on the user and question.
1:1 -- rarely practical; may as well merge the tables together.
1:many -- a link (FK?) in one table to the other.
many:many -- need a bridge table with (usually) 2 columns, namely ids linking to the two tables.
for exmaple, has course relationship table, student id and course id is multi-unique, if i create this relationship table, should i use auto-incr column as PK, or use student id and course id as multi-PK ?
Some people add auto-increment column as PK to just every table.
But I believe it is good to have a multi-column-PK in the case where the table is a relationship table between two or more tables.
On the other hand, it is more effort to delete a multi-column-PK table entry, because you need to give all columns in the multi-column-PK.
Also, check whether your technology stack (programming language) has problems with multi-column-PK.
This is something of a matter of opinion, but I put a synthetic primary key (auto-incremented id) in almost every table I create, including association/junction tables.
Why? Here are some reasons:
If I need to delete or update rows, then the primary key simplifies the process and reduces the change for error.
The primary key captures the insertion order of the rows.
If the row needs to be referred to by another table, then you can refer to it by a primary key.
In some databases, the primary key is used to cluster the data (that is, sort the data on the data pages). An auto-incremented primary key ensures that data goes "at the end". A natural primary key can result in fragmented data.
As an example of the third point, you might have an attendance table that records -- by day -- whether a student attended a class s/he is enrolled in. This could refer to the enrollment table.
I am stuck in a problem where i have to find cardinality of a relationship between tables using mysql. Following this post MySQL: How to determine foreign key relationships programmatically?
I have found all tables related to my table and the columns which are foreign key. Now i also want to find the cardinality of relationship i.e. one-to-one, one-to-many or many-to-many. Any ideas or snippets would be highly appreciated
Let us assume that table A has a foreign key f which refers to the primary key k of table B. Then you can learn the following from the schema:
If there is a UNIQUE constraint on A.f, then there can be at most one row in A for every row in B. Note that in the case of multi-column indices, all columns of the unique constraint must be part of the foreign key. You can use SHOW INDEX FROM tablename WHERE Non_unique = 0 to obtain information on the uniqueness constraints of a table.
If A.f is declared NOT NULL, then there will always be at least one row in B for every row in A. You can use SHOW COLUMNS FROM tablename to list the columns and see which of them allow NULL values.
If you interpret “one” as “zero or one”, then you get a one-to-one relation using a unique constraint, and a many-to-one relation (i.e. many rows in A referring to one row in B) without such a unique constraint.
A many-to-many relation would be modeled using a separate table, where each row represents one element of the relation, with many-to-one relations for both foreign keys it contains.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
mySQL's KEY keyword?
Like
PRIMARY KEY (ID),
KEY name (name),
KEY desc (desc),
etc.
what are they useful for?
Keys are used to enforce referential integrity in your database.
A primary key is, as its name suggests, the primary identification of a given row in your table. That is, each row's primary key will uniquely identify that row.
A unique key is a key that enforces uniqueness on that set of columns. It is similar to a primary key in that it will also uniquely identify a row in a table. However, there is the added benefit of allowing NULL in some of those combinations. There can only be 1 primary key, but you can have many unique keys.
A foreign key is used to enforce a relationship between 2 tables (think parent/child table). That way, a child table can not have a value of X in its parent column unless X actually appears in the parent table. This prevents orphaned records from appearing.
The primary key constraint ensures that the column(s) are:
not null
unique (unique sets if more than one column)
KEY is MySQL's terminology in CREATE TABLE statements for an index. Indexes are not ANSI currently, but all databases use indexes to speed up data retrieval (at the cost of insertion/update/deletion, because of maintenance to keep the index relevant).
There are other key constraints:
unique
foreign key (for referential integrity)
...but your question doesn't include examples of them.
keys are also called indexes. They are used for speeding up queries. Additionally keys can be constrains (unique key and foreign key). The primary key is also unique key and it identifies the records. The record can have other unique keys as well, that do not allow to duplicate a value in a given column. Foreign key enforces referential integrity (#Derek Kromm already wrote excellent description). The ordinary key is used only for speeding up queries. You need to index the columns used in the WHERE clause of the queries. If you have no index on the column, MySQL will need to read the whole table to find the records you need. When index is used, MySQL reads only the index (which is usually a B+ tree) and then read only those record from the table it found in the index.
Primary KEY is for creating unique/not null constraint for each row in the table. Also searching by this key is the fastest. You can create only one PK in the table.
Ordinary key/index is key for speeding your searching by this column, sorting, grouping and joining with other table by this key.
Indexes drawback:
Adding new indexes to table will influence on speed or running insert/update/delete statements. So you should select columns for indexing in your table very carefully.
Key are used for relation purposes between tables and you are able to create joins in order to select data from multiple tables
What, you didn't fine the wikipedia entry comprehensive? ;-)
So, a key, in a relational database (such as MySQL, PostgreSQL, Oracle, etc) is a data constraint on a column or set of columns. The most common keys are the Primary key and foreign keys and unique keys.
A foreign key specifically relates the data of one table to data in another table. You might see that a table blog_posts has a foreign key to users based on a user_id column. This means that every user_id in blog_posts will have a corresponding entry in the users column (this is a one-to-many relationship -- a topic for another time).
If a column (or group of columns) has a unique key, that means that there can only be one such incidence of the key in the table. Often you'll see things like email addresses be unique keys -- you only want one email address per user. I've also seen a combination of columns match to a unique key -- the five columns, first_name, last_name, address, city, and state, will often be a unique key -- realistically, there can only be one William Gates at 1835 73rd Ave NE, Medina, Washington. (I do realize that it is possible for a William Gates Jr. to be born, but the designers of that database didn't really care).
The primary key is the primary, unique identifier of a given table. By definition it is a unique key. It is something which cannot be null and must be unique. It holds a special place of prominence among the indexes of a given table.
I thought a foreign key meant that a single row must reference a single row, but I'm looking at some tables where this is definitely not the case. Table1 has column1 with a foreign key constraint on column2 in table2, BUT there are many records in table2 with the same value in column2. There's also non-unique index on column2. What does this mean? Does a foreign key constraint simply mean that at least one record must exist with the right values in the right columns? I thought it meant there must be exactly one such record (not sure how nulls fit in to the picture, but I'm less concerned about that at the moment).
update: Apparently, this behavior is specific to MySQL, which is what I was using, but I didn't mention it in my original question.
From MySQL documentation:
InnoDB allows a foreign key constraint to reference a non-unique key. This is an InnoDB extension to standard SQL.
However, there is a pratical reason to avoid foreign keys on non-unique columns of referenced table. That is, what should be the semantic of "ON DELETE CASCADE" in that case?
The documentation further advises:
The handling of foreign key references to nonunique keys or keys that contain NULL values is not well defined (...) You are advised to use foreign keys that reference only UNIQUE (including PRIMARY) and NOT NULL keys.
Your analysis is correct; the keys don't have to be unique, and constraints will act on the set of matching rows. Not usually a useful behavior, but situations can come up where it's what you want.
When this happens, it usually means that two foreign keys are being linked to each other.
Often the table that would contain the key as a primary key isn't even in the schema.
Example: Two tables, COLLEGES and STUDENTS, both contain a column called ZIPCODE.
If we do a quick check on
SELECT * FROM COLLEGES JOIN STUDENTS ON COLLEGES.ZIPCODE = STUDENTS.ZIPCODE
We might discover that the relationship is many to many. If our schema had a table called ZIPCODES, with primary key ZIPCODE, it would be obvious what's really going on.
But our schema has no such table. Just because our schema has no such table doesn't mean that such data doesn't exist, however. somewhere, out in USPO land, there is just such a table. And both COLLEGES.ZIPCODE and STUDENTS.ZIPCODE are references to that table, even if we don't acknowledge it.
This has more to do with the philosophy of data than the practice of building databases, but it neatly illustrates something fundamental: the data has characteristics that we discover, and not only characteristics that we invent. Of course, what we discover could be what somebody else invented. That's certainly the case with ZIPCODE.
Yes, you can create foreign keys to basically any column(s) in any table. Most times you'll create them to the primary key, though.
If you do use foreign keys that don't point to a primary key, you might also want to create a (non-unique) index to the column(s) being referenced for the sake of performance.
Depends on the RDBMS you're using. I think some do this for you implicitly, or use some other tricks. RTM.
PostgreSQL also refuses this (anyway, even if it is possible, it does not mean it is a good idea):
essais=> CREATE TABLE Cities (name TEXT, country TEXT);
CREATE TABLE
essais=> INSERT INTO Cities VALUES ('Syracuse', 'USA');
INSERT 0 1
essais=> INSERT INTO Cities VALUES ('Syracuse', 'Greece');
INSERT 0 1
essais=> INSERT INTO Cities VALUES ('Paris', 'France');
INSERT 0 1
essais=> INSERT INTO Cities VALUES ('Aramits', 'France');
INSERT 0 1
essais=> INSERT INTO Cities VALUES ('Paris', 'USA');
INSERT 0 1
essais=> CREATE TABLE People (name TEXT, city TEXT REFERENCES Cities(name));
ERROR: there is no unique constraint matching given keys for referenced table "cities"
Necromancing.
As others already said, you shouldn't reference a non-unique key as foreign key.
But what you can do instead (without delete cascade danger) is adding a check-constraint (at least in MS-SQL).
That's not exactly the same as a foreign key, but at least it will prevent the insertion of invalid/orphaned/dead data.
See here for reference (you'll have to port the MS-SQL code to MySQL syntax):
Foreign Key to non-primary key
Edit:
Searching for the reasons for the downvote, according to Mysql CHECK Constraint, MySQL doesn't really support CHECK constraints.
You can define them in your DDL query for compatibility reasons, but they are just ignored...
But as mentioned there, you can create a BEFORE INSERT and BEFORE UPDATE trigger, which will throw an error when the requirements of the data are not met, which is basically the same thing, except that it's an even bigger mess.
As to the question:
I thought a foreign key meant that a single row must reference a
single row, but I'm looking at some tables where this is definitely
not the case.
In any sane RDBMS, this is true.
The fact that this is possible in MySQL is just one more reason why
MySQL is an in-sane RDBMS.
It may be fast, but sacrificing referential integrity and data quality on the altar of speed is not my idea of a quality-rdbms.
In fact, if it's not ACID-compliant, it's not really a (correctly functioning) RDBMS at all.
What database are we talking about? In SQL 2005, I cannot create a foreign key constraint that references a column that does not have a unique constraint (primary key or otherwise).
create table t1
(
id int identity,
fk int
);
create table t2
(
id int identity,
);
CREATE NONCLUSTERED INDEX [IX_t2] ON [t2]
(
[id] ASC
);
ALTER TABLE t1 with NOCHECK
ADD CONSTRAINT FK_t2 FOREIGN KEY (fk)
REFERENCES t2 (id) ;
Msg 1776, Level 16, State 0, Line 1
There are no primary or candidate keys in the referenced table 't2'
that match the referencing column list in the foreign key 'FK_t2'.
Msg 1750, Level 16, State 0, Line 1
Could not create constraint. See previous errors.
If you could actually do this, you would effectively have a many-to-many relationship, which is not possible without an intermediate table. I would be truly interested in hearing more about this ...
See this related question and answers as well.