Using MySQL, moving from MyISAM to InnoDB tables. Database design started with dumping the data and re-importing it without foreign keys or constraints. Adding those one at a time to find errors.
I have ParentTable which can either be linked to ChildTableA or ChildTableB, but not both. Should the CREATE syntax be: (using CREATE syntax for simplicity rather than multiple ALTERs)
CREATE TABLE `ParentTable`
`IDParentTable` bigint(20) unsigned NOT NULL auto_increment,
`IDChildTableA` bigint(20) unsigned NOT NULL default '0',
`IDChildTableB` bigint(20) unsigned NOT NULL default '0',
PRIMARY KEY (`IDParentTable`),
KEY `ParentTable_IDChildTableA` (`IDChildTableA`),
KEY `ParentTable_IDChildTableB` (`IDChildTableB`)
ENGINE=InnoDB DEFAULT CHARSET=latin1;
Without thinking about it, I tried including:
CONSTRAINT `ParentTable_IDChildTableA` FOREIGN KEY (`IDChildTableA`) REFERENCES `ChildTableA` (`IDChildTableA`),
CONSTRAINT `ParentTable_IDChildTableB` FOREIGN KEY (`IDChildTableB`) REFERENCES `ChildTableB` (`IDChildTableB`)
Which failed, because many rows have 0 for IDChildTableA, and many rows have 0 for IDChildTableB. But, no rows have 0 for both. It's seeing that no ChildTableA exists with IDChildTableA of 0, and likewise with B.
Is there a proper way to handle this situation while keeping referential integrity? Without splitting ParentTable in two? A way to say it's OK if it's 0 or references a valid related table? Or, does wanting polymorphic tables mean I have to go without constraints?
BTW, I much prefer this route than having a single IDChildTable foreign key and then having another column designating whether it's table A or B... Not how I see that would work either for constraints, just saying I prefer not to go that route...
A column used as a foreign key can be nullable. Use a NULL value in the foreign key column to indicate "no row referenced."
It seems like you have your foreign keys backwards. Usually, the child table has a reference to the parent table.
parent (id int primary key)
childA (id int, parent_id int, ...)
childB (id int, parent_id int, ...)
EDIT
Related to the question regarding a foreign key column referencing two tables (based on a discriminator column)... that's not possible. A foreign key constraint can reference only one table.
To get something like that work, you'd need to add two separate foreign key columns, each referencing one target table. You could make use of the extra discriminator column (A or B) to identify which foreign key column should be used, so one fk column would be populated with a reference, the other fk column would be set to NULL.
However, there is no declarative constraint that would require exactly one of those two fk columns to be populated. That would not be enforced by the database. The extra discriminator column would actually be redundant, because you could derive that based on which foreign key column was populated.
does it make sense to create indexes for a table called user_movies with the following columns:
user_id
movie_id
There will be much more reading than inserting or updating on this table but I'm not sure what to do. Also: Is it adequate to omit a primary key in this situation?
The correct definition for this table is as follows:
CREATE TABLE user_movies (
user_id INT NOT NULL,
movie_id INT NOT NULL,
PRIMARY KEY (user_id, movie_id),
FOREIGN KEY (user_id) REFERENCES users(user_id),
FOREIGN KEY (movie_id) REFERENCES movies(movie_id)
) ENGINE=InnoDb;
Notice "primary key" is a constraint, not a column. It's best practice to have a primary key constraint in every table. Do not confuse primary key constraint with an auto-generated pseudokey column.
In MySQL, declaring a foreign key or a primary key implicitly creates an index. Yes, these are beneficial.
I would index both columns separately and yes you can eliminate the primary key.
I have always heard that you should create a unique index on BOTH columns, first one way (user_id + movie_id) then the other way (movie_id + user_id). It DOES work slightly faster (not much, about 10-20%) in my application with some quick and dirty testing.
It also makes sure you can't have two rows that tie the same movie_id to the same user_id (which could be good, but perhaps not always).
If you are using such a "join-table", you'll probably use some joins in your queries -- and those will probably benefit from an index on each one of those two columns (which means two separate indexes).
I'm looking at the MySQL docs here and trying to sort out the distinction between FOREIGN KEYs and CONSTRAINTs. I thought an FK was a constraint, but the docs seem to talk about them like they're separate things.
The syntax for creating an FK is (in part)...
[CONSTRAINT [symbol]] FOREIGN KEY
[index_name] (index_col_name, ...)
REFERENCES tbl_name (index_col_name,...)
So the "CONSTRAINT" clause is optional. Why would you include it or not include it? If you leave it out does MySQL create a foreign key but not a constraint? Or is it more like a "CONSTRAINT" is nothing more than a name for you FK, so if you don't specify it you get an anonymous FK?
Any clarification would be greatly appreciated.
Thanks,
Ethan
Yes, a foreign key is a type of constraint. MySQL has uneven support for constraints:
PRIMARY KEY: yes as table constraint and column constraint.
FOREIGN KEY: yes as table constraint, but only with InnoDB and BDB storage engines; otherwise parsed but ignored.
CHECK: parsed but ignored in all storage engines.
UNIQUE: yes as table constraint and column constraint.
NOT NULL: yes as column constraint.
DEFERRABLE and other constraint attributes: no support.
The CONSTRAINT clause allows you to name the constraint explicitly, either to make metadata more readable or else to use the name when you want to drop the constraint. The SQL standard requires that the CONSTRAINT clause is optional. If you leave it out, the RDBMS creates a name automatically, and the name is up to the implementation.
In general (not necessary MySQL), foreign keys are constraints, but constraints are not always foreign keys. Think of primary key constraints, unique constraints etc.
Coming back to the specific question, you are correct, omitting CONSTRAINT [symbol] part will create a FK with an auto-generated name.
As of now, our CREATE TABLE DDLs are of this format - notice the UNIQUE KEY and FOREIGN KEY definition syntax we have used.
CREATE TABLE my_dbschema.my_table (
id INT unsigned auto_increment PRIMARY KEY,
account_nbr INT NOT NULL,
account_name VARCHAR(50) NOT NULL,
active_flg CHAR(1) NOT NULL DEFAULT 'Y',
vendor_nbr INT NOT NULL,
create_ts TIMESTAMP NOT NULL DEFAULT current_timestamp,
create_usr_id VARCHAR(10) NOT NULL DEFAULT 'DFLTUSR',
last_upd_ts TIMESTAMP NOT NULL DEFAULT current_timestamp ON UPDATE current_timestamp,
last_upd_usr_id VARCHAR(10) NOT NULL DEFAULT 'DFLTUSR',
UNIQUE KEY uk1_my_table(account_nbr, account_name),
FOREIGN KEY fk1_my_table(vendor_nbr) REFERENCES vendor(vendor_nbr)
);
In this format, MySQL is creating INDEX-es with the names uk1_my_table and fk1_my_table automatically; but the FK object name is something different - my_table_ibfk_1 (ie. tablename_ibfk_N – system defined) . So ALTER TABLE my_table DROP FOREIGN KEY fk1_my_table won’t work (and hence frustrating and raising alarms), as there’s no FK db object by that name.
Here’s an alternative DDL format wrt the constarints (Ref : https://dev.mysql.com/doc/refman/5.6/en/create-table-foreign-keys.html) :-
CREATE TABLE my_dbschema.my_table (
id INT unsigned auto_increment PRIMARY KEY,
account_nbr INT NOT NULL,
account_name VARCHAR(50) NOT NULL,
active_flg CHAR(1) NOT NULL DEFAULT 'Y',
vendor_nbr INT NOT NULL,
create_ts TIMESTAMP NOT NULL DEFAULT current_timestamp,
create_usr_id VARCHAR(10) NOT NULL DEFAULT 'DFLTUSR',
last_upd_ts TIMESTAMP NOT NULL DEFAULT current_timestamp ON UPDATE current_timestamp,
last_upd_usr_id VARCHAR(10) NOT NULL DEFAULT 'DFLTUSR',
CONSTRAINT uk1_my_table UNIQUE KEY (account_nbr, account_name),
CONSTRAINT fk1_my_table FOREIGN KEY (vendor_nbr) REFERENCES vendor(vendor_nbr)
);
In this format, MySQL is still creating INDEX-es with the names uk1_my_table and fk1_my_table automatically, but the FK object name is not something different – it’s fk1_my_table as mentioned in the DDL. So ALTER TABLE my_table DROP FOREIGN KEY fk1_my_table works, but leaves behind the namesake INDEX.
And, note that ALTER TABLE my_table DROP INDEX fk1_my_table won’t work initially (when the FK is not yet dropped), with an error message that it is being used in a FK! If the DROP FK command has been executed successfully, only then the DROP INDEX works.
Hope this explains and helps resolve the confusion.
Can't answer for MySQL but FK's are constraints. Anything that forces your data into a certain condition is a constraint. There are several kinds of constraints, Unique, Primary Key, Check and Foreign Keys are all constraints. Maybe MySQL has others.
Sometimes words are allowed in commands but not required sheerly for readability like the FROM in the DELETE statement.
If I'm not wrong, the constraints need indexes, so when you create, for example, a foreign key constraint MySQL automatically creates an index too.
I am going to throw my hat in the ring here, although I don't actually know if my answer is accurate, so if you know the internal guts of database engineering, please correct me. But if I am right, I think this will help.
A Foreign Key and its associated Foreign Key Constraint are not the same thing, in the way a car engine and a crank-shaft are not the same thing. The engine converts gasoline explosions into straight line motion (the pistons), and the crankshaft converts that straight-line motion into turning motion, which then turns the wheels of the car. Together, the engine and the crank-shaft make the car go.
Likewise, a Foreign Key and a Foreign Key Constraint are not the same thing, but they work together to create the idea of a "Foreign Key Relationship".
DEFINITIONS:
"Foreign Key" is short for Foreign Key Index.
"Constraint" is short for Foreign Key Constraint.
The Index and the Constraint together make the "Foreign Key Relationship".
The Foreign Key Relationship is the requirement for a value in a child table to exist in its parent table, thus ensuring data integrity in a database.
Because Key means Index, we don't say "Foreign Key Index". We just say Foreign Key, but not saying "Index" is the cause of much confusion.
Creating a Foreign Key (a Foreign Key Index) creates a Binary Search Tree (also called a dictionary, because the tree is used to look up values). The Binary Search Tree (BST) then exists in computer memory and takes up physical disk space, but allows for O(log n) JOIN access time (almost instantly) from the child table to the parent table.
Creating a Foreign Key Constraint is creating a rule, which is a piece of code that gets called when you process (INSERT, SELECT, etc...) on the foreign key column. A constraint is essentially a database trigger. A constraint is like an email filter: a piece of code that gets called on a certain action, such as WHEN (new email) IF (From: crzy#xgfrnd.com) {SEND TO Trash Folder;}.
Thus, a Foreign Key "Constraint" would be a piece of code that gets called (a trigger, essentially), that looks like such: WHEN INSERT child_column IF (NOT IN parent_table) DO NOT ALLOW INSERT.
And then you have your Cascades and Updates and Delete rules (constraints), wth their various if / then conditions and operations, etc...
So, a "Foreign Key" is a BST dictionary mapping child table column values to parent table column values. The purpose of the Foreign Key is speed (NOT data integrity, since data integrity can be achieved, albeit slowly, without an index).
A Foreign Key Constraint is a rule: code that gets triggered on SQL statements, and that rule uses the BST as a dictionary for fast processing, to avoid traversing tables, which may eventually create Cartesian-like behavior. The purpose of the Foreign Key Constraint is data integrity.
I have never created a parent table where the referenced parent column was not itself a key in the parent table. So the question then is, is the Foreign Key Index (the BST dictionary) actually needed? The Constraint is definitely needed, to ensure data integrity, but Foreign Key Index (the BST dictionary) is actually not needed to fulfill the Foreign Key rule, thus, "Key" and "Index" have two different definitions. The "Index" is the BST tree, and the "Key" is the rule (the idea that the child value must exist in the parent table). In MySQL, however, the Foreign Key Index is needed, only because they programmed it that way, but they didn't have to. Having a BST tree is just faster, when the parent column is not itself indexed. I would never recommend making the referenced parent column not a key (index). But if someone did reference a non-indexed parent column using a Foreign Key Constraint without a BST, then the SQL operations would be progressively slow, and your application may eventually come to a crawl.
THE CONFUSION: Adjectives and verbs.
When we say "Foreign Key" colloquially, we are usually referring to the Foreign Key Relationship, not the Foreign Key Index. But the word Key means Index. So that's the root source of all the confusion. I.e. lack of reserved keyword definition standards. In a MySQL CREATE TABLE statement, FOREIGN KEY means the Index (the BST), and CONSTRAINT names the rule, therefore, the confusion is coming from the difference in the phrase "Foreign Key" when we speak, versus "Foreign Key" being defined in an actual SQL statement.
In computer code, "Foreign" is an adjective and "Key" is a noun, meaning the Index.
In colloquial speech, the phrase "Foreign Key" is an adjective, and the words "Index", "Constraint", and "relationship" are all nouns. When we speak to each other across office cubicles, "Foreign Key" means the "idea" of data integrity (i.e. the rule, not the index).
Unfortunately, programmers are always searching for short-hand ways of typing, which often causes confusion. Everything in computer science is a trade-off, and that includes coding style.
If the syntax for creating a MySQL table instead used the following reserved words, then the confusion would disappear: FOREIGN KEY CONSTRAINT fk1_rule_child_column FOREIGN KEY INDEX (fk_bst_child_to_parent_column) REFERENCES parent_table (parent_column)
Furthermore, since MySQL always creates both an Index and a Constraint, the MySQL creators could have completely hidden (abstraction) the dual element of the Foreign Key Relationship. Or, perhaps I should say, they could have bundled them together so the user doesn't have to think about the dual aspect, and instead just creates a "Foreign Key" my_foreign_key with the dual details hidden.
Nevertheless, MySQL is inexpensive, robust, and it's great. For the record, I have zero complaints, and I have only gratitude for the creators. For my part, they can do as they please.
Incidentally, as a style recommendation, you should ALWAYS name your parent and child columns the same, and your table names should ALWAYS contain their Foreign Key Relationships. So your tables
customers
products
attributes
orders
should instead be named
customers
products
product_attributes
customer_product_orders
That way, you and your successors know the foreign key relationships just by reading any table name. If that's too much typing for you, then
cust
prod
prod_attr
cust_prod_ord
That being said, I am guessing. I don't actually know if my BST and Rule explanation is correct, but I think it's correct, and hopefully will clear up this confusing issue. But I would appreciate if you database-guts guys out there would either confirm, modify, supplement, or deny what I have written, and if I am mistaken, what is the real answer, so we can finally get this multi-generational mystery solved. If I am completely off, and this answer needs to be deleted, that is fine too.
This is probably the most confusing topìc in MySQL.
Many people say that, for instance, the 'PRIMARY KEY', the 'FOREIGN KEY', and the 'UNIQUE' key are actually indexes! (MySQL official documentation is included here)
Many others, on the other hand, say that they're rather constraints (which does make sense, cause when you use them, you're really imposing restrictions on the affected columns).
If they really are indexes, then what's the point in using the constraint clausule in order to give it a name, since you're supposed to be able to use the name of that index when you created it?
Example:
... FOREIGN KEY index_name (col_name1, col_name2, ...)
If FOREIGN KEY is an index, then we should be able to use the index_name to handle it. However, we can't.
But if they're not indexes but actual contraints which do use indexes to work, then this does make sense.
In any case, we don't know. Actually, nobody seems to know.