I have a table like this:
something
a [INT]
b [INT]
c [INT]
...where a, b and c are separate Foreign Keys pointing to three different table.id. Since I want to make all regs be unique, and after having read this great answer, I think I should create a new Index this way: UNIQUE INDEX(a, b, c) and (in my case) do IGNORE INSERTS.
But as you can see, I would have one KEY for each column and then another extra UNIQUE INDEX containing all three. Is this a normal thing? It seems strange to me, and I have never seen it.
It is perfectly normal and reasonable to include a column in more than one index. However, if the combination of (a, b, c) is enough to uniquely identify a row it seems that you want a PRIMARY index instead of a UNIQUE one here (technically there is very little difference, but semantically it might be the better choice).
Creating a Primary Key if Something (a, b, c) will invalidate the need for a unique index. An additional Unique index would make sense if your primary key was Something(a, b) and you wanted a Unique Index (a, b, c). But since all three columns are Foreignkey then a Primary key index is what you.
Related
We have a table that is currently using a composite (i.e. multi-column) index.
Let's say
PRIMARY KEY(A, B)
Of course we can rapidly search based on A alone (Leftmost Index Prefix) and if we want to efficiently search based on B alone, we need to create a separate index for B.
My question is that if I am doing:
PRIMARY KEY (B)
is there any value in retaining
PRIMARY KEY (A,B)
In other words will there be any advantage retaining
PRIMARY KEY (A,B)
if I have
PRIMARY KEY (A)
and
PRIMARY KEY (B)
You are missing a key point about PRIMARY KEY -- it is by definition (at least in MySQL), UNIQUE. And do not have more columns than are needed to make the PK unique.
If B, aloneis unique, then havePRIMARY KEY(B)` without any other columns in the PK definition.
If A is also unique, then do
PRIMARY KEY(B),
UNIQUE(A)
or swap them.
For a longer discussion of creating indexes, see my cookbook.
If it takes both columns to be "unique", then you may need
PRIMARY KEY(A, B),
INDEX(B)
or
PRIMARY KEY(B, A),
INDEX(A)
Until you have the SELECTs, it is hard to know what indexes to create.
You can't have multiple primary keys, so I'm going to assume you're really asking about having an ordinary index.
If you have an index on (A, B), it will be used for queries that use both columns, like:
WHERE A = 1 AND B = 2
as well as queries that just use A:
WHERE A = 3
But if you have a query that just uses B, e.g.
WHERE B = 4
it will not be able to use the index at all. If you need to optimize these queries, you should also have an index on B. So you might have:
UNIQUE KEY (A, B)
INDEX (B)
I have child table, A, which needs to refer to either of two different tables, B and C. B and C are similar but need to be in different tables.
As I understand it, mysql only allows a FK to refer to one table. Therefore, and having looked at other solutions, I've decided to create two columns in A to refer to either B or C. As it should only be B or C i've added in a constraint to prevent them both being NOT NULL:
CREATE TABLE conversions
(
id INT AUTO_INCREMENT,
kicker_id INT NOT NULL,
success BOOL NOT NULL,
try_id INT,
penalty_try_id INT,
PRIMARY KEY (id),
FOREIGN KEY (try_id),
FOREIGN KEY (penalty_try_id),
CONSTRAINT conversions_coll_null CHECK (try_id IS NULL OR penalty_try_id IS NULL)
);
Will this work? Is it a good design?
Thanks
This is a fine approach (assuming you add in the foreign key definitions), but with an important caveat: MySQL does not actually enforce check constraints. So, although you can include the constraint in the definition, it doesn't do anything.
If you want to insist on the constraint, then you need to use a trigger.
By the way, if you want to ensure that exactly one of the columns has a value, use XOR rather than OR. This would be expressed as:
CHECK (try_id IS NULL XOR penalty_try_id IS NULL)
(Or course, this doesn't do anything in MySQL, but it is just to show the correct logic.)
There is a table that contains more id data than real data data.
user_id int unsigned NOT NULL,
project_id int unsigned NOT NULL,
folder_id int unsigned NOT NULL,
file_id int unsigned NOT NULL,
data TEXT NOT NULL
The only way to create a unique primary key for this table would be a composite of (user_id, project_id, folder_id, file_id). I have frequently seen 2 column composite primary keys, but is it ok to have 4 or even more? According to MySQL: "All storage engines support at least 16 indexes per table and a total index length of at least 256 bytes. Most storage engines have higher limits.", so I know at least it is possible to do.
Past this, there are frequent queries to this table for various combinations of these ids. For example, find all projects for user X, find all files for user X, find all files for project Y and folder Z, etc. Should there be a separate individual index key on each of the id columns, or if there is a composite primary key that already contains all the columns does this make further individual keys redundant? There will be about 10 million - 50 million rows in the table at any time.
To summarize: is it ok to have a composite primary key with 4 (or more) id columns, and if there is a composite key does it make additional individual keys for each of those columns redundant?
Yes, it is ok to have a composite primary key with 4 or more columns.
It doesn't necessarily make additional keys for each of those columns redundant. For example, a key (a, b, c) will not be useful for a query SELECT ... WHERE b = 4. For that type of query you would rather have key (b) or key (b, c).
You need to examine your expected queries to determine which indexes you'll need. See this talk for more details: http://youtu.be/AVNjqgf7zNw
Yes this is OK if the data model supports it. You haven't shared much about your overall DB schema and how these items related to each other to determine if this might be considered the best approach. In other words is this truly the only way in which these for items are related to each other, or for example are the files REALLY related to projects and projects related to users or something like that such the splitting up these joins tables makes more logical sense.
If you are querying individual columns within this primary key, this might suggest to me that your schema is not quite correct. At a minimum you might need to add individual index on these columns to support such a query.
You're going to regret creating a compound primary key, it becomes really obnoxious to address individual rows and derivative indexes in MySQL must contain the primary key as a row identifier. You can create a UNIQUE that's compound, though.
You can have a composite key with a fairly large number of components, though keep in mind the more you add the bigger the index will get and the slower it will be to update when you do an INSERT. As your database grows in size, insert operations may get cripplingly slow.
This is why, whenever possible, you should try and minimize your index size.
I am creating an Innodb table with four columns.
Table
column_a (tiny_int)
column_b (medium_int)
column_c (timestamp)
column_d (medium_int)
Primary Key -> column_a, column_b, column_c
From a logical standpoint, columns A, B, C must be made into a PK together.However, to increase performance and be able to read directly from the index (using index) I am considering a PK that comprises of all 4 columns (A, B, C, D).
QUESTION
What would the performance be of appending an additional column to the Primary Key on an Innodb table?
CONSIDERATIONS
Surrogate primary keys are absolutely out of the question
No other indexes will exist on this table
Table is read/write intensive (both about equal)
Thank you!
In InnoDB, the PRIMARY KEY index structure includes all non-key fields and will automatically use them for covering index queries and row elimination. There is no separate "data" structure other than the PRIMARY KEY index structure. It is not necessary to add additional fields to the PRIMARY KEY definition itself. Note that it won't show Using index when it's using the PRIMARY KEY on an InnoDB table, because it's a different code path which doesn't trigger the addition of that message.
A few things to consider:
Unless the query in question uses all of the columns in the index, the index will not be used.
As jeremycole notes: in the Innodb structure all row data is stored in the B-tree leaf nodes of the clustered index (PRIMARY INDEX)
This concept is covered:
http://www.innodb.com/wp/wp-content/uploads/2009/05/innodb-file-formats-and-source-code-structure.pdf
http://blog.johnjosephbachir.org/2006/10/22/everything-you-need-to-know-about-designing-mysql-innodb-primary-keys/
... and in jeremy's blog post here:
http://blog.jcole.us/2013/01/07/the-physical-structure-of-innodb-index-pages/
As such, a query on A, B, C will be sufficient for efficiently obtaining all values on this Innodb table.
I have two tables, let's say they are called table A and table B. An item from table B can be present in multiple instances of A, and each A can contain multiple Bs so I have a table called a_b which links them together by their primary keys. My question is when I define this association table, should I have a primary key on the association table? Or is it not needed? Just trying to avoid ending up on TDWTF, that's all :)
The primary key would be on the table A PK column and table B PK column in your association table. That way, you ensure you don't get any duplicate rows in your association table by accident.
One of the main purposes of primary keys is to guarantee referential integrity. That is, keep the data in your table clean, with no duplicates. The PK in this case will ensure you never have 2 duplicate rows in the association table.
I think you might want to use a primary key in order to show your intent. If for example you do not want
a, b
a, b
Then a primary key defined on A.a and B.b would make that more clear. If you don't care, but you have a,b and other fields, then adding a surrogate key as your primary key might help in giving you a uniform way to delete a row that you do not want. Otherwise you will have to delete where a=a and b=b and ?? then pick some field value from the row you want deleted. Whereas with a surrogate key you can just pick the row and say delete where mykey = 36 or something...
But really it depends on the business case. Many intersect tables have some kind of date range, or additional fields related to the relationship in addition to the keys of the two tables. Defining a primary key on the existing columns, a new surrogate key, some unique indexes, some constraints, or even having no indexes could all be valid courses of action depending upon your needs.
I would say definitely do whatever makes your intentions the most clear.
Not needed. Both keys should form the primary key of your association table. If you're going to be doing bidirectional navigation, consider adding an index with the keys reversed.
The primary key is needed always.
However, I'd say it depends what should it be. If you are going to use some sort of ORM systems (e.g. Hibernate) then it is best to have a surrogate identifier, while those two foreign keys (pointing to tables A and B) should form a unique index.
Also, if there would ever be a need to reference such a relationship from another table then this surrogate identifier would be really handy.