What does changing the primary key of a table incur in MySQL? - mysql

In MySQL,
whenever I change the primary key of a table, is it correct that the original index on the original primary key will be removed, and a new index on the new primary key will be created?
is an index based on a primary key always clustered? If yes. when changing the primary key of a table, are the records in the table to be moved to be stored in the order of the new primary key?
Thanks.

MySQL's default storage engine is InnoDB. InnoDB always stores a table as a clustered index, using the primary key as the clustered index. See https://dev.mysql.com/doc/refman/8.0/en/innodb-index-types.html for details on this.
If you change the columns defined for your table's primary key, like the following for example:
ALTER TABLE MyTable DROP PRIMARY KEY, ADD PRIMARY KEY (id2);
This will require all the pages of that table to be copied to a new layout, using the newly defined primary key as the clustered index.
This isn't the only operation that requires a table-copy. Any ALTER TABLE that changes the size of a row will perform a table-copy. E.g. adding/dropping a column, changing a data type (with some exceptions), changing nullability of a column, etc. See https://dev.mysql.com/doc/refman/8.0/en/innodb-create-index-overview.html for details
P.S.: I don't bother to answer about MyISAM storage engine anymore. It's on its way to being deprecated. The sooner people stop considering MyISAM as a viable option, the better.

To answer the questions you asked:
1) Yes, a new index will be created for the new primary key.
2) For tables using the InnoDB storage engine, yes, the table will be reorganized with the new primary key as the cluster key (rows will be stored in index order by primary key; the table itself is organized as an index.) For tables using MyISAM storage engine, no.

Related

Does INDEX() create a clustered or non-clustered index in MySQL?

I am learning from a tutorial that uses INDEX() within a CREATE TABLE statement, but does not explain whether it is clustered or non-clustered. My question is: does INDEX() when used in a CREATE TABLE statement result in a clustered or non-clustered index?
For example:
CREATE TABLE test (a varchar(30), b varchar(30), index(a));
/* Is column A a clustered or non-clustered index? */
Also wondering how to do the opposite as well: if the example results in a non-clustered index, how do you write a clustered index, and vice versa?
TL;DR The primary key - and only the primary key - is a clustered index. If you don't explicitly define a primary key, the first suitable UNIQUE key is used. If you don't have either a primary key or a suitable UNIQUE key, MySQL generates a hidden clustered index. You cannot create a clustered index using INDEX().
As explained in the docs (emphasis added):
Every InnoDB table has a special index called the clustered index where the data for the rows is stored. Typically, the clustered index is synonymous with the primary key.
...
When you define a PRIMARY KEY on your table, InnoDB uses it as the clustered index. Define a primary key for each table that you create. If there is no logical unique and non-null column or set of columns, add a new auto-increment column, whose values are filled in automatically.
If you do not define a PRIMARY KEY for your table, MySQL locates the first UNIQUE index where all the key columns are NOT NULL and InnoDB uses it as the clustered index.
If the table has no PRIMARY KEY or suitable UNIQUE index, InnoDB internally generates a hidden clustered index on a synthetic column containing row ID values. The rows are ordered by the ID that InnoDB assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the row ID are physically in insertion order.
...
All indexes other than the clustered index are known as secondary indexes. In InnoDB, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index. InnoDB uses this primary key value to search for the row in the clustered index.
See also the definition of clustered index in the glossary, which defines it as "The InnoDB term for a primary key index," along with some additional details.
So, to answer your question, there's no way to create a clustered index, other than to create a primary key or, on a table without a primary key, a suitable UNIQUE key (all key columns NOT NULL). INDEX() just creates a secondary (i.e., non-clustered) key, no matter what you do with it.
* Note: as pointed out in the comments, some other databases don't have clustered indexes, at all, and some allow more than one clustered index on a table. I'm only addressing MySQL in my answer.
Is column A a clustered or non-clustered index?
It's a non-clustered index and only primary key field has clustered index. Remember, there can be only one clustered index in a table and thus it definitely can't create one.

How to find out size of indexes in mysql (including primary keys)

2 common answers are to use show_table_status and INFORMATION_SCHEMA.TABLES
But it seems, both of them don't count primary key's size.
I have tables with millions of records with primary key and no other indexes, and both of methods mentioned above show Index_length: 0 for that tables. Tables are INNODB.
Your primary key is your table. In an InnoDB the primary key contains the actual data thus if the primary key contains the data it is the table.
Think about it for a moment. You get two different types of indexes on an InnoDB table clustered and secondary indexes. The difference is that a clustered index contains the data and a secondary index contains the indexed columns and a pointer to the data. Thus a secondary index does not contain the data but rather the location of where the data is located in the CLUSTERED index.
Normally a primary key is a clustered index. It would be highly inefficient to store both the table with all its values and then a clustered index with all its values. This would effectively double the size of the table.
So when you have a primary key that is on an InnoDB the table size is the size of the primary key. In some database systems you can have a secondary index as a primary key and a separate index as a clustered key, however InnoDB does not allow this.
Go read the following links for more details:
http://dev.mysql.com/doc/refman/5.0/en/innodb-table-and-index.html
http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html
In these links they explain all I have said above in more detail. Simply put you already have the size of the primary key index as it is the size of your table.
Hope that helps.

Why does MySQL use a temporary table to drop a primary key?

When using the command:
ALTER TABLE my_table DROP PRIMARY KEY;
The state (when SHOW PROCESSLIST) appears as:
copy to tmp table
Why would it need to use a tmp table to "drop" a primary key constraint?
Consider the case of a composite primary key. In this case, the DB engine has to create a new clustered index from a synthetic key, which will require moving rows around. (Keep in mind that rows are physically ordered on disk by the primary key.) Given the rarity of this situation, it's not really worth handling the special case where your primary key is already an integer.

Is there a performance benefit to creating a multiple index on a primary key + foreign key?

If I have a table that has a primary key and a foreign key, and searches are frequently done with queries that include both (...WHERE primary=n AND foreign=x), is there any performance benefit to making a multiple index in MySQL using the two keys?
I understand that they are both indexes already, but I am uncertain if the foreign key is still seen as an index when included in another table. For example, would MySQL go to the primary key, and then compare all values of the foreign key until the right one is found, or does it already know where it is because the foreign key is also an index?
Update: I am using InnoDB tables.
For equality comparisons, you cannot get an improvement over the primary key index (because at that point, there is at most just one row that can match).
The access path would be:
look at the primary key index for primary = n
get the single matching row from the table
check any other conditions using the row in the table
A composite index might make some sense if you have a range scan on the primary key and want to narrow that down by the other column.

How do I stop DataContext.CreateDatabase creating a clustered index for the primary key of a table?

When using DataContext.CreateDatabase() to create a database, I wish to stop Linq To Sql creating a clustered index on the primary key of a table.
This is because I wish to create a normal index for the primary key, as I need the clustered index to spread up range queries on a date field.
It seems that you can't control what DataContext.CreateDatabase Does.