I have an extremely large MySQL table that I would like to partition. A simplified create of this table is as given below -
CREATE TABLE `myTable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`columnA` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`columnB` varchar(50) NOT NULL ,
`columnC` int(11) DEFAULT NULL,
`columnD` varchar(255) DEFAULT NULL,
`columnE` int(11) DEFAULT NULL,
`columnF` varchar(255) DEFAULT NULL,
`columnG` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_B` (`columnB`),
UNIQUE KEY `UNIQ_B_C` (`columnB`,`columnC`),
UNIQUE KEY `UNIQ_C_D` (`columnC`,`columnD`),
UNIQUE KEY `UNIQ_E_F_G` (`columnE`,`columnF`,`columnG`)
)
I want to partition my table either by columnA or id, but the problem is that the MySQL Manual states -
In other words, every unique key on the table must use every column in the table's partitioning expression.
Which means that I cannot partition the table on either of those columns without changing my schema. For example, I have considered adding id to all my unique keys like so -
CREATE TABLE `myTable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`columnA` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`columnB` varchar(50) NOT NULL ,
`columnC` int(11) DEFAULT NULL,
`columnD` varchar(255) DEFAULT NULL,
`columnE` int(11) DEFAULT NULL,
`columnF` varchar(255) DEFAULT NULL,
`columnG` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_B` (`columnB`,`id`),
UNIQUE KEY `UNIQ_B_C` (`columnB`,`columnC`,`id`),
UNIQUE KEY `UNIQ_C_D` (`columnC`,`columnD`,`id`),
UNIQUE KEY `UNIQ_E_F_G` (`columnE`,`columnF`,`columnG`,`id`)
)
Which I do not mind doing except for the fact that it allows for the creation of rows that should not be created. For example, by my original schema, the following row insertion wouldn't have worked twice -
INSERT into myTable (columnC, columnD) VALUES (1.0,2.0)
But it works with the second schema as columnC and columnD by themselves no longer form a unique key. I have considered getting around this by using triggers to prevent the creation of such rows but then the trigger cost would reduce(or outweigh) the partitioning performance gain
Edited:
Some additional information about this table:
Table has more than 1.2Billion records.
Using Mysql 5.6.34 version with InnoDB Engine and running on AWS RDS.
Few other indexes are also there on this table.
Because of huge data and multiple indexes it is an expensive process to insert and retrieve the data.
There are no unique indexes on timestamp and float data types. It was just a sample table schema for illustration. Our actual table has similar schema as above table.
Other than Partitioning what options do we have to improve the
performance of the table without losing any data and maintaining the
integrity constraints.
How do I partition a MySQL table that contains several unique keys?
Sorry to say, you don't.
Also, you shouldn't. Remember that UPDATE and INSERT operations to a table with unique keys necessarily must query the table to ensure the keys stay unique. If it were possible to partition a table so unique keys weren't built in to the partititon expression, then every insert or update would require querying every partition. This would be likely to make the partitioning worse than useless.
I have those two tables schema:
CREATE TABLE `myTable` (
id int(11) NOT NULL AUTO_INCREMENT,
lat double NOT NULL,
lng double NOT NULL,
date datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
mobile bigint(11) unsigned NOT NULL,
date_updated datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `IDX_Datee` (`mobile`,`date`),
CONSTRAINT `FK_DeviceLocationss` FOREIGN KEY (`mobile`) REFERENCES `device` (`serial`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
And here is the second one:
CREATE TABLE `myTable2` (
lat double NOT NULL,
lng double NOT NULL,
date datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
mobile bigint(11) unsigned NOT NULL,
date_updated datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY `IDX_Datee2` (`mobile`,`date`),
CONSTRAINT `FK_DeviceLocationss2` FOREIGN KEY (`mobile`) REFERENCES `device` (`serial`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
In every table there are around 4,000,000 records till now,
So I'm trying to build the most suitable schema which is more fast and less storage consuming.
When I check the state of each Table in MySql Workbeanch I got little confusing:
First Table:
Second Table
When I changed the IDX_Datee key from Index to Primary, It doesn't consume any space.
I believe the second schema is better for me, But I don't have a good understand about that difference.
Can anyone explain that?
The table is index organized. The datarecords are stored in index order.
see https://dev.mysql.com/doc/refman/5.5/en/optimizing-primary-keys.html
"With the InnoDB storage engine, the table data is physically organized to do ultra-fast lookups and sorts based on the primary key column or columns"
so there is no extra index necessary
All operations (select, insert, delete, update) on a single row specified by the PK will be very fast and efficient. Drill down the BTree that contains the data and is organized by the PK, and there is the row to work with.
The PK takes a tiny amount of space, just as any BTree is more than the leaf nodes. As a Rule-Of-Thumb, MySQL's BTrees (data or index) have a fanout of about 100. That is, each node has about 100 nodes under it. This implies that there is only about 1% overhead for the non-leaf nodes for the 'rest' of the PK overhead.
16KB / 61 is about 268 -- your "fanout".
For starters, I will suggest that DOUBLE (8 bytes) is gross overkill for latitude and longitude unless you are trying to distinguish one flea from another on a dog. Here is my table of representation choices for lat/lng.
INT is 4 bytes. If you are sure you won't go past 16 million, change the PK to MEDIUMINT UNSIGNED (3 bytes). (I suggest this is too risky.)
The size of the PK is doubly important because it is included in every secondary key.
If (mobile, date) is unique, the it may as well be the PK. That shaves off two copies of id, and speeds up queries based on mobile.
If mobile contains phone numbers, well some numbers won't fit. Better off going with DECIMAL(11) takes 5 bytes; (13) takes 6. If, instead, mobile is an AUTO_INCREMENT in some other table, the perhaps even SMALLINT UNSIGNED (2 bytes per copy, per table) would be better.
Your First table has 4 extra columns (relative to the Second table): id--twice, mobile, and date.
I need to give my website users the ability to select their country, province and city. So I want to display a list of countries, then a list of provinces in the selected country, then a list of cities in the selected province (I don't want any other UI solution for now). Of course, every name must be in the user's language, so I need additional tables for the translations.
Let's focus on the case of the cities. Here are the two tables:
CREATE TABLE `city` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`province_id` int(10) unsigned DEFAULT NULL
PRIMARY KEY (`id`),
KEY `idx_fk_city_province` (`province_id`),
CONSTRAINT `fk_city_province` FOREIGN KEY (`province_id`) REFERENCES `province` (`id`)
) ENGINE=InnoDB;
CREATE TABLE `city_translation` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`city_id` int(10) unsigned NOT NULL,
`locale_id` int(10) unsigned DEFAULT NULL,
`name` varchar(255) DEFAULT NULL
PRIMARY KEY (`id`),
KEY `idx_fk_city_translation_city` (`city_id`),
KEY `idx_fk_city_translation_locale` (`locale_id`),
KEY `idx_city_translation_city_locale` (`city_id`,`locale_id`),
CONSTRAINT `fk_city_translation_city` FOREIGN KEY (`city_id`) REFERENCES `city` (`id`),
CONSTRAINT `fk_city_translation_locale` FOREIGN KEY (`locale_id`) REFERENCES `locale` (`id`)
) ENGINE=InnoDB;
The city table contains 4 millions rows and the city_translation table 4 millions × the number of the languages available on my website. This is 12 millions now. If in the future I want to support 10 languages, it will be 40 millions...
So I am wondering: is it a bad idea (performance wise) to work with a table of this size, or is a good index (here on the join fields, city_id and locale_id) sufficient to make the size not matter?
If not, what are the common solutions used to solve this specific --but I guess common-- problem? I'm only interested in performance. I'm ok to denormalize if necessary, or even to use other tools if they are more appropriate (ElasticSearch?).
Get rid of id in city_translations. Instead have PRIMARY KEY(city_id, locale_id). With InnoDB, this may double the speed because of cutting out an unnecessary step in the JOINs. And you can shrink the disk footprint by also removing the two indexes starting with city_id.
Do you think you will go beyond 16M cities? I doubt it. So save one byte by changing (in all tables) city_id to MEDIUMINT UNSIGNED.
Save 3 bytes by changing locale_id to TINYINT UNSIGNED.
Those savings are multiplied by the number of columns and indexes mentioning them.
How big are the tables (GB)? What is the setting of innodb_buffer_pool_size? How much RAM is there? See if you can make that setting bigger than the total table size and yet no more than 70% of available memory. (That's the only "tunable" that is worth checking.)
I hope you have a default of CHARACTER SET utf8mb4 for the sake of Chinese users. (But that is another story.)
I'm having some problems creating a foreign key to an existing table in a MySQL database.
I have the table exp:
+-------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+-------+
| EID | varchar(45) | NO | PRI | NULL | |
| Comment | text | YES | | NULL | |
| Initials | varchar(255) | NO | | NULL | |
| ExpDate | date | NO | | NULL | |
| InsertDate | date | NO | | NULL | |
| inserted_by | int(11) unsigned | YES | MUL | NULL | |
+-------------+------------------+------+-----+---------+-------+
and I wan't to create a new table called sample_df referencing this, using the following:
CREATE TABLE sample_df (
df_id mediumint(5) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
sample_type mediumint(5) UNSIGNED NOT NULL,
df_10 boolean NOT NULL,
df_100 boolean NOT NULL,
df_1000 boolean NOT NULL,
df_above_1000 boolean NOT NULL,
target int(11) UNSIGNED NOT NULL,
assay mediumint(5) UNSIGNED ZEROFILL NOT NULL,
insert_date timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
inserted_by int(11) UNSIGNED NOT NULL,
initials varchar(255),
experiment varchar(45),
CONSTRAINT FOREIGN KEY (inserted_by) REFERENCES user (iduser),
CONSTRAINT FOREIGN KEY (target) REFERENCES protein (PID),
CONSTRAINT FOREIGN KEY (sample_type) REFERENCES sample_type (ID),
CONSTRAINT FOREIGN KEY (assay) REFERENCES assays (AID),
CONSTRAINT FOREIGN KEY (experiment) REFERENCES exp (EID)
);
But I get the error:
ERROR 1215 (HY000): Cannot add foreign key constraint
To get some more information, I did:
SHOW ENGINE INNODB STATUS\G
From which I got:
FOREIGN KEY (experiment) REFERENCES exp (EID)
):
Cannot find an index in the referenced table where the
referenced columns appear as the first columns, or column types
in the table and the referenced table do not match for constraint.
To me, the column types seem to match, since they are both varchar(45). (I also tried setting the experiment column to not null, but this didn't fix it). So I guess the problem must be that
Cannot find an index in the referenced table where the referenced columns appear as the first columns.
But I'm not quite sure what this means, or how to check/fix it. Does anyone have any suggestions? And what is meant by first columns?
Just throwing this into the mix of possible causes, I ran into this when the referencing table column had the same "type" but did not have the same signing.
In my case, the referenced table colum was TINYINT UNSIGNED and my referencing table column was TINYINT SIGNED. Aligning both columns solved the issue.
This error can also occur, if the references table and the current table don't have the same character set.
According to http://dev.mysql.com/doc/refman/5.5/en/create-table-foreign-keys.html
MySQL requires indexes on foreign keys and referenced keys so that
foreign key checks can be fast and not require a table scan. In the
referencing table, there must be an index where the foreign key
columns are listed as the first columns in the same order.
InnoDB permits a foreign key to reference any index column or group of
columns. However, in the referenced table, there must be an index
where the referenced columns are listed as the first columns in the
same order.
So if the index in referenced table is exist and it is consists from several columns, and desired column is not first, the error shall be occurred.
The cause of our error was due to violation of following rule:
Corresponding columns in the foreign key and the referenced key must
have similar data types. The size and sign of integer types must be
the same. The length of string types need not be the same. For
nonbinary (character) string columns, the character set and collation
must be the same.
As mentioned #Anton, this could be because of the different data type.
In my case I had primary key BIGINT(20) and tried to set foreight key with INT(10)
Mine was a collation issue between the referenced table and the to be created table so I had to explicitly set the collation type of the key I was referencing.
First I ran a query at referenced table to get its collation type
show table STATUS like '<table_name_here>';
Then I copied the collation type and explicitly stated employee_id's collation type at the creation query. In my case it was utf8_general_ci
CREATE TABLE dbo.sample_db
(
id INT PRIMARY KEY AUTO_INCREMENT,
event_id INT SIGNED NOT NULL,
employee_id varchar(45) COLLATE utf8_general_ci NOT NULL,
event_date_time DATETIME,
CONSTRAINT sample_db_event_event_id_fk FOREIGN KEY (event_id) REFERENCES event (event_id),
CONSTRAINT sample_db_employee_employee_id_fk FOREIGN KEY (employee_id) REFERENCES employee (employee_id)
);
In my case, it turned out the referenced column wasn't declared primary or unique.
https://stackoverflow.com/a/18435114/1763217
For me it was just the charset and collation of the DB. I changed to utf8_unicode_ci and works
In my case, it was an incompatibility with ENGINE and COLLATE, once i added ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci it worked
CREATE TABLE `some_table` (
`id` varchar(36) NOT NULL,
`col_id` varchar(36) NOT NULL,
PRIMARY KEY (`id`),
CONSTRAINT `FK_some_table_cols_col_id` FOREIGN KEY (`col_id`) REFERENCES `ref_table` (`id`) ON DELETE CASCADE,
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
The exact order of the primary key also needs to match with no extra columns in between.
I had a primary key setup where the column order actually matches, but the problem was the primary key had an extra column in it that is not part of the foreign key of the referencing table
e.g.) table 2, column (a, b, c) -> table 1, column (a, b, d, c) -- THIS FAILS
I had to reorder the primary key columns so that not only they're ordered the same way, but have no extra columns in the middle:
e.g.) table 2, column (a, b, c) -> table 1, column (a, b, c, d) -- THIS SUCCEEDS
I had this error as well. None of the answers pertained to me. In my case, my GUI automatically creates a table with a primary unique identifier as "unassigned". This fails when I try and create a foreign key and gives me the exact same error. My primary key needs to be assigned.
If you write the SQL itself like so id int unique auto_increment then you don't have this issue but for some reason my GUI does this instead id int unassigned unique auto_increment.
Hope this helps someone else down the road.
In my case was created using integer for the id, and the referencing table was creating by default a foreign key using bigint.
This caused a big nightmare in my Rails app as the migration failed but the fields were actually created in DB, so they showed up in the DB but not in the schema of the Rails app.
Referencing the same column more than once in the same constraint also produces this Cannot find an index in the referenced table error, but can be difficult to spot on large tables. Split up the constraints and it will work as expected.
In some cases, I had to make the referenced field unique on top of defining it as the primary key.
But I found that not defining it as unique doesn't create a problem in every case. I have not been able to figure out the scenarios though. Probably something to do with nullable definition.
Just to throw another solution in the mix. I had on delete set to set null but the field that i was putting the foreign key on was NOT nullable so making it nullable allowed the foreign key to be created.
As others have said the following things can be an issue
Field Length - INT -> BIGINT, VARCHAR(20) -> VARCHAR(40)
Unsigned - UNSIGNED -> Signed
Mixed Collations
And just to add to this , I've had the same issue today
Both fields were int of same length etc, however, one was unsigned and this was enough to break it.
Both needed to be declared as unsigned
I had the same problem with writing this piece of code in the OnModelCreating method
My problem was completely solved and my tables and migrations were created without errors. Please try it
var cascadeFKs = modelBuilder.Model.GetEntityTypes()
.SelectMany(t => t.GetForeignKeys())
.Where(fk => !fk.IsOwnership && fk.DeleteBehavior == DeleteBehavior.Cascade);
foreach (var fk in cascadeFKs)
fk.DeleteBehavior = DeleteBehavior.Restrict;
It is mostly because the old table you are referring to does not have the suitable data type / collation / engine with the new table. The way to detect the difference is dumping that old table out and see how the database collect the information to have the dump script
mysqldump -uroot -p -h127.0.0.1 your_database your_old_table > dump_table.sql
It will give you enough information for you to compare
create table your_old_table
(
`id` varchar(32) not null,
) Engine = InnoDB DEFAULT CHARSET=utf8mb3;
This only works if you have the permission to dump your table scheme
I spent hours trying to get this to work. It turned out I had an older version of Heidi, 11.0.0.5919 which was not displaying the UNSIGNED attribute in the create table statement (Heidi bug), which I had used to copy from. Couldn't see it in the table design view either.
So the original table had an UNSIGNED attribute, but my foreign key didn't. The solution was upgrading Heidi, and adding the UNSIGNED attribute in the create table .
CREATE TABLE `extension` (
`ExtensionId` INT NOT NULL,
`AccountId` INT NOT NULL,
PRIMARY KEY (`ExtensionId`) USING BTREE,
INDEX `AccountId` (`AccountId`) USING BTREE,
CONSTRAINT `AccountId` FOREIGN KEY (`AccountId`) REFERENCES `accounts` (`AccountId`) ON UPDATE NO ACTION ON DELETE NO ACTION,
)
COLLATE='utf8mb4_0900_ai_ci'
ENGINE=InnoDB
;
Change to:
`AccountId` INT UNSIGNED NOT NULL,
If Data type is same, Probably error is due to different Charset and Collation, try altering column as referenced column's Character set and Collate with a query something like this,
ALTER TABLE table_name MODIFY COLUMN
column_name VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci;
and then run the query for foreign key..
(alter charset of a table and database didn't work for me on incompatible error, until I alter specific column like my above suggestion)
In my case I referred directly to a PRIMARY KEY and got the error shown above. After adding a "normal" index additionally to my primary key it worked:
ALTER TABLE `TableName` ADD INDEX `id` (`id`);
Now it looks like this:
When I try to drop the INDEX id again I get following error:
(1553): Cannot drop index 'id': needed in a foreign key constraint.
EDIT:
This is not a new question - I just want to show the way i solved this problem for me and what kind of problems may occur.
I ran a comparison INSERTing rows into an empty table using MySQL 5.6.
Each table contained a column (ascending) that was incremented serially by AUTO_INCREMENT, and a pair of columns (random_1, random_2) that receive random, unique numbers.
In the first test, ascending was PRIMARY KEY and (random_1, random_2) were KEY. In the second test, (random_1, random_2) were PRIMARY KEY and ascending was KEY.
CREATE TABLE clh_test_pk_auto_increment (
ascending_pk BIGINT UNSIGNED NOT NULL AUTO_INCREMENT, -- PK
random_ak_1 BIGINT UNSIGNED NOT NULL, -- AK1
random_ak_2 BIGINT UNSIGNED, -- AK2
payload VARCHAR(40),
PRIMARY KEY ( ascending_pk ),
KEY ( random_ak_1, random_ak_2 )
) ENGINE=MYISAM
AUTO_INCREMENT=1
;
CREATE TABLE clh_test_auto_increment (
ascending_ak BIGINT UNSIGNED NOT NULL AUTO_INCREMENT, -- AK
random_pk_1 BIGINT UNSIGNED NOT NULL, -- PK1
random_pk_2 BIGINT UNSIGNED, -- PK2
payload VARCHAR(40),
PRIMARY KEY ( random_pk_1, random_pk_2 ),
KEY ( ascending_ak )
) ENGINE=MYISAM
AUTO_INCREMENT=1
;
Consistently, the second test (where the auto-increment column is not the PRIMARY KEY) runs slightly faster -- 5-6%. Can anyone speculate as to why?
Primary keys are often used as the sequence in which the data is actually stored. If the primary key is incremented, the data is simply appended. If the primary key is random, that would mean that existing data must be moved about to get the new row into the proper sequence. A basic (non-primary-key) index is typically much lighter in content and can be moved around faster with less overhead.
I know this to be true for other DBMS's; I would venture to guess that MySQL works similarly in this respect.
UPDATE
As stated by #BillKarwin in comments below, this theory would not hold true for MyISAM tables. As a followup-theory, I'd refer to #KevinPostlewaite's answer below (which he's since deleted), that the issue is the lack of AUTO_INCREMENT on a PRIMARY KEY - which must be unique. With AUTO_INCREMENT it's easier to determine that the values are unique since they are guaranteed to be incremental. With random values, it may take some time to actually walk the index to make this determination.