I'm trying to insert (from postgres via grails) about 10 millions records into a table with a primary key and 2 foreign keys. If I keep the all primary and foreign keys and the indexes automatically generated along with these keys, it'll take about 7.5 hours to complete. If I drop all the keys and indexes before the inserts, it'll take only 10 minutes to executes all the inserts. But when I used ALTER TABLE to add the keys back in, it took forever (more than 7 hours) to perform. Is there a way to improve the performance?
The concept table that this table linked to has about 1 million records.
Here's the CREATE TABLE statement:
CREATE TABLE `concept_relationship` (
`concept_id_1` int(11) NOT NULL,
`concept_id_2` int(11) NOT NULL,
`relationship_id` int(11) NOT NULL,
`valid_start_date` date NOT NULL,
`valid_end_date` date NOT NULL DEFAULT '2099-12-31',
`invalid_reason` char(1) DEFAULT NULL,
PRIMARY KEY (`concept_id_1`,`concept_id_2`,`relationship_id`),
KEY `concept_id_1` (`concept_id_1`),
KEY `concept_id_2` (`concept_id_2`),
KEY `relationship_id` (`relationship_id`),
CONSTRAINT `FK_CONCEPT_REL_child` FOREIGN KEY (`concept_id_2`) REFERENCES `concept` (`concept_id`),
CONSTRAINT `FK_CONCEPT_REL_Parent` FOREIGN KEY (`concept_id_1`) REFERENCES `concept` (`concept_id`),
CONSTRAINT `FK_CONCEPT_REL_REL_TYPE` FOREIGN KEY (`relationship_id`) REFERENCES `relationship` (`relationship_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Thanks for your help
First, the index concept_id_1 is not needed. The primary key covers this index entirely.
My suggestion is to create the table without the keys or foreign references, except for the primary key. When you insert into the table, be sure that the input data is sorted by the keys of the primary key. Then add back the other keys with explicit index creation:
create index concept_relationship_idx1 on concept_relationship(concept_id_1);
And so on.
If this doesn't work efficiently, then reconsider the primary key. The data is actually ordered by the primary key, which can be computationally intensive for inserts. Add an auto-incremented primary key. Insert the data. Then create a unique index for what is now the primary key, and indexes for the other keys.
Related
Got an odd problem I cant solve after browsing dozens of forum posts, and my local SQL Books.
I've got two tables, and want to add a foreign key to one of them. The foreign key and primary key share the same datatype and charset and yet I cannot add the Foreign Key at all.
addon_account
name
type
comments
id
int(11)
Primary Key
name
varchar(60)
Primary Key
label
varchar(255)
shared
int(11)
addon_account_data
name
type
comments
id
int(11)
Primary Key
account_name
varchar(60)
Primary Key
money
double
owner
varchar()
The query I ran:
ALTER TABLE `addon_account_data` ADD FOREIGN KEY (`account_name`) REFERENCES `addon_account`(`name`) ON DELETE RESTRICT ON UPDATE RESTRICT;
Can't get it to work. Tosses out the same issue the entire time.
You are creating a foreign key on addon_account_data(account_name) that references addon_account(name). You have a composite primary the referred table : addon_account(id, name).
This is not allowed in MySQL, as explained in the documentation:
MySQL requires indexes on foreign keys and referenced keys so that foreign key checks can be fast and not require a table scan. In the referencing table, there must be an index where the foreign key columns are listed as the first columns in the same order.
Possible solutions:
add an additional column in the referring table: addon_account_data(account_id, account_name) and create a composite primary key to the corresponding columns in addon_account
create an index on addon_account(name) (probably the simplest solution)
change the order of the columns in the primary key of the referred table, like: addon_account(name, id) (you might want to first consider the impacts this may have in terms of performance)
I am not exactly a MySQL guy, but:
I believe the problem is that you are referencing only part of the primary key:
Your table addon_account has a composite key PK(id, name).
So, to put your relationship to work, you will need to add 'account_id' as part of the foreign key as well:
ALTER TABLE addon_account_data ADD FOREIGN KEY (account_id, account_name) REFERENCES addon_account(id, name)
This thread deals with something similar.
I hope this helps.
EDITED
I have installed a MySQL server instance on my local machine... (MySQL 8).
I have run the script below, and it worked (giving warnings about integer display being a deprecated feature, so I would recommend ommitting it):
CREATE TABLE addon_account(
id INT(11) NOT NULL,
`name` VARCHAR(60) NOT NULL,
label VARCHAR(255),
shared INT(11),
CONSTRAINT pk_addon_account PRIMARY KEY(id, `name`));
CREATE TABLE addon_account_data (
id INT(11) NOT NULL,
account_name VARCHAR(60) NOT NULL,
account_id INT(11),
money DOUBLE,
`owner` VARCHAR(255),
CONSTRAINT pk_addon_account_data PRIMARY KEY(id, account_name),
CONSTRAINT fk_addon_account_account_data FOREIGN KEY(account_id, account_name)
REFERENCES addon_account(id, `name`));
Could you try it and see if this works for you?
I am not that familiar with MySQL.
make sure that the 2 tables have the same collation
like
COLLATE='utf8_general_ci'
I've created a table for accounts/users with a primary key (UsersID, AccountsID) like below. Should I add the index for the Users table?
create table AccountsUsers
(
AccountsID int unsigned not null,
UsersID int unsigned not null,
Roles bigint unsigned null,
primary key (UsersID, AccountsID),
constraint AccountsUsers_Accounts_ID_fk
foreign key (AccountsID) references Accounts (ID)
on update cascade on delete cascade,
constraint AccountsUsers_Users_ID_fk
foreign key (UsersID) references Users (ID)
on update cascade on delete cascade
)
engine=InnoDB
;
create index AccountsUsers_Accounts_ID_fk
on AccountsUsers (AccountsID)
;
MySQL will create the necessary indexes for the foreign key automatically, if necessary.
In the case of your foreign key on UsersId, it can use the left column of your primary key. It doesn't need to create a new index for that foreign key.
In the case of your foreign key on AccountsId, MySQL will create a new index automatically. It can't use the fact that AccountsId is part of your primary key, because it isn't the left-most column.
After you do the CREATE TABLE, run SHOW CREATE TABLE AccountsUsers and you should see the new index it created for AccountsId.
From the documentation
MySQL requires indexes on foreign keys and referenced keys so that
foreign key checks can be fast and not require a table scan. In the
referencing table, there must be an index where the foreign key
columns are listed as the first columns in the same order. Such an
index is created on the referencing table automatically if it does not
exist. This index might be silently dropped later, if you create
another index that can be used to enforce the foreign key constraint.
index_name, if given, is used as described previously.
In other words, if you don't already have the required indexes on the columns of your referencing table (AccountsUsers), MySQL will create them for you.
If the columns in the referenced tables (Accounts and Users) are not indexed you will get an error. Your's look like they will be Primary Keys on their respective tables, so you should be fine.
I am trying to insert pseudo data into my db to get going, and in one particular table I have two columns which are FK's and PK's of the table; fk_product_manf_code and fk_content_id. To my understanding, these are considered composite keys in their current state.
So I add data to the table:
fk_product_manf_code fk_content_id
NOV-ABC123 1
I then want to associate another content_id to the same product_manf_code, so I perform the following:
INSERT INTO `mydb`.`package_contents`
(`fk_product_manf_code`, `fk_content_id`)
VALUES
('NOV-ABC123', 2);
However I'm greeted with the following error:
Error Code: 1062. Duplicate entry 'NOV-ABC123' for key 'fk_product_manf_code_UNIQUE'
I don't understand what's going, because I thought a composite key makes 2 columns unique? So why is it kicking up a fuss about just 1 column being unique?
Here is the table CREATE statement
CREATE TABLE `package_contents` (
`fk_product_manf_code` varchar(255) NOT NULL,
`fk_content_id` int(11) NOT NULL,
PRIMARY KEY (`fk_content_id`,`fk_product_manf_code`),
UNIQUE KEY `fk_content_id_UNIQUE` (`fk_content_id`),
UNIQUE KEY `fk_product_manf_code_UNIQUE` (`fk_product_manf_code`),
CONSTRAINT `content_id` FOREIGN KEY (`fk_content_id`) REFERENCES `contents` (`content_id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `product_manf_code` FOREIGN KEY (`fk_product_manf_code`) REFERENCES `products` (`product_manf_code`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
So, you are learning why composite primary keys are a pain, especially for foreign key constraints. Not only are integer keys more efficient, but a single key is easier to work with.
I would suggest changing your table structure to be more like this:
CREATE TABLE package_contents (
package_contents_id int not null auto_increment primary key,
fk_product_manf_id int NOT NULL,
fk_content_id int(11) NOT NULL,
UNIQUE KEY (fk_content_id, fk_product_manf_id),
CONSTRAINT content_id FOREIGN KEY (fk_content_id)
REFERENCES contents(content_id) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT product_manf_code FOREIGN KEY (fk_product_manf_id)
REFERENCES products(product_manf_id) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Note that I changed the manufacturer code to an id as well. This should also reduce the size of the table, assuming that the "code" is longer than 4 bytes.
If you do this for all your tables, the database will be a bit more efficient, and you won't need superfluous unique constraints. The foreign key constraints should always be to primary keys (unless there is a very good reason for using a different unique key).
In a MySQL database where there are relationships between tables and the primary key of one table is stored as a foreign key in a second table, is there still a need to perform a join?
If there is, what is the point on declaring the relationship? I'd take a stab in the dark and say it's something to do with the indexing or related tables can find related records much faster? I've tried Googleing this, but can't seem to find much. I'm sure there is loads out there on this, but I don't know the keywords to search for.
Here is an example of table 1 and table 2:
------------------- Table 1 ----------------------
CREATE TABLE IF NOT EXISTS `db_hint`.`user` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`fb_id` INT NOT NULL,
`last_logged_in` DATETIME NULL,
`permissions` INT UNSIGNED NOT NULL,
PRIMARY KEY (`id`),
INDEX `permissions_id_idx` (`permissions` ASC),
CONSTRAINT `permissions_id`
FOREIGN KEY (`permissions`)
REFERENCES `db_hint`.`permissions` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
----------------- Table 2 ----------------------
CREATE TABLE IF NOT EXISTS `db_hint`.`user_stat` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`user_id` INT UNSIGNED NOT NULL,
PRIMARY KEY (`id`),
INDEX `user_id_idx3` (`user_id` ASC),
CONSTRAINT `user_id`
FOREIGN KEY (`user_id`)
REFERENCES `db_hint`.`user` (`id`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
When performing any kind of join, does the InnoDB engine use the relationship in any way? Thanks.
The point of declaring the foreign key is to enforce data consistency.
You will still need the JOIN in order to get desired data.
In MySQL foreign keys will improve performance, but don't expect much comparable to indexes.
To do a query involving two tables, you need JOIN ... ON ... to say how they are related. FOREIGN KEYs are not involved in a SELECT and has zero impact on performance of SELECT. You do not "have to have" FOREIGN KEYs to perform SELECTs.
A FOREIGN KEY is used during INSERTs (and other writes) to verify that a subsequent JOIN will actually find something in the other table. It is an overhead during the write -- the INSERT actively checks (via an index) that the referenced table has the indicated row.
FOREIGN KEYs may also do a cascading operation. For example, a DELETE can cause another DELETE to happen. I prefer to such take control in my application code.
Is there an improvement in performance in indexing foreign keys in InnoDB? As far as I have read, InnoDB automatically creates an index for the foreign key.
Here is the query given to me for creating the table.
DROP TABLE IF EXISTS `assignments`;
CREATE TABLE `assignments`
(
`id` INTEGER NOT NULL AUTO_INCREMENT,
`user` INTEGER NOT NULL,
`job` INTEGER NOT NULL,
`created_at` DATETIME,
`updated_at` DATETIME,
PRIMARY KEY (`id`),
INDEX `job_fk1` (`user`),
INDEX `job_fk2` (`job`),
CONSTRAINT `job_fk1`
FOREIGN KEY (`user`)
REFERENCES `users` (`id`),
CONSTRAINT `job_fk2`
FOREIGN KEY (`job`)
REFERENCES `jobs` (`id`)
) ENGINE=InnoDB;
In there, he created foreign keys named job_fk1 and job_fk2. He used the names of these foreign keys as the name of the index.
Is there an improvement in performance in indexing foreign keys in InnoDB?
Answer: No. Performance will be degraded due to duplicate keys.
You do not need
INDEX `job_fk1` (`user`),
INDEX `job_fk2` (`job`),
Those will be automatically created by InnoDB internally. But you need to have index on users (id) and jobs (id) for faster operations on assignments table
http://dev.mysql.com/doc/refman/5.0/en/innodb-foreign-key-constraints.html
"InnoDB requires indexes on foreign keys and referenced keys so that foreign key checks can be fast and not require a table scan. In the referencing table, there must be an index where the foreign key columns are listed as the first columns in the same order. Such an index is created on the referencing table automatically if it does not exist. (This is in contrast to some older versions, in which indexes had to be created explicitly or the creation of foreign key constraints would fail.) index_name, if given, is used as described previously."
You are correct that MySQL will create an index on a column, if it doesn't already exist, when creating a foreign key constraint. However, feel free to create an index on the column and remove the auto-generated one if you want.
You also might want additional multi-column indexes to aid queries like this make-believe one:
SELECT id, user, job
FROM assignments
WHERE job = 5
ORDER BY user
The multi-column index (job, user) would satisfy both the search and the sort, and since secondary indexes include the primary key, it would also act as a covering index in this case.