MySQL Slow INSERT on related big tables, 100% CPU use - mysql

I am building a website (LAMP stack) with an Amazon RDS MySQL instance as the back end (type db.m3.medium).
I am happy with database integrity, and it works perfectly with regards to SELECT/JOIN/ETC queries (everything is normalized, indexed, and foreign keyed, all tables have id primary keys and relevant secondary keys / unique keys).
I have a table 'df_products' with approx half a million products in it. The products need to be updated nightly. The process involves a PHP script reading over a large products data-file and inserting data into several tables (products table, product_colours table, brands table, etc), calling either INSERT or UPDATE depending on whether or not a row already exists. This is done as one giant transaction.
What I am seeing is the UPDATE commands are sufficiently fast (50/sec, not exactly lightning but it should do), however the INSERT commands are super slow (1/sec) and appear to be consuming 100% of the CPU. On a dual core instance we see 50% CPU use (i.e. one full core).
I assume that this is because indexes (1x PRIMARY + 5x INDEX + 1x UNIQUE + 1x FULLTEXT) are being rebuilt after every INSERT. However I though that putting the entire process into one transaction should stop indexes being rebuilt until the transaction is committed.
I have tried setting the following params via PHP but there is negligible performance improvement:
$this->db->query('SET unique_checks=0');
$this->db->query('SET foreign_key_checks=0;');
The process will take weeks to complete at this rate so we must improve performance. Google appears to suggest using LOAD DATA. However:
I would have to generate five files in order to populate five tables
The process would have to use UPDATE commands as opposed to INSERT since the tables already exist
I would still need to loop over the products and scan the database for what values already do and don't exist
The database is entirely InnoDB and I don't plan to move to MyISAM (I want transactions, foreign keys, etc). This means that I cannot disable indexes. Even if I did it would probably be a big performance drain as we need to check if a row already exists before we insert it, and without an index this will be super slow.
I have provided the products table defition below for information. Can you please provide advice to what process we should be using to achieve faster INSERT/UPDATE on multiple large related tables? Or what optimisations we can make to our existing process?
Thank you,
CREATE TABLE `df_products` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_brand` int(11) NOT NULL,
`title` varchar(255) NOT NULL,
`id_gender` int(11) NOT NULL,
`id_colourSet` int(11) DEFAULT NULL,
`id_category` int(11) DEFAULT NULL,
`desc` varchar(500) DEFAULT NULL,
`seoAlias` varchar(255) CHARACTER SET ascii NOT NULL,
`runTimestamp` timestamp NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `seoAlias_UNIQUE` (`seoAlias`),
KEY `idx_brand` (`id_brand`),
KEY `idx_category` (`id_category`),
KEY `idx_seoAlias` (`seoAlias`),
KEY `idx_colourSetId` (`id_colourSet`),
KEY `idx_timestamp` (`runTimestamp`),
KEY `idx_gender` (`id_gender`),
FULLTEXT KEY `fulltext_title` (`title`),
CONSTRAINT `fk_id_colourSet` FOREIGN KEY (`id_colourSet`) REFERENCES `df_productcolours` (`id_colourSet`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `fk_id_gender` FOREIGN KEY (`id_gender`) REFERENCES `df_lu_genders` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=285743 DEFAULT CHARSET=utf8

How many "genders" are there? If the usual 2, don't normalize it, don't index it, don't us a 4-byte INT to store it, use a CHAR(1) CHARACTER SET ascii (only 1 byte) or an ENUM (1 byte).
Each unnecessary index is a performance drain on the load, regardless of how it is done.
For INSERT vs UPDATE, look into using INSERT ... ON DUPLICATE KEY UPDATE.
Load the nightly data into a separate table (this could be MyISAM with no indexes). Then run one query to update existing rows and one to insert new rows. (Each needs a JOIN.) See http://mysql.rjweb.org/doc.php/staging_table, especially the 2 SQLs used for "normalizing". They can be adapted to your situation.
Any kind of multi-row query runs noticeably faster than 1-row at a time. (A 100-row INSERT runs 10 times as fast as 100 1-row inserts.)
innodb_flush_log_at_trx_commit = 2 will let the individual write statements run much faster. (Batching them as I suggest won't speed up much.)

Related

Regarding MySQL deployed on GCP load

I am almost done building a website that I target to have 10,000 users. It's free, so I'd like to keep the cost as low as possible.
All but two tables are less than 100,000 rows (read only). Off the remaining, one table will have about 5,200 rows per user in total and nothing less. The other I estimate about 1.5mn rows per user over two years, assuming they continue using it that long.
The latter table is as follows, and the former is the same except for col3...
CREATE TABLE `my_table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`col1` int(11) NOT NULL,
`col2` mediumint(8) unsigned NOT NULL,
`col3` smallint(5) unsigned NOT NULL,
`col4` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `fk1_ix` (`col1`),
KEY `fk2_ix` (`col2`),
KEY `fk3_ix` (`col3`),
CONSTRAINT `fk1` FOREIGN KEY (`col1`) REFERENCES `pktbl1` (`id`) ON UPDATE CASCADE,
CONSTRAINT `fk2` FOREIGN KEY (`col2`) REFERENCES `pktbl2` (`id`) ON UPDATE CASCADE,
CONSTRAINT `fk3` FOREIGN KEY (`col3`) REFERENCES `pktbl3` (`id`) ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci;
Both the tables will on average be written to about 10-20 times a day, and read about 4-5 times, for every active user.
I'd like to estimate my running cost and have two primary questions and appreciate any other inputs.
1) Will MySQL reasonably be able to handle my load.
2) How much CPU / RAM do you reckon I'd need to handle my load with a response time / lag of up to one second.
My website is designed using PHP Yii2 framework, so I can just switch databases, if required. The queries are simple inserts and indexed select statements.
20 Billion rows, mostly in small tables? That sounds like 1 terabyte of disk space. Plan for that.
200K queries (write or read) per day? That's only a few per second. No problem on any server. This assumes, however, that the queries are not too complex.
Will MySQL handle it? Look around in this forum (better yet, dba.stackoverflow), you will see much bigger systems being discussed.
CPU -- Usually the least of the problems.
RAM -- Depends on the queries. These days, I would not start with anything less than 4GB.
Cloud -- It's a viable option. You pay extra, but have fewer hassles, especially if you need to upgrade.
Get 100 users on the system, then take stock of what you have. See what the numbers look like.
If you ever get into performance problems, first look at the queries and indexes to see if the slowest query can be improved.
There isn't really a way to estimate, your requirements are very low. If you are able to do some load testing that will help you.
However if you can tolerate a few minutes downtime, you are able to scale your Cloud SQL instance up or down. This might give you confidence that starting on an f1-micro or g1-small does not prevent you from upgrading should performance not meet your needs.

delete by primary key takes forever mysql

I am trying to delete by primary key from Table (300 rows) and it takes up to max query execution time and at the end returns ERROR 2013: 2013: Lost connection to MySQL server during query. This table has foreign key to the large table (200k rows). What can be an issue?
Query: DELETE FROM Table Where table_id=x
EDIT:
There are no triggers associated with this DELETE statement. DELETE/INSERT/UPDATE statements in some database tables work really slow while SELECT statements in whole database work perfectly fine.
EDIT#2:
Additional information from innodb trx table for the query:
trx_lock_structs 429704
trx_lock_memmory_bytes 34698792
trx_rows_locked 214938
trx_isolation_level REPEATABLE READ
trx_unique_checks 1
trx_foreign_key_checks 1
This query deletes 1 row and doesn't have child rows, why locked rows value is so high?
EDIT#3
Investigating situation further I have determined that tables that have slow insert/update/delete operations are the tables that have foreign key with the big table (200k). Is it necessary to remove this foreign keys or data integrity is more important? Although 200k rows is not that much what can be reasons for this slow operations?
EDIT#4
SHOW CREATE TABLE:
CREATE TABLE `Table` (
`table_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`tableb_id` bigint(20) unsigned NOT NULL,
`tablec_id` bigint(20) unsigned DEFAULT NULL,
`bigtable_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`table_id`),
KEY `fk_tableb_id_idx` (`tableb_id`),
KEY `fk_bigtable_id_idx` (`bigtable_id`),
KEY `fk_tablec_id_idx` (`tablec_id`),
CONSTRAINT `fk_bigtable_id` FOREIGN KEY (`bigtable_id`) REFERENCES `Bigtable`
(`bigtable_id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `fk_tableb_id` FOREIGN KEY (`tableb_id`) REFERENCES `tablebs`
(`tableb_id`) ON DELETE CASCADE ON UPDATE NO ACTION,
CONSTRAINT `fk_tablec_id` FOREIGN KEY (`tablec_id`) REFERENCES `tablecs`
(`tablec_id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=271 DEFAULT CHARSET=utf8
BigTable is a typical Users table id and additional information.
EDIT#5
EXPLAIN DELETE:
select_type : SIMPLE,
table : Table,
type : range,
possible_keys : PRIMARY,
key : PRIAMRY,
key_len : 8,
ref : const,
rows : 1,
Extra : Using where
The reason is cascade on big table. Do you understand the complexity of this computation? it is not just a delete operation on 300 rows. It is basically 300 rows * 200k. Each delete would go and pass through the big table and would perform delete operation on the corresponding rows (based on the id).
Just follow the below mentioned steps:
1) Remove the references of this primary id of this table from other tables (if any)
2) Alter this table and add NO ACTION to the UPDATE & CASCADE of big table foreign key
OR
remove all the foreign key contraints
3) delete from tableName
I may suggest an empirical approach...
Delete ops with cascade impacts on indexes, slow execution time may depend on time needed to elaborate fKey indexes, btw your tables aren't huge. You say you have no triggers, functions or procedure, so you have to benchmark your query to find what's part is taking so much time...
I assume that you have already verified indexes efficency.
So to benchmark your delete query you should handy write "inner deletion" queries and benchmark each one, using BENCHMARK().
This way you will find which part of that deletion took so long.
What if each single part is reasonably fast?
You may have some misconfiguration in your my.cnf, so you could try checking with https://tools.percona.com/wizard recommendations...
Maybe you have some memory allocation limit or some thread/memory limit.
You can find many tutorials on mysql optimization like http://www.codingpedia.org/ama/optimizing-mysql-server-settings/
You can also find some scripts that may help you in finding mysql optimizations.
Without knowing your architecture, mysql configuration and your database structure, it's quite hard to give a 100% solution, but I hope you can find the way, I'll be glad to read about your findings. If something more will come to my mind I'll keep you posted.

Creating an auxiliary table to improve performance on a large MySQL table?

I have a client who has asked me to tune his MySQL database in order to implement some new features and to improve the performance of an already existing web app.
The biggest table (~90 GB) has over 200M rows, and is growing at periodic intervals (one per visit to any of the websites he owns). Having continuous INSERTs, each SELECT query performed from the backend page takes a while to complete, as indexes are regenerated each time.
I've done a simulation on my own server switching from BTREE indexes to HASH indexes. Both SELECTs and INSERTs are not running any faster. The table uses MyISAM as storage engine. There are only INSERTs and SELECTs, no UPDATEs or DELETEs.
I've came up with the idea of creating an auxiliary table updated together with each INSERT to speed up every SELECT query coming from the backend. I know this is bad practice, but, I'm sure the performance will improve for the statistics page.
I'm not a database performance expert, as you may have noticed... Is there a better approach for this?
By the way, from phpMyAdmin I've seen that most indexes on the table have a cardinality of 0. In my simulation, this didn't happen. I'm not sure why is this happening.
Thanks a lot.
1st update: I've just learned that hash index isn't available for MyISAM engine.
2nd update: OK. Here's the table schema.
CREATE TABLE `visits` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`datetime` int(8) NOT NULL,
`webmaster_id` char(18) NOT NULL,
`country` char(2) NOT NULL,
`connection` varchar(15) NOT NULL,
`device` varchar(15) NOT NULL,
`provider` varchar(100) NOT NULL,
`ip_address` varchar(15) NOT NULL,
`url` varchar(300) NOT NULL,
`user_agent` varchar(300) NOT NULL,
PRIMARY KEY (`id`),
KEY `datetime` (`datetime`),
KEY `webmaster_id` (`webmaster_id`),
KEY `country` (`country`),
KEY `connection` (`connection`),
KEY `device` (`device`),
KEY `provider` (`provider`)
) ENGINE=InnoDB;
So, instead of performing queries like select count(*) from visits where datetime=20140715 and device="ios", won't it be best to fetch this from select count from visits_stats where datetime=20140715 and device="ios"?
INSERTs are, as said, much more frequent than SELECTs, but my client wants to improve the performance of the backend used to retrieve aggregated data. Using my approach, each visit would imply one INSERT and one INSERT/UPDATE (or REPLACE) which would increment one or more counters (I haven't decided the schema for the visits_stats table yet, the above query was just an example).
Apart from this, I've decided to replace some of the fields by their appropriate IDs from a foreign table. So far, data is stored in strings like connection=cable, device=android, and so on. I'm not sure how would this affect performance.
Thanks again.
Edit: I said before not to use partitions. But Bill is right that the way he described would work. Your only concern would be if you tried to select across the 101 partitions, then the whole thing would come to a standstill. If you don't intend to do this then partitioning would solve the problem. Fix your indexes first though.
Your primary problem is that MyISAM is not the best engine, neither is InnoDB. TokuDB would be your best bet, but you'd have to install that on the server.
Now, you need to prune your indexes. This is the major reason for the slowness. Remove an index on everything that isn't part of common SELECT statements. Add an multi-column index on exactly what is requested in the WHERE of your SELECT statements.
So (in addition to your primary key) you want an index on datetime, device only as a multi-column index, according to your posted SELECT statement.
If you change to TokuDB the inserts will be much faster, if you stick with MyISAM then you could speed the whole thing up by using INSERT DELAYED instead of INSERT. The only issue with this is that the inserts will not be live, but will be added whenever MySQL decides there is not too much load.
Alternatively, if the above still does not help, your final option would be to use two tables. One table that you SELECT from, and another that you INSERT to. Once an day or so you would then copy the insert table to the select table. Though this means the data in your select table could be up to 24 hours old.
Other than that you would have to completely change the table structure, for which I can't tell you how to do because it depends on what you are using it for exactly, or use something other than MySQL for this. However, my above optimizations should work.
I would suggest looking into partitioning. You have to add datetime to the primary key to make that work, because of a limitation of MySQL. The primary or unique keys must include the column by which you partition the table.
Also make the index on datetime into a compound index on (datetime, device). This will be a covering index for the query you showed, so the query can get its answer from the index alone, without having to touch table rows.
CREATE TABLE `visits` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`datetime` int(8) NOT NULL,
`webmaster_id` char(18) NOT NULL,
`country` char(2) NOT NULL,
`connection` varchar(15) NOT NULL,
`device` varchar(15) NOT NULL,
`provider` varchar(100) NOT NULL,
`ip_address` varchar(15) NOT NULL,
`url` varchar(300) NOT NULL,
`user_agent` varchar(300) NOT NULL,
PRIMARY KEY (`id`, `datetime`), -- compound primary key is necessary in this case
KEY `datetime` (`datetime`,`device`), -- compound index for the SELECT
KEY `webmaster_id` (`webmaster_id`),
KEY `country` (`country`),
KEY `connection` (`connection`),
KEY `device` (`device`),
KEY `provider` (`provider`)
) ENGINE=InnoDB
PARTITION BY HASH(datetime) PARTITIONS 101;
So when you query for select count(*) from visits where datetime=20140715 and device='ios', your query is only scanning one partition, with about 1% of the rows in the table. Then within that partition, it narrows down even further using the index.
Inserts should also improve, because they are updating much smaller indexes.
I use a prime number when doing hash partitioning, to help the partitions remain more evenly filled in case the dates inserted follow a regular pattern.
Converting a 90GB table to partitioning is going to take a long time. You can use pt-online-schema-change to avoid blocking your application.
You can even make more partitions if you want, in theory up to 1024 in MySQL 5.5 and 8192 in MySQL 5.6. Although with thousands of partitions, you may run into different bottlenecks, like the number of open files.
P.S.: HASH indexes are not support by either MyISAM or InnoDB. HASH indexes are only supported by MEMORY and NDB storage engines.
You are in the problem which is called Big Data Querying / Big Data handling now a days. For handling big data there are many solutions available unfortunately none of them are easy enough to be implemented. You always need a team to structure Big Data to fulfill your need. Some of The solution I may define here are as Under.
1. Big Table
Google uses this technique to create a whole lot big table with thousands of column.(To minimize records vertically). For which you will have to analyze your data and then partition on the basis of similarity and then tag those similarity with appropriate name. Now you must have to write Query that will be first analyzed by some algorithm to check what column space have to be queried. Not Simple enough
2. Distribute Database Across multiple Machine
Hadoop file system is an open source Apache project which is totally created for solving the problem of storing and querying big data. In early days Space was issue and system were capable enough to process small data but now space is not an issue.Even Small organization have tera bytes of data stored locally. But this terabytes of data can not be be processed in one go at one machine. Even a giant machine can take days to process aggregate operation. That is why hadoop is there.
If you are individual then definitely you are in trouble you will need resource for doing this painful task for You. But you can use the essence of these techniques without employing these technologies.
You are free to give a try to these technique. Just study articles about handling big data. Relational database queries are not gonna work in your case

Mysql design for logtable

I would like to have advices about a mysql table design for a event logger.
Our needs :
- track a lot of action
- 10 000 actions / second
- 1 billion row at this time
Our hardware :
- 2*Xeon (seen as 32 CPU by the system)
- 128 GB RAM
- 6*600 SSD with Raid 10
Our table design :
CREATE TABLE IF NOT EXISTS `log_event` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`id_event` smallint(6) NOT NULL,
`id_user` bigint(20) NOT NULL,
`date` int(11) NOT NULL,
`data` bigint(20) NOT NULL,
PRIMARY KEY (`id`),
KEY `id_event_2` (`id_event`,`data`),
KEY `id_inscri` (`id_inscri`),
KEY `date` (`date`),
KEY `id_event_4` (`id_event`,`date`,`data`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8
ALTER TABLE `log_event`
ADD CONSTRAINT `log_event_ibfk_1` FOREIGN KEY (`id_inscri`) REFERENCES `inscription` (`id_inscri`) ON DELETE CASCADE ON UPDATE CASCADE;
Our problem :
- We have an auto-increment as primary, but it is not really used. Is it a problem to remove it ? We will no have primary key if we remove it => How to identify a line ?
We would like to do partionning, but with the foreign it seems to be impossible ?
We don't do bulk insert. Is it a good idea to insert in a Memory table without index and copy data every 5 minutes ?
Do you have any idea to optimize ? Do you have best practice for this kind of system ?
Thanks !
François
Primary keys of relational tables (relations) might have two types:
Natural - exists in subject area to completely determine each row of relational table.
Natural primary keys might be simple (if consists of only one column), or complex (if consists more than one column). It is not recomended to set a natural primary key on large string column.
Artificial - special column, injected by database designer / developer to boost table performance, if natural key is complex, and have to be used in related table (is foreign key for something), or if it is simple, but is large and will produce data overhead while copied in related table as a foreign key, or if it is complex to search (for example, CRUD operations on VARCHAR IDs might be slower, than on INT IDs). There might be other reasons. TL;DR: Artificial key - one special column, serving to completely determine each row of relational table and boost it's performance for CRUD operations.
We have an auto-increment as primary, but it is not really used. Is it
a problem to remove it ? We will no have primary key if we remove it
=> How to identify a line ?
If you do not need to reference your table to another tables (as source), then you may probably remove artificial key without any consequences. Still, I recomend you set any other PRIMARY KEY in this table to avoid data duplication, and for obviosity (if it matters).
Your table by itself (if properly normalized) will have natural key as one of "key candidates". It might be complex one (consist of few columns). It is normal. But don't set primary for strings, because PRIMARY always have index, which will produce data overhead. If it is combination of INT or "small" VARCHAR columns, then it is normal.
Consider as an option: id_event + id_user + date.
We don't do bulk insert. Is it a good idea to insert in a Memory table
without index and copy data every 5 minutes ?
It is not a bad idea. But it is not good idea, until it properly tested. Try to perform load-test, before real use.
If you not reference MEMORY table to others, then you still may join it with any other InnoDB table. But you will loose InnoDB functionality (referential integrity). If lose of parent table ON DELETE CASCADE ON UPDATE CASCADE is not a concern, then it might be done. As for me, InnoDB is not so slow to switch table engine, in your case.

A simple INSERT query on InnoDB taking too much

I have this simple query:
INSERT IGNORE INTO beststat (bestid,period,rawView) VALUES ( 4510724 , 201205 , 1 )
On the table:
CREATE TABLE `beststat` (
`bestid` int(11) unsigned NOT NULL,
`period` mediumint(8) unsigned NOT NULL,
`view` mediumint(8) unsigned NOT NULL DEFAULT '0',
`rawView` mediumint(8) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`bestid`,`period`),
) ENGINE=InnoDB AUTO_INCREMENT=2020577 DEFAULT CHARSET=utf8
And it takes 1 sec to completes.
Side Note: actually it doesn't take always 1sec. Sometime it's done even in 0.05 sec. But often it takes 1 sec
This table (beststat) currently has ~500'000 records and its size is: 40MB. I have 4GB RAM and innodb buffer pool size = 104,857,600, with: Mysql: 5.1.49-3
This is the only InnoDB table in my database (others are MyISAM)
ANALYZE TABLE beststat shows: OK
Maybe there is something wrong with InnoDB settings?
I ran some simulations about 3 years ago as part of some evaluation project for a customer. They had a requirement to be able to search a table where data is constantly being added, and they wanted to be up to date up to a minute.
InnoDB has shown much better results in the beginning, but has quickly deteriorated (much before 1mil records), until I have removed all indexes (including primary). At that point InnoDB has become superior to MyISAM when executing inserts/updates. (I have much worse HW then you, executing tests only on my laptop.)
Conclusion: Insert will always suffer if you have indexes, and especially unique.
I would suggest following optimization:
Remove all indexes from your beststat table and use it as a simple dump.
If you really need these unique indexes, consider some programmable solution (like remembering the max bestid at all time, and insisting that the new record is above that number - and immediately increasing this number. (But do you really need so many unique fields - and they all sound to me just like indexes.)
Have a background thread move new records from InnoDB to another table (which can be MyISAM) where they would be indexed.
Consider dropping indexes temporarily and then after bulk update re-indexing the table, possibly switching two tables so that querying is never interrupted.
These are theoretical solutions, I admit, but is the best I can say given your question.
Oh, and if your table is planned to grow to many millions, consider a NoSQL solution.
So you have two unique indexes on the table. You primary key is a autonumber. Since this is not really part of the data as you add it to the data it is what you call a artificial primary key. Now you have a unique index on bestid and period. If bestid and period are supposed to be unique that would be a good candidate for the primary key.
Innodb stores the table either as a tree or a heap. If you don't define a primary key on a innodb table it is a heap if you define a primary key it is defined as a tree on disk. So in your case the tree is stored on disk based on the autonumber key. So when you create the second index it actually creates a second tree on disk with the bestid and period values in the index. The index does not contain the other columns in the table only bestid, period and you primary key value.
Ok so now you insert the data first thing myself does is to ensure the unique index is always unique. Thus it read the index to see if you are trying to insert a duplicate value. This is where the slow down comes into play. It first has to ensure uniqueness then if it passes the test write data. Then it also has to insert the bestid, period and primary key value into the unique index. So total operation would be 1 read index for value 1 insert row into table 1 insert bestid and period into index. A total of three operations. If you removed the autonumber and used only the unique index as the primary key it would read table if unique insert into table. In this case you would have the following number of operations 1 read table to check values 1 insert into tables. This is two operations vs three. So you do 33% less work by removing the redundant autonumber.
I hope this is clear as I am typing from my Android and autocorrect keeps on changing innodb to inborn. Wish I was at a computer.