Will partitions improve MySQL INSERT speed? - mysql

I'm doing a lot of INSERTs via LOAD DATA INFILE on MySQL 5.0. After many inserts, say a few hundred millions rows (InnoDB, PK + a non-unique index, 64 bit Linux 4GB RAM, RAID 1), the inserts slow down considerably and appear IO bound. Are partitions in MySQL 5.1 likely to improve performance if the data flows into separate partition tables?

The previous answer is erroneous in his assumptions that this will decrease performance. Quite the contrary.
Here's a lengthy, but informative article and the why and how to do partitioning in MySQL:
http://dev.mysql.com/tech-resources/articles/partitioning.html
Partitioning is typically used, as was mentioned, to group like-data together. That way, when you decided to archive off or flat out destroy a partition, your tables do not become fragmented. This, however, does not hurt performance, it can actually increase it. See, it is not just deletions that fragment, updates and inserts can also do that. By partitioning the data, you are instructing the RDBMS the criteria (indeces) by which the data should be manipulated and queried.

Edit: SiLent SoNG is correct. DISABLE / ENABLE KEYS only works for MyISAM, not InnoDB. I never knew that, but I went and read the docs. http://dev.mysql.com/doc/refman/5.1/en/alter-table.html#id1101502.
Updating any indexes may be whats slowing it down. You can disable indexes while your doing your update and turn them back on so they can be generated once for the whole table.
ALTER TABLE foo DISABLE KEYS;
LOAD DATA INFILE ... ;
ALTER TABLE ENABLE KEYS;
This will cause the indexes to all be updated in one go instead of per-row. This also leads to more balanced BTREE indexes.

No improvement on MySQL 5.6
"MySQL can apply partition pruning to SELECT, DELETE, and UPDATE statements. INSERT statements currently cannot be pruned."
http://dev.mysql.com/doc/refman/5.6/en/partitioning-pruning.html

If the columns INSERT checks (primary keys, for instance) are indexed - then this will only decrease the speed: MySQL will have to additionally decide on partitioning.
All queries are only improved by adding indexes. Partitioning is useful when you have tons of very old data (e.g. year<2000) which is rarely used: then it'll be nice to create a partition for that data.
Cheers!

Related

Can adding/removing keys and indexes damage data in PhpMyAdmin

I'm experimenting with various indexing settings for my mysql database.
I wonder though, by removing or adding indexes is there any possibility to damage data rows in any way? Obviously I realise that if I make any application queries fail, that can cause bad rows. I'm more talking just about the structural queries themselves.
Or will I simply affect the efficiency of the database?
I just want to know if I have safety to experiment or if I have to be cautious?
The data isn't in phpmyadmin, it's in mysql. Adding/removing an index will not affect your data integrity by default. With a unique index, and using the ignore keyword it can.
That said - you should always have a backup of your data, it's easy to run a test like:
CREATE TABLE t1 LIKE t;
INSERT INTO t1 SELECT * FROM t;
ALTER TABLE t1 CREATE INDEX ...;
Then compare the difference in tables (perhaps a COUNT is fine in your case).
Adding/removing indexes is safe in terms of the rows in your table. However as you note, too many indexes or poorly constructed indexes can be (very) detrimental to performance. Likewise, adding indexes on large tables can be a very expensive process, and can bring a MySQL server to its knees, so you're better off not "experimenting" on production tables.

Will making a table innodb and partitioning improve performance in mysql?

I have a Myisam table with composite unique key of 2 columns and 90 million data. Now we are facing memory and load issues and after going through the web I am planning to include partitioning and changing this table to Innodb for better performance. But I have following concerns:
Changing to innodb will have a huge downtime, Is it possible to minimize the downtime?
Most of the select query are on a particular column of the key on which I am planning to have the hash partitioning, how much it will effect the query on another key column?
Will these changes improve the performance to the extent mentioned theoretically? Is there any better solution for such cases. Any suggestion or experience can be helpful.
My queries are simple like
Select * from Table where Col1= "Value"
Select * from Table where Col1="Value" and Col2 IN (V1,V2,V3)
Inserts are very frequently.
InnoDB will probably help some. Conversion to InnoDB comes with some issues, as I state in My conversion blog.
Partitioning, per se, buys no performance gain. My partitioning blog lists 4 cases where you can, with design changes, gain performance.
Regardless of the Engine, your two queries will both benefit from
INDEX(col1, col2)
No form of partitioning will help. HASH partitioning is especially useless.
Conversion to InnoDB will take a lot of downtime, unless pt-online-schema-change will work for your case. Research it.
Also read my answers and comments on
Can i set up Mysql to auto-partition?
for more specifics.
It may be that adding that index is the main performance gain. But you have to do a lengthy ALTER to get it. MyISAM does not have ALGORITHM=INPLACE.
Innodb (about perfomance we are talking now) have sense only when there are alot of inserts and updates to your table, because of row-locking table.
If the most queries on your table are SELECTs then MyIsam will be faster.
Advice: put in my.cnf key_buffer_size equal to 25% of your free RAM.
If inserts on your database are very frequent, you will likely gain performance by switching to innodb, which won't lock down entire tables to insert, allowing other clients to select data concurrently.
Regarding question #1, if you are worried about downtime, I'd suggest you find a parallel dump/load solution for migrating your data to innodb. if you simply run an ALTER statement on your tables, this is a single threaded operation which will be much slower.
Regarding #2, you'd have to post a schema along with your partitioning strategy and the queries you are worried about.

MySQL how to ignore indexig while inserting rows

I have a table in my MySQL database with round 5M rows. Inserting rows to the table is too slow as MySQL updates index while inserting. How to stop index updating while inserting and do the indexing separately later?
Thanks
Kamrul
Sounds like your table might be over indexed. Maybe post your table definition here so we can have a look.
You have two choices:
Keep current indexes and remove unused indexes. If you have 3 indexes on a table for every single write to the table there will be 3 writes to the indexes. A index is only helpful during reads so you might want to remove unused indexes. During a load indexes will be updated which will slow down your load.
Drop you indexes before load then recreate them after load. You can drop your indexes before data load then insert and rebuild. The rebuild might take longer than the slow inserts. You will have to rebuild all indexes one by one. Also unique indexes can fail if duplicates are loaded during the load process without the indexes.
Now I suggest you take a good look at the indexes on the table and reduce them if they are not used in any queries. Then try both approaches and see what works for you. There is no way I know of in MySQL to disable indexes as they need the values insert to be written to their internal structures.
Another thing you might want to try it to split the IO over multiple drives i.e partition your table over several drives to get some hardware performance in place.

Inserting New Column in MYSQL taking too long

We have a huge database and inserting a new column is taking too long. Anyway to speed up things?
Unfortunately, there's probably not much you can do. When inserting a new column, MySQL makes a copy of the table and inserts the new data there. You may find it faster to do
CREATE TABLE new_table LIKE old_table;
ALTER TABLE new_table ADD COLUMN (column definition);
INSERT INTO new_table(old columns) SELECT * FROM old_table;
RENAME table old_table TO tmp, new_table TO old_table;
DROP TABLE tmp;
This hasn't been my experience, but I've heard others have had success. You could also try disabling indices on new_table before the insert and re-enabling later. Note that in this case, you need to be careful not to lose any data which may be inserted into old_table during the transition.
Alternatively, if your concern is impacting users during the change, check out pt-online-schema-change which makes clever use of triggers to execute ALTER TABLE statements while keeping the table being modified available. (Note that this won't speed up the process however.)
There are four main things that you can do to make this faster:
If using innodb_file_per_table the original table may be highly fragmented in the filesystem, so you can try defragmenting it first.
Make the buffer pool as big as sensible, so more of the data, particularly the secondary indexes, fits in it.
Make innodb_io_capacity high enough, perhaps higher than usual, so that insert buffer merging and flushing of modified pages will happen more quickly. Requires MySQL 5.1 with InnoDB plugin or 5.5 and later.
MySQL 5.1 with InnoDB plugin and MySQL 5.5 and later support fast alter table. One of the things that makes a lot faster is adding or rebuilding indexes that are both not unique and not in a foreign key. So you can do this:
A. ALTER TABLE ADD your column, DROP your non-unique indexes that aren't in FKs.
B. ALTER TABLE ADD back your non-unique, non-FK indexes.
This should provide these benefits:
a. Less use of the buffer pool during step A because the buffer pool will only need to hold some of the indexes, the ones that are unique or in FKs. Indexes are randomly updated during this step so performance becomes much worse if they don't fully fit in the buffer pool. So more chance of your rebuild staying fast.
b. The fast alter table rebuilds the index by sorting the entries then building the index. This is faster and also produces an index with a higher page fill factor, so it'll be smaller and faster to start with.
The main disadvantage is that this is in two steps and after the first one you won't have some indexes that may be required for good performance. If that is a problem you can try the copy to a new table approach, using just the unique and FK indexes at first for the new table, then adding the non-unique ones later.
It's only in MySQL 5.6 but the feature request in http://bugs.mysql.com/bug.php?id=59214 increases the speed with which insert buffer changes are flushed to disk and limits how much space it can take in the buffer pool. This can be a performance limit for big jobs. the insert buffer is used to cache changes to secondary index pages.
We know that this is still frustratingly slow sometimes and that a true online alter table is very highly desirable
This is my personal opinion. For an official Oracle view, contact an Oracle public relations person.
James Day, MySQL Senior Principal Support Engineer, Oracle
usually new line insert means that there are many indexes.. so I would suggest reconsidering indexing.
Michael's solution may speed things up a bit, but perhaps you should have a look at the database and try to break the big table into smaller ones. Take a look at this: link. Normalizing your database tables may save you loads of time in the future.

How to improve MySQL INSERT and UPDATE performance?

Performance of INSERT and UPDATE statements in our database seems to be degrading and causing poor performance in our web app.
Tables are InnoDB and the application uses transactions. Are there any easy tweaks that I can make to speed things up?
I think we might be seeing some locking issues, how can I find out?
You could change the settings to speed InnoDB inserts up.
And even more ways to speed up InnoDB
...and one more optimization article
INSERT and UPDATE get progressively slower when the number of rows increases on a table with an index. Innodb tables are even slower than MyISAM tables for inserts and the delayed key write option is not available.
The most effective way to speed things up would be to save the data first into a flat file and then do LOAD DATA , this is about 20x faster.
The second option would be create a temporary in memory table, load the data into it and then do a INSERT INTO SELECT in batches. That is once you have about 100 rows in your temp table, load them into the permanent one.
Additionally you can get an small improvement in speed by moving the Index file into a separate physical hard drive from the one where the data file is stored. Also try to move any bin logs into a different device. Same applies for the temporary file location.
I would try setting your tables to delay index updates.
ALTER TABLE {name} delay_key_write='1'
If you are not using indexes, they can help improve performance of update queries.
I would not look at locking/blocking unless the number of concurrent users have been increasing over time.
If the performance gradually degraded over time I would look at the query plans with the EXPLAIN statement.
It would be helpful to have the results of these from the development or initial production environment, for comparison purposes.
Dropping or adding an index may be needed,
or some other maintenance action specified in other posts.