MySQL how to ignore indexig while inserting rows - mysql

I have a table in my MySQL database with round 5M rows. Inserting rows to the table is too slow as MySQL updates index while inserting. How to stop index updating while inserting and do the indexing separately later?
Thanks
Kamrul

Sounds like your table might be over indexed. Maybe post your table definition here so we can have a look.
You have two choices:
Keep current indexes and remove unused indexes. If you have 3 indexes on a table for every single write to the table there will be 3 writes to the indexes. A index is only helpful during reads so you might want to remove unused indexes. During a load indexes will be updated which will slow down your load.
Drop you indexes before load then recreate them after load. You can drop your indexes before data load then insert and rebuild. The rebuild might take longer than the slow inserts. You will have to rebuild all indexes one by one. Also unique indexes can fail if duplicates are loaded during the load process without the indexes.
Now I suggest you take a good look at the indexes on the table and reduce them if they are not used in any queries. Then try both approaches and see what works for you. There is no way I know of in MySQL to disable indexes as they need the values insert to be written to their internal structures.
Another thing you might want to try it to split the IO over multiple drives i.e partition your table over several drives to get some hardware performance in place.

Related

How can I index rebuild on multiple table every day at 3 AM efficient way?

I run below command on multiple tables but is it right way to do index rebuild or is there any better way to do it every day at specified time using event?
OPTIMIZE TABLE table1, table2;
My second question is that, if another process(insert,delete,update) run on same table during index rebuild then what will happen for that process?
Is process same for both MariaDB, MySQL?
Since I am working on those DBMS that is why I need to know the actual behavior in this scenario.
Thanks In Advance.
If you are using ENGINE=InnoDB (on MySQL or MariaDB), there is "never" any need to rebuild indexes or do OPTIMIZE TABLE.
Sure, either will do some "defragmentation", but, because of the way BTrees work, they become fragmented promptly. And, a fragmented BTree is only slightly slower.
Read and write operations are interfered with by anything that will rebuild an index -- another argument against periodic rebuilding.
About the only useful time to use OPTIMIZE is after you have DELETEd most of a table. In that situation, I have a list of better ways to do the big delete.

Optimize table on huge mysql tables without partition

We have a very huge Mysql table which is MyISAM. Whenever we run optimize table command, the table is locked and performance is getting impacted. The table is not read only and hence creating temporary tables and swapping them may not work out. We are not able to partition the table also.
Is there any other way/tool to achieve optimize table functionality without degrading the performance. Any suggestion would be of great help.
Thanks in advance.
http://dev.mysql.com/doc/refman/5.5/en/optimize-table.html
For InnoDB tables, OPTIMIZE TABLE is mapped to ALTER TABLE, which
rebuilds the table (...)
Therefore, I would not expect any improvement in switching to InnoDB, as Quassnoi probably suggests.
By definition, OPTIMIZE TABLE needs some exclusive access to the table, hence the degraded performances during OPTIMIZE'ation
Nevertheless, there could be some steps to take to reduce the time taken by OPTIMIZE, depending on how your table is "huge" :
if your table has many fields, your table might need to be normalized. Conversely, you might want to de-normalize your table by spreading your columns into several "narrower" tables, and establish one-to-one relations.
if your table has many records, implement a "manual" partitionning in your application code. A simple step would be to create an "archive" table that holds rarely updated records. This way you only need to optimize a smaller set of records (the non-archive table).
optimize table command lock the table,it decrease the performance.
you download percona tool kit command to optimize table.
this command not lock the table during optimize table.
use below link :
https://www.percona.com/doc/percona-toolkit/2.1/pt-online-schema-change.html

i am inserting a lot of records into a large table, should I remove the indices until I am finished?

I am importing a lot of data (something like 75 million inserts) into a MySQL database with a few different tables.
I have indexes on a lot of columns. Should I remove them while I do the inserts and just add them back after it is done? Will that have a significant impact on performance?
I get the feeling the import has slowed down now that I have imported a few hundred thousand records, and I suspect the indexes might be case.
Would any more information be useful?
There is no yes/no answer for that
The most important point being: If the table is used while importing,
do not disable the indices: Just imagine a few simple queries
falling back to full table scan, after you have inserted 74 million
records.
Closely realted: Can you make sure, the table is not needed after being filled, but the indices not yet built?
If you can do the insert on a completly "cold" table, I'd drop the indices and rebuild them later.
Yes, certainly. You need to remove the indexes before importing large amount of records. Otherwise it will not only take a long time to import, but it will also make the existing indexes heavily fragmented. You will have to rebuild the index anyway to restore optimal performance.
If you remove the indexes before doing the import, then import will be faster. After the import, create the indexes again and then the indexes will be created fresh and they will have no fragmentation and the search performance will be faster as well.

Inserting New Column in MYSQL taking too long

We have a huge database and inserting a new column is taking too long. Anyway to speed up things?
Unfortunately, there's probably not much you can do. When inserting a new column, MySQL makes a copy of the table and inserts the new data there. You may find it faster to do
CREATE TABLE new_table LIKE old_table;
ALTER TABLE new_table ADD COLUMN (column definition);
INSERT INTO new_table(old columns) SELECT * FROM old_table;
RENAME table old_table TO tmp, new_table TO old_table;
DROP TABLE tmp;
This hasn't been my experience, but I've heard others have had success. You could also try disabling indices on new_table before the insert and re-enabling later. Note that in this case, you need to be careful not to lose any data which may be inserted into old_table during the transition.
Alternatively, if your concern is impacting users during the change, check out pt-online-schema-change which makes clever use of triggers to execute ALTER TABLE statements while keeping the table being modified available. (Note that this won't speed up the process however.)
There are four main things that you can do to make this faster:
If using innodb_file_per_table the original table may be highly fragmented in the filesystem, so you can try defragmenting it first.
Make the buffer pool as big as sensible, so more of the data, particularly the secondary indexes, fits in it.
Make innodb_io_capacity high enough, perhaps higher than usual, so that insert buffer merging and flushing of modified pages will happen more quickly. Requires MySQL 5.1 with InnoDB plugin or 5.5 and later.
MySQL 5.1 with InnoDB plugin and MySQL 5.5 and later support fast alter table. One of the things that makes a lot faster is adding or rebuilding indexes that are both not unique and not in a foreign key. So you can do this:
A. ALTER TABLE ADD your column, DROP your non-unique indexes that aren't in FKs.
B. ALTER TABLE ADD back your non-unique, non-FK indexes.
This should provide these benefits:
a. Less use of the buffer pool during step A because the buffer pool will only need to hold some of the indexes, the ones that are unique or in FKs. Indexes are randomly updated during this step so performance becomes much worse if they don't fully fit in the buffer pool. So more chance of your rebuild staying fast.
b. The fast alter table rebuilds the index by sorting the entries then building the index. This is faster and also produces an index with a higher page fill factor, so it'll be smaller and faster to start with.
The main disadvantage is that this is in two steps and after the first one you won't have some indexes that may be required for good performance. If that is a problem you can try the copy to a new table approach, using just the unique and FK indexes at first for the new table, then adding the non-unique ones later.
It's only in MySQL 5.6 but the feature request in http://bugs.mysql.com/bug.php?id=59214 increases the speed with which insert buffer changes are flushed to disk and limits how much space it can take in the buffer pool. This can be a performance limit for big jobs. the insert buffer is used to cache changes to secondary index pages.
We know that this is still frustratingly slow sometimes and that a true online alter table is very highly desirable
This is my personal opinion. For an official Oracle view, contact an Oracle public relations person.
James Day, MySQL Senior Principal Support Engineer, Oracle
usually new line insert means that there are many indexes.. so I would suggest reconsidering indexing.
Michael's solution may speed things up a bit, but perhaps you should have a look at the database and try to break the big table into smaller ones. Take a look at this: link. Normalizing your database tables may save you loads of time in the future.

Generating a massive 150M-row MySQL table

I have a C program that mines a huge data source (20GB of raw text) and generates loads of INSERTs to execute on simple blank table (4 integer columns with 1 primary key). Setup as a MEMORY table, the entire task completes in 8 hours. After finishing, about 150 million rows exist in the table. Eight hours is a completely-decent number for me. This is a one-time deal.
The problem comes when trying to convert the MEMORY table back into MyISAM so that (A) I'll have the memory freed up for other processes and (B) the data won't be killed when I restart the computer.
ALTER TABLE memtable ENGINE = MyISAM
I've let this ALTER TABLE query run for over two days now, and it's not done. I've now killed it.
If I create the table initially as MyISAM, the write speed seems terribly poor (especially due to the fact that the query requires the use of the ON DUPLICATE KEY UPDATE technique). I can't temporarily turn off the keys. The table would become over 1000 times larger if I were to and then I'd have to reprocess the keys and essentially run a GROUP BY on 150,000,000,000 rows. Umm, no.
One of the key constraints to realize: The INSERT query UPDATEs records if the primary key (a hash) exists in the table already.
At the very beginning of an attempt at strictly using MyISAM, I'm getting a rough speed of 1,250 rows per second. Once the index grows, I imagine this rate will tank even more.
I have 16GB of memory installed in the machine. What's the best way to generate a massive table that ultimately ends up as an on-disk, indexed MyISAM table?
Clarification: There are many, many UPDATEs going on from the query (INSERT ... ON DUPLICATE KEY UPDATE val=val+whatever). This isn't, by any means, a raw dump problem. My reasoning for trying a MEMORY table in the first place was for speeding-up all the index lookups and table-changes that occur for every INSERT.
If you intend to make it a MyISAM table, why are you creating it in memory in the first place? If it's only for speed, I think the conversion to a MyISAM table is going to negate any speed improvement you get by creating it in memory to start with.
You say inserting directly into an "on disk" table is too slow (though I'm not sure how you're deciding it is when your current method is taking days), you may be able to turn off or remove the uniqueness constraints and then use a DELETE query later to re-establish uniqueness, then re-enable/add the constraints. I have used this technique when importing into an INNODB table in the past, and found even with the later delete it was overall much faster.
Another option might be to create a CSV file instead of the INSERT statements, and either load it into the table using LOAD DATA INFILE (I believe that is faster then the inserts, but I can't find a reference at present) or by using it directly via the CSV storage engine, depending on your needs.
Sorry to keep throwing comments at you (last one, probably).
I just found this article which provides an example of a converting a large table from MyISAM to InnoDB, while this isn't what you are doing, he uses an intermediate Memory table and describes going from memory to InnoDB in an efficient way - Ordering the table in memory the way that InnoDB expects it to be ordered in the end. If you aren't tied to MyISAM it might be worth a look since you already have a "correct" memory table built.
I don't use mysql but use SQL server and this is the process I use to handle a file of similar size. First I dump the file into a staging table that has no constraints. Then I identify and delete the dups from the staging table. Then I search for existing records that might match and put the idfield into a column in the staging table. Then I update where the id field column is not null and insert where it is null. One of the reasons I do all the work of getting rid of the dups in the staging table is that it means less impact on the prod table when I run it and thus it is faster in the end. My whole process runs in less than an hour (and actually does much more than I describe as I also have to denormalize and clean the data) and affects production tables for less than 15 minutes of that time. I don't have to wrorry about adjusting any constraints or dropping indexes or any of that since I do most of my processing before I hit the prod table.
Consider if a simliar process might work better for you. Also could you use some sort of bulk import to get the raw data into the staging table (I pull the 22 gig file I have into staging in around 16 minutes) instead of working row-by-row?