InnoDB indexes before and after importing - mysql

I'm trying to import a large SQL file that was generated by mysqldump for an InnoDB table but it is taking a very long time even after adjusting some parameters in my.cnf and disabling AUTOCOMMIT (as well as FOREIGN_KEY_CHECKS and UNIQUE_CHECKS but the table does not have any foreign or unique keys). But I'm wondering if it's taking so long because of the several indexes in the table.
Looking at the SQL file, it appears that the indexes are being created in the CREATE TABLE statement, prior to inserting all the data. Based on my (limited) research and personal experience, I've found that it's faster to add the indexes after inserting all the data. Does it not have to check the indexes for every INSERT? I know that mysqldump does have a --disable-keys option which does exactly that – disable the keys prior to inserting, but apparently this only works with MyISAM tables and not InnoDB.
But why couldn't mysqldump not include the keys with the CREATE TABLE statement for InnoDB tables, then do an ALTER TABLE after all the data is inserted? Or does InnoDB work differently, and there is no speed difference?
Thanks!

I experimented with this concept a bit at a past job, where we needed a fast method of copying schemas between MySQL servers.
There is indeed a performance overhead when you insert to tables that have secondary indexes. Inserts need to update the clustered index (aka the table), and also update secondary indexes. The more indexes a table has, the more overhead it causes for inserts.
InnoDB has a feature called the change buffer which helps a bit by postponing index updates, but they have to get merged eventually.
Inserts to a table with no secondary indexes are faster, so it's tempting to try to defer index creation until after your data is loaded, as you describe.
Percona Server, a branch of MySQL, experimented with a mysqldump --optimize-keys option. When you use this option, it changes the output of mysqldump to have CREATE TABLE with no indexes, then INSERT all data, then ALTER TABLE to add the indexes after the data is loaded. See https://www.percona.com/doc/percona-server/LATEST/management/innodb_expanded_fast_index_creation.html
But in my experience, the net improvement in performance was small. It still takes a while to insert a lot of rows, even for tables with no indexes. Then the restore needs to run an ALTER TABLE to build the indexes. This takes a while for a large table. When you count the time of INSERTs plus the extra time to build indexes, it's only a few (low single-digit) percents faster than inserting the traditional way, into a table with indexes.
Another benefit of this post-processing index creation is that the indexes are stored more compactly, so if you need to save disk space, that's a better reason to use this technique.
I found it much more beneficial to performance to restore by loading several tables in parallel.
The new MySQL 8.0 tool mysqlpump supports multi-threaded dump.
The open-source tool mydumper supports multi-threaded dump, and also has a multi-threaded restore tool, called myloader. The worst downside of mydumper/myloader is that the documentation is virtually non-existant, so you need to be an intrepid power user to figure out how to run it.
Another strategy is to use mysqldump --tab to dump CSV files instead of SQL scripts. Bulk-loading CSV files is much faster than executing SQL scripts to restore the data. Well, it dumps an SQL file for the table definition, and a CSV for the data to import. It creates separate files for each table. You have to manually recreate the tables by loading all the SQL files (this is quick), and then use mysqlimport to load the CSV data files. The mysqlimport tool even has a --use-threads option for parallel execution.
Test carefully with different numbers of parallel threads. My experience is that 4 threads is the best. With greater parallelism, InnoDB becomes a bottleneck. But your experience may be different, depending on the version of MySQL and your server hardware's performance capacity.
The fastest restore method of all is when you use a physical backup tool, the most popular is Percona XtraBackup. This allows for fast backups and even faster restores. The backed up files are literally ready to be copied into place and used as live tablespace files. The downside is that you must shut down your MySQL Server to perform the restore.

Related

Reduce database size (mysqldump) before restoring

I have a 42Gb mysqldump of my database (which is like 100gb). I have searched over the web if there is any way of reducing the disk size of the database, I mean, once the dump is restored, I want the disk size to reduce from 100gb to 87-90Gb of space. I haven't found any relevant information yet.
I would appreciatte if anyone could guide me a little bit about this.
Thanks
You could filter the CREATE TABLE statements so they create compressed tables as they restore:
sed -e 's/ENGINE=InnoDB/& ROW_FORMAT=COMPRESSED/' dump.sql | mysql ...
Another idea is to drop some or all of the indexes in large tables before restoring data. Insert ALTER TABLE <tablename> DROP KEY <indexname>; statements after the CREATE TABLE and before the subsequent INSERT statements.
Even if you decide later that you want the indexes after all, creating the index after data has been loaded often results in a more compact index.
Removing indexes might impact the performance of some of your queries that need those indexes. But if it's more important to make the database smaller, then it's up to you how much you sacrifice query performance.
I'll leave it to you to figure out how you want to edit a 42GB file. Different solutions exist depending on your environment (Mac, Windows, Linux).

It is more efficient to use hybrid Mysql storage engine (innoDB+MyISAM) for log files?

I am planning a centralize log server, which receiving syslog from many devices (up to 2000).
It should have ability to query & sort events.
I have read Mysql storage engine for log table
Since MyISAM is better on selecting, while InnoDB is better on writing. How about using hybrid engine to gain different benefit ?
Use innoDB for writing, to get benefit from row-locking.
Use MyISAM for read-only ancient log.
Use Merge to split large table into many smaller tables.
Here are the steps:
Create InnoDB table A, syslog-ng will insert row to it when receiving log.
Create MyISAM table named 'yyyymmdd' every midnight, move rows from table A to it. The data will keep persistently.
Using a table with merge engine, merge each 'yyyymmdd' for query operation.
Considering both query and write, is it more efficient strategy than using a single innoDB/MyISAM table?

Mysql insert,updates very slow

Our server database is in mysql 5.1
we have 754 tables in our db.We create a table for each project. Hence the large no of tables.
From past one week i have noticed a very long delay in inserts and updates to any table.If i create a new table and insert into it,It takes one min to insert around 300 recs.
Where as our test database in the same server has 597 tables Same insertion is very fast in test db.
Default engine is MYISAM. But we have few tables in INNODB .
There were a few triggers running. After i deleted triggers it has become some what faster. But it is not fast enough.
USE DESCRIBE to know your query execution plans.
Look more at http://dev.mysql.com/doc/refman/5.1/en/explain.html for its usage.
As #swapnesh mentions, the DESCRIBE command is very usefull for performance debugging.
You can also check your installation for issues using:
https://raw.github.com/rackerhacker/MySQLTuner-perl/master/mysqltuner.pl
You use it like this:
wget https://raw.github.com/rackerhacker/MySQLTuner-perl/master/mysqltuner.pl
chmod +x mysqltuner.pl
./mysqltuner.pl
Of course, here I am assuming that you run some kind of a Unix based system.
You can use OPTIMIZE. According to Manual it does the following:
Reorganizes the physical storage of table data and associated index
data, to reduce storage space and improve I/O efficiency when
accessing the table. The exact changes made to each table depend on
the storage engine used by that table
The syntax is:
OPTIMIZE TABLE tablename
Inserts are typically faster when made in bulk rather than one by one. Try inserting 10, 30, or 100 records per statement.
If you use jdbc you may be able to achieve the same effect with batching, without changing the SQL.

How do I make a MySQL database run completely in memory?

I noticed that my database server supports the Memory database engine. I want to make a database I have already made running InnoDB run completely in memory for performance.
How do I do that? I explored PHPMyAdmin, and I can't find a "change engine" functionality.
Assuming you understand the consequences of using the MEMORY engine as mentioned in comments, and here, as well as some others you'll find by searching about (no transaction safety, locking issues, etc) - you can proceed as follows:
MEMORY tables are stored differently than InnoDB, so you'll need to use an export/import strategy. First dump each table separately to a file using SELECT * FROM tablename INTO OUTFILE 'table_filename'. Create the MEMORY database and recreate the tables you'll be using with this syntax: CREATE TABLE tablename (...) ENGINE = MEMORY;. You can then import your data using LOAD DATA INFILE 'table_filename' INTO TABLE tablename for each table.
It is also possible to place the MySQL data directory in a tmpfs in thus speeding up the database write and read calls. It might not be the most efficient way to do this but sometimes you can't just change the storage engine.
Here is my fstab entry for my MySQL data directory
none /opt/mysql/server-5.6/data tmpfs defaults,size=1000M,uid=999,gid=1000,mode=0700 0 0
You may also want to take a look at the innodb_flush_log_at_trx_commit=2 setting. Maybe this will speedup your MySQL sufficently.
innodb_flush_log_at_trx_commit changes the mysql disk flush behaviour. When set to 2 it will only flush the buffer every second. By default each insert will cause a flush and thus cause more IO load.
Memory Engine is not the solution you're looking for. You lose everything that you went to a database for in the first place (i.e. ACID).
Here are some better alternatives:
Don't use joins - very few large apps do this (i.e Google, Flickr, NetFlix), because it sucks for large sets of joins.
A LEFT [OUTER] JOIN can be faster than an equivalent subquery because
the server might be able to optimize it better—a fact that is not
specific to MySQL Server alone.
-The MySQL Manual
Make sure the columns you're querying against have indexes. Use EXPLAIN to confirm they are being used.
Use and increase your Query_Cache and memory space for your indexes to get them in memory and store frequent lookups.
Denormalize your schema, especially for simple joins (i.e. get fooId from barMap).
The last point is key. I used to love joins, but then had to run joins on a few tables with 100M+ rows. No good. Better off insert the data you're joining against into that target table (if it's not too much) and query against indexed columns and you'll get your query in a few ms.
I hope those help.
If your database is small enough (or if you add enough memory) your database will effectively run in memory since it your data will be cached after the first request.
Changing the database table definitions to use the memory engine is probably more complicated than you need.
If you have enough memory to load the tables into memory with the MEMORY engine, you have enough to tune the innodb settings to cache everything anyway.
"How do I do that? I explored PHPMyAdmin, and I can't find a "change engine" functionality."
In direct response to this part of your question, you can issue an ALTER TABLE tbl engine=InnoDB; and it'll recreate the table in the proper engine.
In place of the Memory storage engine, one can consider MySQL Cluster. It is said to give similar performance but to support disk-backed operation for durability. I've not tried it, but it looks promising (and been in development for a number of years).
You can find the official MySQL Cluster documentation here.
Additional thoughts :
Ramdisk - setting the temp drive MySQL uses as a RAM disk, very easy to set up.
memcache - memcache server is easy to set up, use it to store the results of your queries for X amount of time.

Changing tables from MyISAM to InnoDB make the system slow

Hi I am using Mysql 5.0.x
I have just changed a lot of the tables from MyISAM to InnoDB
With the MyISAM tables it took about 1 minute to install our database
With the InnoDB it takes about 15 minute to install the same database
Why does the InnoDB take so long?
What can I do to speed things up?
The Database install does the following steps
1) Drops the schema
2) Create the schema
3) Create tables
4) Create stored procedures
5) Insert default data
6) Insert data via stored procedure
EDIT:
The Inserting of default data takes most of the time
Modify the Insert Data step to start a transaction at the start and to commit it at the end. You will get an improvement, I guarantee it. (If you have a lot of data, you might want to break the transaction up to per table.)
If you application does not use transactions at all, then you should set the paramater innodb_flush_log_at_trx_commit to 2. This will give you a lot of performance back because you will almost certainly have auto_commit enabled and this generates a lot more transactions than InnoDB's default parameters are configured for. This setting stops it unnecessarily flushing the disk buffers on every commit.
15 minutes doesn't seem excessive to me. After all, it's a one-time cost.
I'm not certain, but I would imagine that part of the explanation is the referential integrity isn't free. InnoDB has to do more work to guarantee it, so of course it would take up more time.
Maybe your script needs to be altered to add constraints after the tables are created.
Like duffymo said, disable your constraints(indexes and foreing/primary keys) before inserting the data.
Maybe you should restore some indexes before the data inserted via stored procedure, if its use a lot of select statements