Optimizing DROP TABLE on a huge InnoDB table - mysql

I have an innodb table with over 140 million rows taking 26GB. I would like to drop this table however this is taking way too long, probably days. How can I speed up this query?
processor: Intel® Xeon® E3-1220 4 Cores x 3.1 GHz (3.4 Turbo Boost)
ram: 12 GB DDR3 ECC

If you don't want to check referential integrity while deleting records, use
TRUNCATE TABLE table_name
More in MySQL documentation.

This may depend on your innodb_file_per_table setting. You may find Performance problem with Innodb and DROP TABLE relevant:
If you’re not using innodb_file_per_table you’re not affected because
in this case tablespace does not need to be dropped so pages from
Buffer pool LRU do not need to be discarded. The raw performance of
dropping tables I measured on the test server was 10 times better with
single tablespace, though this will vary a lot depending on buffer
pool size. MyISAM tables creating/dropping was several times faster
still.
Also check Slow DROP TABLE

This solved my problem, who would've thought it was because of integrity checks:
https://dba.stackexchange.com/questions/53515/drop-table-on-a-huge-innodb-table/

Related

Will switch to MyISAM Engine help to improve the speed of reading operations?

I'm currently have a few tables with InnoDB Engine. 10-20 connections are constantly inserts data into those tables. I use MySQL RDS instance on AWS. Metric shows about 300 Write IOPS (counts/second). However, INSERT operations lock the table, and if someone want to perform a query like SELECT COUNT(*) FROM table; it could literally take a few hours for the first time before MySQL cache the result.
I'm not a DBA and my knowledge about DB are very limited. So the question is if I'll switch to MyISAM Engine will it help to improve the time of READ operations?
SELECT COUNT(*) without WHERE is bad query for InnoDB, as it does not cache the row count like MyISAM do. So if you have issue with this particular query, you have to cache the count somewhere - in a stats table for example.
After you remove this specific type of query, you can talk about InnoDB vs MyISAM read performance. Generally writes do not block reads in InnoDB - is uses MVCC for this. InnoDB performance however is very dependent of how much RAM you have set for the buffer pool.
InnoDB and MyISAM are very different in how they store data. You can always optimize for one of them and knowing the differences can help you in designing your application. Generally you can have as good performance for reading as in MyISAM in InnoDB tables - you just can use count without where clause, and you always should have a suitable index for where clauses, as in InnoDB table scan will be slower than in MyISAM.
I think you should stick with your current setup. InnoDB is supposed not to lock the table when inserting rows, since it uses the MVCC technique. On the other hand, MyISAM locks the entire table when new rows are inserted.
So, if you have many writes, you should stick with InnoDB.
Innodb is a better overall engine in general. There are some benchmarks out there that put read operations in myiasm a little ahead of innodb. However, if your site is big enough to notice this performance difference, you should be on innodb anyway because of all the other efficiencies. Innodb alone wins because of the row level locking instead if table level locking in myiasm when backing up your database.

Performance difference between Innodb and Myisam in Mysql

I have a mysql table with over 30 million records that was originally being stored with myisam. Here is a description of the table:
I would run the following query against this table which would generally take around 30 seconds to complete. I would change #eid each time to avoid database or disk caching.
select count(fact_data.id)
from fact_data
where fact_data.entity_id=#eid
and fact_data.metric_id=1
I then converted this table to innoDB without making any other changes and afterwards the same query now returns in under a second every single time I run the query. Even when I randomly set #eid to avoid caching, the query returns in under a second.
I've been researching the differences between the two storage types to try to explain the dramatic improvement in performance but haven't been able to come up with anything. In fact, much of what I read indicates that Myisam should be faster.
The queries I'm running are against a local database with no other processes hitting the database at the time of the tests.
That's a surprisingly large performance difference, but I can think of a few things that may be contributing.
MyISAM has historically been viewed as faster than InnoDB, but for recent versions of InnoDB, that is true for a much, much smaller set of use cases. MyISAM is typically faster for table scans of read-only tables. In most other use cases, I typically find InnoDB to be faster. Often many times faster. Table locks are a death knell for MyISAM in most of my usage of MySQL.
MyISAM caches indexes in its key buffer. Perhaps you have set the key buffer too small for it to effectively cache the index for your somewhat large table.
MyISAM depends on the OS to cache table data from the .MYD files in the OS disk cache. If the OS is running low on memory, it will start dumping its disk cache. That could force it to keep reading from disk.
InnoDB caches both indexes and data in its own memory buffer. You can tell the OS not to also use its disk cache if you set innodb_flush_method to O_DIRECT, though this isn't supported on OS X.
InnoDB usually buffers data and indexes in 16kb pages. Depending on how you are changing the value of #eid between queries, it may have already cached the data for one query due to the disk reads from a previous query.
Make sure you created the indexes identically. Use explain to check if MySQL is using the index. Since you included the output of describe instead of show create table or show indexes from, I can't tell if entity_id is part of a composite index. If it was not the first part of a composite index, it wouldn't be used.
If you are using a relatively modern version of MySQL, run the following command before running the query:
set profiling = 1;
That will turn on query profiling for your session. After running the query, run
show profiles;
That will show you the list of queries for which profiles are available. I think it keeps the last 20 by default. Assuming your query was the first one, run:
show profile for query 1;
You will then see the duration of each stage in running your query. This is extremely useful for determining what (e.g., table locks, sorting, creating temp tables, etc.) is causing a query to be slow.
My first suspicion would be that the original MyISAM table and/or indexes became fragmented over time resulting in the performance slowly degrading. The InnoDB table would not have the same problem since you created it with all the data already in it (so it would all be stored sequentially on disk).
You could test this theory by rebuilding the MyISAM table. The easiest way to do this would be to use a "null" ALTER TABLE statement:
ALTER TABLE mytable ENGINE = MyISAM;
Then check the performance to see if it is better.
Another possibility would be if the database itself is simply tuned for InnoDB performance rather than MyISAM. For example, InnoDB uses the innodb_buffer_pool_size parameter to know how much memory should be allocated for storing cached data and indexes in memory. But MyISAM uses the key_buffer parameter. If your database has a large innodb buffer pool and a small key buffer, then InnoDB performance is going to be better than MyISAM performance, especially for large tables.
What are your index definitions, there are ways in which you can create indexes for MyISAM in which your index fields will not be used when you think they would.

mysql innodb vs myisam inserts

I have a table with 17 million rows. I need to grab 1 column of that table and insert it all into another table. Here's what I did:
INSERT IGNORE INTO table1(name) SELECT name FROM main WHERE ID < 500001
InnoDB executes in around 3 minutes and 45 seconds
However, MyISAM executes in just below 4 seconds. Why the difference?
I see everyone praising InnoDB but honestly I don't see how it's better for me. It's so much slower. I understand that it's great for integrity and whatnot, but many of my tables will not be updated (just read). Should I even bother with InnoDB?
The difference is most likely due to configuration of innoDB, which takes a bit more tweaking than myISAM. The idea of innoDB is to keep most of your data in memory, and flushing/reading to disk only when you have a few spare cpu cycles.
should you even bother with InnoDB is a really good question. If you're going to keep using MySQL, it's highly recommended you get some experience with InnoDB. But if you're doing a quick-and-dirty job for a database that won't see a lot of traffic and not worried about scale, then the ease of MyISAM may just be a win for you. InnoDB can be overkill in many instances where someone just wants a simple database.
but many of my tables will not be updated
You can still get a performance lift from InnoDB if you are doing 99% reading. If you configure your buffer pool size to hold your entire database in memory, InnoDB will NEVER have to go to disk to get your data, even if it misses the mysql query cache.
In MyISAM, there is a good chance you have to read the row from disk, and you're leaving the operating system to do the caching and optimization for you.
innodb-buffer-pool-size
My first guess is to check innodb_buffer_pool_size which ships out of the box set to 8M. It's recommended to have this around 80% of your total memory. Once you hit that limit, innodb performance will drop significantly because it needs to flush something out of the buffer to make room for the new data, which can be expensive
autocommit=0
Also, make sure autocommit is turned off while you load your table, or flushing will happen on every insert. You can turn it back on after you're done, and it's a client-side setting. very safe.
Loading tables typically happens once
Think about if you really want to tune your database to accommodate "inserting 17million rows". How often do you do this? MyISAM might be quicker in this instance, but when you have 100 concurrent connections all reading and modifying this table at the same time, you'll find a well-tuned innoDB will win and MyISAM will choke on table locks.
How MyISAM sees this operation
MyISAM will be very good at this without any tuning, because under the covers, you're simply appending each row to a file (and updating an index). Your OS and disk caching will handle all those performance problems.
How InnoDB sees this operation
Innodb will know the table needs a write, so it throws the row into the insert buffer.
You give it no time before the next insert, so innoDB has no time to deal with the buffer, it runs out of room and is forced to 'hold up' the insert while it writes to the buffer pool and updates indexes.
Next, your buffer pool fills up, and innoDB is forced to 'hold up' the insert and flush some page out of the buffer pool to disk.
And you keep throwing inserts at it like crazy.
Note that when you do tune InnoDB to give you a MySQL> prompt very fast after you do this, InnoDB will still be scrambling underneath the covers to catch up in it's spare time, but will be willing to execute a new transaction for you.
MUST READ:
http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/
http://dev.mysql.com/doc/refman/5.0/en/innodb-tuning.html (see bulk data loading tips)
You're saying right upto some extend. InnoDB is slower than MyISAM but in which cases?
Everything is not made to meet everyone's requirements. INNODB is a transactional database engine while MyISAM is not. Therefore to make it ACID compliance and transactions aware storage engine, we have to pay its cost in terms of response time.
Further more InnoDB runs faster if it is properly tuned using my.ini or other configuration file.
At the end I am able to understand following reasons why people are praising InnoDB:
It is ACID compliant and transaction supported engine
It take row-level locking while working on a table while MyISAM take table-level locks
InnoDB is highly tunable for multi-core/multi-process machines to improve concurrency
Last but not the least comment from my side; anything can meet "everyone's" needs so its solely depends in which scenario you're comparing both engines.
Check out MYISAM vs Innodb comparison on Wikipedia.
http://en.wikipedia.org/wiki/Comparison_of_MySQL_database_engines

Mysql MEMORY table vs InnoDB table (many inserts, few reads)

I run my sites all on InnoDB tables which is working really well so far. Now I like to know what is going on in real-time on my sites, so I store each pageview (page, referrer, IP, hostname, etc) in an InnoDB table. There are about 100 inserts per second, and this table is only read once in a while when i'm browsing the logs.
I clean out the table every minute with a cron that removes old items. This leaves about 35.000 rows in that table on average, with a size of about 5MB.
Would it be easier on the server if I were to transfer the InnoDB table to a MEMORY table? As far as I can see this would save a lot of disk IO right? Restarting Mysql would result in a loss of data, but this does not matter in my case.
Question: In my case, would you recommend a Memory table over a InnoDB table?
Yes I would. The conditions you mention (a lot of writes, periodic purging of data, data persistence not required) make it pretty much an ideal candidate for MEMORY.
please optimize your innodb settings:
As long as you have configured InnoDB to use enough memory to hold your entire table (with innodb_buffer_pool_size), and there is not excessive pressure from other InnoDB tables on the same server, the data will remain in memory. If you're concerned about write performance (and again barring other uses of the same system) you can reduce durability to drastically increase write performance by setting innodb_flush_log_at_trx_commit = 0 and disabling binary logging.
Using any sort of triggers with temporary tables will be a mess to maintain, and won't give you any benefits of transactionality on the temporary tables.
You can find more details right here:
http://dev.mysql.com/doc/refman/4.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit

Slow DROP TEMPORARY TABLE

Ran into an interesting problem with a MySQL table I was building as a temporary table for reporting purposes.
I found that if I didn't specify a storage engine, the DROP TEMPORARY TABLE command would hang for up to half a second.
If I defined my table as ENGINE = MEMORY this short hang would disappear.
As I have a solution to this problem (using MEMORY tables), my question is why would a temporary table take a long time to drop? Do they not use the MEMORY engine by default? It's not even a very big table, a couple of hundred rows with my current test data.
Temporary tables, by default, will be created where ever the mysql configuration tells it to, typically /tmp or somewhere else on a disk. You can set this location (and even multiple locations) to a RAM disk location such as /dev/shm.
Hope this helps!
If the temporary file is created with InnoDb engine, which may be the case if your default engine was InnoDb, and the InnoDb buffer pool is large, DROP TEMPORARY TABLE may take some time since it needs to scan all pages to discard.
It was mentionned in a comment to this stack overflow question.
Note also that DROP (TEMPORARY) TABLE uses a LOCK that may have huge impact on all your server. See for example this.
At my work, we recently had a server slow down because we had an InnoDb buffer pool of 80 Gb and some SQL requests had been optimized using InnoDb temporary tables.
About 100 such DROP TEMPORARY TABLE requests every 5 minutes were sufficient to have a huge impact. And the problem was hard to debug since slow query log would tell us that UPDATEs of a single row accessed by primary key in some other table was taking two seconds, and there was an enormous amount of such updates. But even if most query time was spent on these updates, the problem was really because of the DROP TEMPORARY TABLE requests.