InnoDB: breaking and fixing indexes - mysql

From time to time it happens that some indexes in our tables get broken and the DB start consuming 100% CPU load and in some time it gets completely stuck. Even simple queries won't finish and restarts don't help.
What I found is to either drop and recreate indexes one by one (which might take a loooong time and lot of investigation) or just calling alter table mytable engine=innodb; on suspicious table. This works actually quite well, it fixes everything and everything gets back to normal. But I have no idea what actually happens in background and why it helps. Also – would it help to do this manually once a month? Is it a good idea to automatize this? Is there some way to do some DB health check?

A guess...
You have an older version of MySQL/Percona, one that either does not have "persistent statistics" or does not have it enabled.
And you have a nasty query that sometimes leads the Optimizer to pick the wrong query plan.
The quick fix (that may or may not work) is to run ANALYZE TABLE of the table(s) in the slow query.
A better fix may be to upgrade the version.
Meanwhile, let's see the query, its EXPLAIN, and SHOW CREATE TABLE for each table involved. The may be a way to reformulate it to be less flaky.

Related

How do Indexes work in MySQL?

I found a lot of information on how indexes works in MySQL by looking at the following SO link: How do MySQL indexes work? However, I am facing a mysql issue I can not resolve, and I'm unsure whether it is related to indexing or not.
The problem is: I used multiple indexes in most of my tables, and everything seems to be working fine. However, when I restore the old back up data to my existing data, the size of the db keeps getting larger (it almost doubles each time).
Example: I was using a mysql db named DB1 last week, I made a backup and continued to use DB1. A few days later, I needed to continue from that backup db, so I restored it to DB1.
Before the restore, DB1's size was 115MB, but afterward it was suddenly 350MB.
Can anyone help shed some light on what might be happening?
This is not surprising. If you have lots of indexes, it's not unusual for them to take up as much space as the data itself.
When you are talking about 115MB vs. 350MB though, I'd guess the increase in query speed you get is probably worth that extra couple hundred megs of disk space. If not, then you might want to take a closer look at your indexes and make sure they are all actually providing some benefit.

Process of adding an index to production db

Let's say I have a table with 1M rows and I need to add an index on it. What would be the process of doing so? Am I able to do ALTER TABLE table ADD INDEX column (column) directly on production? Or do I need to take other things into account to add the index.
How does one normally go about adding an index on a live production db?
You should first try adding the same index on your test database under similar load conditions and check that it doesn't cause problems. It's possible that by creating the index you lock the table for some time and cause other queries to fail.
One million rows is a large table, but it's not huge. You will probably find that the adding the index completes reasonably quickly. Unless you have real-time constraints it's unlikely to cause serious issues. It's definitely worth testing it first though.
Personally I'd wait until things were quiet, but you don't have to take the database offline. Users might see a delay, but it shouldn't time out. (If it does, something else is wrong.)
Note that adding an index doesn't make the dbms copy the whole table (as adding a column might).

Mysql Lock times in slow query log

I have an application that has been running fine for quite awhile, but recently a couple of items have started popping up in the slow query log.
All the queries are complex and ugly multi join select statements that could use refactoring. I believe all of them have blobs, meaning they get written to disk. The part that gets me curious is why some of them have a lock time associated with them. None of the queries have any specific locking protocols set by the application. As far as I know, by default you can read against locks unless explicitly specified.
so my question: What scenarios would cause a select statement to have to wait for a lock (and thereby be reported in the slow query log)? Assume both INNODB and MYISAM environments.
Could the disk interaction be listed as some sort of lock time? If yes, is there documentation around that says this?
thanks in advance.
MyISAM will give you concurrency problems, an entire table is completely locked when an insert is in progress.
InnoDB should have no problems with reads, even while a write/transaction is in progress due to it's MVCC.
However, just because a query is showing up in the slow-query log doesn't mean the query is slow - how many seconds, how many records are being examined?
Put "EXPLAIN" in front of the query to get a breakdown of the examinations going on for the query.
here's a good resource for learning about EXPLAIN (outside of the excellent MySQL documentation about it)
I'm not certain about MySql, but I know that in SQL Server select statements do NOT read against locks. Doing so will allow you to read uncommitted data, and potentially see duplicate records or miss a record entirely. The reason for this is because if another process is writing to the table, the database engine may decide it's time to reorganize some data and shifts it around on disk. So it moves a record you already read to the end and you see it again, or it moves one from the end up higher where you've already past.
There's a guy on the net somewhere who actually wrote a couple of scripts to prove that this happens and I tried them once and it only took a few seconds before a duplicate showed up. Of course, he designed the scripts in a fashion that would make it more likely to happen, but it proves that it definitely can happen.
This is okay behaviour if your data doesn't need to be accurate and can certainly help prevent deadlocks. However, if you're working on an application dealing with something like people's money then that's very bad.
In SQL Server you can use the WITH NOLOCK hint to tell your select statement to ignore locks. I'm not sure what the equivalent in MySql would be but maybe someone else here will say.

MySQL Crippling performance - indexes

Just wondering if anyone knows a quick way to check the health of some indexes on a table. The one we are having trouble with is quite a large table, but it has indexes so should be ok ("show indexes from mytable" shows them as present).
But it's going really slowly whenever we try to access this table, so wondering if we need to rebuild the indexes or something. None of us here are DBA's so really really appreicate any tips, it's getting quite urgent :(
It's a MyISAM table by the way, dumped from a v4 DB to a v5 database.
Thanks
Check Table
Turn on slow query logging if it's not already on.
Run explain on the slow queries to investigate why they're running slow.
MyISAM tables does not always update index distribution information. Because of this sometimes we need to do it manually: http://dev.mysql.com/doc/refman/5.0/en/analyze-table.html
Thanks for the help everyone, really appreciate it (I know it's a week or so ago since I posted this, been a very busy time...). It turned out to be that the indexes were fine, but were disabled. We think it happened because when we took a backup, the backup crashed half way through. Apparently a backup disables the indexes, then re-enables them afterwards. Since it crashed, they never got re-enabled. Once we turned them back on, it's super quick, phew....
Hope that's useful for someone else

How to tell if a MySQL process is stuck?

I have a long-running process in MySQL. It has been running for a week. There is one other connection, to a replication master, but I have halted slave processing so there's effectively nothing else going on.
How can I tell if this process is still working? I knew it would take a long time which is why I put it on its own database instance, but this is longer than I anticipated. Obviously, if it is still doing work, I don't want to kill it. If it is zombied, then I don't know how to get the work done that it's supposed to be doing.
It's in the "Sending data" state. The table is an InnoDB one but without any FK references that are used by the query. The InnoDB status shows no errors or locks since the query started.
Any thoughts are appreciated.
Try "SHOW PROCESSLIST" to see what's active.
Of course if you kill it, it may then want to take just as much time rolling it back.
You need to kill it and come up with better indices.
I did a job for a guy. Had a table with about 35 million rows. His batch process, like yours, had been running a week, with no end in sight. I added some indexes, made some changes to the order and methods of his batch process, and got the whole thing down to about two and a half hours. On a slower machine.
Given what you've said, it's not stuck. However, the is absolutely no guarantee that it will actually finish in anything resembling a reasonable amount of time. Adding indicies will almost certainly help, and depending on the type of query refactoring it into a series of queries that use temp tables could possibly give you a huge performance boost. I wouldn't suggest waiting around for it to maybe finish.
For better performance on a database that size, you may want to look at a document based database such as mongoDB. It will take more hard drive space to store the database, but depending on your current schema, you may get much better performance.