How to tell if a MySQL process is stuck? - mysql

I have a long-running process in MySQL. It has been running for a week. There is one other connection, to a replication master, but I have halted slave processing so there's effectively nothing else going on.
How can I tell if this process is still working? I knew it would take a long time which is why I put it on its own database instance, but this is longer than I anticipated. Obviously, if it is still doing work, I don't want to kill it. If it is zombied, then I don't know how to get the work done that it's supposed to be doing.
It's in the "Sending data" state. The table is an InnoDB one but without any FK references that are used by the query. The InnoDB status shows no errors or locks since the query started.
Any thoughts are appreciated.

Try "SHOW PROCESSLIST" to see what's active.
Of course if you kill it, it may then want to take just as much time rolling it back.

You need to kill it and come up with better indices.
I did a job for a guy. Had a table with about 35 million rows. His batch process, like yours, had been running a week, with no end in sight. I added some indexes, made some changes to the order and methods of his batch process, and got the whole thing down to about two and a half hours. On a slower machine.

Given what you've said, it's not stuck. However, the is absolutely no guarantee that it will actually finish in anything resembling a reasonable amount of time. Adding indicies will almost certainly help, and depending on the type of query refactoring it into a series of queries that use temp tables could possibly give you a huge performance boost. I wouldn't suggest waiting around for it to maybe finish.

For better performance on a database that size, you may want to look at a document based database such as mongoDB. It will take more hard drive space to store the database, but depending on your current schema, you may get much better performance.

Related

InnoDB: breaking and fixing indexes

From time to time it happens that some indexes in our tables get broken and the DB start consuming 100% CPU load and in some time it gets completely stuck. Even simple queries won't finish and restarts don't help.
What I found is to either drop and recreate indexes one by one (which might take a loooong time and lot of investigation) or just calling alter table mytable engine=innodb; on suspicious table. This works actually quite well, it fixes everything and everything gets back to normal. But I have no idea what actually happens in background and why it helps. Also – would it help to do this manually once a month? Is it a good idea to automatize this? Is there some way to do some DB health check?
A guess...
You have an older version of MySQL/Percona, one that either does not have "persistent statistics" or does not have it enabled.
And you have a nasty query that sometimes leads the Optimizer to pick the wrong query plan.
The quick fix (that may or may not work) is to run ANALYZE TABLE of the table(s) in the slow query.
A better fix may be to upgrade the version.
Meanwhile, let's see the query, its EXPLAIN, and SHOW CREATE TABLE for each table involved. The may be a way to reformulate it to be less flaky.

Mysql Lock times in slow query log

I have an application that has been running fine for quite awhile, but recently a couple of items have started popping up in the slow query log.
All the queries are complex and ugly multi join select statements that could use refactoring. I believe all of them have blobs, meaning they get written to disk. The part that gets me curious is why some of them have a lock time associated with them. None of the queries have any specific locking protocols set by the application. As far as I know, by default you can read against locks unless explicitly specified.
so my question: What scenarios would cause a select statement to have to wait for a lock (and thereby be reported in the slow query log)? Assume both INNODB and MYISAM environments.
Could the disk interaction be listed as some sort of lock time? If yes, is there documentation around that says this?
thanks in advance.
MyISAM will give you concurrency problems, an entire table is completely locked when an insert is in progress.
InnoDB should have no problems with reads, even while a write/transaction is in progress due to it's MVCC.
However, just because a query is showing up in the slow-query log doesn't mean the query is slow - how many seconds, how many records are being examined?
Put "EXPLAIN" in front of the query to get a breakdown of the examinations going on for the query.
here's a good resource for learning about EXPLAIN (outside of the excellent MySQL documentation about it)
I'm not certain about MySql, but I know that in SQL Server select statements do NOT read against locks. Doing so will allow you to read uncommitted data, and potentially see duplicate records or miss a record entirely. The reason for this is because if another process is writing to the table, the database engine may decide it's time to reorganize some data and shifts it around on disk. So it moves a record you already read to the end and you see it again, or it moves one from the end up higher where you've already past.
There's a guy on the net somewhere who actually wrote a couple of scripts to prove that this happens and I tried them once and it only took a few seconds before a duplicate showed up. Of course, he designed the scripts in a fashion that would make it more likely to happen, but it proves that it definitely can happen.
This is okay behaviour if your data doesn't need to be accurate and can certainly help prevent deadlocks. However, if you're working on an application dealing with something like people's money then that's very bad.
In SQL Server you can use the WITH NOLOCK hint to tell your select statement to ignore locks. I'm not sure what the equivalent in MySql would be but maybe someone else here will say.

Extreme low-priority SELECT query in MySQL

Is it possible to issue an (expensive, but low-priority) SELECT query to mySQL in such a way that if an UPDATE query appears in the queue, mySQL will immediately terminate the query, and re-append it to the end of the queue?
If re-appending to the queue is not possible, I'm happy with simply killing the SELECT query.
No, not really.
I am not sure exactly what you need, but my guess is that you need to either optimize the SELECT to not lock an entire table, or get the replication going and do the SELECT on the slave rather than the master.
You could theoretically find out what the MySQL process ID is of the SELECT query, and in your application send a KILL before you do any update.
Well, sort of maybe.
A client runs an application which occasionally throws out queries that completely kill performance for everything else on the server. We have monitoring and if we've got a suitable person ready to react, we can deal to that query manually, and we learn about the problems in the app by doing things that way.
But to prevent major outages if noone is on the ball, we have an automated script which terminates long running queries, so the server does recover in the event that noone is available to intervene within 15 minutes.
Far from ideal, but that's where things are currently at with this project, and it does prevent the occasional extended outages that used to occur. We can only move just so fast with fixing up the problem queries.
Anyway, you could run something similar, that looks at the running queries and recognises when you have an update waiting on one of your large selects, and in that event it kills the select. Doing this sort of check a few times a minute is not overly expensive. I'd want to do a bit of testing before running.
So, whether you can solve your problem this way depends on what your tolerance is for how long an update can be delayed. Running this every minute (as we do) is no problem at all. Running it every second would noticeably add to the overall load. You'd need to test how far you can reasonably go in between those points.
This approach means some delay before the select gets pushed out of the way, but it saves you having to build this logic into potentially many different places in your application.
--
Regarding breaking up your query, you're most likely better off restricting the chunks by id range from one or more tables in your query rather than by offset and limit.
--
There may also be good solutions available based on partitioning your tables so that the queries don't collide as badly. Make sure you have a very good grasp on what you are doing for this though.

How much time does MySQL need to build an index

How much time does MySQL need to build an index of a table with 30,000,000 entries that are strings of length 256?
At the moment it seems to take hours and I don't know how long I should wait till I conclude that MySql simply failed at building an index.
You may run SHOW PROCESSLIST \G in mysql console to watch its state. I had a similar problem just a couple of hours ago, but my table was much smaller.
Here a list of thread states you will definitely need. After an hour of waiting I realized that ALTER TABLE CREATE INDEX is in Locked state, I needed to restart mysqld and run the statement once again. That time I had index built in 15 minutes.
By the way, I recommend to run index creation from mysql console, GUI tools may add some spices to the process.
it could easily take hours. it all depends on the machine specs, load, etc etc. to see whether it's failed, check something like top or watch your hard drives - if they're going mad it's still indexing.
Depending on your OS you may check for disk activity (i.e. does it reads/writes DB files) to find out if it failed or not.

MySQL slow query log - how slow is slow?

What do you find is the optimal setting for mysql slow query log parameter, and why?
I recommend these three lines
log_slow_queries
set-variable = long_query_time=1
log-queries-not-using-indexes
The first and second will log any query over a second. As others have pointed out a one second query is pretty far gone if you are a shooting for a high transaction rate on your website, but I find that it turns up some real WTFs; queries that should be fast, but for whatever combination of data it was run against was not.
The last will log any query that does not use an index. Unless your doing data warehousing any common query should have the best index you can find so pay attention to its output.
Although its certainly not for production, this last option
log = /var/log/mysql/mysql.log
will log all queries, which can be useful if you are trying to tune a specific page or action.
Whatever time /you/ feel is unacceptably slow for a query on your systems.
It depends on the kind of queries you run and the kind of system; a query taking several seconds might not matter if it's some back-end reporting system doing complex data-mining etc where a delay doesn't matter, but might be completely unacceptable on a user-facing system which is expected to return results promptly.
Set it to whatever you like. The only problem is that in a stock MySQL, it can only be set in increments of 1 second, which is too slow for some people.
Most heavily used production servers execute far too many queries to log them all. The slow log is a way of filtering the log so that we can see the ones which take a long time (most queries are likely to be executed almost instantly). It's a bit of a blunt instrument.
Set it to 1 sec if you like, you're probably not going to run out of disc space or create a performance problem by doing that.
It's really about the risk of enabling the slow log- don't do it if you feel it's likely to cause further disc or performance problems.
Of course you could enable the slow log on a non-production server and put simulated load through, but that is never quite the same.
Peter Zaitsev posted a nice article about using the slow query log. One thing he notes is important is to also consider how often a certain query is used. Reports run once a day are not important to be fast. But something that is run very often might be a problem even if it takes half a second. And you cant detect that without the microslow patch.
Not only is it a blunt instrument as far as resolution is concerned, but also it is MySQL-instance wide, so that if you have different databases with differing performancy requirements you're kind of out of luck. Obviously there are ways around that, but it's important to keep that in mind when setting your slow log setting.
Aside from performance requirements of your application, another factor to consider is what you're trying to log. Are you using the log to catch queries that would threaten the stability of your db instance (ones that cause deadlocks or Cartesian joins, for instance) or queries that affect the performance for specific users and that might require a little tuning? That will influence where you set your threshold.