I regularly see those entries in the process list:
the state is "starting", the info empy (does not show a query) and the time goes to several seconds.
I could not find any information about it in the documentation. Is this normal and what does it mean? Is it caused by degrading query performance?
Related
I was trying to run analyze command on a table out of 900 tables in mysql 5.7.30. Its stuck my all db process-list and connections spike immediate and lot of commands found with state "Waiting for table flush" even our max_connection parameter reaches at 2500. We are running the analyze table command from last 3 years but from last 1 month we notice this issue 4th time. If we didn't analyze our tables then we see severe performance issues and lot of queries enter into state "statistics". Whats your thoughts on it
You most definitely shouldn't be running ANALYZE regularly or automatically. It sounds like you were dodging the bullet of queries stuck in the waiting for able flush state purely because the load on your servers was sufficiently low that you didn't notice it before. You should only ever run this on a table sparingly when you have clear, definitive evidence that the index statistics on that table are sufficiently detached from reality to cause the query optimiser to regularly come up with egregiously poor execution plan.
Our production server gets stuck on Init for update state whenever we start a query like
update
<some_big_table>
set
<primary_key> = <some_sequence>.nextval
order by
<some_indexed_field>
While this the query is stuck are this state, all other queries get stuck at commit or writing to binlog state.
I couldn't find any relevant documentation for the same either.
That has to change every row in the table. So it effectively locks the entire table. And it takes a long time.
Hence, it blocks other queries touching the table for any purpose.
As for the "state" -- It is like most states, it does not mean much. And is possibly misleading. (I would expect it to be finished with "init" and "performing" the update.)
What does it mean if the Mysql query:
SHOW PROCESSLIST;
returns "Sending data" in the State column?
I imagine it means the query has been executed and MySQL is sending “result” Data to the client but I'm wondering why its taking so much time (up to an hour).
Thank you.
This is quite a misleading status. It should be called "reading and filtering data".
This means that MySQL has some data stored on the disk (or in memory) which is yet to be read and sent over. It may be the table itself, an index, a temporary table, a sorted output etc.
If you have a 1M records table (without an index) of which you need only one record, MySQL will still output the status as "sending data" while scanning the table, despite the fact it has not sent anything yet.
MySQL 8.0.17 and later: This state is no longer indicated separately, but rather is included in the Executing state.
In this state:
The thread is reading and processing rows for a
SELECT statement, and sending data to the client.
Because operations occurring during this this state tend to perform
large amounts of disk access (reads).
That's why it takes more time to complete and so is the longest-running state over the lifetime of a given query.
What does it mean if the Mysql query:
SHOW PROCESSLIST;
returns "Sending data" in the State column?
I imagine it means the query has been executed and MySQL is sending “result” Data to the client but I'm wondering why its taking so much time (up to an hour).
Thank you.
This is quite a misleading status. It should be called "reading and filtering data".
This means that MySQL has some data stored on the disk (or in memory) which is yet to be read and sent over. It may be the table itself, an index, a temporary table, a sorted output etc.
If you have a 1M records table (without an index) of which you need only one record, MySQL will still output the status as "sending data" while scanning the table, despite the fact it has not sent anything yet.
MySQL 8.0.17 and later: This state is no longer indicated separately, but rather is included in the Executing state.
In this state:
The thread is reading and processing rows for a
SELECT statement, and sending data to the client.
Because operations occurring during this this state tend to perform
large amounts of disk access (reads).
That's why it takes more time to complete and so is the longest-running state over the lifetime of a given query.
I have a service that sits on top of a MySQL 5.5 database (INNODB). The service has a background job that is supposed to run every week or so. On a high level the background job does the following:
Do some initial DB read and write in one transaction
Execute UMQ (described below) with a set of parameters in one transaction.
If no records are returned we are done!
Process the result from UMQ (this is a bit heavy so it is done outside of any DB
transaction)
Write the outcome of the previous step to DB in one transaction (this
writes to tables queried by UMQ and ensures that the same records are not found again by UMQ).
Goto step 2.
UMQ - Ugly Monster Query: This is a nasty database query that joins a bunch of tables, has conditions on columns in several of these tables and includes a NOT EXISTS subquery with some more joins and conditions. UMQ includes ORDER BY also has LIMIT 1000. Even though the query is bad I have done what I can here - there are indexes on all columns filtered on and the joins are all over foreign key relations.
I do expect UMQ to be heavy and take some time, which is why it's executed in a background job. However, what I'm seeing is rapidly degrading performance until it eventually causes a timeout in my service (maybe 50 times slower after 10 iterations).
First I thought that it was because the data queried by UMQ changes (see step 4 above) but that wasn't it because if I took the last query (the one that caused the timeout) from the slow query log and executed it myself directly I got the same behavior only until I restated the MySQL service. After restart the exact query on the exact same data that took >30 seconds before restart now took <0.5 seconds. I can reproduce this behavior every time by restoring the database to it's initial state and restarting the process.
Also, using the trick described in this question I could see that the query scans around 60K rows after restart as opposed to 18M rows before. EXPLAIN tells me that around 10K rows should be scanned and the result of EXPLAIN is always the same. No other processes are accessing the database at the same time and the lock_time in the slow query log is always 0. SHOW ENGINE INNODB STATUS before and after restart gives me no hints.
So finally the question: Does anybody have any clue of why I'm seeing this behavior? And how can I analyze this further?
I have the feeling that I need to configure MySQL differently in some way but I have searched and tested like crazy without coming up with anything that makes a difference.
Turns out that the behavior I saw was the result of how the MySQL optimizer uses InnoDB statistics to decide on an execution plan. This article put me on the right track (even though it does not exactly discuss my problem). The most important thing I learned from this is that MySQL calculates statistics on startup and then once in a while. This statistics is then used to optimize queries.
The way I had set up the test data the table T where most writes are done in step 4 started out as empty. After each iteration T would contain more and more records but the InnoDB statistics had not yet been updated to reflect this. Because of this the MySQL optimizer always chose an execution plan for UMQ (which includes a JOIN with T) that worked well when T was empty but worse and worse the more records T contained.
To verify this I added an ANALYZE TABLE T; before every execution of UMQ and the rapid degradation disappeared. No lightning performance but acceptable. I also saw that leaving the database for half an hour or so (maybe a bit shorter but at least more than a couple of minutes) would allow the InnoDB statistics to refresh automatically.
In a real scenario the relative difference in index cardinality for the tables involved in UMQ will look quite different and will not change as rapidly so I have decided that I don't really need to do anything about it.
thank you very much for the analysis and answer. I've been searching this issue for several days during ci on mariadb 10.1 and bacula server 9.4 (debian buster).
The situation was that after fresh server installation during a CI cycle, the first two tests (backup and restore) runs smoothly on unrestarted mariadb server and only the third test showed that one particular UMQ took about 20 minutes (building directory tree during restore process from the table with about 30k rows).
Unless the mardiadb server was restarted or table has been analyzed the problem would not go away. ANALYZE TABLE or the restart changed the cardinality of the fields and internal query processing exactly as stated in the linked article.