I have MyISAM tables with a big number of rows - 10M to 500M. The tables are used to store time series data infrequently and I would like to optimize it for SELECT, which I am doing throung 2 indexes: the epoch and a classifier column (which contain a few thousand categories).
My issue is that the first SELECT I do for a particular category is quite long (10 to 50 sec) while the subsequent ones are really fast, even when using SQL_NO_CACHE. Such a query will typically return between 100,000 and 1M elements.
Profiling shows that MySQL spends a lot of time "Sending data". This would indicate that a lot of the time spent is done doing disk I/O. But I do not really understand where the bottleneck is:
Is the bottleneck in the BTREE read? The tree has only a few thousands nodes and then less than 1M points in the selected node. I cannot believe it would take 30 seconds to do that on a modern machine, even with old school hard drives.
Is it in reading the rows in the table? That is again less than a million rows with an average length of ~40bytes.
Something else I am not accounting for?
Here are the query results:
mysql> SELECT SQL_NO_CACHE COUNT(`Time`) FROM archive_1 WHERE Channel=63;
+---------------+
| COUNT(`Time`) |
+---------------+
| 450619 |
+---------------+
1 row in set (28.67 sec)
mysql> SELECT SQL_NO_CACHE COUNT(`Time`) FROM archive_1 WHERE Channel=63;
+---------------+
| COUNT(`Time`) |
+---------------+
| 450619 |
+---------------+
1 row in set (2.20 sec)
mysql> SELECT SQL_NO_CACHE COUNT(`Time`) FROM archive_1 WHERE Channel=63;
+---------------+
| COUNT(`Time`) |
+---------------+
| 450619 |
+---------------+
1 row in set (0.88 sec)
mysql> SHOW PROFILES;
+----------+-------------+-----------------------------------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+-------------+-----------------------------------------------------------------------------------+
| 1 | 28.66720725 | SELECT SQL_NO_CACHE COUNT(`Time`) FROM archive_1 WHERE Channel=63 |
| 2 | 2.19872350 | SELECT SQL_NO_CACHE COUNT(`Time`) FROM archive_1 WHERE Channel=63 |
| 3 | 0.87811475 | SELECT SQL_NO_CACHE COUNT(`Time`) FROM archive_1 WHERE Channel=63 |
+----------+-------------+-----------------------------------------------------------------------------------+
3 rows in set (0.00 sec)
mysql> SHOW PROFILE FOR QUERY 1;
+----------------------+-----------+
| Status | Duration |
+----------------------+-----------+
| starting | 0.000113 |
| checking permissions | 0.000010 |
| Opening tables | 0.000027 |
| System lock | 0.000017 |
| init | 0.000030 |
| optimizing | 0.000018 |
| statistics | 0.055731 |
| preparing | 0.000024 |
| executing | 0.000008 |
| Sending data | 28.611161 |
| end | 0.000019 |
| query end | 0.000005 |
| closing tables | 0.000014 |
| freeing items | 0.000021 |
| logging slow query | 0.000003 |
| logging slow query | 0.000004 |
| cleaning up | 0.000005 |
+----------------------+-----------+
17 rows in set (0.00 sec)
mysql> SHOW PROFILE FOR QUERY 2;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000105 |
| checking permissions | 0.000011 |
| Opening tables | 0.000036 |
| System lock | 0.000015 |
| init | 0.000028 |
| optimizing | 0.000019 |
| statistics | 0.032255 |
| preparing | 0.000024 |
| executing | 0.000007 |
| Sending data | 2.166140 |
| end | 0.000020 |
| query end | 0.000004 |
| closing tables | 0.000014 |
| freeing items | 0.000025 |
| logging slow query | 0.000003 |
| cleaning up | 0.000018 |
+----------------------+----------+
16 rows in set (0.00 sec)
mysql> SHOW PROFILE FOR QUERY 3;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000071 |
| checking permissions | 0.000009 |
| Opening tables | 0.000018 |
| System lock | 0.000012 |
| init | 0.000021 |
| optimizing | 0.000014 |
| statistics | 0.000059 |
| preparing | 0.000020 |
| executing | 0.000007 |
| Sending data | 0.877795 |
| end | 0.000021 |
| query end | 0.000004 |
| closing tables | 0.000015 |
| freeing items | 0.000029 |
| logging slow query | 0.000015 |
| cleaning up | 0.000006 |
+----------------------+----------+
16 rows in set (0.00 sec)
The particular table I am querying contains 107,407,213 rows with a data length of 4,237,427,600 bytes and an index length of 4,255,541,248 bytes. I optimized it yesterday and there where no added data since then.
If the query is I/O bound, I can always switch to a SSD and I also have the option to store the time index as an integer instead of a double. But so far I do not understand where my bottleneck is and I would like to avoid major changes before knowing more.
SQL_NO_CACHE means that mysql should not use the query cache.
The disk/buffer cache is still used, which is why the first query takes longer time, and the subseqent queries are faster.
Related
I'm trying to get an rough (order-of-magnitude) estimate of how long time the following query could take:
mysql> EXPLAIN SELECT t1.col1, t1_col4 FROM t1 LEFT JOIN t2 ON t1.col1=t2.col1 WHERE col2=0 AND col3 IS NULL;
+----+-------------+--------------------+------+---------------+------------+---------+-----------------------------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+------+---------------+------------+---------+-----------------------------+---------+--------------------------+
| 1 | SIMPLE | t1 | ref | foobar | foobar | 4 | const | 9715129 | |
| 1 | SIMPLE | t2 | ref | col1 | col1 | 4 | db2.t1.col1 | 42318 | Using where; Using index |
+----+-------------+--------------------+------+---------------+------------+---------+-----------------------------+---------+--------------------------+
2 rows in set (0.00 sec)
mysql>
This can be done when using SHOW PROFILES syntax.
When you open a MySQL session, you could set the variable "profiling" to 1 or ON.
mysql> SET profiling = 1;
So all the statements sent to the server will be profiled and stored in a historical and shown later by typing the command:
mysql> SHOW PROFILES;
See, from MySQL manual:
mysql> SET profiling = 1;
Query OK, 0 rows affected (0.00 sec)
mysql> DROP TABLE IF EXISTS t1;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> CREATE TABLE T1 (id INT);
Query OK, 0 rows affected (0.01 sec)
mysql> SHOW PROFILES;
+----------+----------+--------------------------+
| Query_ID | Duration | Query |
+----------+----------+--------------------------+
| 0 | 0.000088 | SET PROFILING = 1 |
| 1 | 0.000136 | DROP TABLE IF EXISTS t1 |
| 2 | 0.011947 | CREATE TABLE t1 (id INT) |
+----------+----------+--------------------------+
3 rows in set (0.00 sec)
mysql> SHOW PROFILE;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| checking permissions | 0.000040 |
| creating table | 0.000056 |
| After create | 0.011363 |
| query end | 0.000375 |
| freeing items | 0.000089 |
| logging slow query | 0.000019 |
| cleaning up | 0.000005 |
+----------------------+----------+
7 rows in set (0.00 sec)
mysql> SHOW PROFILE FOR QUERY 1;
+--------------------+----------+
| Status | Duration |
+--------------------+----------+
| query end | 0.000107 |
| freeing items | 0.000008 |
| logging slow query | 0.000015 |
| cleaning up | 0.000006 |
+--------------------+----------+
4 rows in set (0.00 sec)
mysql> SHOW PROFILE CPU FOR QUERY 2;
+----------------------+----------+----------+------------+
| Status | Duration | CPU_user | CPU_system |
+----------------------+----------+----------+------------+
| checking permissions | 0.000040 | 0.000038 | 0.000002 |
| creating table | 0.000056 | 0.000028 | 0.000028 |
| After create | 0.011363 | 0.000217 | 0.001571 |
| query end | 0.000375 | 0.000013 | 0.000028 |
| freeing items | 0.000089 | 0.000010 | 0.000014 |
| logging slow query | 0.000019 | 0.000009 | 0.000010 |
| cleaning up | 0.000005 | 0.000003 | 0.000002 |
+----------------------+----------+----------+------------+
References (updated at: 2014-09-04):
- SHOW PROFILE Syntax
- The INFORMATION_SCHEMA PROFILING Table
- How To Use MySQL Query Profiling (The Digital Ocean recently published a great article concerning this issue.)
I am running a simple query on a simple table on the same machine as the server running 5.5. It is taking 22sec to return ~7000 rows from a 20 million row table. Upon profiling most of the time is taken up by multiple "Waiting for query cache lock". What is "Waiting for query cache lock" and why is this query taking so long? Is it something with the way I set up the server?
Here is the profile (note the time for the operation is actually from the row below as stated here):
mysql> show profile for query 4;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000015 |
| Waiting for query cache lock | 0.000003 |
| checking query cache for query | 0.000045 |
| checking permissions | 0.000006 |
| Opening tables | 0.000027 |
| System lock | 0.000007 |
| Waiting for query cache lock | 0.000032 |
| init | 0.000018 |
| optimizing | 0.000008 |
| statistics | 0.033109 |
| preparing | 0.000019 |
| executing | 0.000002 |
| Sending data | 4.575480 |
| Waiting for query cache lock | 0.000005 |
| Sending data | 5.527728 |
| Waiting for query cache lock | 0.000005 |
| Sending data | 5.743041 |
| Waiting for query cache lock | 0.000004 |
| Sending data | 6.191706 |
| end | 0.000007 |
| query end | 0.000005 |
| closing tables | 0.000028 |
| freeing items | 0.000008 |
| Waiting for query cache lock | 0.000002 |
| freeing items | 0.000182 |
| Waiting for query cache lock | 0.000002 |
| freeing items | 0.000002 |
| storing result in query cache | 0.000004 |
| logging slow query | 0.000001 |
| logging slow query | 0.000002 |
| cleaning up | 0.000003 |
+--------------------------------+----------+
Here is the table:
mysql> SHOW CREATE TABLE prvol;
"Table","Create Table"
"prvol","CREATE TABLE `prvol` (
`ticker` varchar(10) DEFAULT NULL,
`date` date DEFAULT NULL,
`close` float unsigned DEFAULT NULL,
KEY `Index 1` (`date`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1"
Here is the query:
mysql> select close from prvol where date = '20100203';
EDIT: After running with SQL_NO_CACHE, all the time is now in the execution. Could this just be normal for a table this size on a 2.4GHz, 3GB ram machine?
+----------------------+-----------+
| Status | Duration |
+----------------------+-----------+
| starting | 0.000052 |
| checking permissions | 0.000007 |
| Opening tables | 0.000027 |
| System lock | 0.000008 |
| init | 0.000019 |
| optimizing | 0.000008 |
| statistics | 0.034766 |
| preparing | 0.000011 |
| executing | 0.000002 |
| Sending data | 22.071324 |
| end | 0.000012 |
| query end | 0.000005 |
| closing tables | 0.000020 |
| freeing items | 0.000170 |
| logging slow query | 0.000001 |
| logging slow query | 0.000003 |
| cleaning up | 0.000004 |
+----------------------+-----------+
EDIT: Include results of explain.
mysql> explain extended select cp from prvol where date = '20100208';
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref |rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | prvol | ref | Index 1 | Index 1 | 4 | const |6868 | 100.00 | Using where |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.08 sec)
I solved my slow query problem. To summarize the problem, it was taking 22sec to query 7000 rows from a 20mln row, 1.7GB indexed table. The problem was that the cache was too small and the query had to go to disk for every query. I would think the disk access would be faster than what I was seeing because I was going off an indexed column so the amount of data read off disk should have been small. But I'm guessing there is a lot of overhead with accessing the InnoDB storage on disk.
Once I set innodb_buffer_pool_size=1024M in the my.ini file, the initial query would take a long time, but all subsequent queries would finish in under a second.
Unfortunately, the profiling didn't really help.
This is a known problem with MySQL. It's really well described here:
https://web.archive.org/web/20160129162137/http://www.psce.com/blog/kb/how-query-cache-can-cause-performance-problems/
Query cache can help you a lot but at the same time it can become a bottleneck.
In the below output, what does the 22.071324 duration refer to - Sending data or executing?
The natural way to read the output is 22 sec for Sending data, and this post from MySQL refers to the times this way. Also, the documentation does not caution that the table needs to be interpreted in a special way.
So why am I asking this question? I came across this post, which refers to this post, where someone claims they read the source code and the 22 sec time should be interpreted as corresponding to executing.
Can anyone definitively say one way or another?
mysql> show profile for query 4;
+----------------------+-----------+
| Status | Duration |
+----------------------+-----------+
| starting | 0.000052 |
| checking permissions | 0.000007 |
| Opening tables | 0.000027 |
| System lock | 0.000008 |
| init | 0.000019 |
| optimizing | 0.000008 |
| statistics | 0.034766 |
| preparing | 0.000011 |
| executing | 0.000002 |
| Sending data | 22.071324 |
| end | 0.000012 |
| query end | 0.000005 |
| closing tables | 0.000020 |
| freeing items | 0.000170 |
| logging slow query | 0.000001 |
| logging slow query | 0.000003 |
| cleaning up | 0.000004 |
+----------------------+-----------+
I can definitely confirm that 22 seconds should be interpreted as "executing" rather than sending the data. Digging trough the source code of MySQL and talking to several people confirmed that.
I seem to be having slow inserts, updates and deletes on all tables on a specific database with MySQL. Not a lot of data in those tables (from 2k to 20k). Small number of columns (5-10), indexes (two of them), and no duplicate index issue. I'm running MySQL 5.0.45 with MyISAM.
I run the following query and it takes about 5-7 seconds:
UPDATE accounts SET updated_at = '2010-10-09 11:22:53' WHERE id = 8;
Selects seem to come back right away.
Explain gives me the following:
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | accounts | index | NULL | PRIMARY | 4 | NULL | 1841 | Using index |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
The profiler doesn't show any significant data for anything other than a seemingly high number of context switches:
+----------------------+----------+-------------------+---------------------+
| Status | Duration | Context_voluntary | Context_involuntary |
+----------------------+----------+-------------------+---------------------+
| (initialization) | 0.000057 | 0 | 0 |
| checking permissions | 0.000008 | 0 | 0 |
| Opening tables | 0.000013 | 0 | 0 |
| System lock | 0.000005 | 0 | 0 |
| Table lock | 0.000005 | 0 | 0 |
| init | 0.000061 | 0 | 0 |
| Updating | 0.000101 | 0 | 0 |
| end | 7.957233 | 7951 | 2 |
| query end | 0.000008 | 0 | 0 |
| freeing items | 0.000011 | 0 | 0 |
| closing tables | 0.000007 | 1 | 0 |
| logging slow query | 0.000002 | 0 | 0 |
+----------------------+----------+-------------------+---------------------+
This might also help:
+----------------------+----------+-----------------------+---------------+-------------+
| Status | Duration | Source_function | Source_file | Source_line |
+----------------------+----------+-----------------------+---------------+-------------+
| (initialization) | 0.000057 | check_access | sql_parse.cc | 5306 |
| checking permissions | 0.000008 | open_tables | sql_base.cc | 2629 |
| Opening tables | 0.000013 | mysql_lock_tables | lock.cc | 153 |
| System lock | 0.000005 | mysql_lock_tables | lock.cc | 162 |
| Table lock | 0.000005 | mysql_update | sql_update.cc | 167 |
| init | 0.000061 | mysql_update | sql_update.cc | 429 |
| Updating | 0.000101 | mysql_update | sql_update.cc | 560 |
| end | 7.957233 | mysql_execute_command | sql_parse.cc | 5122 |
| query end | 0.000008 | mysql_parse | sql_parse.cc | 6116 |
| freeing items | 0.000011 | dispatch_command | sql_parse.cc | 2146 |
| closing tables | 0.000007 | log_slow_statement | sql_parse.cc | 2204 |
| logging slow query | 0.000002 | dispatch_command | sql_parse.cc | 2169 |
+----------------------+----------+-----------------------+---------------+-------------+
Additional info:
It's running on a CentOS-5 VPS with 4 gigs of ram guaranteed. No index on the updated_at column and not triggers anywhere.
[New things that I tried]
Created a new table (using like)
running innodb and inserted all
records from one of the affected
tables. (same problem)
Backed up the database and restored it to a
different database within the same
server instance. (same problem)
Restored that same backup to my
local machine and I didn't have a
problem.
Tried another database
within the same mysql server
instance that has the problem
database and the other database (a
Wordpress DB) ran
updates/inserts/deletes just fine.
Restarted mysqld and restarted the entire server last night (same problem)
Updated MySQL to version 5.0.77 (same problem)
Deleted all indexes from one of the affected tables (same problem)
Any ideas what to look at next or what might be the issue? Seems to be more of a recent problem though I can't say when it started to show up exactly.
If you have variable length rows, you might need to run OPTIMIZE TABLE occasionally.
Finally found the answer. That database was somehow missing the MYD and MYI files and still running. Not sure how that's possible considering that the MYD file holds the data for MyISAM tables but that was causing the slow inserts/updates/deletes.
I ran an ALTER TABLE to set the engine to MyISAM (which it already was) and it recreated those files. Updates/inserts/deletes running fast again!
Is there an explanation of these statuses anywhere?
http://dev.mysql.com/tech-resources/articles/using-new-query-profiler.html
my specific question is in regards to this query:
select count(*)
from 135_5m.record_updates u, 135_5m.records r
where r.record_id = u.record_id and
(u.date_updated > null or null is null) and
u.date_updated <= '2011-01-03';
which returns a single number - 4053904. So why would the majority of the time be spent in "Sending data"? Is it just poorly named? Surely "Sending data" must be doing more than just sending data?
+--------------------------------+-----------+-------+
| Status | Duration | Swaps |
+--------------------------------+-----------+-------+
| starting | 0.000224 | 0 |
| checking query cache for query | 0.000188 | 0 |
| checking permissions | 0.000012 | 0 |
| checking permissions | 0.000017 | 0 |
| Opening tables | 0.000036 | 0 |
| System lock | 0.000015 | 0 |
| Table lock | 0.000067 | 0 |
| init | 0.000105 | 0 |
| optimizing | 0.000052 | 0 |
| statistics | 0.000254 | 0 |
| preparing | 0.000061 | 0 |
| executing | 0.000017 | 0 |
| Sending data | 32.079549 | 0 |
| end | 0.000036 | 0 |
| query end | 0.000012 | 0 |
| freeing items | 0.000089 | 0 |
| storing result in query cache | 0.000022 | 0 |
| logging slow query | 0.000008 | 0 |
| logging slow query | 0.000008 | 0 |
| cleaning up | 0.000011 | 0 |
+--------------------------------+-----------+-------+
http://dev.mysql.com/doc/refman/5.0/en/general-thread-states.html
Executing means the thread has started execution, Sending data apparently covers both the processing of the rows and sending the count back to the client.
before sending data to client, mysql need to read data, the phase of read data may take the majrity time of "sending data"