MySQL gurus
We have readonly dataset, which contains around 4 million rows. This dataset is represented as 1 InnoDB table with a bunch of indices, like this:
CREATE TABLE IF NOT EXISTS dim_users (
ds varchar(10),
user_id varchar(32),
date_joined varchar(10),
payer boolean,
source varchar(12),
country varchar(2),
is1dayactive boolean,
is7dayactive boolean,
is28dayactive boolean,
istest boolean
) ENGINE = InnoDB;
ALTER TABLE dim_users ADD INDEX (ds);
ALTER TABLE dim_users ADD INDEX (user_id);
ALTER TABLE dim_users ADD INDEX (date_joined);
ALTER TABLE dim_users ADD INDEX (source);
ALTER TABLE dim_users ADD INDEX (country);
MySQL8 is running on Ubuntu 18.04 in servers.com provider on a cloud server with 8GB of RAM and swap turned off.
The database has default config, except
[mysqld]
innodb-buffer-pool-size=6G
read_only=1
The problem is that aggregation queries like select count(*) from dim_users or select country,count(*) from dim_users group by 1 order by 1 desc; take too much time (up to 12 seconds) comparing to default Postgres 11 setup on the same vhost.
Default psql11 performs the last query for less than 2 seconds.
We were trying to profile and analyze this request on MySQL:
Explain:
mysql> explain analyze select country,count(*) from dim_users group by 1 order by 1 desc;
+---------------+
| EXPLAIN
+---------------+
| -> Group aggregate: count(0) (actual time=241.091..14777.679 rows=11 loops=1)
| -> Index scan on dim_users using country (reverse) (cost=430171.90 rows=4230924) (actual time=0.054..11940.166 rows=4228692 loops=1)
|
+---------------+
1 row in set (14.78 sec)
Profile:
mysql> SELECT SEQ,STATE,DURATION,SWAPS,SOURCE_FUNCTION,SOURCE_FILE,SOURCE_LINE FROM INFORMATION_SCHEMA.PROFILING WHERE QUERY_ID=1;
+-----+--------------------------------+-----------+-------+-------------------------+----------------------+-------------+
| SEQ | STATE | DURATION | SWAPS | SOURCE_FUNCTION | SOURCE_FILE | SOURCE_LINE |
+-----+--------------------------------+-----------+-------+-------------------------+----------------------+-------------+
| 2 | starting | 0.000115 | 0 | NULL | NULL | NULL |
| 3 | Executing hook on transaction | 0.000008 | 0 | launch_hook_trans_begin | rpl_handler.cc | 1119 |
| 4 | starting | 0.000013 | 0 | launch_hook_trans_begin | rpl_handler.cc | 1121 |
| 5 | checking permissions | 0.000010 | 0 | check_access | sql_authorization.cc | 2176 |
| 6 | Opening tables | 0.000050 | 0 | open_tables | sql_base.cc | 5591 |
| 7 | init | 0.000012 | 0 | execute | sql_select.cc | 677 |
| 8 | System lock | 0.000020 | 0 | mysql_lock_tables | lock.cc | 331 |
| 9 | optimizing | 0.000007 | 0 | optimize | sql_optimizer.cc | 282 |
| 10 | statistics | 0.000046 | 0 | optimize | sql_optimizer.cc | 502 |
| 11 | preparing | 0.000035 | 0 | optimize | sql_optimizer.cc | 583 |
| 12 | executing | 13.552128 | 0 | ExecuteIteratorQuery | sql_union.cc | 1409 |
| 13 | end | 0.000037 | 0 | execute | sql_select.cc | 730 |
| 14 | query end | 0.000008 | 0 | mysql_execute_command | sql_parse.cc | 4606 |
| 15 | waiting for handler commit | 0.000018 | 0 | ha_commit_trans | handler.cc | 1589 |
| 16 | closing tables | 0.000015 | 0 | mysql_execute_command | sql_parse.cc | 4657 |
| 17 | freeing items | 0.000039 | 0 | mysql_parse | sql_parse.cc | 5330 |
| 18 | cleaning up | 0.000020 | 0 | dispatch_command | sql_parse.cc | 2184 |
+-----+--------------------------------+-----------+-------+-------------------------+----------------------+-------------+
As you can see, group by query takes ages to complete.
So, my questions are the following.
I've never been using MySQL before, my main DB has always been Postgresql for years and I don't know whether it's normal behaviour for this DB or not - it's very slow when we want to execute group by queries.
Otherwise, if I'm missing something very important here, eg DB or OS configuration, could you please give me a pieces of advice on how to properly setup MySQL for read only purposes.
Thank you!
Related
Recently in my application which uses MariaDB 10.6, I am facing some weird issues where the same query took more than the expected time and consumes more IO at random times.
Enabled slow query to trace the same where we see a query stuck more than 9min and consumes more IO.
# Time: 230119 15:25:02
# User#Host: user[user] # [192.*.*.*]
# Thread_id: 125616 Schema: DB QC_hit: No
# Query_time: 567.099806 Lock_time: 0.000500 Rows_sent: 48 Rows_examined: 10859204
# Rows_affected: 0 Bytes_sent: 0
SET timestamp=1674152702;
select column1,column2....columnN where column1=v1 and column2=v2 and column3=v3 and column4=v4 and column5=v5;
On seeing the DB processlist, more no of query are in state "Waiting for table metadata lock" and end up in bigger issues.
| 106804 | userx | IP | DB | Query | 4239 | Sending data | Q1 | 0.000 |
| 106838 | userx | IP | DB | Query | 1980 | Waiting for table metadata lock | Q2 | 0.000 |
| 107066 | userx | IP | DB | Sleep | 0 | | NULL | 0.000 |
| 107196 | userx | IP | DB | Sleep | 1 | | NULL | 0.000 |
| 107223 | userx | IP | DB | Query | 4363 | Sending data | Q3 | 0.000 |
| 107277 | userx | IP | DB | Query | 3221 | Sending data | Q4 | 0.000 |
| 107299 | userx | IP | DB | Sleep | 26 | | NULL | 0.000 |
| 107324 | userx | IP | DB | Sleep | 54 | | NULL | 0.000 |
| 107355 | userx | IP | DB | Sleep | 0 | | NULL | 0.000 |
| 107357 | userx | IP | DB | Sleep | 1 | | NULL | 0.000 |
| 107417 | userx | IP | DB | Query | 1969 | Waiting for table metadata lock | | 0.000 |
| 107462 | userx | IP | DB | Sleep | 55 | | NULL | 0.000 |
| 107489 | userx | IP | DB | Query | 1979 | Waiting for table metadata lock | Q5 | 0.000 |
| 107492 | userx | IP | DB | Sleep | 25 | | NULL | 0.000 |
| 107519 | userx | IP | DB | Query | 1981 | Waiting for table metadata lock | Q6 | 0.000 |
Currently, the manual killing of the suspected query using KILL query cmd unblocks the other query to complete, and via MariaDB property, we can use max_statement_time to terminate the long-running query.
But Is there a way to check what was killed by the max_statement_time? Unable to find any traces in error.log.
The actual query should return around 1765 records while the slow query reports row_sent as 48.
Is it a problem with scanning the table or record fetched got stuck after some time?
Or Am I misinterpreting the slow query output Row_send record count
127.0.0.1:3307> select column1,column2....columnN where column1=v1 and column2=v2 and column3=v3 and column4=v4 and column5=v5;
+----------+
| count(*) |
+----------+
| 1756 |
+----------+
1 row in set (0.006 sec)
---EDITED---
Missed adding column 5 added now in the query.
The table is indexed and let me explain the statement.
127.0.0.1:3307> explain extended select..... from Tablename where column1=v1 and column2=v2 and column3=v3 and column4=v4 and column5=v5;
+------+-------------+-------+------+---------------+---------+---------+-------------------------------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+-------+------+---------------+---------+---------+-------------------------------+------+----------+-------+
| 1 | SIMPLE | s | ref | PRIMARY | PRIMARY | 7 | const,const,const,const,const | 73 | 100.00 | |
+------+-------------+-------+------+---------------+---------+---------+-------------------------------+------+----------+-------+
1 row in set, 1 warning (0.007 sec)
Assuming that v1..v4 are literal constants (numeric, string, date, etc), and assuming you really mean = in all 4 cases, then
INDEX(col1, col2, col3, col4)
(in any order) is optimal for that one SELECT. This will also minimize I/O for that query.
There are many reasons for "high" I/O. I'll cover a few of them.
Data is bigger than the cache, the size of which is controlled by innodb_buffer_pool_size. How big is the table?
That setting should be configured to be about 70% of available RAM. How much RAM do you have?
If any items in the query are TEXT or BLOB, there may be extra I/O.
Currently you do not have a good index, and it had to check 10859204 rows to find the 48 you needed. With the index, above, only 49 rows need to be fetched.
"Waiting for table metadata lock" -- This implies that you are doing some serious blocking or LOCKing or ALTERing. What other queries are going on?
Since the buffer_pool is shared among all MySQL processes, some other query may be at fault.
Adding the index should eliminate the 48 vs 1765 puzzle and any need for max-query-time.
You should use ENGINE=InnoDB; MyISAM has more locking/blocking issues.
If you provide more specifics, I may have more tips.
(after your edit)
The "Rows" in EXPLAIN is an estimate; it can easily be off by a factor of 2 either way.
Rows_sent in the slowlog is exact.
I just updated my MySQL version to 5.7. A SELECT query that has four INNER-JOINS and that previously took around 3 seconds to execute is now taking so long that I can't even keep track of it. A bit of profiling shows that the 'Send Data' part is taking too long. Can someone tell me what is going wrong? Here's some data. Note that the query is still running at this point in time:
+----------------------+-----------+
| Status | Duration |
+----------------------+-----------+
| starting | 0.001911 |
| checking permissions | 0.000013 |
| checking permissions | 0.000003 |
| checking permissions | 0.000003 |
| checking permissions | 0.000006 |
| Opening tables | 0.000030 |
| init | 0.000406 |
| System lock | 0.000018 |
| optimizing | 0.000019 |
| statistics | 0.000509 |
| preparing | 0.000052 |
| executing | 0.000004 |
| Sending data | 31.881794 |
| end | 0.000021 |
| query end | 0.003540 |
| closing tables | 0.000032 |
| freeing items | 0.000214 |
| cleaning up | 0.000028 |
+----------------------+-----------+
Here's the output of EXPLAIN:
+----+-------------+--------------------+------------+------+---------------+------------+---------+-------+---------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------+------------+------+---------------+------------+---------+-------+---------+----------+----------------------------------------------------+
| 1 | SIMPLE | movie_data_primary | NULL | ref | cinestopId | cinestopId | 26 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | mg | NULL | ALL | NULL | NULL | NULL | NULL | 387498 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | crw | NULL | ALL | NULL | NULL | NULL | NULL | 1383452 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | cst | NULL | ALL | NULL | NULL | NULL | NULL | 2184556 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+--------------------+------------+------+---------------+------------+---------+-------+---------+----------+----------------------------------------------------+
Looks like indexing problem when you upgrade the msssql version-
Documentation says-
If you perform a binary upgrade without dumping and reloading tables,
you cannot upgrade directly from MySQL 4.1 to 5.1 or higher. This
occurs due to an incompatible change in the MyISAM table index
formatin MySQL 5.0. Upgrade from MySQL 4.1 to 5.0 and repair all
MyISAM tables. Then upgrade from MySQL 5.0 to 5.1 and check and
repair your tables.Modifications to the handling of character sets or
collations might change the character sort order, which causes the
ordering of entries in any index that uses an affected character
set or collation to be incorrect. Such changes result in several
possible problems: Comparison results that differ from previous
results
Inability to find some index values due to misordered index entries
Misordered ORDER BY results
Tables that CHECK TABLE reports as being in need of repair
Check for the links-
1)checking-table-incompatibilities
2)check-table
3)rebuilding-tables
I'm looking to improve the speed of queries on a very large MySQL analytics table that I have. This table is tracking playercount on gameservers and the structure looks as so:
`server_tracker` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ip` int(10) unsigned NOT NULL,
`port` smallint(5) unsigned NOT NULL,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`players` tinyint(3) unsigned NOT NULL,
`map` varchar(28) NOT NULL,
`portjoin` smallint(5) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_tracking_ip_port` (`ip`,`port`)
) ENGINE=InnoDB AUTO_INCREMENT=310729056 DEFAULT CHARSET=utf8 ROW_FORMAT=FIXED |
This table is inserted into very frequently, with 10k+ servers being tracked 10+ times an hour. However, every hour the data is taken and averaged out, and put into an "averaged" table with basically the same structure.
Currently I have the IP/port setup as key. However - sometimes it can be a tad slow when doing that hourly averaging - so I am curious if it would be worth putting an index on the timestamp, which is frequently used to select data from a certain timeframe like so:
SELECT `players`
FROM `server_tracker`
WHERE `ip` = x
AND `port` = x
AND `date` > NOW()
AND `date` < NOW() + INTERVAL 60 MINUTE
ORDER BY `id` DESC
This is the only type of query ran on this table. The table is only used for fetching the playercount from gameservers within a specific timeframe. The data is never updated or changed.
However, I am a bit new to all of this - and I am not sure if putting an index on the timestamp would do much of anything. Just looking for some friendly advice.
Results of EXPLAIN SELECT players FROM server_tracker WHERE ip = x AND port = x AND date > NOW() AND date < NOW() + INTERVAL 60 MINUTE ORDER BY id DESC
+----+-------------+-----------------+------+----------------------+----------------------+---------+-------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+------+----------------------+----------------------+---------+-------------+-------+-------------+
| 1 | SIMPLE | server_tracker | ref | idx_tracking_ip_port | idx_tracking_ip_port | 6 | const,const | 15354 | Using where |
+----+-------------+-----------------+------+----------------------+----------------------+---------+-------------+-------+-------------+
One of the most important information in MySQL and scripts is to know the MySQL to very few exceptions, always just ONE INDEX can be used in a query.
So it does not use much depending on an index ever to set a Column when all 4 are used verfelder in the where clause.
Only a combined index hilt over these fields.
The order of the fields is very important for this index can also be used for other queries.
An example:
An index on field1, field2 and field3 is used when you have the WHERE FIELD1 or FIELD1 and FIELD2 or field1, field2 and FIELD3. This index is not used if you in the WHERE FIELD2 or used FIELD3 or FIELD2 and field. 3 So always use the first field.
Too easy to find out if un like the QUERY works you can just run your query and EXPALIN and beommst directly the information whether and which index is used. If there are several lines you can as an indicator, the individual values under rows muliplizieren together. The smaller this number is the better performs your query.
MariaDB [tmp]> EXPLAIN select * from content;
+------+-------------+---------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+---------+------+---------------+------+---------+------+------+-------+
| 1 | SIMPLE | content | ALL | NULL | NULL | NULL | NULL | 13 | |
+------+-------------+---------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)
MariaDB [tmp]>
Anternativ you can check out the profiler how long the QUERY in what capacity depends and about optimizing server
An example:
MariaDB [(none)]> use tmp
Database changed
MariaDB [tmp]> SET PROFILING=ON;
Query OK, 0 rows affected (0.00 sec)
MariaDB [tmp]>
MariaDB [tmp]> SELECT * FROM content;
+----+------+---------------------+--------+------+--------------+------+------+------+------+
| id | Wert | Zeitstempel | WertID | aaa | d | e | wwww | n | ddd |
+----+------+---------------------+--------+------+--------------+------+------+------+------+
| 1 | 10 | 2001-01-01 00:00:00 | 1 | NULL | 1.5000 | NULL | NULL | 1 | NULL |
| 2 | 12.3 | 2001-01-01 00:01:00 | 2 | NULL | 2.5000 | NULL | NULL | 2 | NULL |
| 3 | 17.4 | 2001-01-01 00:02:00 | 3 | NULL | 123456.1250 | NULL | NULL | 3 | NULL |
| 4 | 10.9 | 2001-01-01 01:01:00 | 1 | NULL | 1000000.0000 | NULL | NULL | 4 | NULL |
| 5 | 15.4 | 2001-01-01 01:02:00 | 2 | NULL | NULL | NULL | NULL | 5 | NULL |
| 6 | 20.9 | 2001-01-01 01:03:00 | 3 | NULL | NULL | NULL | NULL | 6 | NULL |
| 7 | 22 | 2001-01-02 00:00:00 | 1 | NULL | NULL | NULL | NULL | 7 | NULL |
| 8 | 12.3 | 2001-01-02 00:01:00 | 2 | NULL | NULL | NULL | NULL | 8 | NULL |
| 9 | 17.4 | 2001-01-02 00:02:00 | 3 | NULL | NULL | NULL | NULL |
+----+------+---------------------+--------+------+--------------+------+------+------+------+
13 rows in set (0.00 sec)
MariaDB [tmp]>
MariaDB [tmp]> SHOW PROFILE;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000031 |
| checking permissions | 0.000005 |
| Opening tables | 0.000036 |
| After opening tables | 0.000004 |
| System lock | 0.000003 |
| Table lock | 0.000002 |
| After opening tables | 0.000005 |
| init | 0.000013 |
| optimizing | 0.000006 |
| statistics | 0.000013 |
| preparing | 0.000010 |
| executing | 0.000002 |
| Sending data | 0.000073 |
| end | 0.000003 |
| query end | 0.000003 |
| closing tables | 0.000006 |
| freeing items | 0.000003 |
| updating status | 0.000012 |
| cleaning up | 0.000003 |
+----------------------+----------+
19 rows in set (0.00 sec)
MariaDB [tmp]>
I seem to be having slow inserts, updates and deletes on all tables on a specific database with MySQL. Not a lot of data in those tables (from 2k to 20k). Small number of columns (5-10), indexes (two of them), and no duplicate index issue. I'm running MySQL 5.0.45 with MyISAM.
I run the following query and it takes about 5-7 seconds:
UPDATE accounts SET updated_at = '2010-10-09 11:22:53' WHERE id = 8;
Selects seem to come back right away.
Explain gives me the following:
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | accounts | index | NULL | PRIMARY | 4 | NULL | 1841 | Using index |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
The profiler doesn't show any significant data for anything other than a seemingly high number of context switches:
+----------------------+----------+-------------------+---------------------+
| Status | Duration | Context_voluntary | Context_involuntary |
+----------------------+----------+-------------------+---------------------+
| (initialization) | 0.000057 | 0 | 0 |
| checking permissions | 0.000008 | 0 | 0 |
| Opening tables | 0.000013 | 0 | 0 |
| System lock | 0.000005 | 0 | 0 |
| Table lock | 0.000005 | 0 | 0 |
| init | 0.000061 | 0 | 0 |
| Updating | 0.000101 | 0 | 0 |
| end | 7.957233 | 7951 | 2 |
| query end | 0.000008 | 0 | 0 |
| freeing items | 0.000011 | 0 | 0 |
| closing tables | 0.000007 | 1 | 0 |
| logging slow query | 0.000002 | 0 | 0 |
+----------------------+----------+-------------------+---------------------+
This might also help:
+----------------------+----------+-----------------------+---------------+-------------+
| Status | Duration | Source_function | Source_file | Source_line |
+----------------------+----------+-----------------------+---------------+-------------+
| (initialization) | 0.000057 | check_access | sql_parse.cc | 5306 |
| checking permissions | 0.000008 | open_tables | sql_base.cc | 2629 |
| Opening tables | 0.000013 | mysql_lock_tables | lock.cc | 153 |
| System lock | 0.000005 | mysql_lock_tables | lock.cc | 162 |
| Table lock | 0.000005 | mysql_update | sql_update.cc | 167 |
| init | 0.000061 | mysql_update | sql_update.cc | 429 |
| Updating | 0.000101 | mysql_update | sql_update.cc | 560 |
| end | 7.957233 | mysql_execute_command | sql_parse.cc | 5122 |
| query end | 0.000008 | mysql_parse | sql_parse.cc | 6116 |
| freeing items | 0.000011 | dispatch_command | sql_parse.cc | 2146 |
| closing tables | 0.000007 | log_slow_statement | sql_parse.cc | 2204 |
| logging slow query | 0.000002 | dispatch_command | sql_parse.cc | 2169 |
+----------------------+----------+-----------------------+---------------+-------------+
Additional info:
It's running on a CentOS-5 VPS with 4 gigs of ram guaranteed. No index on the updated_at column and not triggers anywhere.
[New things that I tried]
Created a new table (using like)
running innodb and inserted all
records from one of the affected
tables. (same problem)
Backed up the database and restored it to a
different database within the same
server instance. (same problem)
Restored that same backup to my
local machine and I didn't have a
problem.
Tried another database
within the same mysql server
instance that has the problem
database and the other database (a
Wordpress DB) ran
updates/inserts/deletes just fine.
Restarted mysqld and restarted the entire server last night (same problem)
Updated MySQL to version 5.0.77 (same problem)
Deleted all indexes from one of the affected tables (same problem)
Any ideas what to look at next or what might be the issue? Seems to be more of a recent problem though I can't say when it started to show up exactly.
If you have variable length rows, you might need to run OPTIMIZE TABLE occasionally.
Finally found the answer. That database was somehow missing the MYD and MYI files and still running. Not sure how that's possible considering that the MYD file holds the data for MyISAM tables but that was causing the slow inserts/updates/deletes.
I ran an ALTER TABLE to set the engine to MyISAM (which it already was) and it recreated those files. Updates/inserts/deletes running fast again!
Is there an explanation of these statuses anywhere?
http://dev.mysql.com/tech-resources/articles/using-new-query-profiler.html
my specific question is in regards to this query:
select count(*)
from 135_5m.record_updates u, 135_5m.records r
where r.record_id = u.record_id and
(u.date_updated > null or null is null) and
u.date_updated <= '2011-01-03';
which returns a single number - 4053904. So why would the majority of the time be spent in "Sending data"? Is it just poorly named? Surely "Sending data" must be doing more than just sending data?
+--------------------------------+-----------+-------+
| Status | Duration | Swaps |
+--------------------------------+-----------+-------+
| starting | 0.000224 | 0 |
| checking query cache for query | 0.000188 | 0 |
| checking permissions | 0.000012 | 0 |
| checking permissions | 0.000017 | 0 |
| Opening tables | 0.000036 | 0 |
| System lock | 0.000015 | 0 |
| Table lock | 0.000067 | 0 |
| init | 0.000105 | 0 |
| optimizing | 0.000052 | 0 |
| statistics | 0.000254 | 0 |
| preparing | 0.000061 | 0 |
| executing | 0.000017 | 0 |
| Sending data | 32.079549 | 0 |
| end | 0.000036 | 0 |
| query end | 0.000012 | 0 |
| freeing items | 0.000089 | 0 |
| storing result in query cache | 0.000022 | 0 |
| logging slow query | 0.000008 | 0 |
| logging slow query | 0.000008 | 0 |
| cleaning up | 0.000011 | 0 |
+--------------------------------+-----------+-------+
http://dev.mysql.com/doc/refman/5.0/en/general-thread-states.html
Executing means the thread has started execution, Sending data apparently covers both the processing of the rows and sending the count back to the client.
before sending data to client, mysql need to read data, the phase of read data may take the majrity time of "sending data"