I've recently moved a site to a different server, and while the overall performance is better, there's one specific SQL query that's taking about 5 seconds to execute now, while it only takes 0.1 seconds on the old server.
Query:
SELECT t1.*
FROM wp_ap_activity t1 NATURAL JOIN (SELECT max(activity_date) AS activity_date
FROM wp_ap_activity
WHERE activity_q_id IN(126187,125933,126043,126083,100007,125781,125628,125615,125716,125728,126115,126061,126028,125429,124783,125651,126092,125510,126062,126058,125923,125727,125948,125085,126033,125975,125537,124664,126031,125947,125938,123327,125908,125467,125471,125852,125558,125980,125226,125904,124454,103489,125935,125925,124472,122940,125949,125950,125139,112744,124718,124626,125859,125903,125406,66537,125722,125887,125810,124810,125782,125823,125799,108626,99836,85975,74147,69962,69510,68598,68593,125875,125620,92246,112851,108528,108629,112864,106120,119571,125798,118205,125831,108547,125550,125813,124297,125223,125792,125536,125730,123848,125411,125598,125638,125698,125519,125700,125697,125151,125688,125445,125715,125083,125669,125665,125673,124777,123975,125528,125724,125146,125610,124784,125617,125631,125637,124765,125496,125647,125571,125245,125264,125513,125534,124854,125527,125543,125535,125515,125337,125221,125202,125549,125530,125531,125541,124952,125358,125502,125427,125525,125123,125361,125252,125421,125491,125263,125260,124743)
GROUP BY activity_q_id) t2
ORDER BY t2.activity_date
New Server
MySQL Version: 10.3.16-MariaDB-1:10.3.16+maria~jessie
Execution Time: 5.2812 seconds
Table Name: wp_ap_activity
Table Rows: 109,947
Space Usage:
Data: 9.5 MiB
Index: 10.5 MiB
Effective: 20.1 MiB
Total: 20.1 MiB
SHOW CREATE TABLE wp_ap_activity results:
CREATE TABLE `wp_ap_activity` (
`activity_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`activity_action` varchar(45) NOT NULL,
`activity_q_id` bigint(20) unsigned NOT NULL,
`activity_a_id` bigint(20) unsigned DEFAULT NULL,
`activity_c_id` bigint(20) unsigned DEFAULT NULL,
`activity_user_id` bigint(20) unsigned NOT NULL,
`activity_date` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`activity_id`),
KEY `activity_q_id` (`activity_q_id`),
KEY `activity_a_id` (`activity_a_id`),
KEY `activity_user_id` (`activity_user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=113859 DEFAULT CHARSET=utf8mb4
Old Server
MySQL Version: 5.7.26-0ubuntu0.16.04.1
Execution Time: 0.1842 seconds
Table Name: wp_ap_activity
Table Rows: 109,759
Space Usage:
Data: 1.5 MiB
Index: 4.5 MiB
Total: 6 MiB
SHOW CREATE TABLE wp_ap_activity results:
CREATE TABLE `wp_ap_activity` (
`activity_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`activity_action` varchar(45) NOT NULL,
`activity_q_id` bigint(20) unsigned NOT NULL,
`activity_a_id` bigint(20) unsigned DEFAULT NULL,
`activity_c_id` bigint(20) unsigned DEFAULT NULL,
`activity_user_id` bigint(20) unsigned NOT NULL,
`activity_date` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`activity_id`),
KEY `activity_q_id` (`activity_q_id`),
KEY `activity_a_id` (`activity_a_id`),
KEY `activity_user_id` (`activity_user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=113657 DEFAULT CHARSET=utf8mb4
The table structure, primary keys, indexes are identical.
The new table reports it's total size as 20.1 MiB while the old one is much smaller at 6 MiB - I'm not sure why this is happening and if it has to do with the slow performance.
Both tables are InnoDB and have collation set to utf8mb4_general_ci
Any advice is greatly appreciated.
Explain query:
New Server (performs slow)
+----+-------------+----------------+-------+---------------+---------------+---------+------+--------+-------------------------------------------------+--+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | |
+----+-------------+----------------+-------+---------------+---------------+---------+------+--------+-------------------------------------------------+--+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 970 | Using temporary; Using filesort | |
+----+-------------+----------------+-------+---------------+---------------+---------+------+--------+-------------------------------------------------+--+
| 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 109514 | Using where; Using join buffer (flat, BNL join) | |
+----+-------------+----------------+-------+---------------+---------------+---------+------+--------+-------------------------------------------------+--+
| 2 | DERIVED | wp_ap_activity | range | activity_q_id | activity_q_id | 8 | NULL | 970 | Using index condition | |
+----+-------------+----------------+-------+---------------+---------------+---------+------+--------+-------------------------------------------------+--+
Old Server (performs fast)
+----+-------------+----------------+------------+-------+---------------+---------------+---------+---------------------------+-------+----------+----------------------------------------------+--+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | |
+----+-------------+----------------+------------+-------+---------------+---------------+---------+---------------------------+-------+----------+----------------------------------------------+--+
| 1 | PRIMARY | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 20270 | 100.00 | Using where; Using temporary; Using filesort | |
+----+-------------+----------------+------------+-------+---------------+---------------+---------+---------------------------+-------+----------+----------------------------------------------+--+
| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 5 | helpdesk.t1.activity_date | 10 | 100.00 | Using index | |
+----+-------------+----------------+------------+-------+---------------+---------------+---------+---------------------------+-------+----------+----------------------------------------------+--+
| 2 | DERIVED | wp_ap_activity | NULL | range | activity_q_id | activity_q_id | 8 | NULL | 970 | 100.00 | Using index condition | |
+----+-------------+----------------+------------+-------+---------------+---------------+---------+---------------------------+-------+----------+----------------------------------------------+--+
Updated EXPLAIN after applying the fix provided by Wilson Hauck. Query speed down to ~0.005 seconds!
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|----|-------------|----------------|-------|---------------|---------------|---------|------------------|------|-----------------------------|
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 966 | Using where; Using filesort |
| 1 | PRIMARY | t1 | ref | activity_date | activity_date | 5 | t2.activity_date | 1 | |
| 2 | DERIVED | wp_ap_activity | range | activity_q_id | activity_q_id | 8 | NULL | 966 | Using index condition |
Your EXPLAIN's are nowhere close to being identical, look at end of lines. Second query has BNL in remarks, meaning Block Nested Loop processing (that is always SLOW). To be avoided.
You NEED an index on activity_date on EACH table.
Run from MySQL command prompt, SHOW INDEX FROM wp_ap_activity; on EACH server will rebuild the indexes so they are current.
Change queries to SELECT SQL_NO_CACHE ......... for testing to avoid using Query Cache results and get your timings again from the SECOND and THIRD execution of each query to compare.
Let us know your results, please.
Related
I have a particular MariaDB Query that utilizes some joins and some IN ('...') conditions.
Generally it returns results in < 2sec on large data sets (~50M records), however when a very large number of options are presented in the IN condition, (Eg. 1000+ IN options) the query takes 5+ hours and the logic completely changes when running an ANALYZE on the queries.
Looking to understand why this is the case and suggestions on how I might be able to resolve the bottleneck. Presently thinking the simplest option may be to drop the IN condition completely and filter the results in PHP instead of SQL, as if the IN condition is dropped results are returned in <1s on same tables.
ANALYZE results from query where small IN set used.
+------+-------------+-------+-------+------------------------+----------+---------+--------------------+-------+---------+----------+------------+---------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
+------+-------------+-------+-------+------------------------+----------+---------+--------------------+-------+---------+----------+------------+---------------------------------------------------------------------+
| 1 | SIMPLE | t1 | range | calldate,call_id | calldate | 7 | NULL | 13400 | 7162.00 | 100.00 | 33.01 | Using index condition; Using where; Using temporary; Using filesort |
| 1 | SIMPLE | t2 | ref | PRIMARY,called,call_id | call_id | 4 | crimson.t1.call_id | 1 | 3.21 | 100.00 | 35.24 | Using where |
| 1 | SIMPLE | d1 | ref | digits,leg_id | leg_id | 4 | crimson.t2.xid | 1 | 1.94 | 100.00 | 0.58 | Using index condition; Using where |
| 1 | SIMPLE | g1 | ref | call_id | call_id | 4 | crimson.t1.call_id | 1 | 3.00 | 100.00 | 0.00 | Using where |
+------+-------------+-------+-------+------------------------+----------+---------+--------------------+-------+---------+----------+------------+---------------------------------------------------------------------+
4 rows in set (1.154 sec)
ANALYZE results from the same query and conditions where 1200 IN set has been used.
+------+--------------+-------------+--------+------------------------+---------+---------+--------------------+------+---------+----------+------------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
+------+--------------+-------------+--------+------------------------+---------+---------+--------------------+------+---------+----------+------------+---------------------------------+
| 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 1222 | 1222.00 | 100.00 | 100.00 | Using temporary; Using filesort |
| 1 | PRIMARY | d1 | ref | digits,leg_id | digits | 29 | tvc_0._col_1 | 5 | 6192.72 | 100.00 | 100.00 | Using index condition |
| 1 | PRIMARY | t2 | eq_ref | PRIMARY,called,call_id | PRIMARY | 8 | crimson.d1.leg_id | 1 | 1.00 | 100.00 | 36.73 | Using where |
| 1 | PRIMARY | g1 | ref | call_id | call_id | 4 | crimson.t2.call_id | 1 | 3.32 | 100.00 | 0.05 | Using where |
| 1 | PRIMARY | t1 | ref | calldate,call_id | call_id | 4 | crimson.t2.call_id | 1 | 5.19 | 100.00 | 0.03 | Using where |
| 2 | MATERIALIZED | <derived3> | ALL | NULL | NULL | NULL | NULL | 1222 | 1222.00 | 100.00 | 100.00 | |
| 3 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
+------+--------------+-------------+--------+------------------------+---------+---------+--------------------+------+---------+----------+------------+---------------------------------+
7 rows in set (5 hours 16 min 16.738 sec)
ANALYZE without any IN condition at all.
+------+-------------+-------+-------+------------------------+----------+---------+--------------------+-------+---------+----------+------------+---------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
+------+-------------+-------+-------+------------------------+----------+---------+--------------------+-------+---------+----------+------------+---------------------------------------------------------------------+
| 1 | SIMPLE | t1 | range | calldate,call_id | calldate | 7 | NULL | 13400 | 7162.00 | 100.00 | 33.01 | Using index condition; Using where; Using temporary; Using filesort |
| 1 | SIMPLE | t2 | ref | PRIMARY,called,call_id | call_id | 4 | crimson.t1.call_id | 1 | 3.21 | 100.00 | 35.24 | Using where |
| 1 | SIMPLE | g1 | ref | call_id | call_id | 4 | crimson.t1.call_id | 1 | 3.57 | 100.00 | 0.06 | Using where |
| 1 | SIMPLE | d1 | ref | leg_id | leg_id | 4 | crimson.t2.xid | 1 | 1.33 | 100.00 | 100.00 | Using index condition |
+------+-------------+-------+-------+------------------------+----------+---------+--------------------+-------+---------+----------+------------+---------------------------------------------------------------------+
4 rows in set (0.093 sec)
Example Tables:
CREATE TABLE `digit_dial_map_x` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`leg_id` int(11) NOT NULL,
`sequence` int(2) NOT NULL,
`digits` varchar(26) DEFAULT NULL,
`category` varchar(2) NOT NULL DEFAULT 'M',
INDEX `leg_id` (`leg_id`),
INDEX `digits` (`digits`),
INDEX `category` (`category`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `call_legs_x` (
`xid` bigint(20) NOT NULL AUTO_INCREMENT,
`call_id` int(11) NOT NULL,
`calldate` date NOT NULL,
`start_time` time DEFAULT NULL,
`duration_hr` varchar(6) DEFAULT NULL,
`duration_min` varchar(2) DEFAULT NULL,
`duration_sec` varchar(2) DEFAULT NULL,
`calling` varchar(7) DEFAULT NULL,
`called` varchar(7) DEFAULT NULL,
`ans` varchar(2) DEFAULT NULL,
`ans_time` varchar(4) DEFAULT NULL,
`digits_dialed` varchar(26) DEFAULT NULL,
`digits_actual` varchar(26) DEFAULT NULL,
`ani` varchar(20) DEFAULT NULL,
`dnis` varchar(10) DEFAULT NULL,
`extn` varchar(10) DEFAULT NULL,
`trans_conf` varchar(2) DEFAULT NULL,
`third_party` varchar(7) DEFAULT NULL,
`sysid` varchar(3) DEFAULT NULL,
`call_log_id` varchar(12) DEFAULT NULL,
`assoc_log_id` varchar(12) DEFAULT NULL,
`raw_id` int(11) NOT NULL,
`leg` varchar(2) DEFAULT NULL,
`call_start` datetime DEFAULT NULL,
`call_end` datetime DEFAULT NULL,
`call_start_utc` datetime DEFAULT NULL,
`call_end_utc` datetime DEFAULT NULL,
INDEX `calldate` (`calldate`, `start_time`),
INDEX `called` (`called`),
INDEX `call_id` (`call_id`),
INDEX `digits_dialed` (`digits_dialed`),
INDEX `raw_id` (`raw_id`),
INDEX `call_start` (`call_start`),
INDEX `call_end` (`call_end`),
INDEX `call_start_utc` (`call_start`),
INDEX `call_end_utc` (`call_end`),
INDEX `calling` (`calling`),
INDEX `ans_time` (`ans_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Example Query (omitted the 1200 IN options):
SELECT t1.call_id, t2.ans_time, t2.ans, ((t2.duration_hr * 3600) + (t2.duration_min *60) + t2.duration_sec) as duration, t2.digits_dialed, t2.digits_actual, t2.dnis, t2.trans_conf, t1.ani, t1.calling, t2.called, d1.digits, g1.extn
FROM call_legs_55 as t1
JOIN call_legs_55 as t2 ON t1.call_id=t2.call_id
JOIN digit_dial_map_55 as d1 ON t2.xid=d1.leg_id
JOIN call_legs_55 as g1 ON t1.call_id=g1.call_id
WHERE (t1.calldate BETWEEN '2019-11-25' AND '2019-11-25') AND NOT ((t1.calldate = '2019-11-25') AND (t1.start_time < '00:00:00')) AND NOT((t1.calldate = '2019-11-25') AND (t1.start_time > '24:00:00'))
AND (t1.calling IN ('T6001','T6002') )
AND d1.digits IN ('...')
AND t2.called !='X9999'
AND t1.calling != 'X9999'
AND t1.calling != ''
AND t2.ans_time != ''
AND (g1.extn IN ('52043','52042','52132','52116') AND g1.extn != t1.calling)
GROUP BY CONCAT(t1.call_id, g1.extn);
digit_dial_map_x is a many:many mapping table, correct?? It's indexes are very inefficient.
Recommend:
PRIMARY KEY(leg_id, digits),
INDEX(digits, leg_id)
More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
(The table did not happen to be such.)
I have the following query. the main tables are company_reports and project_con_messages with 770,000 and 1,040,000 records respectively.
When I run this query it takes about 20 seconds and when I remove the last table company_par_user_settings it takes less than 0.5 seconds.
company_par_user_settings (short name: pus) has about 200,000 records and is meant to show user settings for each company_partner. on company_par_user_settings table we have composite unique index key on partner_id and user_id fields. I also removed the index and replaced with simple indexes on partner_id and user_id but at the end, it didn't make any big difference in running time.
SELECT *
FROM company_reports rep
LEFT JOIN system_users usr ON rep.user_id=usr.id
LEFT JOIN company_rep_subjects sbj ON rep.subject_id=sbj.id
INNER JOIN company_partners par ON rep.partner_id=par.id
LEFT JOIN project_con_messages mes ON rep.message_id=mes.id
LEFT JOIN company_par_user_settings pus ON par.id=pus.partner_id AND 1=pus.user_id
WHERE 1=1
ORDER BY rep.id DESC
LIMIT 0,50
Here below I added the explain on the above query:
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
| 1 | SIMPLE | rep | NULL | ALL | partner_id | NULL | NULL | NULL | 772236 | 100.00 | Using temporary; Using filesort |
| 1 | SIMPLE | par | NULL | eq_ref | PRIMARY | PRIMARY | 3 | portal_ebrahim.rep.partner_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | usr | NULL | eq_ref | PRIMARY | PRIMARY | 2 | portal_ebrahim.rep.user_id | 1 | 100.00 | Using where |
| 1 | SIMPLE | sbj | NULL | eq_ref | PRIMARY | PRIMARY | 2 | portal_ebrahim.rep.subject_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | mes | NULL | eq_ref | PRIMARY | PRIMARY | 4 | portal_ebrahim.rep.message_id | 1 | 100.00 | NULL |
| 1 | SIMPLE | pus | NULL | ALL | NULL | NULL | NULL | NULL | 191643 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
I appreciate if someone help me on the right indexes or any other solutions that make the query run faster.
Edit:
Here is the show table for company_par_user_settings
CREATE TABLE `company_par_user_settings` (
`id` mediumint(9) NOT NULL AUTO_INCREMENT,
`partner_id` mediumint(8) unsigned NOT NULL,
`user_id` smallint(5) unsigned NOT NULL,
`access` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT '0-Not specified',
`access_category` tinyint(1) unsigned NOT NULL DEFAULT '0',
`notify` tinyint(1) unsigned NOT NULL DEFAULT '0',
`stars` tinyint(1) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `partner_id` (`partner_id`,`user_id`),
KEY `stars` (`stars`)
) ENGINE=MyISAM AUTO_INCREMENT=198729 DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci
This is the shortversion of my query:
SELECT product.* FROM product_list product
LEFT JOIN language_item language ON (product.title=language.languageVariable)
WHERE language.languageID = 1
ORDER BY language.languageValue ASC
When I use it, the query has 3 seconds. When I remove the order by the query has 0.3 seconds. Can you recommend a change to accelerate it?
product.title and language.languageVariable is a language variable like global.product.title1, and languageValue is the title like car, doll or something else.
CREATE TABLE `language_item` (
`languageItemID` int(10) UNSIGNED NOT NULL,
`languageID` int(10) UNSIGNED NOT NULL DEFAULT '0',
`languageVariable` varchar(255) NOT NULL DEFAULT '',
`languageValue` mediumtext NOT NULL,
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
ALTER TABLE `language_item`
ADD PRIMARY KEY (`languageItemID`),
ADD UNIQUE KEY `languageVariable` (`languageVariable`,`languageID`),
ADD KEY `languageValue` (`languageValue`(300));
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra
1 | SIMPLE | product | NULL | ALL | PRIMARY,inactive,archive,productCategoryID | NULL | NULL | NULL | 1475 | 88.27 | Using where; Using temporary; Using filesort
1 | SIMPLE | language | NULL | ref | languageVariable | languageVariable | 767 | db.product.title | 136 | 1.00 | Using index condition
Here is the structur from language_item with the index:
CREATE TABLE `language_item` (
`languageItemID` int(10) UNSIGNED NOT NULL,
`languageID` int(10) UNSIGNED NOT NULL DEFAULT '0',
`languageVariable` varchar(255) NOT NULL DEFAULT '',
`languageValue` mediumtext NOT NULL,
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
ALTER TABLE `language_item`
ADD PRIMARY KEY (`languageItemID`),
ADD UNIQUE KEY `languageVariable` (`languageVariable`,`languageID`),
ADD KEY `languageValue` (`languageValue`(300));
The Explain:
id | select_type | table | partitions | type | possible_keys | key |
key_len | ref | rows | filtered | Extra 1 | SIMPLE | product | NULL |
ALL | PRIMARY,inactive,archive,productCategoryID | NULL | NULL | NULL
| 1475 | 88.27 | Using where; Using temporary; Using filesort 1 |
SIMPLE | language | NULL | ref | languageVariable | languageVariable |
767 | db.product.title | 136 | 1.00 | Using index condition
TRy this:
SELECT d.* from (
SELECT product.*, language.languageValue AS lv
FROM product_list product
JOIN language_item language ON (product.title=language.languageVariable)
WHERE language.languageID = 1
) as d
ORDER BY d.lv ASC
I have the following MySQL (MyISAM) table with about 3 Million rows.
CREATE TABLE `tasks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`node` smallint(6) NOT NULL,
`pid` int(11) NOT NULL,
`job` int(11) NOT NULL,
`a_id` int(11) DEFAULT NULL,
`user_id` int(11) NOT NULL,
`state` int(11) NOT NULL,
`start_time` int(11) NOT NULL,
`end_time` int(11) NOT NULL,
`stop_time` int(11) NOT NULL,
`end_stream` int(11) NOT NULL,
`message` varchar(255) DEFAULT NULL,
`rate` float NOT NULL,
`exiting` int(11) NOT NULL DEFAULT '0',
`bytes` int(11) NOT NULL,
`motion` tinyint(4) NOT NULL,
PRIMARY KEY (`id`),
KEY `a_id` (`a_id`),
KEY `job` (`job`),
KEY `state` (`state`),
KEY `end_time` (`end_time`),
KEY `start_time` (`start_time`),
) ENGINE=MyISAM AUTO_INCREMENT=100 DEFAULT CHARSET=utf8;
Now when I run the following query, MySQL is only using the a_id index and needs to scan a few thousand rows.
SELECT count(id) AS tries FROM `tasks` WHERE ( job='1' OR job='3' )
AND a_id='614' AND state >'80' AND state < '100' AND start_time >='1386538013';
When I add an additional index KEY newkey (a_id,state,start_time), MySQL is still trying to use a_id only and not newkey. Only when using the hint / force index in the query, it's been used. Changing the fields in the query around does not help.
Any ideas? I don't necessarily want hints in my statements. The fact that MySQL is not doing this automatically indicates to me that there is an issue with my table, keys or query somewhere. Any help is highly appreciated.
Additional info:
mysql> show index in tasks;
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| tasks | 0 | PRIMARY | 1 | id | A | 3130554 | NULL | NULL | | BTREE | | |
| tasks | 1 | a_id | 1 | a_id | A | 2992 | NULL | NULL | YES | BTREE | | |
| tasks | 1 | job | 1 | job | A | 5 | NULL | NULL | | BTREE | | |
| tasks | 1 | state | 1 | state | A | 9 | NULL | NULL | | BTREE | | |
| tasks | 1 | end_time | 1 | end_time | A | 1565277 | NULL | NULL | | BTREE | | |
| tasks | 1 | newkey | 1 | a_id | A | 2992 | NULL | NULL | YES | BTREE | | |
| tasks | 1 | newkey | 2 | state | A | 8506 | NULL | NULL | | BTREE | | |
| tasks | 1 | newkey | 3 | start_time | A | 3130554 | NULL | NULL | | BTREE | | |
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
EXPLAIN with and without quotes:
mysql> DESCRIBE SELECT count(id) AS tries FROM `tasks` WHERE ( job='1' OR job='3' ) AND a_id='614' AND state >'80' AND state < '100' AND start_time >='1386538013';
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
| 1 | SIMPLE | tasks | ref | a_id,job,state,newkey | a_id | 5 | const | 740 | Using where |
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
1 row in set (0.10 sec)
mysql> DESCRIBE SELECT count(id) AS tries FROM `tasks` WHERE ( job=1 OR job=3 ) AND a_id = 614 AND state > 80 AND state < 100 AND start_time >= 1386538013;
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
| 1 | SIMPLE | tasks | ref | a_id,job,state,newkey | a_id | 5 | const | 740 | Using where |
+----+-------------+-------+------+----------------------------+-----------+---------+-------+------+-------------+
1 row in set (0.01 sec)
A few things... I would have a SINGLE compound index on
( a_id, job, state, start_time )
This to help optimize the query on all the criteria, in what I believe is the best tuned sequence. A single "A_ID", then two jobs, a small state range, then time based. Next, notice no quotes... It appears you were converting numeric to string comparisons, leave them as numeric for compare -- faster than strings.
Also, by having them all as part of the index, it is a COVERING index meaning it does NOT have to go to the raw page data to get the other values to test the qualifying records to include or not.
SELECT
count(*) AS tries
FROM
tasks
WHERE
a_id = 614
AND job IN ( 1, 3 )
AND state > 80 AND state < 100
AND start_time >= 1386538013;
Now, the why the index... consider the following scenario. You have two rooms that have boxes... In the first room, each box is an "a_id", within that are the jobs in order, within each job are the state ranges, and finally by start time.
In another room, your boxes are sorted by start time, within that a_id are sorted, and finally state.
Which would be easier to find what you need. That is how you should think on the indexes. I would rather go to one box for "A_ID = 614", then jump to Job 1 and another for Job 3. Within each Job 1, Job 3, grab 80-100, then time. You however know better your data and volume in each criteria consideration and may adjust.
Finally, the count(ID) vs count(*). All I care about is a record qualified. I don't need to know the actual ID as the filtering criteria already qualified as include or not, why look (in this case) for the actual "ID".
Probably mysql thinks that using the a_id key will using less IO.
Probably the cardinality of the key a_id is good enough.
What explains of the hinted/hintless queries say?
Most of a_id=614's state has > 80 and < 100, then it could be happened. Have you tried one of below indexes?
INDEX(a_id, start_time, state)
INDEX(start_time, a_id, state)
I have a table with 18,310,298 records right now.
And next query
SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE
`event` = '11' AND
`obj_type` = '2' AND
`region` = 'us' AND
DATE(`date`) = DATE('20120213010502');
With next structure
CREATE TABLE `logs` (
`log_id` int(11) NOT NULL AUTO_INCREMENT,
`event` tinyint(4) NOT NULL,
`obj_type` tinyint(1) NOT NULL DEFAULT '0',
`obj_id` int(11) unsigned NOT NULL DEFAULT '0',
`region` varchar(3) NOT NULL DEFAULT '',
`date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`log_id`),
KEY `event` (`event`),
KEY `obj_type` (`obj_type`),
KEY `region` (`region`),
KEY `for_stat` (`event`,`obj_type`,`obj_id`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |
and MySQL explain show the next
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
| 1 | SIMPLE | logs | ref | event,obj_type,region,for_stat | for_stat | 2 | const,const | 837216 | 100.00 | Using where; Using index |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
Running such query in daily peak usage time take about 5 seconds.
What can I do to make it faster ?
UPDATED: Regarding all comments I modified INDEX and take off DATE function in WHERE clause
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| logs | 0 | PRIMARY | 1 | log_id | A | 15379109 | NULL | NULL | | BTREE | |
| logs | 1 | event | 1 | event | A | 14 | NULL | NULL | | BTREE | |
| logs | 1 | obj_type | 1 | obj_type | A | 14 | NULL | NULL | | BTREE | |
| logs | 1 | region | 1 | region | A | 14 | NULL | NULL | | BTREE | |
| logs | 1 | for_stat | 1 | event | A | 157 | NULL | NULL | | BTREE | |
| logs | 1 | for_stat | 2 | obj_type | A | 157 | NULL | NULL | | BTREE | |
| logs | 1 | for_stat | 3 | region | A | 157 | NULL | NULL | | BTREE | |
| logs | 1 | for_stat | 4 | date | A | 157 | NULL | NULL | | BTREE | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
mysql> explain extended SELECT COUNT(obj_id) as cnt
-> FROM `common`.`logs`
-> WHERE `event`= '11' AND
-> `obj_type` = '2' AND
-> `region`= 'est' AND
-> date between '2012-11-25 00:00:00' and '2012-11-25 23:59:59';
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
| 1 | SIMPLE | logs | range | event,obj_type,region,for_stat | for_stat | 21 | NULL | 9674 | 75.01 | Using where |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
It seems it's running faster. Thanks everyone.
The EXPLAIN output shows that the query is using only the first two columns of the for_stat index.
This is because the query doesn't use obj_id in the WHERE clause. If you create a new key without obj_id (or modify the existing key to reorder the columns), more of the key can be used and you may see better performance:
KEY `for_stat2` (`event`,`obj_type`,`region`,`date`)
If it's still too slow, changing the last condition, where you use DATE(), as said by Salman and Sashi, might improve things.
#Joni already explained what is wrong with your index. For query, I assume that your example query selects all records for 2012-02-13 regardless of time. You can change the where clause to use >= and < instead of DATE cast:
SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE
`event` = 11 AND
`obj_type` = 2 AND
`region` = 'us' AND
`date` >= DATE('20120213010502') AND
`date` < DATE('20120213010502') + INTERVAL 1 DAY
The date function on the date column is making the full table scan.
Try this ::
SELECT COUNT(obj_id) as cnt
FROM
`common`.`logs`
WHERE
`event` = 11
AND
`obj_type` = 2
AND
`region` = 'us'
AND
`date` = DATE('20120213010502')
As logging (inserts) needs to be fast too, use as less indices as possible.
Evaluation may take long as that is admin, not necessarily needing indices.
CREATE TABLE `logs` (
`log_id` int(11) NOT NULL AUTO_INCREMENT,
`event` tinyint(4) NOT NULL,
`obj_type` tinyint(1) NOT NULL DEFAULT '0',
`obj_id` int(11) unsigned NOT NULL DEFAULT '0',
`region` varchar(3) NOT NULL DEFAULT '',
`date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`log_id`),
KEY `for_stat` (`event`,`obj_type`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |
And about the date search #SashiKant and #SalmanA already answered.
Is Mysql you should place index columns by collation count; less possible values in table - placed closer to the left.
Also you can try to change column region to enum() and try to search date with BETWEEN clause.
Mysql is not using third column in the index because it's usage takes more efforts then just filtering (it's a common thing in Mysql).