I've noticed a serious problem recently, when my database increased to over 620000 records. Following query:
SELECT *,UNIX_TIMESTAMP(`time`) AS `time` FROM `log` WHERE (`projectname`="test" OR `projectname` IS NULL) ORDER BY `time` DESC LIMIT 0, 20
has an execution time about 2,5s on a local database. I was wondering how can I speed it up?
The EXPLAIN commands produces following output:
ID: 1
select type: SIMPLE
TABLE: log
type: ref_or_null
possible_keys: projectname
key: projectname
key_len: 387
ref: const
rows: 310661
Extra: Using where; using filesort
I've got indexes set on projectname, time columns.
Any help?
EDIT: Thanks to ypercube response, I was able to decrease query execution time. But when I only add another condition to WHERE clause (AND severity="changes") it lasts 2s again. Is it a good solution to include all of the possible "WHERE" columns to my merged-index?
ID: 1
select type: SIMPLE
TABLE: log
type: ref_or_null
possible_keys: projectname
key: projectname
key_len: 419
ref: const, const
rows: 315554
Extra: Using where; using filesort
Table structure:
CREATE TABLE `log` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`projectname` VARCHAR(128) DEFAULT NULL,
`time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`master` VARCHAR(128) NOT NULL,
`itemName` VARCHAR(128) NOT NULL,
`severity` VARCHAR(10) NOT NULL DEFAULT 'info',
`message` VARCHAR(255) NOT NULL,
`more` TEXT NOT NULL,
PRIMARY KEY (`id`),
KEY `projectname` (`severity`,`projectname`,`time`)
) ENGINE=INNODB AUTO_INCREMENT=621691 DEFAULT CHARSET=utf8
Add an index on (projectname, time):
ALTER TABLE log
ADD INDEX projectname_time_IX -- choose a name for the index
(projectname, time) ;
And then use the original column for the ORDER BY
SELECT *, UNIX_TIMESTAMP(time) AS unix_time
FROM log
WHERE (projectname = 'test' OR projectname IS NULL)
ORDER BY time DESC
LIMIT 0, 20 ;
or this variation - to make sure that the index is used effectively:
( SELECT *, UNIX_TIMESTAMP(time) AS unix_time
FROM log
WHERE projectname = 'test'
ORDER BY time DESC
LIMIT 20
)
UNION ALL
( SELECT *, UNIX_TIMESTAMP(time) AS unix_time
FROM log
WHERE projectname IS NULL
ORDER BY time DESC
LIMIT 20
)
ORDER BY time DESC
LIMIT 20 ;
Related
I have a query like this:
SELECT id, terrain, occupied, c_type
FROM map
WHERE x >= $x-$radius
AND x <= $x+$radius
AND y >= $y-$radius
AND y <= $y+$radius
ORDER BY
x ASC,
y ASC
My table looks like this:
CREATE TABLE `map` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`occupied` tinyint(2) NOT NULL DEFAULT '0',
`c_type` tinyint(4) NOT NULL DEFAULT '0',
`x` int(11) NOT NULL,
`y` int(11) NOT NULL,
`terrain` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4_general_ci
I removed all indexes except PRIMARY KEY, because I am unsure how does indexing works with SQL.
What can I do to tune this query? Thanks...
This is not a duplicate,check comments!
The best you can do with that query is
INDEX(x, y) -- In this order
That will be effective with
WHERE x >= $x-$radius
AND x <= $x+$radius
but not for filtering on y.
And it will (probably) avoid the "filesort" for ORDER BY x ASC, y ASC. Note that the index order must match this order.
Provide EXPLAIN SELECT ... for any attempted SELECT.
And, please switch to InnoDB before MyISAM is removed.
Here is a quick cookbook on indexing in MySQL.
So, I added index gmp with columns x, y, id, terrain, occupied, c_type and EXPLAIN SELECT displays:
id: 1
select_type: SIMPLE
table: map
type: range
possible_keys: gmp
key: gmp
key_len: 8
ref: NULL
rows: 1955
Extra: Using where; Using index
So, I guess it works now.
The Facts:
Dedicated Server, 4 Cores, 16GB
MySQL 5.5.29-0ubuntu0.12.10.1-log - (Ubuntu)
One Table, 1.9M rows and growing
I need all sorted rows for export or a 5er chunk. The query takes 25 seconds with Copying To Tmp Table 23.3 s
I tried InnoDB and MyISAM, changing the index order, using a MD5 Hash of some_text as GROUP BY, partition the table by day.
dayis a Unix-Timestamp and alway present.
lang some_bool some_filter ano_filter rel_id could be in where clause but not need to.
Here is the MyISAM example:
The table
mysql> SHOW CREATE TABLE data \G;
*************************** 1. row ***************************
Table: data
Create Table: CREATE TABLE `data` (
`data_id` bigint(20) NOT NULL AUTO_INCREMENT,
`rel_id` int(11) NOT NULL,
`some_text` varchar(255) DEFAULT NULL,
`lang` varchar(3) DEFAULT NULL,
`some_bool` tinyint(1) DEFAULT NULL,
`some_filter` varchar(40) DEFAULT NULL,
`ano_filter` varchar(10) DEFAULT NULL,
`day` int(11) DEFAULT NULL,
PRIMARY KEY (`data_id`),
KEY `cnt_idx` (`some_filter`,`ano_filter`,`rel_id`,`lang`,`some_bool`,`some_text`,`day`)
) ENGINE=MyISAM AUTO_INCREMENT=1900099 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
The query
mysql> EXPLAIN SELECT `some_text` , COUNT(*) AS `num` FROM `data`
WHERE `lang` = 'en' AND `day` BETWEEN '1364342400' AND
'1366934399' GROUP BY `some_text` ORDER BY `num` DESC \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: data
type: index
possible_keys: NULL
key: cnt_idx
key_len: 947
ref: NULL
rows: 1900098
Extra: Using where; Using index; Using temporary; Using filesort
1 row in set (0.00 sec)
mysql> SELECT `some_text` , COUNT(*) AS `num` FROM `data`
WHERE `lang` = 'en' AND `day` BETWEEN '1364342400' AND '1366934399'
GROUP BY `some_text` ORDER BY `num` DESC LIMIT 5 \G;
...
*************************** 5. row ***************************
5 rows in set (24.26 sec)
Any idea how to speed up that thing?`
No index is being used because of the column order in the index. Indexes work left to right. For this query to use an index, you would need an index of lang, day.
Table structure:
CREATE TABLE IF NOT EXISTS `logs` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user` bigint(20) unsigned NOT NULL,
`type` tinyint(1) unsigned NOT NULL,
`date` int(11) unsigned NOT NULL,
`plus` decimal(10,2) unsigned NOT NULL,
`minus` decimal(10,2) unsigned NOT NULL,
`tax` decimal(10,2) unsigned NOT NULL,
`item` bigint(20) unsigned NOT NULL,
`info` char(10) NOT NULL,
PRIMARY KEY (`id`),
KEY `item` (`item`),
KEY `user` (`user`),
KEY `type` (`type`),
KEY `date` (`date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=0 ROW_FORMAT=FIXED;
Query:
SELECT logs.item, COUNT(logs.item) AS total FROM logs WHERE logs.type = 4 GROUP BY logs.item;
Table holds 110k records out of which 50k type 4 records.
Execution time: 0.13 seconds
I know this is fast, but can I make it faster?
I am expecting 1 million records and thus the time would grow quite a bit.
Analyze queries with EXPLAIN:
mysql> EXPLAIN SELECT logs.item, COUNT(logs.item) AS total FROM logs
WHERE logs.type = 4 GROUP BY logs.item\G
id: 1
select_type: SIMPLE
table: logs
type: ref
possible_keys: type
key: type
key_len: 1
ref: const
rows: 1
Extra: Using where; Using temporary; Using filesort
The "Using temporary; Using filesort" indicates some costly operations. Because the optimizer knows it can't rely on the rows with each value of item being stored together, it needs to scan the whole table and collect the count per distinct item in a temporary table. Then sort the resulting temp table to produce the result.
You need an index on the logs table on columns (type, item) in that order. Then the optimizer knows it can leverage the index tree to scan each value of logs.item fully before moving on to the next value. By doing this, it can skip the temporary table to collect values, and skip the implicit sorting of the result.
mysql> CREATE INDEX logs_type_item ON logs (type,item);
mysql> EXPLAIN SELECT logs.item, COUNT(logs.item) AS total FROM logs
WHERE logs.type = 4 GROUP BY logs.item\G
id: 1
select_type: SIMPLE
table: logs
type: ref
possible_keys: type,logs_type_item
key: logs_type_item
key_len: 1
ref: const
rows: 1
Extra: Using where
I have following table with 2 million rows in it.
CREATE TABLE `gen_fmt_lookup` (
`episode_id` varchar(30) DEFAULT NULL,
`media_type` enum('Audio','Video') NOT NULL DEFAULT 'Video',
`service_id` varchar(50) DEFAULT NULL,
`genre_id` varchar(30) DEFAULT NULL,
`format_id` varchar(30) DEFAULT NULL,
`masterbrand_id` varchar(30) DEFAULT NULL,
`signed` int(11) DEFAULT NULL,
`actual_start` datetime DEFAULT NULL,
`scheduled_start` datetime DEFAULT NULL,
`scheduled_end` datetime DEFAULT NULL,
`discoverable_end` datetime DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
KEY `idx_discoverability_gn` (`media_type`,`service_id`,`genre_id`,`actual_start`,`scheduled_end`,`scheduled_start`,`episode_id`),
KEY `idx_discoverability_fmt` (`media_type`,`service_id`,`format_id`,`actual_start`,`scheduled_end`,`scheduled_start`,`episode_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
Below is query with explain which I am running against this table
mysql> EXPLAIN select episode_id,scheduled_start
from gen_fmt_lookup
where media_type='video'
and service_id in ('mobile_streaming_100','mobile_streaming_200','iplayer_streaming_h264_flv_vlo','mobile_streaming_500','iplayer_stb_uk_stream_aac_concrete','captions','iplayer_uk_stream_aac_rtmp_concrete','iplayer_streaming_n95_3g','iplayer_uk_download_oma_wifi','iplayer_uk_stream_aac_3gp_concrete')
and genre_id in ('100001','100002','100003','100004','100005','100006','100007','100008','100009','100010')
and NOW() BETWEEN actual_start and scheduled_end
group by episode_id order by min(scheduled_start) limit 1 offset 100\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: nitro_episodes_gen_fmt_lookup
type: range
possible_keys: idx_discoverability_gn,idx_discoverability_fmt
key: idx_discoverability_gn
key_len: 96
ref: NULL
rows: 31719
Extra: Using where; Using index; Using temporary; Using filesort
1 row in set (0.16 sec)
So my questions are
Is the index used in query execution best? And if not can someone please suggest better index?
Can mysql use composite index with 2 dates in where clause? As in the query above where clause has and condition "and NOW() BETWEEN actual_start and scheduled_end " but mysql is using index 'idx_discoverability_gn' with key length of 96 only. Which means it is using index upto (media_type,service_id,genre_id,actual_start) only.why can't it use index upto (media_type,service_id,genre_id,actual_start,scheduled_end) ?
What else I can do to improve performance?
You have a range check, so a clustered index on (scheduled_start, actual_start, scheduled_end) might help. Your current indexes are not very useful.You can get rid of them and build one primary key (episode_id) and another index (service_id, genre-id, episode_id)
Here`s my SHOW CREATE TABLE tbl:
CREATE TABLE IF NOT EXISTS `msc_pagestats` (
`id` int(10) unsigned NOT NULL auto_increment,
`domain` varchar(250) NOT NULL,
`file` varchar(200) NOT NULL,
`simbol` varchar(100) NOT NULL,
`request_time` timestamp NULL default CURRENT_TIMESTAMP,
`querystring` mediumtext NOT NULL,
`host` varchar(100) NOT NULL,
PRIMARY KEY (`id`),
KEY `myindex` (`simbol`,`request_time`,`file`,`domain`)
) ENGINE=MyISAM DEFAULT CHARSET=latin2 AUTO_INCREMENT=248008 ;
So basically this table keeps track on what simbols have been most accesed, most viewed, most searched within the site based on a query string.
My query is:
SELECT `simbol`, count(*) AS requests
FROM msc_pagestats
WHERE 1=1 AND request_time > '20100504000000'
AND simbol NOT LIKE ''
GROUP BY `simbol`
ORDER BY requests DESC
LIMIT 0, 15;
This query EXPLAINED:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE msc_pagestats index NULL myindex 561 NULL 24961 Using where; Using index; Using temporary; Using filesort
So the query tryes to get the most accesed symbols in the latest hour or today.
Here's what I've tried doing in order to get rid of using temporary and using filesort:
Adding an ID as primary key and using COUNT(id) AS requests instead of the COUNT(*) AS requests;
Removing the where 1=1 and simbol not like='', it dosen`t prove a big difference though;
Adding a multiple index instead of the reqular index, previously there were indexes on each column ex (KEY(request_time), KEY(file), KEY(domain), KEY(simbol)).
I'm not that good on optimization, so I`ve runed out of options.
Here's a dump of my 'mysq_slow_query' file:
Query_time: 3 Lock_time: 0 Rows_sent: 15 Rows_examined: 220297
use kmarket;
SELECT `simbol`, count(*) AS requests
FROM msc_pagestats
WHERE 1=1 AND request_time > '20100504000000'
AND simbol NOT LIKE ''
GROUP BY `simbol`
ORDER BY requests DESC
LIMIT 0, 15;
Any help would be appreciated, thanks :)
Not much point in adding an index to a field calculated at run time, it would still need to be sorted/indexed on each run.
An index on (request_time,simbol) may allow the optimiser to exclude a lot of rows more quickly and also reduce the key length.