Table structure:
CREATE TABLE IF NOT EXISTS `logs` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user` bigint(20) unsigned NOT NULL,
`type` tinyint(1) unsigned NOT NULL,
`date` int(11) unsigned NOT NULL,
`plus` decimal(10,2) unsigned NOT NULL,
`minus` decimal(10,2) unsigned NOT NULL,
`tax` decimal(10,2) unsigned NOT NULL,
`item` bigint(20) unsigned NOT NULL,
`info` char(10) NOT NULL,
PRIMARY KEY (`id`),
KEY `item` (`item`),
KEY `user` (`user`),
KEY `type` (`type`),
KEY `date` (`date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=0 ROW_FORMAT=FIXED;
Query:
SELECT logs.item, COUNT(logs.item) AS total FROM logs WHERE logs.type = 4 GROUP BY logs.item;
Table holds 110k records out of which 50k type 4 records.
Execution time: 0.13 seconds
I know this is fast, but can I make it faster?
I am expecting 1 million records and thus the time would grow quite a bit.
Analyze queries with EXPLAIN:
mysql> EXPLAIN SELECT logs.item, COUNT(logs.item) AS total FROM logs
WHERE logs.type = 4 GROUP BY logs.item\G
id: 1
select_type: SIMPLE
table: logs
type: ref
possible_keys: type
key: type
key_len: 1
ref: const
rows: 1
Extra: Using where; Using temporary; Using filesort
The "Using temporary; Using filesort" indicates some costly operations. Because the optimizer knows it can't rely on the rows with each value of item being stored together, it needs to scan the whole table and collect the count per distinct item in a temporary table. Then sort the resulting temp table to produce the result.
You need an index on the logs table on columns (type, item) in that order. Then the optimizer knows it can leverage the index tree to scan each value of logs.item fully before moving on to the next value. By doing this, it can skip the temporary table to collect values, and skip the implicit sorting of the result.
mysql> CREATE INDEX logs_type_item ON logs (type,item);
mysql> EXPLAIN SELECT logs.item, COUNT(logs.item) AS total FROM logs
WHERE logs.type = 4 GROUP BY logs.item\G
id: 1
select_type: SIMPLE
table: logs
type: ref
possible_keys: type,logs_type_item
key: logs_type_item
key_len: 1
ref: const
rows: 1
Extra: Using where
Related
i have a below query and I don't know how to do explain plan for it. so what i have is temp table create query and table structure.
create temporary table if not exists tmp_staging_task_ids as
select distinct s.usr_task_id
from ue_events_staging s
where s.queue_id is null
limit 6500;
the above select query explain plan ;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: s
partitions: NULL
type: ref
possible_keys: ue_events_staging_queue_id,usr_task_id,queue_id_usr_task_id,queue_id_app_id
key: queue_id_usr_task_id
key_len: 303
ref: const
rows: 17774428
filtered: 100.00
Extra: Using where; Using index; Using temporary
Query;
update ue_events_staging s
join tmp_staging_task_ids t on t.usr_task_id = s.usr_task_id
set s.queue_id = 'queue_id';
table structure;
Create Table: CREATE TABLE `ue_events_staging` (
`id` bigint NOT NULL AUTO_INCREMENT,
`queue_id` varchar(100) DEFAULT NULL,
`usr_task_id` bigint NOT NULL,
`app_id` bigint NOT NULL,
`platform` tinyint NOT NULL,
`capture_time` bigint NOT NULL,
`input_type` varchar(50) NOT NULL,
`type` varchar(100) NOT NULL,
`event_type` varchar(10) NOT NULL,
`screen` varchar(100) NOT NULL,
`object_name` varchar(255) DEFAULT NULL,
`app_custom_tag` varchar(255) DEFAULT NULL,
`exception_class_name` varchar(250) DEFAULT NULL,
`exception_tag` varchar(250) DEFAULT NULL,
`non_responsive` tinyint(1) DEFAULT '0',
`is_first` tinyint(1) DEFAULT '0',
`is_second` tinyint(1) DEFAULT '0',
`is_last` tinyint(1) DEFAULT '0',
`is_quit` tinyint(1) DEFAULT '0',
`x_coordinate` double DEFAULT NULL,
`y_coordinate` double DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `ue_events_staging_queue_id` (`queue_id`),
KEY `usr_task_id` (`usr_task_id`),
KEY `screen` (`app_id`,`platform`,`screen`),
KEY `app_id_queue_id` (`app_id`,`queue_id`),
KEY `queue_id_usr_task_id` (`queue_id`,`usr_task_id`),
KEY `queue_id_app_id` (`queue_id`,`app_id`)
please check the possibilities it takes around 3.5K seconds and causes load.
This looks like you're doing your updates in batches of 6500 rows.
If you don't need that temporary table, you can refactor your update query to stand alone. You don't need the temporary table because you can put its WHERE queue_id IS NULL directly into the WHERE of your UPDATE.
UPDATE ue_events_staging
SET queue_id = 'queue_id'
WHERE queue_id IS NULL
LIMIT 6500;
Your temporary table creation step pulls 6500 distinct (arbitrarily chosen) usr_task_id values from your table. Some of those values may relate to more than one row in your table, so your UPDATE statement may update more than 6500 rows in your table.
The refactoring I suggest will update 6500 arbitarily chosen rows in your table. At the end of the statement it's possible that some rows with a particular usr_task_id value will be updated and others will not. If that's acceptable for your business rules it will be faster.
If your business rules require all rows with each particular usr_task_id value to be updated at once, you could try this to simplify both statements.
create temporary table if not exists tmp_staging_task_ids as
select s.usr_task_id
from ue_events_staging s
where s.queue_id is null
limit 6500;
update ue_events_staging
set queue_id = 'queue_id'
where usr_task_id IN
(select usr_task_id from tmp_staging_task_ids);
This gets rid of the DISTINCT operator in the creation of your temporary table and may save a little time. The IN clause implies DISTINCT values.
"Arbitrarily chosen"? Statements without ORDER BY and with LIMIT clauses instruct MySQL to choose rows arbitrarily. MySQL picks the rows that are fastest to retrieve (hopefully).
I have a database with a table called QuizMatches. The table has the following structure:
CREATE TABLE `QuizMatches` (
`QuizMatchesGuid` binary(16) NOT NULL,
`DateStarted` datetime NOT NULL,
`LatestChanged` datetime NOT NULL,
`HostFBUserToken` varchar(250) NOT NULL,
`GuestFBUserToken` varchar(250) NOT NULL,
`ArrayOfQuestionIDs` varchar(200) NOT NULL,
`ArrayOfQuestionResponseTimesAndAnswersHost` varchar(900) NOT NULL,
`ArrayOfQuestionResponseTimesAndAnswersGuest` varchar(900) NOT NULL,
`MatchFinished` int(1) NOT NULL DEFAULT '0',
`Category` varchar(45) NOT NULL,
`JsonQuestions` varchar(4000) NOT NULL DEFAULT '[]',
`DateFinished` datetime NOT NULL,
`LatestPushSentDate` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`QuizMatchesGuid`),
KEY `HostFBUserTokenIX` (`HostFBUserToken`),
KEY `GuestFBUserTokenIX` (`GuestFBUserToken`),
KEY `MatchFinishedIX` (`MatchFinished`),
KEY `LatestChangedIX` (`LatestChanged`),
KEY `LatestPushSentDateIX` (`LatestPushSentDate`),
KEY `DateFinishedIX` (`LatestChanged`,`HostFBUserToken`,`GuestFBUserToken`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
There is a large number of rows in this table and it is heavily used by multiple clients especially queries like the following are executed:
SELECT HEX(QuizMatchesGuid) AS QuizMatchesGuid, DateStarted,
LatestChanged, HostFBUserToken, GuestFBUserToken,
ArrayOfQuestionIDs, ArrayOfQuestionResponseTimesAndAnswersHost,
ArrayOfQuestionResponseTimesAndAnswersGuest, JsonQuestions
FROM CrystalDBQuiz.QuizMatches
ORDER BY LatestChanged DESC
LIMIT 10
The main problem seem to be that the database performs a full index scan. I have tried with different combinations of indexes but to no success.
If I run EXPLAIN on the above SELECT query I receive the following:
id: 1
select_type: SIMPLE
table: 'QuizMatches'
type: index
possible_keys: NULL
key: 'LatestChangedIX'
key_len: 8
ref: NULL
rows: 10
Extra:
Is there a way I can optimize SELECTS as the above example towards this database table?
If you are using the LIMIT statement for pagination I suggest you to use LatestChanged value for this due to ordering. So your query will turn to
SELECT HEX(QuizMatchesGuid) AS QuizMatchesGuid, DateStarted, LatestChanged,
HostFBUserToken, GuestFBUserToken, ArrayOfQuestionIDs,
ArrayOfQuestionResponseTimesAndAnswersHost,
ArrayOfQuestionResponseTimesAndAnswersGuest, JsonQuestions
FROM CrystalDBQuiz.QuizMatches
WHERE LatestChanged<[lastValue]
ORDER BY LatestChanged DESC
LIMIT 10
I've noticed a serious problem recently, when my database increased to over 620000 records. Following query:
SELECT *,UNIX_TIMESTAMP(`time`) AS `time` FROM `log` WHERE (`projectname`="test" OR `projectname` IS NULL) ORDER BY `time` DESC LIMIT 0, 20
has an execution time about 2,5s on a local database. I was wondering how can I speed it up?
The EXPLAIN commands produces following output:
ID: 1
select type: SIMPLE
TABLE: log
type: ref_or_null
possible_keys: projectname
key: projectname
key_len: 387
ref: const
rows: 310661
Extra: Using where; using filesort
I've got indexes set on projectname, time columns.
Any help?
EDIT: Thanks to ypercube response, I was able to decrease query execution time. But when I only add another condition to WHERE clause (AND severity="changes") it lasts 2s again. Is it a good solution to include all of the possible "WHERE" columns to my merged-index?
ID: 1
select type: SIMPLE
TABLE: log
type: ref_or_null
possible_keys: projectname
key: projectname
key_len: 419
ref: const, const
rows: 315554
Extra: Using where; using filesort
Table structure:
CREATE TABLE `log` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`projectname` VARCHAR(128) DEFAULT NULL,
`time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`master` VARCHAR(128) NOT NULL,
`itemName` VARCHAR(128) NOT NULL,
`severity` VARCHAR(10) NOT NULL DEFAULT 'info',
`message` VARCHAR(255) NOT NULL,
`more` TEXT NOT NULL,
PRIMARY KEY (`id`),
KEY `projectname` (`severity`,`projectname`,`time`)
) ENGINE=INNODB AUTO_INCREMENT=621691 DEFAULT CHARSET=utf8
Add an index on (projectname, time):
ALTER TABLE log
ADD INDEX projectname_time_IX -- choose a name for the index
(projectname, time) ;
And then use the original column for the ORDER BY
SELECT *, UNIX_TIMESTAMP(time) AS unix_time
FROM log
WHERE (projectname = 'test' OR projectname IS NULL)
ORDER BY time DESC
LIMIT 0, 20 ;
or this variation - to make sure that the index is used effectively:
( SELECT *, UNIX_TIMESTAMP(time) AS unix_time
FROM log
WHERE projectname = 'test'
ORDER BY time DESC
LIMIT 20
)
UNION ALL
( SELECT *, UNIX_TIMESTAMP(time) AS unix_time
FROM log
WHERE projectname IS NULL
ORDER BY time DESC
LIMIT 20
)
ORDER BY time DESC
LIMIT 20 ;
I'm struggling to understand if I've indexed this query properly, it's somewhat slow and I feel it could use optimization. MySQL 5.1.70
select snaps.id, snaps.userid, snaps.ins_time, usr.gender
from usersnaps as snaps
join user as usr on usr.id = snaps.userid
left join user_convert as conv on snaps.userid = conv.userid
where (conv.level is null or conv.level = 4) and snaps.active = 'N'
and (usr.status = "unfilled" or usr.status = "unapproved") and usr.active = 1
order by snaps.ins_time asc
usersnaps table (irrelevant deta removed, size about 250k records) :
CREATE TABLE IF NOT EXISTS `usersnaps` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`userid` int(11) unsigned NOT NULL DEFAULT '0',
`picture` varchar(250) NOT NULL,
`active` enum('N','Y') NOT NULL DEFAULT 'N',
`ins_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`,`userid`),
KEY `userid` (`userid`,`active`),
KEY `ins_time` (`ins_time`),
KEY `active` (`active`)
) ENGINE=InnoDB;
user table (irrelevant deta removed, size about 300k records) :
CREATE TABLE IF NOT EXISTS `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`active` tinyint(1) NOT NULL DEFAULT '1',
`status` enum('15','active','approval','suspended','unapproved','unfilled','rejected','suspended_auto','incomplete') NOT NULL DEFAULT 'approval',
PRIMARY KEY (`id`),
KEY `status` (`status`,`active`)
) ENGINE=InnoDB;
user_convert table (size about : 60k records) :
CREATE TABLE IF NOT EXISTS `user_convert` (
`userid` int(10) unsigned NOT NULL,
`level` tinyint(4) NOT NULL,
UNIQUE KEY `userid` (`userid`),
KEY `level` (`level`)
) ENGINE=InnoDB;
Explain extended returns :
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE snaps ref userid,default_pic,active active 1 const 65248 100.00 Using where; Using filesort
1 SIMPLE usr eq_ref PRIMARY,active,status PRIMARY 4 snaps.userid 1 100.00 Using where
1 SIMPLE conv eq_ref userid userid 4s snaps.userid 1 100.00 Using where
Using filesort is probably your performance killer.
You need the records from usersnaps where active = 'N' and you need them sorted by ins_time.
ALTER TABLE usersnaps ADD KEY active_ins_time (active,ins_time);
Indexes are stored in sorted order, and read in sorted order... so if the optimizer chooses that index, it will go for the records with active = 'N' and -- hey, look at that -- they're already sorted by ins_time -- because of that index. So as it reads the rows referenced by the index, the result-set internally is already in the order you want it to ORDER BY, and the optimizer should realize this... no filesort required.
I would recommend changing the userid index (assuming you're not using it right now) to have active first and userid later.
That should make it more useful for this query.
I have following table with 2 million rows in it.
CREATE TABLE `gen_fmt_lookup` (
`episode_id` varchar(30) DEFAULT NULL,
`media_type` enum('Audio','Video') NOT NULL DEFAULT 'Video',
`service_id` varchar(50) DEFAULT NULL,
`genre_id` varchar(30) DEFAULT NULL,
`format_id` varchar(30) DEFAULT NULL,
`masterbrand_id` varchar(30) DEFAULT NULL,
`signed` int(11) DEFAULT NULL,
`actual_start` datetime DEFAULT NULL,
`scheduled_start` datetime DEFAULT NULL,
`scheduled_end` datetime DEFAULT NULL,
`discoverable_end` datetime DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
KEY `idx_discoverability_gn` (`media_type`,`service_id`,`genre_id`,`actual_start`,`scheduled_end`,`scheduled_start`,`episode_id`),
KEY `idx_discoverability_fmt` (`media_type`,`service_id`,`format_id`,`actual_start`,`scheduled_end`,`scheduled_start`,`episode_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
Below is query with explain which I am running against this table
mysql> EXPLAIN select episode_id,scheduled_start
from gen_fmt_lookup
where media_type='video'
and service_id in ('mobile_streaming_100','mobile_streaming_200','iplayer_streaming_h264_flv_vlo','mobile_streaming_500','iplayer_stb_uk_stream_aac_concrete','captions','iplayer_uk_stream_aac_rtmp_concrete','iplayer_streaming_n95_3g','iplayer_uk_download_oma_wifi','iplayer_uk_stream_aac_3gp_concrete')
and genre_id in ('100001','100002','100003','100004','100005','100006','100007','100008','100009','100010')
and NOW() BETWEEN actual_start and scheduled_end
group by episode_id order by min(scheduled_start) limit 1 offset 100\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: nitro_episodes_gen_fmt_lookup
type: range
possible_keys: idx_discoverability_gn,idx_discoverability_fmt
key: idx_discoverability_gn
key_len: 96
ref: NULL
rows: 31719
Extra: Using where; Using index; Using temporary; Using filesort
1 row in set (0.16 sec)
So my questions are
Is the index used in query execution best? And if not can someone please suggest better index?
Can mysql use composite index with 2 dates in where clause? As in the query above where clause has and condition "and NOW() BETWEEN actual_start and scheduled_end " but mysql is using index 'idx_discoverability_gn' with key length of 96 only. Which means it is using index upto (media_type,service_id,genre_id,actual_start) only.why can't it use index upto (media_type,service_id,genre_id,actual_start,scheduled_end) ?
What else I can do to improve performance?
You have a range check, so a clustered index on (scheduled_start, actual_start, scheduled_end) might help. Your current indexes are not very useful.You can get rid of them and build one primary key (episode_id) and another index (service_id, genre-id, episode_id)