Mysql big table multiple dates in where clause query performance

Mysql big table multiple dates in where clause query performance - mysql

I have following table with 2 million rows in it.
CREATE TABLE `gen_fmt_lookup` (
`episode_id` varchar(30) DEFAULT NULL,
`media_type` enum('Audio','Video') NOT NULL DEFAULT 'Video',
`service_id` varchar(50) DEFAULT NULL,
`genre_id` varchar(30) DEFAULT NULL,
`format_id` varchar(30) DEFAULT NULL,
`masterbrand_id` varchar(30) DEFAULT NULL,
`signed` int(11) DEFAULT NULL,
`actual_start` datetime DEFAULT NULL,
`scheduled_start` datetime DEFAULT NULL,
`scheduled_end` datetime DEFAULT NULL,
`discoverable_end` datetime DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
KEY `idx_discoverability_gn` (`media_type`,`service_id`,`genre_id`,`actual_start`,`scheduled_end`,`scheduled_start`,`episode_id`),
KEY `idx_discoverability_fmt` (`media_type`,`service_id`,`format_id`,`actual_start`,`scheduled_end`,`scheduled_start`,`episode_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
Below is query with explain which I am running against this table
mysql> EXPLAIN select episode_id,scheduled_start
from gen_fmt_lookup
where media_type='video'
and service_id in ('mobile_streaming_100','mobile_streaming_200','iplayer_streaming_h264_flv_vlo','mobile_streaming_500','iplayer_stb_uk_stream_aac_concrete','captions','iplayer_uk_stream_aac_rtmp_concrete','iplayer_streaming_n95_3g','iplayer_uk_download_oma_wifi','iplayer_uk_stream_aac_3gp_concrete')
and genre_id in ('100001','100002','100003','100004','100005','100006','100007','100008','100009','100010')
and NOW() BETWEEN actual_start and scheduled_end
group by episode_id order by min(scheduled_start) limit 1 offset 100\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: nitro_episodes_gen_fmt_lookup
type: range
possible_keys: idx_discoverability_gn,idx_discoverability_fmt
key: idx_discoverability_gn
key_len: 96
ref: NULL
rows: 31719
Extra: Using where; Using index; Using temporary; Using filesort
1 row in set (0.16 sec)
So my questions are
Is the index used in query execution best? And if not can someone please suggest better index?
Can mysql use composite index with 2 dates in where clause? As in the query above where clause has and condition "and NOW() BETWEEN actual_start and scheduled_end " but mysql is using index 'idx_discoverability_gn' with key length of 96 only. Which means it is using index upto (media_type,service_id,genre_id,actual_start) only.why can't it use index upto (media_type,service_id,genre_id,actual_start,scheduled_end) ?
What else I can do to improve performance?

You have a range check, so a clustered index on (scheduled_start, actual_start, scheduled_end) might help. Your current indexes are not very useful.You can get rid of them and build one primary key (episode_id) and another index (service_id, genre-id, episode_id)

Related

does this update query have any possible rewrite options?

i have a below query and I don't know how to do explain plan for it. so what i have is temp table create query and table structure.
create temporary table if not exists tmp_staging_task_ids as
select distinct s.usr_task_id
from ue_events_staging s
where s.queue_id is null
limit 6500;
the above select query explain plan ;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: s
partitions: NULL
type: ref
possible_keys: ue_events_staging_queue_id,usr_task_id,queue_id_usr_task_id,queue_id_app_id
key: queue_id_usr_task_id
key_len: 303
ref: const
rows: 17774428
filtered: 100.00
Extra: Using where; Using index; Using temporary
Query;
update ue_events_staging s
join tmp_staging_task_ids t on t.usr_task_id = s.usr_task_id
set s.queue_id = 'queue_id';
table structure;
Create Table: CREATE TABLE `ue_events_staging` (
`id` bigint NOT NULL AUTO_INCREMENT,
`queue_id` varchar(100) DEFAULT NULL,
`usr_task_id` bigint NOT NULL,
`app_id` bigint NOT NULL,
`platform` tinyint NOT NULL,
`capture_time` bigint NOT NULL,
`input_type` varchar(50) NOT NULL,
`type` varchar(100) NOT NULL,
`event_type` varchar(10) NOT NULL,
`screen` varchar(100) NOT NULL,
`object_name` varchar(255) DEFAULT NULL,
`app_custom_tag` varchar(255) DEFAULT NULL,
`exception_class_name` varchar(250) DEFAULT NULL,
`exception_tag` varchar(250) DEFAULT NULL,
`non_responsive` tinyint(1) DEFAULT '0',
`is_first` tinyint(1) DEFAULT '0',
`is_second` tinyint(1) DEFAULT '0',
`is_last` tinyint(1) DEFAULT '0',
`is_quit` tinyint(1) DEFAULT '0',
`x_coordinate` double DEFAULT NULL,
`y_coordinate` double DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `ue_events_staging_queue_id` (`queue_id`),
KEY `usr_task_id` (`usr_task_id`),
KEY `screen` (`app_id`,`platform`,`screen`),
KEY `app_id_queue_id` (`app_id`,`queue_id`),
KEY `queue_id_usr_task_id` (`queue_id`,`usr_task_id`),
KEY `queue_id_app_id` (`queue_id`,`app_id`)
please check the possibilities it takes around 3.5K seconds and causes load.

This looks like you're doing your updates in batches of 6500 rows.
If you don't need that temporary table, you can refactor your update query to stand alone. You don't need the temporary table because you can put its WHERE queue_id IS NULL directly into the WHERE of your UPDATE.
UPDATE ue_events_staging
SET queue_id = 'queue_id'
WHERE queue_id IS NULL
LIMIT 6500;
Your temporary table creation step pulls 6500 distinct (arbitrarily chosen) usr_task_id values from your table. Some of those values may relate to more than one row in your table, so your UPDATE statement may update more than 6500 rows in your table.
The refactoring I suggest will update 6500 arbitarily chosen rows in your table. At the end of the statement it's possible that some rows with a particular usr_task_id value will be updated and others will not. If that's acceptable for your business rules it will be faster.
If your business rules require all rows with each particular usr_task_id value to be updated at once, you could try this to simplify both statements.
create temporary table if not exists tmp_staging_task_ids as
select s.usr_task_id
from ue_events_staging s
where s.queue_id is null
limit 6500;
update ue_events_staging
set queue_id = 'queue_id'
where usr_task_id IN
(select usr_task_id from tmp_staging_task_ids);
This gets rid of the DISTINCT operator in the creation of your temporary table and may save a little time. The IN clause implies DISTINCT values.
"Arbitrarily chosen"? Statements without ORDER BY and with LIMIT clauses instruct MySQL to choose rows arbitrarily. MySQL picks the rows that are fastest to retrieve (hopefully).

Performance issue full index scan in mysql database

I have a database with a table called QuizMatches. The table has the following structure:
CREATE TABLE `QuizMatches` (
`QuizMatchesGuid` binary(16) NOT NULL,
`DateStarted` datetime NOT NULL,
`LatestChanged` datetime NOT NULL,
`HostFBUserToken` varchar(250) NOT NULL,
`GuestFBUserToken` varchar(250) NOT NULL,
`ArrayOfQuestionIDs` varchar(200) NOT NULL,
`ArrayOfQuestionResponseTimesAndAnswersHost` varchar(900) NOT NULL,
`ArrayOfQuestionResponseTimesAndAnswersGuest` varchar(900) NOT NULL,
`MatchFinished` int(1) NOT NULL DEFAULT '0',
`Category` varchar(45) NOT NULL,
`JsonQuestions` varchar(4000) NOT NULL DEFAULT '[]',
`DateFinished` datetime NOT NULL,
`LatestPushSentDate` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`QuizMatchesGuid`),
KEY `HostFBUserTokenIX` (`HostFBUserToken`),
KEY `GuestFBUserTokenIX` (`GuestFBUserToken`),
KEY `MatchFinishedIX` (`MatchFinished`),
KEY `LatestChangedIX` (`LatestChanged`),
KEY `LatestPushSentDateIX` (`LatestPushSentDate`),
KEY `DateFinishedIX` (`LatestChanged`,`HostFBUserToken`,`GuestFBUserToken`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
There is a large number of rows in this table and it is heavily used by multiple clients especially queries like the following are executed:
SELECT HEX(QuizMatchesGuid) AS QuizMatchesGuid, DateStarted,
LatestChanged, HostFBUserToken, GuestFBUserToken,
ArrayOfQuestionIDs, ArrayOfQuestionResponseTimesAndAnswersHost,
ArrayOfQuestionResponseTimesAndAnswersGuest, JsonQuestions
FROM CrystalDBQuiz.QuizMatches
ORDER BY LatestChanged DESC
LIMIT 10
The main problem seem to be that the database performs a full index scan. I have tried with different combinations of indexes but to no success.
If I run EXPLAIN on the above SELECT query I receive the following:
id: 1
select_type: SIMPLE
table: 'QuizMatches'
type: index
possible_keys: NULL
key: 'LatestChangedIX'
key_len: 8
ref: NULL
rows: 10
Extra:
Is there a way I can optimize SELECTS as the above example towards this database table?

If you are using the LIMIT statement for pagination I suggest you to use LatestChanged value for this due to ordering. So your query will turn to
SELECT HEX(QuizMatchesGuid) AS QuizMatchesGuid, DateStarted, LatestChanged,
HostFBUserToken, GuestFBUserToken, ArrayOfQuestionIDs,
ArrayOfQuestionResponseTimesAndAnswersHost,
ArrayOfQuestionResponseTimesAndAnswersGuest, JsonQuestions
FROM CrystalDBQuiz.QuizMatches
WHERE LatestChanged<[lastValue]
ORDER BY LatestChanged DESC
LIMIT 10

Finding optimal indexes for this MySQL query

I'm struggling to understand if I've indexed this query properly, it's somewhat slow and I feel it could use optimization. MySQL 5.1.70
select snaps.id, snaps.userid, snaps.ins_time, usr.gender
from usersnaps as snaps
join user as usr on usr.id = snaps.userid
left join user_convert as conv on snaps.userid = conv.userid
where (conv.level is null or conv.level = 4) and snaps.active = 'N'
and (usr.status = "unfilled" or usr.status = "unapproved") and usr.active = 1
order by snaps.ins_time asc
usersnaps table (irrelevant deta removed, size about 250k records) :
CREATE TABLE IF NOT EXISTS `usersnaps` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`userid` int(11) unsigned NOT NULL DEFAULT '0',
`picture` varchar(250) NOT NULL,
`active` enum('N','Y') NOT NULL DEFAULT 'N',
`ins_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`,`userid`),
KEY `userid` (`userid`,`active`),
KEY `ins_time` (`ins_time`),
KEY `active` (`active`)
) ENGINE=InnoDB;
user table (irrelevant deta removed, size about 300k records) :
CREATE TABLE IF NOT EXISTS `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`active` tinyint(1) NOT NULL DEFAULT '1',
`status` enum('15','active','approval','suspended','unapproved','unfilled','rejected','suspended_auto','incomplete') NOT NULL DEFAULT 'approval',
PRIMARY KEY (`id`),
KEY `status` (`status`,`active`)
) ENGINE=InnoDB;
user_convert table (size about : 60k records) :
CREATE TABLE IF NOT EXISTS `user_convert` (
`userid` int(10) unsigned NOT NULL,
`level` tinyint(4) NOT NULL,
UNIQUE KEY `userid` (`userid`),
KEY `level` (`level`)
) ENGINE=InnoDB;
Explain extended returns :
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE snaps ref userid,default_pic,active active 1 const 65248 100.00 Using where; Using filesort
1 SIMPLE usr eq_ref PRIMARY,active,status PRIMARY 4 snaps.userid 1 100.00 Using where
1 SIMPLE conv eq_ref userid userid 4s snaps.userid 1 100.00 Using where

Using filesort is probably your performance killer.
You need the records from usersnaps where active = 'N' and you need them sorted by ins_time.
ALTER TABLE usersnaps ADD KEY active_ins_time (active,ins_time);
Indexes are stored in sorted order, and read in sorted order... so if the optimizer chooses that index, it will go for the records with active = 'N' and -- hey, look at that -- they're already sorted by ins_time -- because of that index. So as it reads the rows referenced by the index, the result-set internally is already in the order you want it to ORDER BY, and the optimizer should realize this... no filesort required.

I would recommend changing the userid index (assuming you're not using it right now) to have active first and userid later.
That should make it more useful for this query.

MySQL SELECT COUNT with GROUP and ORDER performance issue

The Facts:
Dedicated Server, 4 Cores, 16GB
MySQL 5.5.29-0ubuntu0.12.10.1-log - (Ubuntu)
One Table, 1.9M rows and growing
I need all sorted rows for export or a 5er chunk. The query takes 25 seconds with Copying To Tmp Table 23.3 s
I tried InnoDB and MyISAM, changing the index order, using a MD5 Hash of some_text as GROUP BY, partition the table by day.
dayis a Unix-Timestamp and alway present.
lang some_bool some_filter ano_filter rel_id could be in where clause but not need to.
Here is the MyISAM example:
The table
mysql> SHOW CREATE TABLE data \G;
*************************** 1. row ***************************
Table: data
Create Table: CREATE TABLE `data` (
`data_id` bigint(20) NOT NULL AUTO_INCREMENT,
`rel_id` int(11) NOT NULL,
`some_text` varchar(255) DEFAULT NULL,
`lang` varchar(3) DEFAULT NULL,
`some_bool` tinyint(1) DEFAULT NULL,
`some_filter` varchar(40) DEFAULT NULL,
`ano_filter` varchar(10) DEFAULT NULL,
`day` int(11) DEFAULT NULL,
PRIMARY KEY (`data_id`),
KEY `cnt_idx` (`some_filter`,`ano_filter`,`rel_id`,`lang`,`some_bool`,`some_text`,`day`)
) ENGINE=MyISAM AUTO_INCREMENT=1900099 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
The query
mysql> EXPLAIN SELECT `some_text` , COUNT(*) AS `num` FROM `data`
WHERE `lang` = 'en' AND `day` BETWEEN '1364342400' AND
'1366934399' GROUP BY `some_text` ORDER BY `num` DESC \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: data
type: index
possible_keys: NULL
key: cnt_idx
key_len: 947
ref: NULL
rows: 1900098
Extra: Using where; Using index; Using temporary; Using filesort
1 row in set (0.00 sec)
mysql> SELECT `some_text` , COUNT(*) AS `num` FROM `data`
WHERE `lang` = 'en' AND `day` BETWEEN '1364342400' AND '1366934399'
GROUP BY `some_text` ORDER BY `num` DESC LIMIT 5 \G;
...
*************************** 5. row ***************************
5 rows in set (24.26 sec)
Any idea how to speed up that thing?`

No index is being used because of the column order in the index. Indexes work left to right. For this query to use an index, you would need an index of lang, day.

Mysql Query run faster

Table structure:
CREATE TABLE IF NOT EXISTS `logs` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user` bigint(20) unsigned NOT NULL,
`type` tinyint(1) unsigned NOT NULL,
`date` int(11) unsigned NOT NULL,
`plus` decimal(10,2) unsigned NOT NULL,
`minus` decimal(10,2) unsigned NOT NULL,
`tax` decimal(10,2) unsigned NOT NULL,
`item` bigint(20) unsigned NOT NULL,
`info` char(10) NOT NULL,
PRIMARY KEY (`id`),
KEY `item` (`item`),
KEY `user` (`user`),
KEY `type` (`type`),
KEY `date` (`date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=0 ROW_FORMAT=FIXED;
Query:
SELECT logs.item, COUNT(logs.item) AS total FROM logs WHERE logs.type = 4 GROUP BY logs.item;
Table holds 110k records out of which 50k type 4 records.
Execution time: 0.13 seconds
I know this is fast, but can I make it faster?
I am expecting 1 million records and thus the time would grow quite a bit.

Analyze queries with EXPLAIN:
mysql> EXPLAIN SELECT logs.item, COUNT(logs.item) AS total FROM logs
WHERE logs.type = 4 GROUP BY logs.item\G
id: 1
select_type: SIMPLE
table: logs
type: ref
possible_keys: type
key: type
key_len: 1
ref: const
rows: 1
Extra: Using where; Using temporary; Using filesort
The "Using temporary; Using filesort" indicates some costly operations. Because the optimizer knows it can't rely on the rows with each value of item being stored together, it needs to scan the whole table and collect the count per distinct item in a temporary table. Then sort the resulting temp table to produce the result.
You need an index on the logs table on columns (type, item) in that order. Then the optimizer knows it can leverage the index tree to scan each value of logs.item fully before moving on to the next value. By doing this, it can skip the temporary table to collect values, and skip the implicit sorting of the result.
mysql> CREATE INDEX logs_type_item ON logs (type,item);
mysql> EXPLAIN SELECT logs.item, COUNT(logs.item) AS total FROM logs
WHERE logs.type = 4 GROUP BY logs.item\G
id: 1
select_type: SIMPLE
table: logs
type: ref
possible_keys: type,logs_type_item
key: logs_type_item
key_len: 1
ref: const
rows: 1
Extra: Using where

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008