i have a below query and I don't know how to do explain plan for it. so what i have is temp table create query and table structure.
create temporary table if not exists tmp_staging_task_ids as
select distinct s.usr_task_id
from ue_events_staging s
where s.queue_id is null
limit 6500;
the above select query explain plan ;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: s
partitions: NULL
type: ref
possible_keys: ue_events_staging_queue_id,usr_task_id,queue_id_usr_task_id,queue_id_app_id
key: queue_id_usr_task_id
key_len: 303
ref: const
rows: 17774428
filtered: 100.00
Extra: Using where; Using index; Using temporary
Query;
update ue_events_staging s
join tmp_staging_task_ids t on t.usr_task_id = s.usr_task_id
set s.queue_id = 'queue_id';
table structure;
Create Table: CREATE TABLE `ue_events_staging` (
`id` bigint NOT NULL AUTO_INCREMENT,
`queue_id` varchar(100) DEFAULT NULL,
`usr_task_id` bigint NOT NULL,
`app_id` bigint NOT NULL,
`platform` tinyint NOT NULL,
`capture_time` bigint NOT NULL,
`input_type` varchar(50) NOT NULL,
`type` varchar(100) NOT NULL,
`event_type` varchar(10) NOT NULL,
`screen` varchar(100) NOT NULL,
`object_name` varchar(255) DEFAULT NULL,
`app_custom_tag` varchar(255) DEFAULT NULL,
`exception_class_name` varchar(250) DEFAULT NULL,
`exception_tag` varchar(250) DEFAULT NULL,
`non_responsive` tinyint(1) DEFAULT '0',
`is_first` tinyint(1) DEFAULT '0',
`is_second` tinyint(1) DEFAULT '0',
`is_last` tinyint(1) DEFAULT '0',
`is_quit` tinyint(1) DEFAULT '0',
`x_coordinate` double DEFAULT NULL,
`y_coordinate` double DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `ue_events_staging_queue_id` (`queue_id`),
KEY `usr_task_id` (`usr_task_id`),
KEY `screen` (`app_id`,`platform`,`screen`),
KEY `app_id_queue_id` (`app_id`,`queue_id`),
KEY `queue_id_usr_task_id` (`queue_id`,`usr_task_id`),
KEY `queue_id_app_id` (`queue_id`,`app_id`)
please check the possibilities it takes around 3.5K seconds and causes load.
This looks like you're doing your updates in batches of 6500 rows.
If you don't need that temporary table, you can refactor your update query to stand alone. You don't need the temporary table because you can put its WHERE queue_id IS NULL directly into the WHERE of your UPDATE.
UPDATE ue_events_staging
SET queue_id = 'queue_id'
WHERE queue_id IS NULL
LIMIT 6500;
Your temporary table creation step pulls 6500 distinct (arbitrarily chosen) usr_task_id values from your table. Some of those values may relate to more than one row in your table, so your UPDATE statement may update more than 6500 rows in your table.
The refactoring I suggest will update 6500 arbitarily chosen rows in your table. At the end of the statement it's possible that some rows with a particular usr_task_id value will be updated and others will not. If that's acceptable for your business rules it will be faster.
If your business rules require all rows with each particular usr_task_id value to be updated at once, you could try this to simplify both statements.
create temporary table if not exists tmp_staging_task_ids as
select s.usr_task_id
from ue_events_staging s
where s.queue_id is null
limit 6500;
update ue_events_staging
set queue_id = 'queue_id'
where usr_task_id IN
(select usr_task_id from tmp_staging_task_ids);
This gets rid of the DISTINCT operator in the creation of your temporary table and may save a little time. The IN clause implies DISTINCT values.
"Arbitrarily chosen"? Statements without ORDER BY and with LIMIT clauses instruct MySQL to choose rows arbitrarily. MySQL picks the rows that are fastest to retrieve (hopefully).
Related
I have 3 tables. The first one is called map_life, the second one is called scripts and the third one is called npc_data.
I'm running the following query to get all the properties from map_life while also getting the script column from scripts and the storage_cost column from npc_data if the ids match.
SELECT life.*
, script.script
, npc.storage_cost
FROM map_life life
LEFT
JOIN scripts script
ON script.objectid = life.lifeid
AND script.script_type = 'npc'
LEFT
JOIN npc_data npc
ON npc.npcid = life.lifeid
As you can see, map_life id is lifeid, while scripts id is objectid and npc_data id is npcid.
This query is taking about 5 seconds to execute, and I have no idea why. Here's the CREATE statements for all those 3 tables, maybe I'm missing something?
CREATE TABLE `mcdb83`.`map_life` (
`id` bigint(21) unsigned NOT NULL AUTO_INCREMENT,
`mapid` int(11) NOT NULL,
`life_type` enum('npc','mob','reactor') NOT NULL,
`lifeid` int(11) NOT NULL,
`life_name` varchar(50) DEFAULT NULL COMMENT 'For reactors, specifies a handle so scripts may interact with them; for NPC/mob, this field is useless',
`x_pos` smallint(6) NOT NULL DEFAULT '0',
`y_pos` smallint(6) NOT NULL DEFAULT '0',
`foothold` smallint(6) NOT NULL DEFAULT '0',
`min_click_pos` smallint(6) NOT NULL DEFAULT '0',
`max_click_pos` smallint(6) NOT NULL DEFAULT '0',
`respawn_time` int(11) NOT NULL DEFAULT '0',
`flags` set('faces_left') NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `lifetype` (`mapid`,`life_type`)
) ENGINE=InnoDB AUTO_INCREMENT=32122 DEFAULT CHARSET=latin1;
CREATE TABLE `mcdb83`.`scripts` (
`script_type` enum('npc','reactor','quest','item','map_enter','map_first_enter') NOT NULL,
`helper` tinyint(3) NOT NULL DEFAULT '-1' COMMENT 'Represents the quest state for quests, and the index of the script for NPCs (NPCs may have multiple scripts).',
`objectid` int(11) NOT NULL DEFAULT '0',
`script` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`script_type`,`helper`,`objectid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Lists all the scripts that belong to NPCs/reactors/etc. ';
CREATE TABLE `mcdb83`.`npc_data` (
`npcid` int(11) NOT NULL,
`storage_cost` int(11) NOT NULL DEFAULT '0',
`flags` set('maple_tv','is_guild_rank') NOT NULL DEFAULT '',
PRIMARY KEY (`npcid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
For this query:
SELECT l.*, s.script, npc.storage_cost
FROM map_life l LEFT JOIN
scripts s
ON s.objectid = l.lifeid AND
s.script_type = 'npc' LEFT JOIN
npc_data npc
ON npc.npcid = l.lifeid;
You want indexes on: scripts(object_id, script_type, script) and npc_data(npcid, storage_cost). The order of the columns in these indexes is important.
map_life.lifeid does not have any indexes defined, therefore the joins will result in full table scans. Define an index on map_life.lifeid field.
In scripts table the primary key is defined on the following fields in that order: script_type, helper, objectid. The join is done on objectid and there is a constant filter criterion on script_type. Because the order of the fields in the index is wrong, MySQL cannot use the primary key for this query. For this query the order of the fields in the index should b: objectid, script_type, helper.
The above will significantly speed up the joins. You may further increase the speed of the query if your indexes actually cover all fields that are in the query because in this case MySQL does not even have to touch the tables.
Consider adding an index with the following fields and order to the scripts table: object_id, script_type, script and npcid, storage_cost index to npc_data table. However, these indexes may slow down insert / update / delete statements, so do some performance testing before adding these indexes to production environment.
I have two tables with the following schema,
CREATE TABLE `open_log` (
`delivery_id` varchar(30) DEFAULT NULL,
`email_id` varchar(50) DEFAULT NULL,
`email_activity` varchar(30) DEFAULT NULL,
`click_url` text,
`email_code` varchar(30) DEFAULT NULL,
`on_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `sent_log` (
`email_id` varchar(50) DEFAULT NULL,
`delivery_id` varchar(50) DEFAULT NULL,
`email_code` varchar(50) DEFAULT NULL,
`delivery_status` varchar(50) DEFAULT NULL,
`tries` int(11) DEFAULT NULL,
`creation_ts` varchar(50) DEFAULT NULL,
`creation_dt` varchar(50) DEFAULT NULL,
`on_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The email_id and delivery_id columns in both tables make up a unique key.
The open_log table have 2.5 million records where as sent_log table has 0.25 million records.
I want to filter out the records from open log table based on the unique key (email_id and delivery_id).
I'm writing the following query.
SELECT * FROM open_log
WHERE CONCAT(email_id,'^',delivery_id)
IN (
SELECT DISTINCT CONCAT(email_id,'^',delivery_id) FROM sent_log
)
The problem is the query is taking too much time to execute. I've waited for an hour for the query completion but didn't succeed.
Kindly, suggest what I can do to make it fast since, I have the big data size in the tables.
Thanks,
Faisal Nasir
First, rewrite your query using exists:
SELECT *
FROM open_log ol
WHERE EXISTS (SELECT 1
FROM send_log sl
WHERE sl.email_id = ol.email_id and sl.delivery_id = ol.delivery_id
);
Then, add an index so this query will run faster:
create index idx_sendlog_emailid_deliveryid on send_log(email_id, delivery_id);
Your query is slow for a variety of reasons:
The use of string concatenation makes it impossible for MySQL to use an index.
The select distinct in the subquery is unnecessary.
Exists can be faster than in.
If this request is often on, you can greatly increase it by create bigint id column, enven if it not unique.
For example you can put trigger and create column like this
alter table sent_log for_get bigint;
After that create trigger/ update it to put hash into that bigint
for_get=CONV(substr(md5(concat(email_id, delivery_id)),1,10),16,10)
If you have such column in both table and index on it, query will be like
SELECT *
FROM open_log ol
left join send_log sl on sl.for_get=ol.for_get
WHERE sl.email_id is not null and sl.email_id = ol.email_id and sl.delivery_id = ol.delivery_id;
That query will be fast.
I have a database with a table called QuizMatches. The table has the following structure:
CREATE TABLE `QuizMatches` (
`QuizMatchesGuid` binary(16) NOT NULL,
`DateStarted` datetime NOT NULL,
`LatestChanged` datetime NOT NULL,
`HostFBUserToken` varchar(250) NOT NULL,
`GuestFBUserToken` varchar(250) NOT NULL,
`ArrayOfQuestionIDs` varchar(200) NOT NULL,
`ArrayOfQuestionResponseTimesAndAnswersHost` varchar(900) NOT NULL,
`ArrayOfQuestionResponseTimesAndAnswersGuest` varchar(900) NOT NULL,
`MatchFinished` int(1) NOT NULL DEFAULT '0',
`Category` varchar(45) NOT NULL,
`JsonQuestions` varchar(4000) NOT NULL DEFAULT '[]',
`DateFinished` datetime NOT NULL,
`LatestPushSentDate` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`QuizMatchesGuid`),
KEY `HostFBUserTokenIX` (`HostFBUserToken`),
KEY `GuestFBUserTokenIX` (`GuestFBUserToken`),
KEY `MatchFinishedIX` (`MatchFinished`),
KEY `LatestChangedIX` (`LatestChanged`),
KEY `LatestPushSentDateIX` (`LatestPushSentDate`),
KEY `DateFinishedIX` (`LatestChanged`,`HostFBUserToken`,`GuestFBUserToken`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
There is a large number of rows in this table and it is heavily used by multiple clients especially queries like the following are executed:
SELECT HEX(QuizMatchesGuid) AS QuizMatchesGuid, DateStarted,
LatestChanged, HostFBUserToken, GuestFBUserToken,
ArrayOfQuestionIDs, ArrayOfQuestionResponseTimesAndAnswersHost,
ArrayOfQuestionResponseTimesAndAnswersGuest, JsonQuestions
FROM CrystalDBQuiz.QuizMatches
ORDER BY LatestChanged DESC
LIMIT 10
The main problem seem to be that the database performs a full index scan. I have tried with different combinations of indexes but to no success.
If I run EXPLAIN on the above SELECT query I receive the following:
id: 1
select_type: SIMPLE
table: 'QuizMatches'
type: index
possible_keys: NULL
key: 'LatestChangedIX'
key_len: 8
ref: NULL
rows: 10
Extra:
Is there a way I can optimize SELECTS as the above example towards this database table?
If you are using the LIMIT statement for pagination I suggest you to use LatestChanged value for this due to ordering. So your query will turn to
SELECT HEX(QuizMatchesGuid) AS QuizMatchesGuid, DateStarted, LatestChanged,
HostFBUserToken, GuestFBUserToken, ArrayOfQuestionIDs,
ArrayOfQuestionResponseTimesAndAnswersHost,
ArrayOfQuestionResponseTimesAndAnswersGuest, JsonQuestions
FROM CrystalDBQuiz.QuizMatches
WHERE LatestChanged<[lastValue]
ORDER BY LatestChanged DESC
LIMIT 10
I'm struggling to understand if I've indexed this query properly, it's somewhat slow and I feel it could use optimization. MySQL 5.1.70
select snaps.id, snaps.userid, snaps.ins_time, usr.gender
from usersnaps as snaps
join user as usr on usr.id = snaps.userid
left join user_convert as conv on snaps.userid = conv.userid
where (conv.level is null or conv.level = 4) and snaps.active = 'N'
and (usr.status = "unfilled" or usr.status = "unapproved") and usr.active = 1
order by snaps.ins_time asc
usersnaps table (irrelevant deta removed, size about 250k records) :
CREATE TABLE IF NOT EXISTS `usersnaps` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`userid` int(11) unsigned NOT NULL DEFAULT '0',
`picture` varchar(250) NOT NULL,
`active` enum('N','Y') NOT NULL DEFAULT 'N',
`ins_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`,`userid`),
KEY `userid` (`userid`,`active`),
KEY `ins_time` (`ins_time`),
KEY `active` (`active`)
) ENGINE=InnoDB;
user table (irrelevant deta removed, size about 300k records) :
CREATE TABLE IF NOT EXISTS `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`active` tinyint(1) NOT NULL DEFAULT '1',
`status` enum('15','active','approval','suspended','unapproved','unfilled','rejected','suspended_auto','incomplete') NOT NULL DEFAULT 'approval',
PRIMARY KEY (`id`),
KEY `status` (`status`,`active`)
) ENGINE=InnoDB;
user_convert table (size about : 60k records) :
CREATE TABLE IF NOT EXISTS `user_convert` (
`userid` int(10) unsigned NOT NULL,
`level` tinyint(4) NOT NULL,
UNIQUE KEY `userid` (`userid`),
KEY `level` (`level`)
) ENGINE=InnoDB;
Explain extended returns :
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE snaps ref userid,default_pic,active active 1 const 65248 100.00 Using where; Using filesort
1 SIMPLE usr eq_ref PRIMARY,active,status PRIMARY 4 snaps.userid 1 100.00 Using where
1 SIMPLE conv eq_ref userid userid 4s snaps.userid 1 100.00 Using where
Using filesort is probably your performance killer.
You need the records from usersnaps where active = 'N' and you need them sorted by ins_time.
ALTER TABLE usersnaps ADD KEY active_ins_time (active,ins_time);
Indexes are stored in sorted order, and read in sorted order... so if the optimizer chooses that index, it will go for the records with active = 'N' and -- hey, look at that -- they're already sorted by ins_time -- because of that index. So as it reads the rows referenced by the index, the result-set internally is already in the order you want it to ORDER BY, and the optimizer should realize this... no filesort required.
I would recommend changing the userid index (assuming you're not using it right now) to have active first and userid later.
That should make it more useful for this query.
I have following table with 2 million rows in it.
CREATE TABLE `gen_fmt_lookup` (
`episode_id` varchar(30) DEFAULT NULL,
`media_type` enum('Audio','Video') NOT NULL DEFAULT 'Video',
`service_id` varchar(50) DEFAULT NULL,
`genre_id` varchar(30) DEFAULT NULL,
`format_id` varchar(30) DEFAULT NULL,
`masterbrand_id` varchar(30) DEFAULT NULL,
`signed` int(11) DEFAULT NULL,
`actual_start` datetime DEFAULT NULL,
`scheduled_start` datetime DEFAULT NULL,
`scheduled_end` datetime DEFAULT NULL,
`discoverable_end` datetime DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
KEY `idx_discoverability_gn` (`media_type`,`service_id`,`genre_id`,`actual_start`,`scheduled_end`,`scheduled_start`,`episode_id`),
KEY `idx_discoverability_fmt` (`media_type`,`service_id`,`format_id`,`actual_start`,`scheduled_end`,`scheduled_start`,`episode_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
Below is query with explain which I am running against this table
mysql> EXPLAIN select episode_id,scheduled_start
from gen_fmt_lookup
where media_type='video'
and service_id in ('mobile_streaming_100','mobile_streaming_200','iplayer_streaming_h264_flv_vlo','mobile_streaming_500','iplayer_stb_uk_stream_aac_concrete','captions','iplayer_uk_stream_aac_rtmp_concrete','iplayer_streaming_n95_3g','iplayer_uk_download_oma_wifi','iplayer_uk_stream_aac_3gp_concrete')
and genre_id in ('100001','100002','100003','100004','100005','100006','100007','100008','100009','100010')
and NOW() BETWEEN actual_start and scheduled_end
group by episode_id order by min(scheduled_start) limit 1 offset 100\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: nitro_episodes_gen_fmt_lookup
type: range
possible_keys: idx_discoverability_gn,idx_discoverability_fmt
key: idx_discoverability_gn
key_len: 96
ref: NULL
rows: 31719
Extra: Using where; Using index; Using temporary; Using filesort
1 row in set (0.16 sec)
So my questions are
Is the index used in query execution best? And if not can someone please suggest better index?
Can mysql use composite index with 2 dates in where clause? As in the query above where clause has and condition "and NOW() BETWEEN actual_start and scheduled_end " but mysql is using index 'idx_discoverability_gn' with key length of 96 only. Which means it is using index upto (media_type,service_id,genre_id,actual_start) only.why can't it use index upto (media_type,service_id,genre_id,actual_start,scheduled_end) ?
What else I can do to improve performance?
You have a range check, so a clustered index on (scheduled_start, actual_start, scheduled_end) might help. Your current indexes are not very useful.You can get rid of them and build one primary key (episode_id) and another index (service_id, genre-id, episode_id)