Efficient select on huge table of ranges - mysql

I have a mysql table which contains the fields: rangeFrom and rangeTo.
I want to request rows with a condition like: rangeFrom >= ? AND rangeTo <=? within a join.
EXPLAIN SELECT *
FROM Version
JOIN Contract FORCE INDEX FOR JOIN (versionRangeFrom)
ON Version.id >= Contract.versionRangeFrom
AND Version.id <= Contract.versionRangeTo
WHERE Version.completedAt = '2016-06-06 10:00:01';
Which mysql explains like this:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Version ref PRIMARY,completedAt completedAt 6 const 1 NULL
1 SIMPLE Contract ALL versionRangeFrom NULL NULL NULL 640744 Range checked for each record (index map: 0x8)
So it has to work though 640744 rows which takes about 1-2 seconds.
However inserting the version id in the queryworks fine
EXPLAIN SELECT *
FROM Contract
WHERE 5 >= Contract.versionRangeFrom AND 5 <= Contract.versionRangeTo;
This is then explained like this:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Contract range versionRangeFrom versionRangeFrom 4 NULL 534 Using index condition; Using where
So in this case mysql only goes though 534 rows and that only takes about 30ms.
So how do I prepare for such a range check correctly. It seems that mysql is unable to use Indexes in those cases. I can work around it by using 2 queries but i'd rather have one.
Here more schemas:
CREATE TABLE `Version` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`completedAt` datetime DEFAULT NULL,
`createdAt` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `completedAt` (`completedAt`)
)
CREATE TABLE `Contract` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`contractId` bigint(20) unsigned NOT NULL,
`startAt` bigint(20) NOT NULL DEFAULT '0',
`endAt` bigint(20) NOT NULL DEFAULT '0',
`tradeStartAt` bigint(20) NOT NULL DEFAULT '0',
`tradeEndAt` bigint(20) NOT NULL DEFAULT '0',
`latestAiId` bigint(20) NOT NULL DEFAULT '0',
`type` varchar(4) COLLATE utf8_unicode_ci NOT NULL,
`daPreis` int(11) NOT NULL DEFAULT '0',
`lastTradePreis` int(11) NOT NULL DEFAULT '0',
`lastTradeVol` int(11) NOT NULL DEFAULT '0',
`VWAID` double NOT NULL DEFAULT '0',
`versionRangeFrom` int(10) unsigned NOT NULL,
`versionRangeTo` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `tradeStartAt` (`tradeStartAt`),
KEY `contractId` (`contractId`),
KEY `versionRangeFrom` (`versionRangeFrom`)
)

A different value than '5' would not need to look at only 534 rows.
The problem (x < from AND x > to)is non-trivial, and for which there is no simple answer.
Shrinking the table size would help a tiny bit. Don't use BIGINT (8 bytes) when some smaller datatype would suffice.
Perhaps the only real solution involves a major revamp of the schema and code. See my blog.
Edit
In certain situations, this subquery may be efficiently used to find the row in question:
( SELECT Contract.id FROM ...
WHERE Version.id >= Contract.versionRangeFrom
ORDER BY versionRangeFrom
LIMIT 1 )

Related

How to optimise this slow MySQL query - late row lookups?

I'm converting a site over to use XenForo as forum software, however this site has millions of thread rows in the MySQL table. If I try to browse a paginated listing of threads, it slows to a crawl the further I go. Once I'm at page 10,000 it takes almost 30s.
My aim is to improve the query below, perhaps by using late row lookups so that I can make this query run faster:
SELECT thread.*
,
user.*, IF(user.username IS NULL, thread.username, user.username) AS username,
NULL AS thread_read_date,
0 AS thread_is_watched,
0 AS user_post_count
FROM xf_thread AS thread
LEFT JOIN xf_user AS user ON
(user.user_id = thread.user_id)
WHERE (thread.node_id = 152) AND (thread.sticky = 0) AND (thread.discussion_state IN ('visible'))
ORDER BY thread.last_post_date DESC
LIMIT 20 OFFSET 238340
Run Time: 4.383607
Select Type Table Type Possible Keys Key Key Len Ref Rows Extra
SIMPLE thread ref node_id_last_post_date,node_id_sticky_state_last_post node_id_last_post_date 4 const 552480 Using where
SIMPLE user eq_ref PRIMARY PRIMARY 4 sitename.thread.user_id 1
Schema:
CREATE TABLE `xf_thread` (
`thread_id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`node_id` INT(10) UNSIGNED NOT NULL,
`title` VARCHAR(150) NOT NULL,
`reply_count` INT(10) UNSIGNED NOT NULL DEFAULT '0',
`view_count` INT(10) UNSIGNED NOT NULL DEFAULT '0',
`user_id` INT(10) UNSIGNED NOT NULL,
`username` VARCHAR(50) NOT NULL,
`post_date` INT(10) UNSIGNED NOT NULL,
`sticky` TINYINT(3) UNSIGNED NOT NULL DEFAULT '0',
`discussion_state` ENUM('visible','moderated','deleted') NOT NULL DEFAULT 'visible',
`discussion_open` TINYINT(3) UNSIGNED NOT NULL DEFAULT '1',
`discussion_type` VARCHAR(25) NOT NULL DEFAULT '',
`first_post_id` INT(10) UNSIGNED NOT NULL,
`first_post_likes` INT(10) UNSIGNED NOT NULL DEFAULT '0',
`last_post_date` INT(10) UNSIGNED NOT NULL,
`last_post_id` INT(10) UNSIGNED NOT NULL,
`last_post_user_id` INT(10) UNSIGNED NOT NULL,
`last_post_username` VARCHAR(50) NOT NULL,
`prefix_id` INT(10) UNSIGNED NOT NULL DEFAULT '0',
`sonnb_xengallery_import` TINYINT(3) DEFAULT '0',
PRIMARY KEY (`thread_id`),
KEY `node_id_last_post_date` (`node_id`,`last_post_date`),
KEY `node_id_sticky_state_last_post` (`node_id`,`sticky`,`discussion_state`,`last_post_date`),
KEY `last_post_date` (`last_post_date`),
KEY `post_date` (`post_date`),
KEY `user_id` (`user_id`)
) ENGINE=INNODB AUTO_INCREMENT=2977 DEFAULT CHARSET=utf8
Can anyone help me improve the speed of this query? I'm a real MySQL novice, but I am running the same dataset on other forum software and it is much faster - so I'm sure there is a way somehow. This table is INNODB and I'd consider the server well optimised.
This might help: http://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/
The concept being, query just the index column with your required paging/ordering, then join this list to the other columns you want from the table
Your User table is already index by user ID... good.
For your thread table, I would have a compound index on it with the key
( note_id, sticky, discussion_state, last_post_date )
This way, the index is optimized on all parts in the WHERE clause... AND since it has the last_post_date too, that can be utilized by the ORDER BY clause. Order By clauses are notorious for killing query performance.

Mysql doesn't use my Index correctly

First sorry, i am french and i don't speak very well english.
I have a rather strange problem with my index.
my table t_bloc (my table contains all the posts)
t_bloc
CREATE TABLE `t_bloc` (
 `id_bloc` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `id_rubrique` int(10) unsigned NOT NULL DEFAULT '0',
 `titre` varchar(100) NOT NULL DEFAULT 'A compléter',
 `contenu` text NOT NULL,
 `titre_page` varchar(100) NOT NULL,
 `desc_courte` varchar(255) NOT NULL,
 `url` varchar(255) NOT NULL,
 `follow_url` tinyint(1) unsigned NOT NULL DEFAULT '1' ,
 `image` varchar(120) NOT NULL,
 `video` varchar(120) NOT NULL,
 `note` tinyint(3) unsigned NOT NULL DEFAULT '0',
 `champopt1` text NOT NULL,
 `champopt2` text NOT NULL,
 `permalien` varchar(100) NOT NULL,
 `en_ligne` tinyint(1) NOT NULL DEFAULT '0',
 `id_utilisateur` int(10) unsigned NOT NULL DEFAULT '1' ,
 `date_crea` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
 `date_modif` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
 `nb_commentaires` smallint(5) unsigned NOT NULL DEFAULT '0',
 `nb_jaimes` int(10) unsigned NOT NULL DEFAULT '0',
 `nb_jaimespas` int(10) unsigned NOT NULL DEFAULT '0',
 `pourcentage_jaimes` smallint(2) unsigned NOT NULL DEFAULT '0' ,
 `niveau_classement` tinyint(1) unsigned NOT NULL DEFAULT '1' ,
 `id_bloc_parent` int(10) unsigned NOT NULL DEFAULT '0' ,
 `id_membre_bloc` int(10) unsigned NOT NULL DEFAULT '0' ,
 `valeur_pts_bloc` smallint(5) unsigned NOT NULL DEFAULT '0' ,
 `commentaires_actifs` tinyint(1) unsigned NOT NULL DEFAULT '1' ,
 PRIMARY KEY (`id_bloc`),
 KEY `date_modif` (`date_modif`),
 KEY `id_bloc_parent` (`id_bloc_parent`),
 KEY `idx_rub_ligne_niv_etc` (`id_rubrique`,`en_ligne`,`niveau_classement`,`note`,`pourcentage_jaimes`,`date_modif`),
 KEY `idx_tri` (`en_ligne`,`niveau_classement`,`note`,`pourcentage_jaimes`,`date_modif`),
 KEY `idx_ligne_membre` (`en_ligne`,`id_membre_bloc`,`id_rubrique`),
 KEY `id_membre_bloc` (`id_membre_bloc`),
 KEY `idx_rub_ligne_date` (`id_rubrique`,`en_ligne`,`date_modif`,`id_bloc`),
 KEY `idx_ligne_date` (`en_ligne`,`date_modif`),
 FULLTEXT KEY `idx_fullindex` (`titre`,`contenu`)
) ENGINE=MyISAM AUTO_INCREMENT=456469 DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC
My table t_taxon_bloc
t_taxon_bloc
CREATE TABLE `t_taxon_bloc` (
 `id_taxon` int(10) unsigned NOT NULL,
 `id_bloc` int(10) unsigned NOT NULL,
 `url_plateforme` varchar(255) NOT NULL,
 `width_flash` smallint(5) unsigned NOT NULL,
 `height_flash` smallint(5) unsigned NOT NULL,
 PRIMARY KEY (`id_taxon`,`id_bloc`),
 KEY `id_bloc` (`id_bloc`),
 KEY `id_taxon` (`id_taxon`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC
I have a problem when I execute this query:
select b.id_bloc FROM t_bloc as b WHERE b.en_ligne = 1 AND EXISTS ( SELECT 1 FROM t_taxon_bloc AS TB WHERE TB.id_bloc=B.id_bloc AND TB.id_taxon= 83) ORDER BY b.en_ligne DESC, b.date_modif DESC LIMIT 0, 20
I get the following explain:
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
PRIMARY
b
ref
idx_tri,idx_ligne_membre,idx_ligne_date
idx_ligne_date
1
const
58210
Using where
2
DEPENDENT SUBQUERY
TB
eq_ref
PRIMARY,id_bloc,id_taxon
PRIMARY
8
const,sitajeuxtestbourrage.b.id_bloc
1
Using index
it does not fully use the idx_ligne_date index and rows = all rows in the table
Extra = « using Where »
But if I create the following index idx_ligne_date_idbloc (en_ligne, date_modif, id_bloc) ,
the use of the index is a little better, the application runs a little faster.
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
PRIMARY
b
ref
idx_tri,idx_ligne_membre,idx_ligne_date_idbloc
idx_ligne_date_idbloc
1
const
58252
Using where; Using index
2
DEPENDENT SUBQUERY
TB
eq_ref
PRIMARY,id_bloc,id_taxon
PRIMARY
8
const,sitajeuxtestbourrage.b.id_bloc
1
Using index
Extra = « using Where, Using Index »
My questions:
id_bloc does not appear in the where and order by clauses, why am I required to add id_bloc on my multiple index ?
And Why rows = (again) all my table ? And not 20 (LIMIT 0,20)
You seem to have a misunderstanding what "Using Index" means in the Extra column. Have a look at this french explanation: http://use-the-index-luke.com/fr/sql/plans-dexecution/mysql/operations
id_bloc does not appear in the where and order by clauses, why am I required to add id_bloc on my multiple index ?
You are not required to, but doing so allows a so-called index-only scan (which is indicated by the words "Using Index" in Extra). You can learn about index-only scan at this french page: http://use-the-index-luke.com/fr/sql/regrouper-les-donnees/parcours-d-index-couvrants

MySQL Indexes for extremely slow queries

The following query, regardless of environment, takes more than 30 seconds to compute.
SELECT COUNT( r.response_answer )
FROM response r
INNER JOIN (
SELECT G.question_id
FROM question G
INNER JOIN answer_group AG ON G.answer_group_id = AG.answer_group_id
WHERE AG.answer_group_stat = 'statistic'
) AS q ON r.question_id = q.question_id
INNER JOIN org_survey os ON os.org_survey_code = r.org_survey_code
WHERE os.survey_id =42
AND r.response_answer = 5
AND DATEDIFF( NOW( ) , r.added_dt ) <1000000
AND r.uuid IS NOT NULL
When I explain the query,
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1087
1 PRIMARY r ref question_id,org_survey_code,code_question,uuid,uor question_id 4 q.question_id 1545 Using where
1 PRIMARY os eq_ref org_survey_code,survey_id,org_survey_code_2 org_survey_code 12 survey_2.r.org_survey_code 1 Using where
2 DERIVED G ALL agid NULL NULL NULL 1680
2 DERIVED AG eq_ref PRIMARY PRIMARY 1 survey_2.G.answer_group_id 1 Using where
I have a very basic knowledge of indexing, but I have tried nearly every combination I can think of and cannot seem to improve the speed of this query. The responses table is right around 2 million rows, question is about 1500 rows, answer_group is about 50, and org_survey is about 8,000.
Here is the basic structure for each:
CREATE TABLE `response` (
`response_id` int(10) unsigned NOT NULL auto_increment,
`response_answer` text NOT NULL,
`question_id` int(10) unsigned NOT NULL default '0',
`org_survey_code` varchar(7) NOT NULL,
`uuid` varchar(40) default NULL,
`added_dt` datetime default NULL,
PRIMARY KEY (`response_id`),
KEY `question_id` (`question_id`),
KEY `org_survey_code` (`org_survey_code`),
KEY `code_question` (`org_survey_code`,`question_id`),
KEY `IDX_ADDED_DT` (`added_dt`),
KEY `uuid` (`uuid`),
KEY `response_answer` (`response_answer`(1)),
KEY `response_question` (`response_answer`(1),`question_id`),
) ENGINE=MyISAM AUTO_INCREMENT=2298109 DEFAULT CHARSET=latin1
CREATE TABLE `question` (
`question_id` int(10) unsigned NOT NULL auto_increment,
`question_text` varchar(250) NOT NULL default '',
`question_group` varchar(250) default NULL,
`question_position` tinyint(3) unsigned NOT NULL default '0',
`survey_id` tinyint(3) unsigned NOT NULL default '0',
`answer_group_id` mediumint(8) unsigned NOT NULL default '0',
`seq_id` int(11) NOT NULL default '0',
PRIMARY KEY (`question_id`),
KEY `question_group` (`question_group`(10)),
KEY `survey_id` (`survey_id`),
KEY `agid` (`answer_group_id`)
) ENGINE=MyISAM AUTO_INCREMENT=1860 DEFAULT CHARSET=latin1
CREATE TABLE `org_survey` (
`org_survey_id` int(11) NOT NULL auto_increment,
`org_survey_code` varchar(10) NOT NULL default '',
`org_id` int(11) NOT NULL default '0',
`org_manager_id` int(11) NOT NULL default '0',
`org_url_id` int(11) default '0',
`division_id` int(11) default '0',
`sector_id` int(11) default NULL,
`survey_id` int(11) NOT NULL default '0',
`process_batch` tinyint(4) default '0',
`added_dt` datetime default NULL,
PRIMARY KEY (`org_survey_id`),
UNIQUE KEY `org_survey_code` (`org_survey_code`),
KEY `org_id` (`org_id`),
KEY `survey_id` (`survey_id`),
KEY `org_survey_code_2` (`org_survey_code`,`total_taken`),
KEY `org_manager_id` (`org_manager_id`),
KEY `sector_id` (`sector_id`)
) ENGINE=MyISAM AUTO_INCREMENT=9268 DEFAULT CHARSET=latin1
CREATE TABLE `answer_group` (
`answer_group_id` tinyint(3) unsigned NOT NULL auto_increment,
`answer_group_name` varchar(50) NOT NULL default '',
`answer_group_type` varchar(20) NOT NULL default '',
`answer_group_stat` varchar(20) NOT NULL default 'demographic',
PRIMARY KEY (`answer_group_id`)
) ENGINE=MyISAM AUTO_INCREMENT=53 DEFAULT CHARSET=latin1
I know there are small things I can probably do to improve the efficiency of the database, such as reducing the size of integers where it's unnecessary. However, those are fairly trivial considering the ridiculous time it takes just to produce a result here. How can I properly index these tables, based on what explain has shown me? It seems that I have tried a large variety of combinations to no avail. Also, is there anything else that anyone can see that will optimize the table and reduce the query? I need it to be computed in less than a second. Thanks in advance!
1.If you want the index of r.added_dt to be used, instead of:
DATEDIFF(NOW(), r.added_dt) < 1000000
use:
CURDATE() - INTERVAL 1000000 DAY < r.added_dt
Anyway, the above condition is checking if added_at is a million days old or not. Do you really store so old dates? If not, you can simply remove this condition.
If you want this condition, an index on added_at would help a lot. Your query as it is now, checks all rows for this condition, calling the DATEDIFF() function as many times as the rows of the response table.
2.Since r.response_answer cannot be NULL, instead of:
SELECT COUNT( r.response_answer )
use:
SELECT COUNT( * )
COUNT(*) is faster than COUNT(field).
3.Two of the three fields that you use for joining tables have different datatypes:
ON question . answer_group_id
= answer_group . answer_group_id
CREATE TABLE question (
...
answer_group_id mediumint(8) ..., <--- mediumint
CREATE TABLE answer_group (
answer_group_id` tinyint(3) ..., <--- tinyint
-------------------------------
ON org_survey . org_survey_code
= response . org_survey_code
CREATE TABLE response (
...
org_survey_code varchar(7) NOT NULL, <--- 7
CREATE TABLE org_survey (
...
org_survey_code varchar(10) NOT NULL default '', <--- 10
Datatype mediumint is not the same as tinyint and the same goes for varchar(7) and varchar(10). When they are used for join, MySQL has to lose time doing conversion from one type to another. Convert one of them so they have identical datatypes. This is not the main issue of the query but this change will also help all other queries that use these joins.
And after making this change do a 'Analyze Table ' for the table. It will help mysql making better execution plans.
You have a response_answer = 5 condition, where response_answer is text. It's not an error, but it's better to use response_answer = '5' (the conversion of 5 to '5' will be done by MySQL anyway, if you don't do that).
Real issue is that you don't have a compound index on the 3 fields that are used in the WHERE conditions. Try adding this one:
ALTER TABLE response
ADD INDEX ind_u1_ra1_aa
(uuid(1), response_answer(1), added_at) ;
(this may take a while as your table is not small)
Can you try the following query? I've removed the sub-query from your original one. This may let the optimiser produce a better execution plan.
SELECT COUNT(r.response_answer)
FROM response r
INNER JOIN question q ON r.question_id = q.question_id
INNER JOIN answer_group ag ON q.answer_group_id = ag.answer_group_id
INNER JOIN org_survey os ON os.org_survey_code = r.org_survey_code
WHERE
ag.answer_group_stat = 'statistic'
AND os.survey_id = 42
AND r.response_answer = 5
AND DATEDIFF(NOW(), r.added_dt) < 1000000
AND r.uuid IS NOT NULL

Why does MySQL stop using an index for a join when I select non-indexed fields in the field list

I have the following two tables:
CREATE TABLE `temporal_expressions` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`dated_obj_type` varchar(255) DEFAULT NULL,
`dated_obj_id` int(11) DEFAULT NULL,
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
`start_time` int(11) DEFAULT NULL,
`end_time` int(11) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`lock_version` int(11) NOT NULL DEFAULT '0',
`wday` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `te_search` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`),
KEY `te_calendar` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`,`start_time`,`end_time`),
KEY `te_search_wday` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`,`wday`),
KEY `te_calendar_wday` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`,`start_time`,`end_time`,`wday`),
KEY `te_index` (`wday`,`dated_obj_type`,`start_date`,`end_date`,`start_time`,`end_time`,`dated_obj_id`)
) ENGINE=InnoDB AUTO_INCREMENT=8162445 DEFAULT CHARSET=latin1
CREATE TABLE `asset_blocks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`block_type` int(11) DEFAULT '0',
`spaces_left` int(11) DEFAULT NULL,
`provider_note` varchar(255) DEFAULT NULL,
`extra_data` text,
`lock_version` int(11) DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`type` varchar(255) DEFAULT NULL,
`service_provider_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `type` (`type`,`id`),
KEY `service_provider_id` (`service_provider_id`,`type`,`id`),
) ENGINE=InnoDB AUTO_INCREMENT=516867 DEFAULT CHARSET=latin1
If I run explain on this query (note that I am only selecting fields in the te_calendar_wday index from temporal_expressions) it uses the index for the join as expected
EXPLAIN SELECT asset_blocks.*, temporal_expressions.id,
temporal_expressions.dated_obj_type, temporal_expressions.dated_obj_id,
temporal_expressions.start_date, temporal_expressions.end_date,
temporal_expressions.start_time
FROM `asset_blocks`
LEFT OUTER JOIN `temporal_expressions`
ON `temporal_expressions`.dated_obj_id = `asset_blocks`.id
AND `temporal_expressions`.dated_obj_type = 'AssetBlock'
WHERE ( temporal_expressions.start_date <= '2010-11-25'
AND temporal_expressions.end_date >= '2010-11-01'
AND temporal_expressions.start_time < 1000 AND temporal_expressions.end_time > 1200
AND temporal_expressions.wday IN (1,2,3,4,5,6)
AND asset_blocks.id IN (1,2,3,4,5,6,7,8,9) )
1 SIMPLE temporal_expressions range te_search,te_calendar,te_search_wday,te_calendar_wday,te_index te_calendar_wday 272 NULL 9 Using where; Using index
1 SIMPLE asset_blocks eq_ref PRIMARY PRIMARY 4 lb_production.temporal_expressions.dated_obj_id 1
However, if I run this query (note that I have added a non-indexed field to the field list) it no longer uses the index (it uses a join buffer). Is this intentional or am I missing something?
EXPLAIN SELECT asset_blocks.*, temporal_expressions.id,
temporal_expressions.dated_obj_type, temporal_expressions.dated_obj_id,
temporal_expressions.start_date, temporal_expressions.end_date,
temporal_expressions.start_time, temporal_expressions.created_at
FROM `asset_blocks`
LEFT OUTER JOIN `temporal_expressions`
ON `temporal_expressions`.dated_obj_id = `asset_blocks`.id
AND `temporal_expressions`.dated_obj_type = 'AssetBlock'
WHERE ( temporal_expressions.start_date <= '2010-11-25'
AND temporal_expressions.end_date >= '2010-11-01'
AND temporal_expressions.start_time < 1000 AND temporal_expressions.end_time > 1200
AND temporal_expressions.wday IN (1,2,3,4,5,6)
AND asset_blocks.id IN (1,2,3,4,5,6,7,8,9) )
1 SIMPLE asset_blocks range PRIMARY PRIMARY 4 NULL 9 Using where
1 SIMPLE temporal_expressions range te_search,te_calendar,te_search_wday,te_calendar_wday,new_te_index te_search 272 NULL 9 Using where; Using join buffer
I cannot be sure if this is the case here, but:
If you select only indexed fields, MySQL can answer the whole query out of the index and does not even load the table data file.
If you select a field that is not indexed, it has to load the table data.
When making its execution plan, in certain cases (see comment) MySQL decides to do a full table scan although an index is present. This is because it's much quicker to read all data blindly than to look up every entry in the index and then read the data.

MySQL stuck on "using filesort" when doing an "order by"

I can't seem to get my query to stop using filesort.
This is my query:
SELECT s.`pilot`, p.`name`, s.`sector`, s.`hull`
FROM `pilots` p
LEFT JOIN `ships` s ON ( (s.`game` = p.`game`)
AND (s.`pilot` = p.`id`) )
WHERE p.`game` = 1
AND p.`id` <> 2
AND s.`sector` = 43
AND s.`hull` > 0
ORDER BY p.`last_move` DESC
Table structures:
CREATE TABLE IF NOT EXISTS `pilots` (
`id` mediumint(5) unsigned NOT NULL AUTO_INCREMENT,
`game` tinyint(3) unsigned NOT NULL DEFAULT '0',
`last_move` int(10) NOT NULL DEFAULT '0',
UNIQUE KEY `id` (`id`),
KEY `last_move` (`last_move`),
KEY `game_id_lastmove` (`game`,`id`,`last_move`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ;
CREATE TABLE IF NOT EXISTS `ships` (
`id` mediumint(5) unsigned NOT NULL AUTO_INCREMENT,
`game` tinyint(3) unsigned NOT NULL DEFAULT '0',
`pilot` mediumint(5) unsigned NOT NULL DEFAULT '0',
`sector` smallint(5) unsigned NOT NULL DEFAULT '0',
`hull` smallint(4) unsigned NOT NULL DEFAULT '50',
UNIQUE KEY `id` (`id`),
KEY `game` (`game`),
KEY `pilot` (`pilot`),
KEY `sector` (`sector`),
KEY `hull` (`hull`),
KEY `game_2` (`game`,`pilot`,`sector`,`hull`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ;
The explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p ref id,game_id_lastmove game_id_lastmove 1 const 7 Using where; Using filesort
1 SIMPLE s ref game,pilot,sector... game_2 6 const,fightclub_alpha.p.id,const 1 Using where; Using index
edit: I cut some of the unnecessary pieces out of my queries/table structure.
Anybody have any ideas?
the best thing that you can do is to make indexes:
index that covers table ships with fields: game + pilot + sector + hull (in this specific order)
pilots: game + id
this particular query will always use filesort, because it has not range condition p.id <> 2
http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
In some cases, MySQL cannot use
indexes to resolve the ORDER BY,
although it still uses indexes to find
the rows that match the WHERE clause.
These cases include the following ...
The key used to fetch the rows is not
the same as the one used in the ORDER
BY