Mysql not use index over huge table - mysql

I have next table:
CREATE TABLE `test` (
`fingerprint` varchar(80) COLLATE utf8_unicode_ci NOT NULL,
`country` varchar(5) COLLATE utf8_unicode_ci NOT NULL,
`loader` int(10) unsigned NOT NULL,
`date` date NOT NULL,
`installer` int(10) unsigned DEFAULT NULL,
`browser` varchar(5) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`version` varchar(5) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`os` varchar(10) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`language` varchar(10) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
PRIMARY KEY (`fingerprint`, `date`),
KEY `date_1` (`date`),
KEY `date_2` (`date`,`loader`,`installer`,`country`,`browser`,`os`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Right now it contains 10M records and will increase per 2M records / day.
My question, why MySQL use "Using Where" on next query:
explain select count(*) from test where date between '2013-08-01' and '2013-08-10'
1 SIMPLE test range date_1,date_2 date_1 3 1601644 Using where; Using index
Update, why next question have type - All and Using where then:
explain select * from test use key(date_1) where date between '2013-08-01' and '2013-08-10'
1 SIMPLE test ALL date_1 null null null 3648813 Using where

It does use the index.
It says so right there: Using where; Using index. The "using where" doesn't mean full scan, it means it's using the WHERE condition you provided.
The 1601644 number also hints at that: it means it expect to read roughly 1.6M records, not the whole 10M in the table, and it correlates with your ~2M/day estimate.
In short, it seems to be doing well, it's just a lot of data you will retrieve.
Still, it's reading the table data too, when it seems the index should be enough. Try changing the count(*) with count(date), so date is the only field mentioned in the whole query. If you get only Using index, then it could be faster.

Your query is not just "Using where", it is actually "Using where; Using index". This means the index is used to match your WHERE condition and the index is being used to perform lookups of key values. This is the best case scenario, because in fact the table has never been scanned, the query could be processed with the index only.
Here you can find a full description of the meaning of the output you are looking at.
Your second query only shows the "Using where" notice. This means the index is only used to filter rows. The data must be read from the table (no "Using index" notice), because the index does not contain all the row data (you selected all columns, but the chosen index only covers date). If you had a covering index (that covers all columns), this index would probably be used instead.

Related

MYSQL - How to add index for group by / order by / sum / with where

I am processing a mysql table with 40K rows. Current execution time is around 2 seconds with the table indexed.could some one guide me how to optimized this query and table better? and how to getrid of "Using where; Using temporary; Using filesort" ??. Any help is appreciated.
The goup by with be for the following cases...
LS_CHG_DTE_OCR
LS_CHG_DTE_OCR/RES_STATE_HSE
LS_CHG_DTE_OCR/RES_STATE_HSE/RES_CITY_HSE
LS_CHG_DTE_OCR/RES_STATE_HSE/RES_CITY_HSE/POSTAL_CDE_HSE
Thanks in advance
SELECT DATE_FORMAT(`LS_CHG_DTE_OCR`, '%Y-%b') AS fmt_date,
SUM(IF(`TYPE`='Connect',COUNT_SUBS,0)) AS connects,
SUM(IF(`TYPE`='Disconnect',COUNT_SUBS,0)) AS disconnects,
SUM(IF(`TYPE`='Connect',ROUND(REV,2),0)) AS REV,
SUM(IF(`TYPE`='Upgrade',COUNT_SUBS,0)) AS upgrades,
SUM(IF(`TYPE`='Downgrade',COUNT_SUBS,0)) AS downgrades,
SUM(IF(`TYPE`='Upgrade',ROUND(REV,2),0)) AS upgradeRev FROM `hsd`
WHERE LS_CHG_DTE_OCR!='' GROUP BY MONTH(LS_CHG_DTE_OCR) ORDER BY LS_CHG_DTE_OCR ASC
CREATE TABLE `hsd` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`SYS_OCR` varchar(255) DEFAULT NULL,
`PRIN_OCR` varchar(255) DEFAULT NULL,
`SERV_CDE_OHI` varchar(255) DEFAULT NULL,
`DSC_CDE_OHI` varchar(255) DEFAULT NULL,
`LS_CHG_DTE_OCR` datetime DEFAULT NULL,
`SALESREP_OCR` varchar(255) DEFAULT NULL,
`CHANNEL` varchar(255) DEFAULT NULL,
`CUST_TYPE` varchar(255) DEFAULT NULL,
`LINE_BUS` varchar(255) DEFAULT NULL,
`ADDR1_HSE` varchar(255) DEFAULT NULL,
`RES_CITY_HSE` varchar(255) DEFAULT NULL,
`RES_STATE_HSE` varchar(255) DEFAULT NULL,
`POSTAL_CDE_HSE` varchar(255) DEFAULT NULL,
`ZIP` varchar(100) DEFAULT NULL,
`COUNT_SUBS` double DEFAULT NULL,
`REV` double DEFAULT NULL,
`TYPE` varchar(255) DEFAULT NULL,
`lat` varchar(100) DEFAULT NULL,
`long` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx` (`LS_CHG_DTE_OCR`,`CHANNEL`,`CUST_TYPE`,`LINE_BUS`,`RES_CITY_HSE`,`RES_STATE_HSE`,`POSTAL_CDE_HSE`,`ZIP`,`COUNT_SUBS`,`TYPE`)
) ENGINE=InnoDB AUTO_INCREMENT=402342 DEFAULT CHARSET=latin1 ROW_FORMAT=DYNAMIC
Using where; Using temporary; Using filesort[enter image description here][1]
The only condition you apply is LS_CHG_DTE_OCR != "". Other than that you are doing a full table scan because of the aggregations. Index wise you can't do much here.
I ran into the same problem. I had fully optimized my queries (I had joins and more conditions) but the table kept growing and with it query time. Finally I decided to mirror the data to ElasticSearch. In my case it cut down query time to about 1/20th to 1/100th (for different queries).
The only possible index for that SELECT is INDEX(LS_CHG_DTE_OCR). But it is unlikely for it to be used.
Perform the WHERE -- If there are a lot of '' values, then the index may be used for filtering.
GROUP BY MONTH(...) -- You might be folding the same month from multiple years. The Optimizer can't tell, so it will punt on using the index.
ORDER BY LS_CHG_DTE_OCR -- This is done after the GROUP BY; the ORDER BY can't be performed until the data is gathered -- too late for any index. However, if multiple years are folded together, you could get some strange results. Cure it by making the ORDER BY be the same as the GROUP BY. This will also prevent an extra sort that is caused by the GROUP BY and ORDER BY being different.
Yeah, if that idx you added has all the columns in the SELECT, then it is a "covering index". But it won't help any because of the comments above. "Using index" won't help a lot.
GROUP BY LS_CHG_DTE_OCR/RES_STATE_HSE -- Eh? Divide a DATETIME by a VARCHAR? That sounds like a disaster.
This table will grow even bigger over time, correct? Consider building and maintaining Summary Table(s) with month as part of the PRIMARY KEY.

GROUP BY Query -why so slow

I am trying to generate a group query on a large table (more than 8 million rows). However I can reduce the need to group all the data by date. I have a view that captures that dates I require and this limits the query bu it's not much better.
Finally I need to join to another table to pick up a field.
I am showing the query, the create on the main table and the query explain below.
Main Query:
SELECT pgi_raw_data.wsp_channel,
'IOM' AS wsp,
pgi_raw_data.dated,
pgi_accounts.`master`,
pgi_raw_data.event_id,
pgi_raw_data.breed,
Sum(pgi_raw_data.handle),
Sum(pgi_raw_data.payout),
Sum(pgi_raw_data.rebate),
Sum(pgi_raw_data.profit)
FROM pgi_raw_data
INNER JOIN summary_max
ON pgi_raw_data.wsp_channel = summary_max.wsp_channel
AND pgi_raw_data.dated > summary_max.race_date
INNER JOIN pgi_accounts
ON pgi_raw_data.account = pgi_accounts.account
GROUP BY pgi_raw_data.event_id
ORDER BY NULL
The create table:
CREATE TABLE `pgi_raw_data` (
`event_id` char(25) NOT NULL DEFAULT '',
`wsp_channel` varchar(5) NOT NULL,
`dated` date NOT NULL,
`time` time DEFAULT NULL,
`program` varchar(5) NOT NULL,
`track` varchar(25) NOT NULL,
`raceno` tinyint(2) NOT NULL,
`detail` varchar(30) DEFAULT NULL,
`ticket` varchar(20) NOT NULL DEFAULT '',
`breed` varchar(12) NOT NULL,
`pool` varchar(10) NOT NULL,
`gross` decimal(11,2) NOT NULL,
`refunds` decimal(11,2) NOT NULL,
`handle` decimal(11,2) NOT NULL,
`payout` decimal(11,4) NOT NULL,
`rebate` decimal(11,4) NOT NULL,
`profit` decimal(11,4) NOT NULL,
`account` mediumint(10) NOT NULL,
PRIMARY KEY (`event_id`,`ticket`),
KEY `idx_account` (`account`),
KEY `idx_wspchannel` (`wsp_channel`,`dated`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
This is my view for summary_max:
CREATE ALGORITHM=UNDEFINED DEFINER=`root`#`localhost` SQL SECURITY DEFINER VIEW
`summary_max` AS select `pgi_summary_tbl`.`wsp_channel` AS
`wsp_channel`,max(`pgi_summary_tbl`.`race_date`) AS `race_date`
from `pgi_summary_tbl` group by `pgi_summary_tbl`.`wsp
And also the evaluated query:
1 PRIMARY <derived2> ALL 6 Using temporary
1 PRIMARY pgi_raw_data ref idx_account,idx_wspchannel idx_wspchannel
7 summary_max.wsp_channel 470690 Using where
1 PRIMARY pgi_accounts ref PRIMARY PRIMARY 3 gf3data_momutech.pgi_raw_data.account 29 Using index
2 DERIVED pgi_summary_tbl ALL 42282 Using temporary; Using filesort
Any help on indexing would help.
At a minimum you need indexes on these fields:
pgi_raw_data.wsp_channel,
pgi_raw_data.dated,
pgi_raw_data.account
pgi_raw_data.event_id,
summary_max.wsp_channel,
summary_max.race_date,
pgi_accounts.account
The general (not always) rule is anything you are sorting, grouping, filtering or joining on should have an index.
Also: pgi_summary_tbl.wsp
Also, why the order by null?
The first thing is to be sure that you have indexes on pgi_summary_table(wsp_channel, race_date) and pgi_accounts(account). For this query, you don't need indexes on these columns in the raw data.
MySQL has a tendency to use indexes even when they are not the most efficient path. I would start by looking at the performance of the "full" query, without the joins:
SELECT pgi_raw_data.wsp_channel,
'IOM' AS wsp,
pgi_raw_data.dated,
-- pgi_accounts.`master`,
pgi_raw_data.event_id,
pgi_raw_data.breed,
Sum(pgi_raw_data.handle),
Sum(pgi_raw_data.payout),
Sum(pgi_raw_data.rebate),
Sum(pgi_raw_data.profit)
FROM pgi_raw_data
GROUP BY pgi_raw_data.event_id
If this has better performance, you may have a situation where the indexes are working against you. The specific problem is called "thrashing". It occurs when a table is too bit to fit into memory. Often, the fastest way to deal with such a table is to just read the whole thing. Accessing the table through an index can result in an extra I/O operation for most of the rows.
If this works, then do the joins after the aggregate. Also, consider getting more memory, so the whole table will fit into memory.
Second, if you have to deal with this type of data, then partitioning the table by date may prove to be a very useful option. This will allow you to significantly reduce the overhead of reading the large table. You do have to be sure that the summary table can be read the same way.

MYSQL performance slow using filesort

I have a simple mysql query, but when I have a lot of records (currently 103,0000), the performance is really slow and it says it is using filesort, im not sure if this is why it is slow. Has anyone any suggestions to speed it up? or stop it using filesort?
MYSQL query :
SELECT *
FROM adverts
WHERE (price >= 0)
AND (status = 1)
AND (approved = 1)
ORDER BY date_updated DESC
LIMIT 19990, 10
The Explain results :
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE adverts range price price 4 NULL 103854 Using where; Using filesort
Here is the adverts table and indexes:
CREATE TABLE `adverts` (
`advert_id` int(10) NOT NULL AUTO_INCREMENT,
`user_id` int(10) NOT NULL,
`type_id` tinyint(1) NOT NULL,
`breed_id` int(10) NOT NULL,
`advert_type` tinyint(1) NOT NULL,
`headline` varchar(50) NOT NULL,
`description` text NOT NULL,
`price` int(4) NOT NULL,
`postcode` varchar(7) NOT NULL,
`town` varchar(60) NOT NULL,
`county` varchar(60) NOT NULL,
`latitude` float NOT NULL,
`longitude` float NOT NULL,
`telephone1` varchar(15) NOT NULL,
`telephone2` varchar(15) NOT NULL,
`email` varchar(80) NOT NULL,
`status` tinyint(1) NOT NULL DEFAULT '0',
`approved` tinyint(1) NOT NULL DEFAULT '0',
`date_created` datetime NOT NULL,
`date_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`expiry_date` datetime NOT NULL,
PRIMARY KEY (`advert_id`),
KEY `price` (`price`),
KEY `user` (`user_id`),
KEY `type_breed` (`type_id`,`breed_id`),
KEY `headline_keywords` (`headline`),
KEY `date_updated` (`date_updated`),
KEY `type_status_approved` (`advert_type`,`status`,`approved`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
The problem is that MySQL only uses one index when executing the query. If you add a new index that uses the 3 fields in your WHERE clause, it will find the rows faster.
ALTER TABLE `adverts` ADD INDEX price_status_approved(`price`, `status`, `approved`);
According to the MySQL documentation ORDER BY Optimization:
In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that match the WHERE clause. These cases include the following:
The key used to fetch the rows is not the same as the one used in the ORDER BY.
This is what happens in your case.
As the output of EXPLAIN tells us, the optimizer uses the key price to find the rows. However, the ORDER BY is on the field date_updated which does not belong to the key price.
To find the rows faster AND sort the rows faster, you need to add an index that contains all the fields used in the WHERE and in the ORDER BY clauses:
ALTER TABLE `adverts` ADD INDEX status_approved_date_updated(`status`, `approved`, `date_updated`);
The field used for sorting must be in the last position in the index. It is useless to include price in the index, because the condition used in the query will return a range of values.
If EXPLAIN still shows that it is using filesort, you may try forcing MySQL to use an index you choose:
SELECT adverts.*
FROM adverts
FORCE INDEX(status_approved_date_updated)
WHERE price >= 0
AND adverts.status = 1
AND adverts.approved = 1
ORDER BY date_updated DESC
LIMIT 19990, 10
It is usually not necessary to force an index, because the MySQL optimizer most often does the correct choice. But sometimes it makes a bad choice, or not the best choice. You will need to run some tests to see if it improves performance or not.
Remove the ticks around the '0' - it currently may prevent using the index but I am not sure.
Nevertheless it is better style since price is int type and not a character column.
SELECT adverts .*
FROM adverts
WHERE (
price >= 0
)
AND (
adverts.status = 1
)
AND (
adverts.approved = 1
)
ORDER BY date_updated DESC
LIMIT 19990 , 10
MySQL does not make use of the key date_updated for the sorting but just uses the price key as it is used in the WHERE clause. You could try to to use index hints:
http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
Add something like
USE KEY FOR ORDER BY (date_updated)
I have two suggestions. First, remove the quotes around the zero in your where clause. That line should be:
price >= 0
Second, create this index:
CREATE INDEX `helper` ON `adverts`(`status`,`approved`,`price`,`date_created`);
This should allow MySQL to find the 10 rows specified by your LIMIT clause by using only the index. Filesort itself is not a bad thing... the number of rows that need to be processed is.
Your WHERE condition uses price, status, approved to select, and then date_updated is used to sort.
So you need a single index with those fields; I'd suggest indexing on approved, status, price and date_updated, in this order.
The general rule is placing WHERE equalities first, then ranges (more than, less or equal, between, etc), and sorting fields last. (Note that leaving one field out might make the index less usable, or even unusable, for this purpose).
CREATE INDEX advert_ndx ON adverts (approved, status, price, date_updated);
This way, access to the table data is only needed after LIMIT has worked its magic, and you will slow-retrieve only a small number of records.
I'd also remove any unneeded indexes, which would speed up INSERTs and UPDATEs.

Mysql performance with nested indices

I have a mysql table (articles) with a nested index (blog_id, published), and performs poorly. I see a lot of these in my slow query logs:
- Query_time: 23.184007 Lock_time: 0.000063 Rows_sent: 380 Rows_examined: 6341
SELECT id from articles WHERE category_id = 11 AND blog_id IN (13,14,15,16,17,18,19,20,21,22,23,24,26,27,6330,6331,8269,12218,18889) order by published DESC LIMIT 380;
I have trouble understanding why mysql would run through all rows with those blog_ids to figure out my top 380 rows. I would expect the whole purpose of the nested index is to speed that up. To the very least, even a naive implementation, should look-up by blog_id and get it's top 380 rows ordered by published. That should be fast, since, we can figure out the exact 200 rows, due to the nested index. And then sort the resulting 19*200=3800 rows.
If one were to implement it in the most optimal way, you would put a heap from the set of all blog-id based streams and pick the one with the max(published) and repeat it 200 times. Each operation should be fast.
I'm surely missing something since Google, Facebook, Twitter, Microsoft and all the big companies are using mysql for production purposes. Any one with experience?
Edit: Updating as per, thieger's answer. I tried index hinting, and it doesn't seem to help. Results are attached below, at the end. Mysql order by optimisation claims to address the concern theiger is raising:
I agree that MySQL might possibly use
the composite blog_id-published-index,
but only for the blog_id part of the
query.
SELECT * FROM t1 WHERE
key_part1=constant ORDER BY
key_part2;
Atleast mysql seems to claim it can be used beyond just the WHERE clause (blog_id part of the query). Any help theiger?
Thanks,
-Prasanna
[myprasanna at gmail dot com]
CREATE TABLE IF NOT EXISTS `articles` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`category_id` int(11) DEFAULT NULL,
`blog_id` int(11) DEFAULT NULL,
`cluster_id` int(11) DEFAULT NULL,
`title` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`description` text COLLATE utf8_unicode_ci,
`keywords` text COLLATE utf8_unicode_ci,
`image_url` varchar(511) COLLATE utf8_unicode_ci DEFAULT NULL,
`url` varchar(511) COLLATE utf8_unicode_ci DEFAULT NULL,
`url_hash` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`author` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`categories` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`published` int(11) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`is_image_crawled` tinyint(1) DEFAULT NULL,
`image_candidates` text COLLATE utf8_unicode_ci,
`title_hash` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`article_readability_crawled` tinyint(1) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_articles_on_url_hash` (`url_hash`),
KEY `index_articles_on_cluster_id` (`cluster_id`),
KEY `index_articles_on_published` (`published`),
KEY `index_articles_on_is_image_crawled` (`is_image_crawled`),
KEY `index_articles_on_category_id` (`category_id`),
KEY `index_articles_on_title_hash` (`title_hash`),
KEY `index_articles_on_article_readability_crawled` (`article_readability_crawled`),
KEY `index_articles_on_blog_id` (`blog_id`,`published`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=562907 ;
SELECT id from articles USE INDEX(index_articles_on_blog_id) WHERE category_id = 11 AND blog_id IN (13,14,15,16,17,18,19,20,21,22,23,24,26,27,6330,6331,8269,12218,18889) order by published DESC LIMIT 380;
....
380 rows in set (11.27 sec)
explain SELECT id from articles USE INDEX(index_articles_on_blog_id) WHERE category_id = 11 AND blog_id IN (13,14,15,16,17,18,19,20,21,22,23,24,26,27,6330,6331,8269,12218,18889) order by published DESC LIMIT 380\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: articles
type: range
possible_keys: index_articles_on_blog_id
key: index_articles_on_blog_id
key_len: 5
ref: NULL
rows: 8640
Extra: Using where; Using filesort
1 row in set (0.00 sec)
Did you try EXPLAIN to see whether your index is used at all? Did you ANALYZE to update the index statistics?
I agree that MySQL might possibly use the composite blog_id-published-index, but only for the blog_id part of the query. If the index is not used after ANALYZE, you can try giving MySQL a hint with USE INDEX or even FORCE INDEX, but the MySQL optimizer may also correctly assume that a sequential scan is faster than using the index. For your kind of query, I would also propose to add an index on category_id and blog_id and try to use that.
Aside from thieger's excellent answer, you might also want to check:
if an index on (category_id,blog_id,published) is any use.
if there is enough room to keep all indexes in memory (innodb buffer pool usage & flushes for instance, mysqlreport is a very handy tool in that respect)
MySQL has a cutoff mechanism where if it detects that it will probably have to look at more than about a third of the table anyway, it won't use the index. Since it appears your query will match just over 6000 rows of an 8000-odd row table, that is definitely what is happening.
In addition, MySQL can't usually use an index twice on the same table, nor can it use more than one. In this case, it won't use the index for the ORDER BY clause because it has different columns specified than in the WHERE clause.

Any way to avoid a filesort when order by is different to where clause?

I have an incredibly simple query (table type InnoDb) and EXPLAIN says that MySQL must do an extra pass to find out how to retrieve the rows in sorted order.
SELECT * FROM `comments`
WHERE (commentable_id = 1976)
ORDER BY created_at desc LIMIT 0, 5
exact explain output:
table select_type type extra possible_keys key key length ref rows
comments simple ref using where; using filesort common_lookups common_lookups 5 const 89
commentable_id is indexed. Comments has nothing trick in it, just a content field.
The manual suggests that if the order by is different to the where, there is no way filesort can be avoided.
http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
I also tried order by id as well as it's equivalent but makes no difference, even if I add id as an index (which I understand is not required as id is indexed implicitly in MySQL).
thanks in advance for any ideas!
To Mark -- here's SHOW CREATE TABLE
CREATE TABLE `comments` (
`id` int(11) NOT NULL auto_increment,
`user_id` int(11) default NULL,
`commentable_type` varchar(255) default NULL,
`commentable_id` int(11) default NULL,
`content` text,
`created_at` datetime default NULL,
`updated_at` datetime default NULL,
`hidden` tinyint(1) default '0',
`public` tinyint(1) default '1',
`access_point` int(11) default '0',
`item_id` int(11) default NULL,
PRIMARY KEY (`id`),
KEY `created_at` (`created_at`),
KEY `common_lookups` (`commentable_id`,`commentable_type`,`hidden`,`created_at`,`public`),
KEY `index_comments_on_item_id` (`item_id`),
KEY `index_comments_on_item_id_and_created_at` (`item_id`,`created_at`),
KEY `index_comments_on_user_id` (`user_id`),
KEY `id` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=31803 DEFAULT CHARSET=latin1
Note that the MySQL term filesort doesn't necessarily mean it writes to the disk. It just means it's going to sort without using an index. If the result set is small enough, MySQL will sort it in memory, which is orders of magnitude faster than disk I/O.
You can increase the amount of memory MySQL allocates for in-memory filesorts using the sort_buffer_size server variable. In MySQL 5.1, the default sort buffer size is 2MB, and the maximum you can allocate is 4GB.
update: Regarding Jonathan Leffler's comment about measuring how long the sorting takes, you can learn how to use SHOW PROFILE FOR QUERY which will give you the breakdown of how long each phase of query execution takes.
Try adding a combined index on (commentable_id, created_at).