MySQL ORDER BY DESC is fast but ASC is very slow - mysql

For some reason when I sort this query by DESC it's super fast, but if sorted by ASC it's extremely slow.
This takes about 150 milliseconds:
SELECT posts.id
FROM posts USE INDEX (published)
WHERE posts.feed_id IN ( 4953,622,1,1852,4952,76,623,624,10 )
ORDER BY posts.published DESC
LIMIT 0, 50;
This takes about 32 seconds:
SELECT posts.id
FROM posts USE INDEX (published)
WHERE posts.feed_id IN ( 4953,622,1,1852,4952,76,623,624,10 )
ORDER BY posts.published ASC
LIMIT 0, 50;
The EXPLAIN is the same for both queries.
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE posts index NULL published 5 NULL 50 Using where
I've tracked it down to "USE INDEX (published)". If I take that out it's the same performance both ways. But the EXPLAIN shows the query is less efficient overall.
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE posts range feed_id feed_id 4 \N 759 Using where; Using filesort
And here's the table.
CREATE TABLE `posts` (
`id` int(20) NOT NULL AUTO_INCREMENT,
`feed_id` int(11) NOT NULL,
`post_url` varchar(255) NOT NULL,
`title` varchar(255) NOT NULL,
`content` blob,
`author` varchar(255) DEFAULT NULL,
`published` int(12) DEFAULT NULL,
`updated` datetime NOT NULL,
`created` datetime NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `post_url` (`post_url`,`feed_id`),
KEY `feed_id` (`feed_id`),
KEY `published` (`published`)
) ENGINE=InnoDB AUTO_INCREMENT=196530 DEFAULT CHARSET=latin1;
Is there a fix for this?

Your index is sorted desc so when you ask for ascending it needs to do a lot more work to bring it back in that order

I wouldn't suggest you create another index on the table; every time a row is inserted or deleted, each index on the table needs to be updated, slowing down INSERT queries.
The index is definitely what's slowing it down. Maybe you could try IGNORE-ing it:
SELECT posts.id
FROM posts IGNORE INDEX (published)
WHERE posts.feed_id IN ( 4953,622,1,1852,4952,76,623,624,10 )
ORDER BY posts.published ASC
LIMIT 0, 50;
Or, since the field is already KEYed, you might try the following:
SELECT posts.id
FROM posts USE KEY (published)
WHERE posts.feed_id IN ( 4953,622,1,1852,4952,76,623,624,10 )
ORDER BY posts.published ASC
LIMIT 0, 50;

You could get your data set first, and then order it.
Something like
SELECT posts.id FROM (
SELECT posts.id
FROM posts USE INDEX (published)
WHERE posts.feed_id IN ( 4953,622,1,1852,4952,76,623,624,10 )
LIMIT 0, 50
)
order by postS.id ASC;
It should first use the index to find all records that satisfy your "where" statement, and the will order them. But the order would be performed in a smaller set. Give it a try and then tell us.
Best Regards.

You want to add an index across (feed_id, published):
ALTER TABLE posts ADD INDEX (feed_id, published)
That'll make this query run best, and you won't need to force a particular index with USE INDEX.

How about flipping the WHERE condition?
SELECT posts.id
FROM posts USE INDEX (published)
WHERE posts.feed_id IN ( 10,624,623,76,4952,1852,622,4953 )
ORDER BY posts.published DESC;

Related

MySQL - how to optimize query with order by

I am trying to generate a list of the 5 most recent history items for for a collection of user tasks. If I remove the order by the execution drops from ~2 seconds to < 20msec.
Indexes are on
h.task_id
h.mod_date
i.task_id
i.user_id
This is the query
SELECT h.*
, i.task_id
, i.user_id
, i.name
, i.completed
FROM h
, i
WHERE i.task_id = h.task_id
AND i.user_id = 42
ORDER
BY h.mod_date DESC
LIMIT 5
Here is the explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE i ref PRIMARY,UserID UserID 4 const 3091 Using temporary; Using filesort
1 SIMPLE h ref TaskID TaskID 4 myDB.i.task_id 7
Here are the show create tables:
CREATE TABLE `h` (
`history_id` int(6) NOT NULL AUTO_INCREMENT,
`history_code` tinyint(4) NOT NULL DEFAULT '0',
`task_id` int(6) NOT NULL,
`mod_date` datetime NOT NULL,
`description` text NOT NULL,
PRIMARY KEY (`history_id`),
KEY `TaskID` (`task_id`),
KEY `historyCode` (`history_code`),
KEY `modDate` (`mod_date`)
) ENGINE=InnoDB AUTO_INCREMENT=185647 DEFAULT CHARSET=latin1
and
CREATE TABLE `i` (
`task_id` int(6) NOT NULL AUTO_INCREMENT,
`user_id` int(6) NOT NULL,
`name` varchar(60) NOT NULL,
`due_date` date DEFAULT NULL,
`create_date` date NOT NULL,
`completed` tinyint(1) NOT NULL DEFAULT '0',
`task_description` blob,
PRIMARY KEY (`task_id`),
KEY `name_2` (`name`),
KEY `UserID` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=12085 DEFAULT CHARSET=latin1
INDEX(task_id, mod_date, history_id) -- in this order
Will be "covering" and the columns will be in the optimal order
Also, DROP
KEY `TaskID` (`task_id`)
So that the Optimizer won't be tempted to use it.
Try changing the index on h.task_id so it's this compound index.
CREATE OR REPLACE INDEX TaskID ON h(task_id, mod_date DESC);
This may (or may not) allow MySql to shortcut some or all the extra work in your ORDER BY ... LIMIT ... request. It's a notorious performance anti pattern, by the way, but sometimes necessary.
Edit the index didn't help. So let's try a so-called deferred join so we don't have to ORDER and then LIMIT all the data from your h table.
Start with this subquery. It retrieves only the primary key values for the rows involved in your results, and will generate just five rows.
SELECT h.history_id, i.task_id
FROM h
JOIN i ON h.task_id = i.task_id
WHERE i.user_id = 42
ORDER BY h.mod_date
LIMIT 5
Why this subquery? It handles the work-intensive ORDER BY ... LIMIT operation while manipulating only the primary keys and the date. It still must sort tons of rows only to discard all but five, but the rows it has to handle are much shorter. Because this subquery does the heavy work, you focus on optimizing it, rather than the whole query.
Keep the index I suggested above, because it covers the subquery for h.
Then, join it to the rest of your query like this. That way you'll only have to retrieve the expensive h.description column for the five rows you care about.
SELECT h.* , i.task_id, i.user_id , i.name, i.completed
FROM h
JOIN i ON i.task_id = h.task_id
JOIN (
SELECT h.history_id, i.task_id
FROM h
JOIN i ON h.task_id = i.task_id
WHERE i.user_id = 42
ORDER BY h.mod_date
LIMIT 5
) selected ON h.history_id = selected.history_id
AND i.task_id = selected.task_id
ORDER BY h.mod_date DESC
LIMIT 5

MySql Join slow with SUM() of results

anyone know a more efficient way to execute this query?
SELECT SQL_CALC_FOUND_ROWS p.*, IFNULL(SUM(v.visits),0) AS visits,
FROM posts AS p
LEFT JOIN visits_day v ON v.post_id=p.post_id
GROUP BY post_id
ORDER BY post_id DESC LIMIT 20 OFFSET 0
The visits_day table has one record per day, per user, per post. With the growth of the table this query is extremely slow.
I cant add a column with the total visit count because I need to list the posts by more visits per day or per week, etc.
Does anyone know a beter solution to this?
Thanks
CREATE TABLE `visits_day` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`post_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`day` date NOT NULL,
`visits` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=52302 DEFAULT CHARSET=utf8
CREATE TABLE `posts` (
`post_id` int(11) NOT NULL AUTO_INCREMENT,
`link` varchar(300) NOT NULL,
`date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`title` varchar(500) NOT NULL,
`img` varchar(300) NOT NULL,
PRIMARY KEY (`post_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1027 DEFAULT CHARSET=utf8
With SQL_CALC_FOUND_ROWS, the query must evaluate everything, just not deliver all the rows. Getting rid of that should be beneficial.
To actually touch only 20 rows, we need to get through the WHERE, GROUP BY and ORDER BY with a single index. Otherwise, we might have to touch all the rows, sort them then deliver 20. The obvious index is (post_id); I suspect that is already indexed as PRIMARY KEY(post_id)? (It would help if you provide SHOW CREATE TABLE when asking questions.)
Another way to do the join, and get the desired result of zero, is as follows. Note that it eliminates the need for GROUP BY.
SELECT p.*,
IFNULL( ( SELECT SUM(v.visits)
FROM visits_day
WHERE post_id = p.post_id
),
0) AS visits
FROM posts AS p
ORDER BY post_id DESC
LIMIT 20 OFFSET 0
If you really need the count, then consider SELECT COUNT(*) FROM posts.
ON v.post_id=p.post_id in your query and WHERE post_id = p.post_id beg for INDEX(post_id) on visits_day. That will speed up both variants considerably.

MySQL Index with ordering

I have a table with 5 million rows. I didn't add my indexes here:
CREATE TABLE `my_table` (
`Id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`Title` CHAR(200) NULL DEFAULT NULL,
`ProjectId` INT(10) UNSIGNED NOT NULL,
`RoleId` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`Id`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB;
When I run below SQL, it takes more than 1 minute.
SELECT *
FROM `my_table` t
WHERE
t.ProjectId IN (123, 456, 789) AND
t.RoleId IN (111, 222, 333)
ORDER BY Title DESC
LIMIT 25
Question is, how properly add indexes for the table. Can you give any solutions?
Explain for index "ProjectId" and "RoleId" is:
key = IndxProjectIdRoleId
ref = NULL,
rows: 32,463
Extra: Using where; Using filesort
Thanks for any suggestion.
You can try indexes on (ProjectId, RoleId, Title) and (RoleId, ProjectId, Title). They may not help much. The problem is that you have two inequalities in the where.
One of these is likely to be better than the current execution plan. However, it might not help so much.
MySQL actually has good documentation on multi-column indexes. You might want to review it.
A more complicated version of the query might work better:
(SELECT *
FROM `my_table` t
WHERE t.ProjectId = 123 AND t.RoleId = 111
ORDER BY Title DESC
LIMIT 25
) UNION ALL
(SELECT *
FROM `my_table` t
WHERE t.ProjectId = 123 AND t.RoleId = 456
ORDER BY Title DESC
LIMIT 25
)
UNION ALL
. . . -- The other 7 combinations
ORDER BY Title DESC
LIMIT 25;
This much longer version of the query can take advantage of either of the above indexes so each should be quite fast. In the end, the query has to sort up to 9 * 25 (225) records, and that should be pretty fast, even without an index.
I suggest a composite index
INDEX my_index_name (ProjectId,RoleId )
in your case ..
CREATE TABLE `my_table` (
`Id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`Title` CHAR(200) NULL DEFAULT NULL,
`ProjectId` INT(10) UNSIGNED NOT NULL,
`RoleId` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`Id`),
INDEX my_index_name (ProjectId,RoleId)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB;
eventually check if is more selective the inverse
INDEX my_index_name (RoleId, ProjectId)
And do the fact your table has only few column you can also try a complete indexed table
INDEX my_index_name (ProjectId,RoleId, Tile, id)
and select this way
SELECT Id, Title, ProjectId, RoleId
FROM `my_table` t
WHERE
t.ProjectId IN (123, 456, 789) AND
t.RoleId IN (111, 222, 333)
ORDER BY Title DESC
LIMIT 25;

MySQL JOIN time reduction

This query is taking over a minute to complete:
SELECT keyword, count(*) as 'Number of Occurences'
FROM movie_keyword
JOIN
keyword
ON keyword.`id` = movie_keyword.`keyword_id`
GROUP BY keyword
ORDER BY count(*) DESC
LIMIT 5
Every keyword has an ID associated with it (keyword_id column). And that ID is used to look up the actual keyword from the keyword table.
movie_keyword has 2.8 million rows
keyword has 127,000
However to return just the most used keyword_id's takes only 1 second:
SELECT keyword_id, count(*)
FROM movie_keyword
GROUP BY keyword_id
ORDER BY count(*) DESC
LIMIT 5
Is there a more efficient way of doing this?
Output with EXPLAIN:
1 SIMPLE keyword ALL PRIMARY NULL NULL NULL 125405 Using temporary; Using filesort
1 SIMPLE movie_keyword ref idx_keywordid idx_keywordid 4 imdb.keyword.id 28 Using index
Structure:
CREATE TABLE `movie_keyword` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`movie_id` int(11) NOT NULL,
`keyword_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_mid` (`movie_id`),
KEY `idx_keywordid` (`keyword_id`),
KEY `keyword_ix` (`keyword_id`),
CONSTRAINT `movie_keyword_keyword_id_exists` FOREIGN KEY (`keyword_id`) REFERENCES `keyword` (`id`),
CONSTRAINT `movie_keyword_movie_id_exists` FOREIGN KEY (`movie_id`) REFERENCES `title` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4256379 DEFAULT CHARSET=latin1;
CREATE TABLE `keyword` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`keyword` text NOT NULL,
`phonetic_code` varchar(5) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_keyword` (`keyword`(5)),
KEY `idx_pcode` (`phonetic_code`),
KEY `keyword_ix` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=127044 DEFAULT CHARSET=latin1;
Untested but should work and be significantly faster in my opinion, not very sure if you're allowed to use limit in a subquery in mysql though, but there are other ways around that.
SELECT keyword, count(*) as 'Number of Occurences'
FROM movie_keyword
JOIN
keyword
ON keyword.`id` = movie_keyword.`keyword_id`
WHERE movie_keyword.keyword_id IN (
SELECT keyword_id
FROM movie_keyword
GROUP BY keyword
ORDER BY count(*) DESC
LIMIT 5
)
GROUP BY keyword
ORDER BY count(*) DESC;
This should be faster because you don't join all the 2.8 million entries in movie_keyword with keyword, just the ones that actually match, which I'm guessing are significantly less.
EDIT since mysql doesn't support limit inside a subquery you have to run
SELECT keyword_id
FROM movie_keyword
GROUP BY keyword
ORDER BY count(*) DESC
LIMIT 5;
first and after fetching the results run the second query
SELECT keyword, count(*) as 'Number of Occurences'
FROM movie_keyword
JOIN
keyword
ON keyword.`id` = movie_keyword.`keyword_id`
WHERE movie_keyword.keyword_id IN (RESULTS_FROM_FIRST_QUERY_SEPARATED_BY_COMMAS)
GROUP BY keyword
ORDER BY count(*) DESC;
replace RESULTS_FROM_FIRST_QUERY_SEPARATED_BY_COMMAS with the proper values programatically from whatever language you're using
The query seems fine but I think the structure is not, try to give index on columns
keyword.id
try,
CREATE INDEX keyword_ix ON keyword (id);
or
ALTER TABLE keyword ADD INDEX keyword_ix (id);
much better if you can post the structures of your tables: keyword and Movie_keyword. Which of the two is the main table and the referencing table?
SELECT keyword, count(movie_keyword.id) as 'Number of Occurences'
FROM movie_keyword
INNER JOIN keyword
ON keyword.`id` = movie_keyword.`keyword_id`
GROUP BY keyword
ORDER BY 'Number of Occurences' DESC
LIMIT 5
I know this is pretty old question, but because I think that xception forgot about delivery tables in mysql, I want to suggest another solution. It requires only one query and it omits joining big data. If someone has such big data and can test it ( maybe question creator ), please share results.
SELECT keyword.keyword, _temp.occurences
FROM (
SELECT keyword_id, COUNT( keyword_id ) AS occurences
FROM movie_keyword
GROUP BY keyword_id
ORDER BY occurences DESC
LIMIT 5
) AS _temp
JOIN keyword ON _temp.keyword_id = keyword.id
ORDER BY _temp.occurences DESC

Slow query with multiple where and order by clauses

I'm trying to find a way to speed up a slow (filesort) MySQL query.
Tables:
categories (id, lft, rgt)
questions (id, category_id, created_at, votes_up, votes_down)
Example query:
SELECT * FROM questions q
INNER JOIN categories c ON (c.id = q.category_id)
WHERE c.lft > 1 AND c.rgt < 100
ORDER BY q.created_at DESC, q.votes_up DESC, q.votes_down ASC
LIMIT 4000, 20
If I remove the ORDER BY clause, it's fast. I know MySQL doesn't like both DESC and ASC orders in the same clause, so I tried adding a composite (created_at, votes_up) index to the questions table and removed q.votes_down ASC from the ORDER BY clause. That didn't help and it seems that the WHERE clause gets in the way here because it filters by columns from another (categories) table. However, even if it worked, it wouldn't be quite right since I do need the q.votes_down ASC condition.
What are good strategies to improve performance in this case? I'd rather avoid restructuring the tables, if possible.
EDIT:
CREATE TABLE `categories` (
`id` int(11) NOT NULL auto_increment,
`lft` int(11) NOT NULL,
`rgt` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `lft_idx` (`lft`),
KEY `rgt_idx` (`rgt`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `questions` (
`id` int(11) NOT NULL auto_increment,
`category_id` int(11) NOT NULL,
`votes_up` int(11) NOT NULL default '0',
`votes_down` int(11) NOT NULL default '0',
`created_at` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `questions_FI_1` (`category_id`),
KEY `votes_up_idx` (`votes_up`),
KEY `votes_down_idx` (`votes_down`),
KEY `created_at_idx` (`created_at`),
CONSTRAINT `questions_FK_1` FOREIGN KEY (`category_id`) REFERENCES `categories` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE q ALL questions_FI_1 NULL NULL NULL 31774 Using filesort
1 SIMPLE c eq_ref PRIMARY,lft_idx,rgt_idx PRIMARY 4 ttt.q.category_id 1 Using where
Try a subquery to get the desired categories:
SELECT * FROM questions
WHERE category_id IN ( SELECT id FROM categories WHERE lft > 1 AND rgt < 100 )
ORDER BY created_at DESC, votes_up DESC, votes_down ASC
LIMIT 4000, 20
Try selecting only what you need in your query, instead of the SELECT *
Why not to use SELECT * ( ALL ) in MySQL
Try putting conditions, concerning joined tables into ON clauses:
SELECT * FROM questions q
INNER JOIN categories c ON (c.id = q.category_id AND c.lft > 1 AND c.rgt < 100)
ORDER BY q.created_at DESC, q.votes_up DESC, q.votes_down ASC
LIMIT 4000, 20