Optimize sorting query - mysql

i couldn't find a way to optimize the following Query:
SELECT *
FROM tbl
WHERE type='51' AND `start`<='2012-01-19'
ORDER BY end DESC
LIMIT 5
I've tried by indexing each column in a separate index (type,start,end), and all of them in the same index, but MySQL keeps telling me that needs to do a filesort
Is this query just impossible to optimize?

Yes, as long you have range comparison in WHERE and sort by another field - mysql cannot use index for sorting.
It could if you had WHERE type='51' ANDstart='2012-01-19' ORDER BY end DESC or WHERE type='51' ANDstart<= '2012-01-19' ORDER BY start DESC
http://dev.mysql.com/doc/refman/5.5/en/order-by-optimization.html -- and here is a chapter relevant to your problem

It depends greatly on the column types and what's all in the table, but all you should really need are indexes on type, start, and end columns.
For an extra boost, you can make an index across type and start.

My first thought is (*) is expanding into rows that are not indexed.
If not, you will most certainly benefit from a Multiple-Column Index.
I would experiment and learn to see how to create the order of columns in the index definition with the lowest cardinality (approx unique combinations) of the columns.

Thanks for all your answers, it made me learn a lot about the issue.
And finally i got it! Looks like this is actually possible to Optimize, or at least to remove that using filesort in the EXPLAIN sentence.
Here's the indexes i used:
KEY `start` (`start`),
KEY `typeend` (`type`,`end`)
Now executing:
EXPLAIN SELECT *
FROM tbl
WHERE type='51' AND `start`<='2012-01-19'
ORDER BY end
DESC LIMIT 5
Leads to:
SIMPLE tbl ref start,type,typeend typeend 5 const 19 Using where

I would recommend the following:
add index (type, end, start)
rewrite the query:
SELECT *
FROM (
SELECT id -- `id` is the primary key
FROM tbl
WHERE type='51' AND `start`<='2012-01-19'
ORDER BY end DESC
LIMIT 5) as ids
JOIN tbl
USING (id); -- `id` is the primary key

Related

MySQL table with composite primary key LIMIT 1 is very slow while Limit 2 is fast

I have ProductInfo table with following indexes -
Primary: ProductCode, Model, Added_Date, id
Index: id
The composite primary key are the columns I use in the following query
SELECT * FROM ProductInfo WHERE
ProductCode='45678' AND
Model='PQA-1' AND
(Added_Date >='2021-08-01 00:00:00' AND Added_Date <='2021-08-14 23:59:59')
ORDER BY Added_Date ASC;
This query works pretty fine
Problem
The following query is fast
select * from ProductInfo WHERE ProductCode="45678" order by id desc limit 1;
Here is the explain
But the following query is very very slow. Please note that query is same but just ProductCode is different
select * from ProductInfo WHERE ProductCode="78342" order by id desc limit 1;
Here is the explain.
However, same query with Limit 2 is fast
select * from ProductInfo WHERE ProductCode="78342" order by id desc limit 2;
Here is the explain
What is the cause? Is my indexing correct? What will be the solution?
Thanks
The first query benefits from this index:
PRIMARY KEY(ProductCode, Model, Added_Date)
All the other queries could not fully benefit from either of your indexes. The Optimizer guessed at one index in one case; the other index in the other. This would work well for both cases:
INDEX(ProductCode, Id)
One reason for the timing differences is the distribution of data in the table. Your examples may not show the same timings if you change the 78342 to some other value.
The new index I recommend will make those queries "fast" regardless of the ProductCode searched on. And "Rows" will say "1" or "2" instead of about "25000".
It sounds like there are about 25K rows with ProductCode = 78342.

Database: Sorting on non-indexed column

Does creating index on a column which is there in sort order improves performance? I need to know this for Mysql, Postgresql and Oracle database.
E.g query:
select * from article_comments where article_id = 245 order by date_created desc limit 30;
In article_comments table, article_id is an indexed field, where as date_created is not.
If we create an index on date_created, will it improve performance.
Table size is around 5-7 million rows.
Does creating index on a column which is there in sort order improves
performance?
A general answer is: it depends on many factors, esspecialy on conditions used in WHERE clause.
In case of your query an index on date_created column doesn't eliminate a sort operation due to where article_id = 245 clause.
In this case you need to create a multicolum index in order to skip sorting:
CREATE INDEX somename ON article_comments(article_id, date_created DESC)
I can tell only for Oracle. Yes an Index can make your sorting faster.
In this example Function-Based Index for Language-Dependent Sorting Oracle shows an index which intention is only to improve performance of an ORDER BY operation.
Another example regarding sorting is shown here: Full Index Scan

mysql order by -id vs order by id desc

I wish to fetch the last 10 rows from the table of 1 M rows.
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`updated_date` datetime NOT NULL,
PRIMARY KEY (`id`)
)
One way of doing this is -
select * from test order by -id limit 10;
**10 rows in set (0.14 sec)**
Another way of doing this is -
select * from test order by id desc limit 10;
**10 rows in set (0.00 sec)**
So I did an 'EXPLAIN' on these queries -
Here is the result for the query where I use 'order by desc'
EXPLAIN select * from test order by id desc limit 10;
And here is the result for the query where I use 'order by -id'
EXPLAIN select * from test order by -id limit 10;
I thought this would be same but is seems there are differences in the execution plan.
RDBMS use heuristics to calculate the execution plan, they cannot always determine the semantic equivalence of two statements as it is a too difficult problem (in terms of theoretical and practical complexity).
So MySQL is not able to use the index, as you do not have an index on "-id", that is a custom function applied to the field "id". Seems trivial, but the RDBMSs must minimize the amount of time needed to compute the plans, so they get stuck with simple problems.
When an optimization cannot be found for a query (i.e. using an index) the system fall back to the implementation that works in any case: a scan of the full table.
As you can see in Explain results,
1 : order by id
MySQL is using indexing on id. So it need to iterate only 10 rows as it is already indexed. And also in this case MySQL don't need to use filesort algorithm as it is already indexed.
2 : order by -id
MySQL is not using indexing on id. So it needs to iterate all the rows.( e.g. 455952) to get your expected results. In this case MySQL needs to use filesort algorithm as id is not indexed. So it will obviously take more time :)
You use ORDER BY with an expression that includes terms other than the key column name:
SELECT * FROM t1 ORDER BY ABS(key);
SELECT * FROM t1 ORDER BY -key;
You index only a prefix of a column named in the ORDER BY clause. In this case, the index cannot be used to fully resolve the sort order. For example, if you have a CHAR(20) column, but index only the first 10 bytes, the index cannot distinguish values past the 10th byte and a filesort will be needed.
The type of table index used does not store rows in order. For example, this is true for a HASH index in a MEMORY table.
Please follow this link: http://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html

Do Mysql sub queries use indexes too?

I have a table that I do fulltext searching on. It's starting to get big already with a relatively small amount of users - 20 million rows
Searches will only ever need to be on rows that belong to the PKs relevant to the search ie rows that belong to that user, and at most, that's about 200 000 per user. I figured if the fulltext search was only done on a subquery that first selects that user's rows, it should be super fast eg
SELECT * FROM
(SELECT * FROM table1 WHERE userID = 2 ) AS r
WHERE MATCH (r.fullTextCol1) AGAINST ('+monkey* ' IN BOOLEAN MODE)
ORDER BY r.fullTextCol1, r.fullTextCol2 ASC LIMIT 0,50
However, this query takes 4 seconds.
EXPLAIN says...
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 185927 Using where; Using filesort
2 DERIVED table1 ref PRIMARY,unique unique 4 193082
My indexes are:
PRIMARY (userID, userSubList, userItemID)
FULLTEXT fullTextCol1
FULLTEXT fullTextCol2
The subquery seems to not use the userID index at all.
Is my thinking right in approaching it like this - sub selecting the relevent user row to search on?
Thanks for your time and help.
Have you tried like this? :
SELECT *
FROM table1
WHERE userID = 2
AND MATCH (fullTextCol1) AGAINST ('+monkey* ' IN BOOLEAN MODE)
ORDER BY fullTextCol1, fullTextCol2 ASC LIMIT 0,50;
Or run without ORDER BY to check JOIN is slow or ORDERing is slow (or mixed)
EDIT
In your case, composite index on (userID, fullTextCol1) is needed but MySQL doesn't have it. Another already answered about this. see Compound FULLTEXT index in MySQL
please, let me know above answer makes sense and it's result.

Can MySQL use index in a RANGE QUERY with ORDER BY?

I have a MySQL table:
CREATE TABLE mytable (
id INT NOT NULL AUTO_INCREMENT,
other_id INT NOT NULL,
expiration_datetime DATETIME,
score INT,
PRIMARY KEY (id)
)
I need to run query in the form of:
SELECT * FROM mytable
WHERE other_id=1 AND expiration_datetime > NOW()
ORDER BY score LIMIT 10
If I add this index to mytable:
CREATE INDEX order_by_index
ON mytable ( other_id, expiration_datetime, score);
Would MySQL be able to use the entire order_by_index in the query above?
It seems like it should be able to, but then according to MySQL's documentation: "The index can also be used even if the ORDER BY does not match the index exactly, as long as all of the unused portions of the index and all the extra ORDER BY columns are constants in the WHERE clause."
The above passage seems to suggest that index would only be used in a constant query while mine is a range query.
Can anyone clarify if index would be used in this case? If not, any way I could force the use of index?
Thanks.
MySQL will use the index to satisfy the where clause, and will use a filesort to order the results.
It can't use the index for the order by because you are not comparing expiration_datetime to a constant. Therefore, the rows being returned will not always all have a common prefix in the index, so the index can't be used for the sort.
For example, consider a sample set of 4 index records for your table:
a) [1,'2010-11-03 12:00',1]
b) [1,'2010-11-03 12:00',3]
c) [1,'2010-11-03 13:00',2]
d) [2,'2010-11-03 12:00',1]
If I run your query at 2010-11-03 11:00, it will return rows a,c,d which are not consecutive in the index. Thus MySQL needs to do the extra pass to sort the results and can't use an index in this case.
Can anyone clarify if index would be used in this case? If not, any way I could force the use of index?
You have a range in filtering condition and the ORDER BY not matching the range.
These conditions cannot be served with a single index.
To choose which index to create, you need to run these queries
SELECT COUNT(*)
FROM mytable
WHERE other_id = 1
AND (score, id) <
(
SELECT score, id
FROM mytable
WHERE other_id = 1
AND expiration_datetime > NOW()
ORDER BY
score, id
LIMIT 10
)
and
SELECT COUNT(*)
FROM mytable
WHERE other_id = 1
AND expiration_datetime >= NOW()
and compare their outputs.
If the second query yields about same or more values as the first one, then you should use an index on (other_id, score) (and let it filter on expiration_datetime).
If the second query yields significantly less values than the first one, you should use an index on (other_id, expiration_datetime) (and let it sort on score).
This article might be interesting to you:
Choosing index
Sounds like you've already checked the documentation and setup the index. Use EXPLAIN and see...
EXPLAIN SELECT * FROM mytable
WHERE other_id=1 AND expiration_datetime > NOW()
ORDER BY score LIMIT 10