mysql OrderBy not using index - mysql

I am using mysql.
And create index on 'playCount' 'desc' in table D.
However, it did not apply.
So, I create index on 'aId ASC, playCount DESC' in table D.
But, it did not apply too.
Order by is so slow, please tell me how to create an index on my code.
explain SELECT `A`.`id` AS `id`, `A`.`title` AS `title`, `A`.`img` AS `img`
FROM `A` `A`
INNER JOIN `B` `B` ON `B`.`aId`=`A`.`id`
INNER JOIN `C` `C` ON `C`.`id`=`B`.`cId`
LEFT JOIN `D` `D` ON `D`.`aId`=`A`.`id`
GROUP BY `A`.`id`
ORDER BY `D`.`playCount` DESC
LIMIT 10;

There may be at least 2 reasons why the ORDER BY may be ignored.
That query will be performed this way:
Join together all valid combinations (based on the ONs) of rows among those tables. This generates a potentially large temp table. This temp table will include a bunch of columns -- title, etc.
Perform the GROUP BY. This probably requires sorting the temp table above. This may shrink back down to a smaller temp table. Since this GROUP BY must be performed before the ORDER BY, no index relating to the ORDER BY can help.
Sort (again) to get the desired ORDER BY.
Deliver the first 10 rows. This effectively tosses any bulky thing (title?) (except for the first 10) that had been carried around since step 1.
If there were a WHERE clause, the addition of an INDEX might help.
INDEX(aId ASC, playCount DESC) -- Well, I need to ask what version of MySQL you are using. Mixing ASC and DESC has always been allowed, and the sorting has always worked correctly. But DESC has been ignored in the index until version 8.0. (Still, as I have already pointed out, the index cannot be used.)
If you want to discuss this further, please provide SHOW CREATE TABLE for each table, EXPLAIN SELECT ..., the approximate size of each table, and whether the tables are related 1:1 or many:many or many:1.

Related

About mysql query with inner Join content

I'm a beginner in php and I want to ask you if the query and table schema I have set up is the right way for performance. Note: If you want me to follow a different way, please provide sample for me, thanks
$digerilanlar = DB::get('
SELECT Count(siparisler.hid) AS siparissayisi,
siparisler.hid, ilanlar.id, ilanlar.seflink, ilanlar.kategori, ilanlar.baslik,
ilanlar.yayin, ilanlar.tutar, ilanlar.sure, ilanlar.onecikan, ilanlar.guncellemetarihi,
uyeler.nick, uyeler.foto, uyeler.online, uyeler.ban FROM ilanlar
inner join uyeler ON uyeler.id=ilanlar.ilansahibi
LEFT JOIN siparisler ON ilanlar.id = siparisler.hid
WHERE ilanlar.kategori= '.$kat->id.' and ilanlar.yayin=1 and uyeler.ban=0
GROUP BY ilanlar.id
ORDER BY guncellemetarihi DESC
LIMIT 0,12');
DATABASE DESİGN
Table engine MyISAM MYSQL versiyon 5.7.14
TABLE:İLANLAR
ilansahibi (int)= index
kategori (int)= index
yayin (int)= index
TABLE:UYELER
ban (int)= index
TABLE:SİPARİSLER
hid (int)= index
This will probably require two temp tables and two sorts:
GROUP BY ilanlar.id
ORDER BY guncellemetarihi DESC
Assuming that guncellemetarihi is update_date, this is not identical, but probably gives you what you want, but with only one temp table and sort:
GROUP BY guncellemetarihi, id
ORDER BY guncellemetarihi DESC, id DESC
COUNT(x) checks x for being NOT NULL. If that is not necessary, simply do COUNT(*).
SELECT COUNT(hid), hid
does not make sense. The COUNT implies that there may be multiple "hids", but hid implies that there is only one. (Since I don't understand to objective, I cannot advise which direction to change things.)
This composite INDEX may help:
ilanlar: INDEX(kategori, yayin, ilansahibi, id)
You should switch from ENGINE=MyISAM to ENGINE=InnoDB.
More on making indexes: Index Cookbook
To discuss further, please provide SHOW CREATE TABLE and EXPLAIN SELECT ...

Why can't MySQL use an index for ORDER BY if ASC and DESC are mixed?

Let's say I have a very simple table like this:
CREATE TABLE `t1` (
`key_part1` INT UNSIGNED NOT NULL,
`key_part2` INT UNSIGNED NOT NULL,
`value` TEXT NOT NULL,
PRIMARY KEY (`key_part1`, `key_part2`)
) ENGINE=InnoDB
Using this table, I want to issue a query like this:
SELECT *
FROM `t1`
ORDER BY `key_part1` ASC, `key_part2` DESC
LIMIT 1
I had hoped that the ORDER BY in this query would get satisfied by the index. However, according to the MySQL documentation:
In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that match the WHERE clause. These cases include the following:
You mix ASC and DESC:
SELECT * FROM t1 ORDER BY key_part1 DESC, key_part2 ASC;
I tried a query similar to the above query and as expected, the EXPLAIN output says that such a query does a filesort. This doesn't totally make sense to me because I can do the following:
SELECT *
FROM `t1`
WHERE `key_part1` = (
SELECT `key_part1`
FROM `t1`
ORDER BY `key_part1` ASC
LIMIT 1
)
ORDER BY `key_part2` DESC
LIMIT 1
When I EXPLAIN this, it says both the subquery and the outer query do not use a filesort. Furthermore, I tried this kind of trick big table I have with a similar structure and found that it speeds up my query by 3 orders of magnitude.
My questions are
Are the two queries I show here equivalent? They seem like they are, but I may be missing something. If they are not, what kind of data would I need to have in my table to cause them to give different results?
Is there a reason that MySQL can't do this optimization trick on it's own, or is this just a case of an optimization that is possible, but just hasn't been written into MySQL?
If it matters, I am using MySQL 5.6.22.
Further clarification:
By "equivalent" I mean "produce the same result". Additionally, I am very aware that if I were to change LIMIT 1 to LIMIT 2 or something, the queries would no longer produce the same results. I am not interested in those cases, only in the case with LIMIT 1.
It's not that MySQL is missing an optimization "trick", it's a property of how compound indexes work. MySQL can only do an index scan in one direction at a time, and has to go with how the index is ordered (so it can do computer-sciency things like binary search, etc).
Let's look at your sample query:
SELECT *
FROM t1
WHERE key_part1 = (
SELECT key_part1
FROM t1
ORDER BY key_part1 ASC
LIMIT 1
)
ORDER BY key_part2 DESC
LIMIT 1
This can order on key_part2 because all returned rows will have an identical key_part1. So basically mysql can ignore that part of the index; it is functionally identical to ORDER BY key_part1 DESC, key_part2 DESC. The direction of the ORDER BY in your subquery is irrelevant because it was in a subquery.
Edit
To be clear, your example query really looks like this:
SELECT *
FROM t1
WHERE key_part1 = #{some value}
ORDER BY key_part2 DESC
LIMIT 1
Where #{some value} is the result of a subselect. It should be clear now why this sort doesn't need a filesort, because you are not sorting by key_part1 at all. In fact, there is no need to, because all returned rows will have identical key_part1.

Optimizing primary id relation table

tag_relation table has tag_id and comment_id fields only and both of them are indexed. (there is no primary) it has InnoDB type.
Following query takes long time to execute. How can I make it faster?
All comment_id, tag_id, status, datetime fields are indexed. I really have no idea how to optimize it further.
SELECT
text
FROM comment
INNER JOIN tag_relation
ON tag_relation.comment_id=comment.comment_id
WHERE tag_id='1022278'
AND status=1
ORDER BY comment.datetime DESC LIMIT 0,20
Main cause of slowness is tag_relation table which has 1.5 million records. When it has less records execution time was faster.
Query plan:
This is your query:
SELECT c.text
FROM comment c INNER JOIN
tag_relation tr
ON tr.comment_id = c.comment_id
WHERE t.tag_id = 1022278 AND c.status = 1
ORDER BY c.datetime DESC
LIMIT 0, 20;
First, notice that I removed the single quotes from the value 1022278. If this is really a number, the single quotes can sometimes confuse SQL optimizers. There are two ways to go about optimizing this query, depending on the selectivity of the various conditions. The first is to have the indexes:
tag_relation(tag_id, comment_id)
comment(comment_id, status, datetime, text)
The second is a covering index for comments, and the most important part is the comment_id column.
The second is:
comment(status, comment_id, datetime)
tag_relation(comment_id, tag_id)
The basic issue is which table is scanned first for the join. Using this index, the query would be processed as:
SELECT c.text
FROM comment c INNER JOIN
tag_relation
WHERE c.status = 1 AND
EXISTS (SELECT 1
FROM tag_relation tr
WHERE tr.comment_id = c.comment_id AND tr.tag_id = 1022278
)
ORDER BY c.datetime DESC
LIMIT 0, 20;
I'm not 100% sure that this avoids the file sort on the result set, but it might work.
If I get it right you have one index for tag_id and another index for comment_id. Try creating an composite index like:
create index ... on tag_relation(tag_id, comment_id)
This will make the index with tag_id redundant so it can be dropped.
AFAIK MySQL cannot do index anding, but even if it could a composite index would be more efficient.
I think the problem is in the "status" field. Although it is indexed, the index is not being used. It says "using where" for that table. You can force the use of the index for status but I'm not sure it will be useful, depending on selectivity, i.e., how many different values can "status" take. Alternatively, the documentation says that if "status" allows for NULL then you'll see the "using where". Does it allow for NULLs? If so, consider restricting it.
I just noticed that I overlooked the "ORDER BY", comment.datetime will need an index.
If you already have an index, then try a subquery:
SELECT text
FROM tag_relation
INNER JOIN (SELECT c.comment_id, c.text, c.datetime
FROM comment c
WHERE c.status = 1) comment
ON tag_relation.comment_id = comment.comment_id
WHERE tag_id='1022278'
ORDER BY comment.datetime DESC LIMIT 0,20

Why does the query take a long time in mysql even with a LIMIT clause?

Say I have an Order table that has 100+ columns and 1 million rows. It has a PK on OrderID and FK constraint StoreID --> Store.StoreID.
1) select * from 'Order' order by OrderID desc limit 10;
the above takes a few milliseconds.
2) select * from 'Order' o join 'Store' s on s.StoreID = o.StoreID order by OrderID desc limit 10;
this somehow can take up to many seconds. The more inner joins I add, slows it down further more.
3) select OrderID, column1 from 'Order' o join 'Store' s on s.StoreID = o.StoreID order by OrderID desc limit 10;
this seems to speed the execution up, by limiting the columns we select.
There are a few points that I dont understand here and would really appreciate it if anyone more knowledgeable with mysql (or rmdb query execution in general) can enlighten me.
Query 1 is fast since it's just a reverse lookup by PK and DB only needs to return the first 10 rows it encountered.
I don't see why Query 2 should take for ever. Shouldn't the operation be the same? i.e. get the first 10 rows by PK and then join with other tables. Since there's a FK constraint, it is guaranteed that the relationship will be satisfied. So DB doesn't need to join more rows than necessary and then trim the result, right? Unless, FK constraint allows null FK? In which case I guess a left join would make this much faster than an inner join?
Lastly, I'm guess query 3 is simply faster because less columns are used in those unnecessary joins? But why would the query execution need the other columns while joining? Shouldn't it just join using PKs first, and then get the columns for just the 10 rows?
Thanks!
My understanding is that the mysql engine applies limit after any join's happen.
From http://dev.mysql.com/doc/refman/5.0/en/select.html, The HAVING clause is applied nearly last, just before items are sent to the client, with no optimization. (LIMIT is applied after HAVING.)
EDIT: You could try using this query to take advantage of the PK speed.
select * from (select * from 'Order' order by OrderID desc limit 10) o
join 'Store' s on s.StoreID = o.StoreID;
All of your examples are asking for tablescans of the existing tables, so none of them will be more or less performant than the degree to which mysql can cache the data or results. Some of your queries have order by or join criteria, which can take advantage of indexes purely to make the joining process more efficient, however, that still is not the same as having a set of criteria that will trigger the use of indexes.
Limit is not a criteria -- it can be thought of as filtration once a result set is determined. You save time on the client, once the result set is prepared, but not on the server.
Really, the only way to get the answers you are seeking is to become familiar with:
EXPLAIN EXTENDED your_sql_statement
The output of EXPLAIN will show you how many rows are being looked at by mysql, as well as whether or not any indexes are being used.

I'm not sure if I have the correct indexes or if I can improve the speed of my query in MySQL?

My query has a join, and it looks like it's using two indexes which makes it more complicated. I'm not sure if I can improve on this, but I thought I'd ask.
The query produces a list of records with similar keywords the record being queried.
Here's my query.
SELECT match_keywords.padid,
COUNT(match_keywords.word) AS matching_words
FROM keywords current_program_keywords
INNER JOIN keywords match_keywords
ON match_keywords.word = current_program_keywords.word
WHERE match_keywords.word IS NOT NULL
AND current_program_keywords.padid = 25695
GROUP BY match_keywords.padid
ORDER BY matching_words DESC
LIMIT 0, 11
The EXPLAIN
Word is varchar(40).
You can start by trying to remove the IS NOT NULL test, which is implicitly removed by COUNT on the field. It also looks like you would want to omit 25695 from match_keywords, otherwise 25695 (or other) would surely show up as the "best" match within your 11 row limit?
SELECT match_keywords.padid,
COUNT(match_keywords.word) AS matching_words
FROM keywords current_program_keywords
INNER JOIN keywords match_keywords
ON match_keywords.word = current_program_keywords.word
WHERE current_program_keywords.padid = 25695
GROUP BY match_keywords.padid
ORDER BY matching_words DESC
LIMIT 0, 11
Next, consider how you would do it as a person.
You would to start with a padid (25695) and retrieve all the words for that padid
From those list of words, go back into the table again and for each matching word,
get their padid's (assumed to have no duplicate on padid + word)
group the padid's together and count them
order the counts and return the highest 11
With your list of 3 separate single-column indexes, the first two steps (both involve only 2 columns) will always have to jump from index back to data to get the other column. Covering indexes may help here - create two composite indexes to test
create index ix_keyword_pw on keyword(padid, word);
create index ix_keyword_wp on keyword(word, padid);
With these composite indexes in place, you can remove the single-column indexes on padid and word since they are covered by these two.
Note: You always have to temper SELECT performance against
size of indexes (the more you create the more to store)
insert/update performance (the more indexes, the longer it takes to commit since it has to update the data, then update all indexes)
Try the following... ensure index on PadID, and one on WORD. Then, by changing the order of the SELECT WHERE qualifier should optimize on the PADID of the CURRENT keyword first, then join to the others... Exclude a join to itself. Also, since you were checking on equality on the inner join to matching keywords... if the current keyword is checked for null, it should never join to a null value, thus eliminating a compare on the MATCH keywords alias as looking at every comparison as looking for NULL...
SELECT STRAIGHT_JOIN
match_keywords.padid,
COUNT(*) AS matching_words
FROM
keywords current_program_keywords
INNER JOIN keywords match_keywords
ON match_keywords.word = current_program_keywords.word
and match_keywords.padid <> 25695
WHERE
current_program_keywords.padid = 25695
AND current_program_keywords.word IS NOT NULL
GROUP BY
match_keywords.padid
ORDER BY
matching_words DESC
LIMIT
0, 11
You should index the following fields (check to what table corresponds)
match_keyword.padid
current_program_keywords.padid
match_keyword.words
current_program_keywords.words
Hope it helps accelerate