Please, help me!
How to optimize a query like:
SELECT idu
FROM `user`
WHERE `username`!='manager'
AND `username`!='user1#yahoo.com'
ORDER BY lastdate DESC
This is the explain:
explain SELECT idu FROM `user` WHERE `username`!='manager' AND `username`!='ser1#yahoo.com' order by lastdate DESC;
+----+-------------+-------+------+----------------------------+------+---------+------+--------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------+------+---------+------+--------+-----------------------------+
| 1 | SIMPLE | user | ALL | username,username-lastdate | NULL | NULL | NULL | 208478 | Using where; Using filesort |
+----+-------------+-------+------+----------------------------+------+---------+------+--------+-----------------------------+
1 row in set (0.00 sec)
To avoid file sorting in a big database.
Since this query is just scanning all rows, you need an index on lastdate to avoid MySQL from having to order the results manually (using filesort, which isn't always to disk/temp table).
For super read performance, add the following multi-column "covering" index:
user(lastdate, username, idu)
A "covering" index would allow MySQL to just scan the index instead of the actual table data.
If using InnoDB and any of the above columns are your primary key, you don't need it in the index.
Related
I have a table like this (details elided for readability):
CREATE TABLE UserData (
id bigint NOT NULL AUTO_INCREMENT,
userId bigint NOT NULL DEFAULT '0', ...
c6 int NOT NULL DEFAULT '0', ...
hidden int NOT NULL DEFAULT '0', ...
c22 int NOT NULL DEFAULT '0', ...
PRIMARY KEY (id), ...
KEY userId_hidden_c6_c22_idx (userId,hidden,c6,c22), ...
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3
and was happily doing queries on it like this in MySQL 5.7:
mysql> select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----
| ...
+----
10 rows in set (0.03 sec)
However, in MySQL 8.0 these queries started doing this:
mysql> select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----
| ...
+----
10 rows in set (1.56 sec)
Explain shows the following, 5.7:
mysql> explain select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+---------------------------------------+
| 1 | SIMPLE | UserData | NULL | ref | userId_hidden_c6_c22_idx | userId_hidden_c6_c22_idx | 12 | const,const | 78062 | 100.00 | Using index condition; Using filesort |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+---------------------------------------+
8.0:
mysql> explain select * from UserData use index (userId_hidden_c6_c22_idx) where (userId = 123 AND hidden = 0) order by id DESC limit 10 offset 0;
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+----------------+
| 1 | SIMPLE | UserData | NULL | ref | userId_hidden_c6_c22_idx | userId_hidden_c6_c22_idx | 12 | const,const | 79298 | 100.00 | Using filesort |
+----+-------------+----------+------------+------+---------------+---------------+---------+-------------+-------+----------+----------------+
The main difference seems to be that 5.7 is Using index condition; Using filesort and 8.0 is only Using filesort.
Why did 8.0 stop using the index condition, and how can I get it to start using it?
EDIT: Why did performance drop 10-100x with MySQL 8.0? It looks like it's because it stopped using the Index Condition Pushdown Optimization - how can I get it to start using it?
The table has ~150M rows in it, and that user has ~75k records, so I guess it could be a change in the size-based heuristics or whatever goes into the MySQL decision making?
In the EXPLAIN you show, the type column is ref and the key column names the index, which indicates it is using that index to optimize the lookup.
You are making an incorrect interpretation of what "index condition" means in the extra column. Admittedly it does sound like "using the index" versus not using the index if that note is absent.
The note about "index condition" is referring to Index Condition Pushdown, which is not related to using the index, but it's about delegating other conditions to be filtered at the storage engine level. Read about it here: https://dev.mysql.com/doc/refman/8.0/en/index-condition-pushdown-optimization.html
It's unfortunate that the notes reported by EXPLAIN are so difficult to understand. You really have to study a lot of documentation to understand how to read those notes.
This would be much faster in either version because it would stop after 10 rows. That is, the "filesort" would be avoided.
INDEX(userId, hidden, id)
This won't do "Using index" (aka "covering"), but neither did your attempts. That is different from "Using index condition" (aka "ICP", as you point out).
Try these to get more insight:
EXPLAIN FORMAT_JSON SELECT ...
EXPLAIN ANALYZE SELECT ...
(No, I cannot explain the regression.)
I have the following MYSQL query:
SELECT statusdate,consigneenamee,weight,productcode,
pieces,statustime,statuscode,statusdesc,generatingiata,
shipmentdate,shipmenttime,consigneeairportcode,signatory
FROM notes
where (shipperref='180908184' OR shipperref='180908184'
OR shipperref='180908184' OR shipperref='180908184 '
OR shipperref like 'P_L_%180908184')
order by edicheckpointdate asc, edicheckpointtime asc;
I added an index to speed up this query using the MYSQL Workbench but when executing the EXPLAIN command, it still does not show the key and shows as NULL:
+----+-------------+---------------+------+---------------+------+---------+------+---------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+------+---------------+------+---------+------+---------+-----------------------------+
| 1 | SIMPLE | dhltracking_2 | ALL | index2 | NULL | NULL | NULL | 3920874 | Using where; Using filesort |
+----+-------------+---------------+------+---------------+------+---------+------+---------+-----------------------------+
Any reason why this is happening and how I can speed up this query?
My Index:
You have LIKE statement in your query and I think that your index spans on more than 20-30% of table rows (or more..) and that's why MySQL can ignore it for performance reasons.
My proposal:
Add FORCE INDEX as #Solarflare proposes
Use FULLTEXT index (works on CHAR and VARCHAR also) and use MATCH ... AGAINST search (https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html)
Sorry that this is such a specific and probably cliche question, but it is really causing me major problems.
Everyday I have to do several hundred thousands select statements that look like these two (this is one example but they're all pretty much the same just with different word1):
SELECT pibn,COUNT(*) AS aaa FROM research_storage1
USE INDEX (word21pibn)
WHERE word1=270299 AND word2=0
GROUP BY pibn
ORDER BY aaa DESC
LIMIT 1000;
SELECT pibn,page FROM research_storage1
USE INDEX (word12num)
WHERE word1=270299 AND word2=0
ORDER BY num DESC
LIMIT 1000;
The first statement is quick-as-a-flash and takes a fraction of a second. The second statement takes about 2 seconds, which is way too long considering I have hundreds of thousands to do.
The indexes are:
word21pibn: word2, word1, pibn
word12num: word1, word2, num
The results of explain (for both extended and partitions are):
mysql> explain extended SELECT pibn,COUNT(*) AS aaa FROM research_storage1 USE INDEX (word21pibn) WHERE word1=270299 AND word2=0 GROUP BY pibn ORDER BY aaa DESC LIMIT 1000;
+----+-------------+-------------------+------+---------------+------------+---------+-------------+------+----------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------+------+---------------+------------+---------+-------------+------+----------+-----------------------------------------------------------+
| 1 | SIMPLE | research_storage1 | ref | word21pibn | word21pibn | 6 | const,const | 1549 | 100.00 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+-------------------+------+---------------+------------+---------+-------------+------+----------+-----------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain partitions SELECT pibn,COUNT(*) AS aaa FROM research_storage1 USE INDEX (word21pibn) WHERE word1=270299 AND word2=0 GROUP BY pibn ORDER BY aaa DESC LIMIT 1000;
+----+-------------+-------------------+------------+------+---------------+------------+---------+-------------+------+-----------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------+------------+------+---------------+------------+---------+-------------+------+-----------------------------------------------------------+
| 1 | SIMPLE | research_storage1 | p99 | ref | word21pibn | word21pibn | 6 | const,const | 1549 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+-------------------+------------+------+---------------+------------+---------+-------------+------+-----------------------------------------------------------+
1 row in set (0.00 sec)
mysql> explain extended SELECT pibn,page FROM research_storage1 USE INDEX (word12num) WHERE word1=270299 AND word2=0 ORDER BY num DESC LIMIT 1000;
+----+-------------+-------------------+------+---------------+-----------+---------+-------------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------+------+---------------+-----------+---------+-------------+------+----------+-------------+
| 1 | SIMPLE | research_storage1 | ref | word12num | word12num | 6 | const,const | 818 | 100.00 | Using where |
+----+-------------+-------------------+------+---------------+-----------+---------+-------------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain partitions SELECT pibn,page FROM research_storage1 USE INDEX (word12num) WHERE word1=270299 AND word2=0 ORDER BY num DESC LIMIT 1000;
+----+-------------+-------------------+------------+------+---------------+-----------+---------+-------------+------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------+------------+------+---------------+-----------+---------+-------------+------+-------------+
| 1 | SIMPLE | research_storage1 | p99 | ref | word12num | word12num | 6 | const,const | 818 | Using where |
+----+-------------+-------------------+------------+------+---------------+-----------+---------+-------------+------+-------------+
1 row in set (0.00 sec)
The only difference I see is that the second statement does not have Using index in the extra column of describe. Though this does not make sense because the index was designed for that statement, so I don't see why it would not be used.
Any idea?
Try adding the pbin and page column to the word12num compound index. Then all the information you need for your query will be in the index, like it is in your first query.
Edit I missed the pbin column you're selecting; sorry about that.
If your compound index turns out to contain (word1, word2, num, pbin, page) then everything in your second query can come from the index.
If you look at the Extra column under your first query's EXPLAIN, one of the blurbs in there is Using index. #sebas pointed this out. This means, actually, Using index only. This means the server can satisfy your query by just consulting the index without having to consult the table. That's why it is so fast: the server doesn't have to bang the disk heads around random-accessing the table to get the extra columns. Using index is not present in your second query's EXPLAIN.
The columns mentioned in WHERE come first. Then we have the columns in ORDER BY. Finally we have the columns you're simply SELECTing. Why use this particular order for columns in the index? The server finds its way to the first index entry matching the SELECT, then can read the index sequentially to satisfy the query.
It is indeed expensive to construct and maintain a compound index on a big table. You are looking at a basic tradeoff in DBMS design: do you want to spend time constructing the table or looking things up in it? Only you know whether it's better to incur the cost when building the table or when looking things up in it.
Which Tables and columns should have indexes? I have an index on flow_permanent_id and entry_id, but it seems to be using filesort?
What can I do to optimize this?
mysql> explain SELECT COUNT(*) AS count_all, entry_id AS entry_id FROM `votes` WHERE `votes`.`flow_permanent_id` = '4fab490cdc1c82cfa800000a' GROUP BY entry_id;
+----+-------------+-------+------+----------------------------------+----------------------------------+---------+-------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------------+----------------------------------+---------+-------+------+----------------------------------------------+
| 1 | SIMPLE | votes | ref | index_votes_on_flow_permanent_id | index_votes_on_flow_permanent_id | 74 | const | 1 | Using where; Using temporary; Using filesort |
+----+-------------+-------+------+----------------------------------+----------------------------------+---------+-------+------+----------------------------------------------+
1 row in set (0.00 sec)
To avoid the filesort, you'll want a composite index on (flow_permanent_id, entry_id) so that MySQL can use the index for both the WHERE and the GROUP BY.
First of all - use COUNT(1) instead of COUNT( * ), in some cases that may improve performance. Never use COUNT(*).
Second: it looks like you index is used, it's listed in the 'key' column of EXPLAIN output.
Your "GROUP BY" is the thing that's causing filesort. Usually it's ORDER BY that's the culprit, but I've seen GROUP BY can do that as well.
Hope this helps.
Please consider following schema
CREATE table articles (
id Int UNSIGNED NOT NULL AUTO_INCREMENT,
cat_id Int UNSIGNED NOT NULL,
status Int UNSIGNED NOT NULL,
date_added Datetime,
Primary Key (id)) ENGINE = InnoDB;
CREATE INDEX cat_list_INX ON articles (cat_id, status, date_added);
CREATE INDEX categories_list_INX ON articles (cat_id, status);
I have written following two queries which can be completely satisfied by the above two indicies but MySQL is putting where in extra column.
mysql> EXPLAIN SELECT cat_id FROM articles USE INDEX (cat_list_INX) WHERE cat_id=158 AND status=2 ORDER BY date_added DESC LIMIT 500, 5;
+----+-------------+----------+------+---------------+--------------+---------+-------------+-------+--------------------------+
| id | select_type | table ref | | type | possible_keys | key | key_len | rows | Extra |
+----+-------------+----------+------+---------------+--------------+---------+-------------+-------+--------------------------+
| 1 | SIMPLE | articles | ref | cat_list_INX | cat_list_INX | 5 | const,const | 50698 | Using where; Using index |
+----+-------------+----------+------+---------------+--------------+---------+-------------+-------+--------------------------+
mysql> EXPLAIN SELECT cat_id FROM articles USE INDEX (categories_list_INX) WHERE cat_id=158 AND status=2;
+----+-------------+----------+------+---------------------+---------------------+---------+-------------+-------+--------------------------+
| id | select_type | tab key |le | type | possible_keys | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------------+---------------------+---------+-------------+-------+--------------------------+
| 1 | SIMPLE | articles | ref | categories_list_INX | categories_list_INX | 5 | const,const | 52710 | Using where; Using index |
+----+-------------+----------+------+---------------------+---------------------+---------+-------------+-------+--------------------------+
As far as I know where requires an additional disk seek. Why it's not just using index?
The first query is filtering records at the mysql level outside of the storage engine because of your "ORDER BY" clause using date_added field.
This can be mitigated by moving the date_added field first in the index like this
CREATE INDEX cat_list_INX ON articles (date_added, cat_id, status);
The 2nd query - my version of mysql is not showing a "Using where" - I would not expect to either - maybe its because I have no records.
mysql> EXPLAIN SELECT cat_id FROM articles USE INDEX (categories_list_INX) WHERE cat_id=158 AND status=2;
+----+-------------+----------+------+---------------------+---------------------+---------+-------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------------+---------------------+---------+-------------+------+-------------+
| 1 | SIMPLE | articles | ref | categories_list_INX | categories_list_INX | 8 | const,const | 1 | Using index |
+----+-------------+----------+------+---------------------+---------------------+---------+-------------+------+-------------+
1 row in set (0.00 sec)
Extra column info from High Performance MySQL:
Using Index: This indicates that MySQL will use a covering index to avoid accessing the table. Don't confuse covering indexes with the index access type.
Using Where: This means the MySQL server will post-filter rows after the storage engine retrieves them. Many WHERE conditions that involve columns in an index can be checked by the storage engine when (and if) it reads the index, so not all queries with a WHERE clause will show "Using where". Sometimes the presence of "Using where" is a hint that the query can benefit from different indexing.