How to improve this sql query performance? - mysql

SELECT id, name, detail FROM student WHERE id NOT IN (1,788,103,100) ORDER BY id DESC LIMIT 1000,10
The table is tiny (10,000 rows). I have to consider two point, "IN query" and "LIMIT query".
Here are the DDLs and the EXPLAIN. I'm using MySQL 5.6.4.
CREATE TABLE student
( id int(11) NOT NULL AUTO_INCREMENT
, name varchar(45) NOT NULL
, detail varchar(255) NOT NULL
, PRIMARY KEY (id)
) ENGINE = MyISAM;
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
| 1 | SIMPLE | student| ALL | Primary,id | NULL | NULL | NULL | 13 | |

The LIMIT and ORDER BY clauses mean that the query has to build the whole table and then order it and then go the record 1000 and then extract the next 10 records.
Why are you looking for 10 records starting at record 1000?
Removing the ORDER BY clause would make it faster as the query would only need to extract 1010 records.

I cannot replicate this finding...
SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.5.16 |
+-----------+
SELECT COUNT(*) FROM student;
+----------+
| COUNT(*) |
+----------+
| 131072 |
+----------+
SELECT id
FROM student
WHERE id
NOT IN (1,788,103,100)
ORDER
BY id DESC
LIMIT 1000,10;
+--------+
| id |
+--------+
| 195591 |
| 195590 |
| 195589 |
| 195588 |
| 195587 |
| 195586 |
| 195585 |
| 195584 |
| 195583 |
| 195582 |
+--------+
10 rows in set (0.00 sec)
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
| 1 | SIMPLE | student | range | PRIMARY | PRIMARY | 4 | NULL | 131069 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+

Related

Efficiently query on first two digits of indexed int column in MySQL

I have a table (MySQL 8.0.26, InnoDB) containing an indexed column of MEDIUMINTs that denote the date a record was created:
date_created MEDIUMINT NOT NULL
INDEX idx_created (date_created)
E.g., the entry "210516" denotes 2021-05-16.
Are the following queries roughly equally efficient in utilizing the index?
WHERE 210000<=date_created AND date_created<220000,
WHERE date_created DIV 10000 = 21,
WHERE date_created LIKE '21%', and
WHERE LEFT(date_created, 2) = '21'
I am currently using WHERE date_created DIV 10000 = 21 in my code but wonder if I should alter all queries to make them more efficient.
Thanks a lot in advance.
Look at the type column in EXPLAIN. If it says "ALL" it means it must do a table-scan of all the rows, evaluating the condition expression for each row. This is not using the index.
mysql> explain select * from mytable where 21000<=date_created and date_created < 22000;
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | mytable | NULL | range | date_created | date_created | 4 | NULL | 1 | 100.00 | Using index condition |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+
mysql> explain select * from mytable where date_created like '21%';
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | mytable | NULL | ALL | date_created | NULL | NULL | NULL | 8192 | 11.11 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
mysql> explain select * from mytable where date_created div 10000 = 21;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | mytable | NULL | ALL | NULL | NULL | NULL | NULL | 8192 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
mysql> explain select * from mytable where left(date_created, 2) = '21';
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | mytable | NULL | ALL | NULL | NULL | NULL | NULL | 8192 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
MySQL 8.0 supports expression indexes, which helps a couple of the cases:
mysql> alter table mytable add index expr1 ((left(date_created, 2)));
mysql> explain select * from mytable where left(date_created, 2) = '21';
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| 1 | SIMPLE | mytable | NULL | ref | expr1 | expr1 | 11 | const | 1402 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
mysql> alter table mytable add index expr2 ((date_created DIV 10000));
mysql> explain select * from mytable where date_created div 10000 = 21;
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| 1 | SIMPLE | mytable | NULL | ref | expr2 | expr2 | 5 | const | 1402 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
But expression indexes won't help the LIKE '21%' search, because you'd have to hard-code the value '21%' in the expression for the index definition. You could use that index to search for that value only, not for the value of a different year.

How to make an efficient UPDATE like my SELECT in MariaDB

Background
I made a small table of 10 rows from a previous SELECT already ran (SavedAnimals).
I have a massive table (animals) which I would like to UPDATE using the rows with the same id as each row in my new table.
What I have tried so far
I can quickly SELECT the desired rows from the big table like this:
mysql> EXPLAIN SELECT * FROM animals WHERE ignored=0 and id IN (SELECT animal_id FROM SavedAnimals);
+------+--------------+-------------------------------+--------+---------------+---------+---------+----------------------------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+-------------------------------+--------+---------------+---------+---------+----------------------------------------------------------+------+-------------+
| 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 10 | |
| 1 | PRIMARY | animals | eq_ref | PRIMARY | PRIMARY | 8 | db_staging.SavedAnimals.animal_id | 1 | Using where |
| 2 | MATERIALIZED | SavedAnimals | ALL | NULL | NULL | NULL | NULL | 10 | |
+------+--------------+-------------------------------+--------+---------------+---------+---------+----------------------------------------------------------+------+-------------+
But the "same" command on the UPDATE is not quick:
mysql> EXPLAIN UPDATE animals SET ignored=1, ignored_when=CURRENT_TIMESTAMP WHERE ignored=0 and id IN (SELECT animal_id FROM SavedAnimals);
+------+--------------------+-------------------------------+-------+---------------+---------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+-------------------------------+-------+---------------+---------+---------+------+----------+-------------+
| 1 | PRIMARY | animals | index | NULL | PRIMARY | 8 | NULL | 34269464 | Using where |
| 2 | DEPENDENT SUBQUERY | SavedAnimals | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
+------+--------------------+-------------------------------+-------+---------------+---------+---------+------+----------+-------------+
2 rows in set (0.00 sec)
The UPDATE command never finishes if I run it.
QUESTION
How do I make mariaDB run with the Materialized select_type on the UPDATE like it does on the SELECT?
OR
Is there a totally separate way that I should approach this which would be quick?
Notes
Version: 10.3.23-MariaDB-log
Use JOIN rather than WHERE...IN. MySQL tends to optimize them better.
UPDATE animals AS a
JOIN SavedAnimals AS sa ON a.id = sa.animal_id
SET a.ignored=1, a.ignored_when=CURRENT_TIMESTAMP
WHERE a.ignored = 0
You should find an EXISTS clause more efficient than an IN clause. For example:
UPDATE animals a
SET a.ignored = 1,
a.ignored_when = CURRENT_TIMESTAMP
WHERE a.ignored = 0
AND EXISTS (SELECT * FROM SavedAnimals sa WHERE sa.animal_id = a.id)

DELETE statement is not using INDEX on table and executing for long time

There is one huge table which is having 25M records and when we try to delete the records by manually passing the value it is using the INDEX and query is executing faster.
Below are details.
MySQL [(none)]> explain DELETE FROM isca51410_octopus_prod_eai.WMSERVICE WHERE contextid in ('1121','1245','5432','12412','1212','7856','2342','1345','5312','2342','3432','5321');
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
| 1 | DELETE | BIG_TABLE | NULL | range | IDX_BIG_CID | IDX_BIG_CID | 109 | const | 12 | 100.00 | Using where |
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
But when we try to pass the values by using select query it is not using index and query is executing for more time.
Below is the explain plan.
MySQL [(none)]> explain DELETE FROM DATABASE1_1.BIG_TABLE WHERE contextid in (SELECT contextid FROM DATABASE_2.TABLE_2);
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
| 1 | DELETE | BIG_TABLE | NULL | ALL | NULL | NULL | NULL | NULL | 25730673 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | TABLE_2 | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 10.00 | Using where |
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
Here DATABASE_2.TABLE_2 is a table where the values will change everytime and row count will be less than 100.
How to make use of index IDX_BIG_CID on table DATABASE1_1.BIG_TABLE for the below query
DELETE FROM DATABASE1_1.BIG_TABLE WHERE contextid in (SELECT contextid FROM DATABASE_2.TABLE_2);
Don't use IN ( SELECT ... ). Use a multi-table DELETE. (See the ref manual.)

MySQL index on timestamp column not used for large date ranges

I have table as
+-------------------+----------------+------+-----+---------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+----------------+------+-----+---------------------+-----------------------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| runtime_id | bigint(20) | NO | MUL | NULL | |
| place_id | bigint(20) | NO | MUL | NULL | |
| amended_timestamp | varchar(50) | YES | | NULL | |
| applicable_at | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| schedule_time | timestamp | NO | MUL | 0000-00-00 00:00:00 | |
| quality_indicator | varchar(10) | NO | | NULL | |
| flow_rate | decimal(15,10) | NO | | NULL | |
+-------------------+----------------+------+-----+---------------------+-----------------------------+
I have index on schedule_time as
create index table_index on table(schedule_time asc);
The table currently has 2121552+ records.
The thing I fail to understand is when I do explain
explain select runtime_id from table where schedule_time >= now() - INTERVAL 1 DAY;
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
| 1 | SIMPLE | table | range | table_index | table_index | 4 | NULL | 38088 | Using where |
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
1 row in set (0.00 sec)
Above index is used, but the below one not.
mysql> explain select runtime_id from table where schedule_time >= now() - INTERVAL 30 DAY;
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | table | ALL | table_index | NULL | NULL | NULL | 2118107 | Using where |
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)
I'll really appreciate if someone can point out whats wrong here, as the data is updated every 12 minutes and as the time passes by query for 30 days or may be 60 days will get very slow.
The final query where I plan to use it is as follows
select avg(flow_rate),c.group from table a ,(select runtime_id from table where schedule_time >= now() - INTERVAL 1 DAY group by schedule_time ) b,place c where a.runtime_id = b.runtime_id and a.place_id = c.id group by c.group;
Update =====>
As per the comments between fails too.
mysql> explain select runtime_id from table where schedule_time between '2013-07-17 12:48:00' and '2013-08-17 12:48:00';
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | table | ALL | table_index | NULL | NULL | NULL | 2118431 | Using where |
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)
mysql> explain select runtime_id from table where schedule_time between '2013-08-16 12:48:00' and '2013-08-17 12:48:00';
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
| 1 | SIMPLE | table | range | table_index | table_index | 4 | NULL | 38770 | Using where |
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
1 row in set (0.00 sec)
Update 2 =======>
mysql> select count(*) from table where schedule_time between '2013-08-16 12:48:00' and '2013-08-17 12:48:00';
+----------+
| count(*) |
+----------+
| 19440 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from table where schedule_time between '2013-07-17 12:48:00' and '2013-08-17 12:48:00';
+----------+
| count(*) |
+----------+
| 597132 |
+----------+
1 row in set (0.00 sec)
Server version: 5.5.24-0ubuntu0.12.04.1 (Ubuntu)
The MySQL optimizer tries to do the fastest thing. Where it thinks that using the index will take as long or longer than doing a table scan, it abandons the available index.
This is what you see it doing in your examples:
where the range is small (1 day) the index will be faster;
where the range is large, you're going to be hitting so much more of the table you might as well scan the table directly (remember, using the index involves searching the index and then grabbing the indexed records from the table - two sets of seeks).
If you think you know better than the optimizer (it isn't perfect), use INDEX hints:
The USE INDEX (index_list) hint tells MySQL to use only one of the
named indexes to find rows in the table. The alternative syntax IGNORE
INDEX (index_list) tells MySQL to not use some particular index or
indexes. These hints are useful if EXPLAIN shows that MySQL is using
the wrong index from the list of possible indexes.

Particular index not used by MySQL

I'm trying to run the following query in my database:
SELECT * FROM ts_cards WHERE ( cardstatus= 2 OR cardstatus= 3 ) AND ( cardtype= 1 OR cardtype= 2 ) ORDER BY cardserial DESC LIMIT 10;
All three fields (cardstatus, cardtype and cardserial) are indexed:
mysql> SHOW INDEX FROM ts_cards;
+----------+------------+----------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+----------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| ts_cards | 0 | PRIMARY | 1 | card_id | A | 15000134 | NULL | NULL | | BTREE | |
| ts_cards | 1 | CardID | 1 | cardserial | A | 15000134 | NULL | NULL | | BTREE | |
| ts_cards | 1 | CardType | 1 | cardtype | A | 17 | NULL | NULL | | BTREE | |
| ts_cards | 1 | CardHolder | 1 | cardstatusholder | A | 17 | NULL | NULL | | BTREE | |
| ts_cards | 1 | CardExpiration | 1 | cardexpiredstatus | A | 17 | NULL | NULL | | BTREE | |
| ts_cards | 1 | CardStatus | 1 | cardstatus | A | 17 | NULL | NULL | | BTREE | |
+----------+------------+----------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
6 rows in set (0.22 sec)
(Yes, I know the index's names suck)
However, by default, MySQL uses only cardstatus' index:
mysql> EXPLAIN SELECT * FROM `ts_cards` WHERE ( cardstatus= 2 OR cardstatus= 3 ) AND ( cardtype= 1 OR cardtype= 2 ) ORDER BY cardserial DESC LIMIT 10;
+----+-------------+----------+-------+---------------------+------------+---------+------+---------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------------+------------+---------+------+---------+-----------------------------+
| 1 | SIMPLE | ts_cards | range | CardType,CardStatus | CardStatus | 1 | NULL | 3215967 | Using where; Using filesort |
+----+-------------+----------+-------+---------------------+------------+---------+------+---------+-----------------------------+
1 row in set (0.00 sec)
(It doesn't even consider the index on cardserial but I guess that's another problem.)
Using "USE KEY" or "FORCE KEY" can make it use cardtype's index, but not both cardtype and cardstatus:
mysql> EXPLAIN SELECT * FROM `ts_cards` FORCE KEY (CardType) WHERE ( cardstatus= 2 OR cardstatus= 3 ) AND ( cardtype= 1 OR cardtype= 2 ) ORDER BY cardserial DESC LIMIT 10;
+----+-------------+----------+-------+---------------+----------+---------+------+---------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+----------+---------+------+---------+-----------------------------+
| 1 | SIMPLE | ts_cards | range | CardType | CardType | 1 | NULL | 6084861 | Using where; Using filesort |
+----+-------------+----------+-------+---------------+----------+---------+------+---------+-----------------------------+
1 row in set (0.00 sec)
mysql> EXPLAIN SELECT * FROM `ts_cards` FORCE KEY (CardType,CardStatus) WHERE ( cardstatus= 2 OR cardstatus= 3 ) AND ( cardtype= 1 OR cardtype= 2 ) ORDER BY cardserial DESC LIMIT 10;
+----+-------------+----------+-------+---------------------+------------+---------+------+---------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------------+------------+---------+------+---------+-----------------------------+
| 1 | SIMPLE | ts_cards | range | CardType,CardStatus | CardStatus | 1 | NULL | 3215967 | Using where; Using filesort |
+----+-------------+----------+-------+---------------------+------------+---------+------+---------+-----------------------------+
1 row in set (0.00 sec)
How can I force MySQL to use BOTH indexes to speed up the query? Both cardtype and cardstatus indexes seem to be defined in the same way yet cardstatus seems to take precedence over cardtype.
IIRC, MySQL cannot use two distinct indexes in the same query. To make use of both indexes, MySQL would need to merge them into one (link to manual). Here is an example if such merge (click on "View Execution Plan"). Notice the "index_merge" of the first SELECT.
Disclaimer: I'm not absolutely sure about the above information.
In your case, despite your hints, the optimizer still considers that the direct scanning of the second table is faster than merging indexes (your tables probably have a very large number of rows, hence a very large, costly-to-manipulate index).
I advise:
ALTER TABLE ADD INDEX CardTypeStatus (cardtype, cardstatus);
This creates an index on both columns. Your query will probably be able to use this index. You may want to drop your CardType index afterwards: queries can still use the two-column index even if they search on the cardtype column only (but not if they search on cardstatus only).
More information about multiple-column indexes: http://dev.mysql.com/doc/refman/5.5/en/multiple-column-indexes.html