MySQL Indexing - In vs. Equals indexing issues - mysql

Following queries run quite fast and instantaneously on mysql server:
SELECT table_name.id
FROM table_name
WHERE table_name.id in (10000)
SELECT table_name.id
from table_name
where table_name.id = (SELECT table_name.id
FROM table_name
WHERE table_name.id in (10000)
);
But if I change the second query to as following, then it takes more than 20 seconds:
SELECT table_name.id
from table_name
where table_name.id in (SELECT table_name.id
FROM table_name
WHERE table_name.id in (10000)
);
On doing explain, I get the following output. It is clear that there are some issues regarding how MySQL indexes the data, and use in keyword.
For first query:
+----+-------------+---------------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | SIMPLE | table_name | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index |
+----+-------------+---------------+-------+---------------+---------+---------+-------+------+-------------+
For second query:
+----+-------------+---------------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | PRIMARY | table_name | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index |
| 2 | SUBQUERY | table_name | const | PRIMARY | PRIMARY | 4 | | 1 | Using index |
+----+-------------+---------------+-------+---------------+---------+---------+-------+------+-------------+
For third query:
+----+--------------------+------------+-------+---------------+---------+---------+-------+---------+--------------------------+
| id | select_type | table_name | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+-------+---------------+---------+---------+-------+---------+--------------------------+
| 1 | PRIMARY | table_name | index | NULL | sentTo | 5 | NULL | 6250751 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | table_name | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index |
+----+--------------------+------------+-------+---------------+---------+---------+-------+---------+--------------------------+
I am using InnoDB and have tried changing the third query to forcibly use the index as indicated by the following category.

In first case you have only first record from subquery (It runs once, because equals is only for first value)
In second query you got Cartesian multiplication (each per each) because IN runs subquery for each row. Which is not good for performance
Try to use joins for these cases.

Related

Efficiently query on first two digits of indexed int column in MySQL

I have a table (MySQL 8.0.26, InnoDB) containing an indexed column of MEDIUMINTs that denote the date a record was created:
date_created MEDIUMINT NOT NULL
INDEX idx_created (date_created)
E.g., the entry "210516" denotes 2021-05-16.
Are the following queries roughly equally efficient in utilizing the index?
WHERE 210000<=date_created AND date_created<220000,
WHERE date_created DIV 10000 = 21,
WHERE date_created LIKE '21%', and
WHERE LEFT(date_created, 2) = '21'
I am currently using WHERE date_created DIV 10000 = 21 in my code but wonder if I should alter all queries to make them more efficient.
Thanks a lot in advance.
Look at the type column in EXPLAIN. If it says "ALL" it means it must do a table-scan of all the rows, evaluating the condition expression for each row. This is not using the index.
mysql> explain select * from mytable where 21000<=date_created and date_created < 22000;
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | mytable | NULL | range | date_created | date_created | 4 | NULL | 1 | 100.00 | Using index condition |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+
mysql> explain select * from mytable where date_created like '21%';
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | mytable | NULL | ALL | date_created | NULL | NULL | NULL | 8192 | 11.11 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
mysql> explain select * from mytable where date_created div 10000 = 21;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | mytable | NULL | ALL | NULL | NULL | NULL | NULL | 8192 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
mysql> explain select * from mytable where left(date_created, 2) = '21';
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | mytable | NULL | ALL | NULL | NULL | NULL | NULL | 8192 | 100.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
MySQL 8.0 supports expression indexes, which helps a couple of the cases:
mysql> alter table mytable add index expr1 ((left(date_created, 2)));
mysql> explain select * from mytable where left(date_created, 2) = '21';
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| 1 | SIMPLE | mytable | NULL | ref | expr1 | expr1 | 11 | const | 1402 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
mysql> alter table mytable add index expr2 ((date_created DIV 10000));
mysql> explain select * from mytable where date_created div 10000 = 21;
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| 1 | SIMPLE | mytable | NULL | ref | expr2 | expr2 | 5 | const | 1402 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+-------+---------+-------+------+----------+-------+
But expression indexes won't help the LIKE '21%' search, because you'd have to hard-code the value '21%' in the expression for the index definition. You could use that index to search for that value only, not for the value of a different year.

Does querying int column with string datatype have any performance impact in mysql queries?

Assuming I have a table as:
create table any_table (any_column_1 int, any_column_2 varchar(255));
create index any_table_any_column_1_IDX USING BTREE ON any_table (any_column_1);
(Note: Index type should not matter here)
I was wondering if querying any_column with int or string have any impact on performance, i.e. does
select * from any_table where any_column_1 = 12345;
have any differences in terms of performance with this one?
select * from any_table where any_column_1 = '12345';
I have looked around the web and really have not faced this particular case.
It should be fine to do this either way for an indexed integer column. When you compare an integer column to a constant, the constant value is cast to an integer whether you format it as an integer or a string.
You can confirm this with EXPLAIN. In both cases, the EXPLAIN shows that it will use the index (type: ref indicates an index lookup), and the performance will be the same.
mysql> explain select * from any_table where any_column_1 = 12345;
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | any_table | NULL | ref | any_table_any_column_1_IDX | any_table_any_column_1_IDX | 5 | const | 1 | 100.00 | NULL |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
mysql> explain select * from any_table where any_column_1 = '12345';
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | any_table | NULL | ref | any_table_any_column_1_IDX | any_table_any_column_1_IDX | 5 | const | 1 | 100.00 | NULL |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
If you had indexed the string column in your example, any_column_2, it would make a difference because the collation of a string column must match the collation of the value you compare it to. A string literal will be cast to a compatible collation by default, so it uses the index:
create index any_table_any_column_2_IDX USING BTREE ON any_table (any_column_2);
mysql> explain select * from any_table where any_column_2 = '12345';
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | any_table | NULL | ref | any_table_any_column_2_IDX | any_table_any_column_2_IDX | 768 | const | 1 | 100.00 | NULL |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
But an integer literal has no collation, so you get warnings, and the index cannot be used. The EXPLAIN shows type: ALL so it will do a table-scan and that will have poor performance if you query a table with many rows.
mysql> explain select * from any_table where any_column_2 = 12345;
+----+-------------+-----------+------------+------+----------------------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | any_table | NULL | ALL | any_table_any_column_2_IDX | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+-------------+-----------+------------+------+----------------------------+------+---------+------+------+----------+-------------+
1 row in set, 3 warnings (0.00 sec)
mysql> show warnings;
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use ref access on index 'any_table_any_column_2_IDX' due to type or collation conversion on field 'any_column_2' |
| Warning | 1739 | Cannot use range access on index 'any_table_any_column_2_IDX' due to type or collation conversion on field 'any_column_2' |
| Note | 1003 | /* select#1 */ select `test2`.`any_table`.`any_column_1` AS `any_column_1`,`test2`.`any_table`.`any_column_2` AS `any_column_2` from `test2`.`any_table` where (`test2`.`any_table`.`any_column_2` = 12345) |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

DELETE statement is not using INDEX on table and executing for long time

There is one huge table which is having 25M records and when we try to delete the records by manually passing the value it is using the INDEX and query is executing faster.
Below are details.
MySQL [(none)]> explain DELETE FROM isca51410_octopus_prod_eai.WMSERVICE WHERE contextid in ('1121','1245','5432','12412','1212','7856','2342','1345','5312','2342','3432','5321');
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
| 1 | DELETE | BIG_TABLE | NULL | range | IDX_BIG_CID | IDX_BIG_CID | 109 | const | 12 | 100.00 | Using where |
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
But when we try to pass the values by using select query it is not using index and query is executing for more time.
Below is the explain plan.
MySQL [(none)]> explain DELETE FROM DATABASE1_1.BIG_TABLE WHERE contextid in (SELECT contextid FROM DATABASE_2.TABLE_2);
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
| 1 | DELETE | BIG_TABLE | NULL | ALL | NULL | NULL | NULL | NULL | 25730673 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | TABLE_2 | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 10.00 | Using where |
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
Here DATABASE_2.TABLE_2 is a table where the values will change everytime and row count will be less than 100.
How to make use of index IDX_BIG_CID on table DATABASE1_1.BIG_TABLE for the below query
DELETE FROM DATABASE1_1.BIG_TABLE WHERE contextid in (SELECT contextid FROM DATABASE_2.TABLE_2);
Don't use IN ( SELECT ... ). Use a multi-table DELETE. (See the ref manual.)

Mysql 5.6 optimizer doesn't use indexes in small tables joins

We have two tables - the first is relatively big (contact table) 250k rows and the second is small(user table, < 10 rows). On mysql 5.6 version I have next explain result:
EXPLAIN SELECT
o0_.id AS id_0,
o8_.first_name,
o8_.last_name
FROM
contact o0_
LEFT JOIN user o8_ ON o0_.user_owner_id = o8_.id
LIMIT
25 OFFSET 100
+----+-------------+-------+-------+---------------+----------------------+---------+------+--------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------------------+---------+------+--------+----------------------------------------------------+
| 1 | SIMPLE | o0_ | index | NULL | IDX_403263ED9EB185F9 | 5 | NULL | 253030 | Using index |
| 1 | SIMPLE | o8_ | ALL | PRIMARY | NULL | NULL | NULL | 5 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+-------+---------------+----------------------+---------+------+--------+----------------------------------------------------+
2 rows in set (0,00 sec)
When i use force index for join:
EXPLAIN SELECT
o0_.id AS id_0,
o8_.first_name,
o8_.last_name
FROM
contact o0_
LEFT JOIN user o8_ force index for join(`PRIMARY`) ON o0_.user_owner_id = o8_.id
LIMIT
25 OFFSET 100
or adding indexes on fields which appears in select clause (first_name, last_name) on user table:
alter table user add index(first_name, last_name);
Explain result changes to this:
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| 1 | SIMPLE | o0_ | index | NULL | IDX_403263ED9EB185F9 | 5 | NULL | 253030 | Using index |
| 1 | SIMPLE | o8_ | eq_ref | PRIMARY | PRIMARY | 4 | o0_.user_owner_id | 1 | NULL |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
2 rows in set (0,00 sec)
On mysql 5.5 version I have same explain result without additional indexes:
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| 1 | SIMPLE | o0_ | index | NULL | IDX_403263ED9EB185F9 | 5 | NULL | 255706 | Using index |
| 1 | SIMPLE | o8_ | eq_ref | PRIMARY | PRIMARY | 4 | o0_.user_owner_id | 1 | |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
2 rows in set (0.00 sec)
Why i need force use PRIMARY index or add extra indexes on mysql 5.6 version?
Same behavior occurs with other selects, when join small tables.
If you have a table with so few rows, it may actually be faster to do a full table scan, than going to an index, locate the records and then go back to the table. If you have other fields in the user table apart from the 3 in the query, then you may consider adding a covering index, but franly, I do not think that any of this would have significant affect on the speed of the query.

MySQL index not used

have some table with index for two columns (user_id,date)
and SQL query
select user_id, stat.in, stat.out, stat.time, date
from stat
where user_id in (select id from users force index (street_id) where street_id=30);
or
select user_id, stat.in, stat.out, stat.time, date
from stat where user_id in (select id from users force index (street_id) where street_id=30)
and date between STR_TO_DATE('2010-01-01 00:00:00', '%Y-%m-%d %H:%i:%s') and TR_TO_DATE('2014-05-22 23:59:59', '%Y-%m-%d %H:%i:%s')
In two case index must work, but I sink problem in in statement. If it's possible, how make it work?
Explain:
+----+--------------------+-------+------+---------------+-----------+---------+-------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+---------------+-----------+---------+-------+----------+--------------------------+
| 1 | PRIMARY | stat | ALL | NULL | NULL | NULL | NULL | 32028701 | Using where |
| 2 | DEPENDENT SUBQUERY | users | ref | street_id | street_id | 8 | const | 650 | Using where; Using index |
+----+--------------------+-------+------+---------------+-----------+---------+-------+----------+--------------------------+
if search with one user_id index work
explain select user_id, stat.in, stat.out, stat.time, date
from stat
where user_id=3991;
Explain:
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-------+
| 1 | SIMPLE | stat | ref | user_id_2 | user_id_2 | 8 | const | 2973 | |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-------+
First thing in the query the IN clause is creating havoc and if I am not wrong the indexes are not done properly.
So here is how it should be lets say the tables are as
create table users (id int, name varchar(100),street_id int);
insert into users values
(1,'a',20),(2,'b',30),(3,'c',10),(4,'d',20),(5,'e',10),(6,'f',40),(7,'g',20),
(8,'h',10),(9,'i',10),(10,'j',40);
create table stat (user_id int ,`in` int, `out` int, time int , date date);
insert into stat values
(1,1,1,20,'2014-01-01'),
(1,1,1,20,'2014-01-02'),
(3,1,1,20,'2014-01-01'),
(2,1,1,20,'2014-01-01'),
(4,1,1,20,'2014-01-02'),
(6,1,1,20,'2014-01-02'),
(7,1,1,20,'2014-01-02'),
(8,1,1,20,'2014-01-02'),
(1,1,1,20,'2014-01-02'),
(2,1,1,20,'2014-01-02'),
(3,1,1,20,'2014-01-03'),
(4,1,1,20,'2014-01-04'),
(5,1,1,20,'2014-01-04'),
(6,1,1,20,'2014-01-04'),
(7,1,1,20,'2014-01-04'),
(2,1,1,20,'2014-01-04'),
(3,1,1,20,'2014-01-04'),
(4,1,1,20,'2014-01-05'),
(5,1,1,20,'2014-01-05'),
(6,1,1,20,'2014-01-05'),
(7,1,1,20,'2014-01-05'),
(8,1,1,20,'2014-01-05'),
(9,1,1,20,'2014-01-05'),
(10,1,1,20,'2014-01-05'),
(1,1,1,20,'2014-01-06'),
(4,1,1,20,'2014-01-06');
Now add some indexes on the table
alter table users add index id_idx (id);
alter table users add index street_idx(street_id);
alter table stat add index user_id_idx(user_id);
Now if we execute the same query that you are trying to do using explain yields
EXPLAIN
select user_id, stat.`in`, stat.`out`, stat.time, date
from stat
where user_id in (select id from users force index (street_id) where street_id=30);
+----+--------------------+-------+------+---------------+------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+---------------+------------+---------+-------+------+-------------+
| 1 | PRIMARY | stat | ALL | NULL | NULL | NULL | NULL | 26 | Using where |
| 2 | DEPENDENT SUBQUERY | users | ref | street_idx | street_idx | 5 | const | 1 | Using where |
+----+--------------------+-------+------+---------------+------------+---------+-------+------+-------------+
It still looks like trying to scan the entire table.
Now lets modify the query and use JOIN and see what explain has to say, note that I have index on both table for the joining key and which are of same type and size.
EXPLAIN
select
s.user_id,
s.`in`,
s.`out`,
s.time,
s.date
from stat s
join users u on u.id = s.user_id
where u.street_id=30 ;
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| 1 | SIMPLE | u | ref | id_idx,street_idx | street_idx | 5 | const | 1 | Using where |
| 1 | SIMPLE | s | ref | user_id_idx | user_id_idx | 5 | test.u.id | 3 | Using where |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
Better hun ?? Now lets try a range search
EXPLAIN
select
s.user_id,
s.`in`,
s.`out`,
s.time,
s.date
from stat s
join users u on u.id = s.user_id
where
u.street_id=30
and s.date between '2014-01-01' AND '2014-01-06'
;
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| 1 | SIMPLE | u | ref | id_idx,street_idx | street_idx | 5 | const | 1 | Using where |
| 1 | SIMPLE | s | ref | user_id_idx | user_id_idx | 5 | test.u.id | 3 | Using where |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
Still better right ??
So the underlying agenda is try avoiding IN queries. Use JOIN on indexed column and for search columns indexed them properly.