Assuming I have a table as:
create table any_table (any_column_1 int, any_column_2 varchar(255));
create index any_table_any_column_1_IDX USING BTREE ON any_table (any_column_1);
(Note: Index type should not matter here)
I was wondering if querying any_column with int or string have any impact on performance, i.e. does
select * from any_table where any_column_1 = 12345;
have any differences in terms of performance with this one?
select * from any_table where any_column_1 = '12345';
I have looked around the web and really have not faced this particular case.
It should be fine to do this either way for an indexed integer column. When you compare an integer column to a constant, the constant value is cast to an integer whether you format it as an integer or a string.
You can confirm this with EXPLAIN. In both cases, the EXPLAIN shows that it will use the index (type: ref indicates an index lookup), and the performance will be the same.
mysql> explain select * from any_table where any_column_1 = 12345;
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | any_table | NULL | ref | any_table_any_column_1_IDX | any_table_any_column_1_IDX | 5 | const | 1 | 100.00 | NULL |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
mysql> explain select * from any_table where any_column_1 = '12345';
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | any_table | NULL | ref | any_table_any_column_1_IDX | any_table_any_column_1_IDX | 5 | const | 1 | 100.00 | NULL |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
If you had indexed the string column in your example, any_column_2, it would make a difference because the collation of a string column must match the collation of the value you compare it to. A string literal will be cast to a compatible collation by default, so it uses the index:
create index any_table_any_column_2_IDX USING BTREE ON any_table (any_column_2);
mysql> explain select * from any_table where any_column_2 = '12345';
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | any_table | NULL | ref | any_table_any_column_2_IDX | any_table_any_column_2_IDX | 768 | const | 1 | 100.00 | NULL |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
But an integer literal has no collation, so you get warnings, and the index cannot be used. The EXPLAIN shows type: ALL so it will do a table-scan and that will have poor performance if you query a table with many rows.
mysql> explain select * from any_table where any_column_2 = 12345;
+----+-------------+-----------+------------+------+----------------------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | any_table | NULL | ALL | any_table_any_column_2_IDX | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+-------------+-----------+------------+------+----------------------------+------+---------+------+------+----------+-------------+
1 row in set, 3 warnings (0.00 sec)
mysql> show warnings;
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use ref access on index 'any_table_any_column_2_IDX' due to type or collation conversion on field 'any_column_2' |
| Warning | 1739 | Cannot use range access on index 'any_table_any_column_2_IDX' due to type or collation conversion on field 'any_column_2' |
| Note | 1003 | /* select#1 */ select `test2`.`any_table`.`any_column_1` AS `any_column_1`,`test2`.`any_table`.`any_column_2` AS `any_column_2` from `test2`.`any_table` where (`test2`.`any_table`.`any_column_2` = 12345) |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
There is one huge table which is having 25M records and when we try to delete the records by manually passing the value it is using the INDEX and query is executing faster.
Below are details.
MySQL [(none)]> explain DELETE FROM isca51410_octopus_prod_eai.WMSERVICE WHERE contextid in ('1121','1245','5432','12412','1212','7856','2342','1345','5312','2342','3432','5321');
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
| 1 | DELETE | BIG_TABLE | NULL | range | IDX_BIG_CID | IDX_BIG_CID | 109 | const | 12 | 100.00 | Using where |
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
But when we try to pass the values by using select query it is not using index and query is executing for more time.
Below is the explain plan.
MySQL [(none)]> explain DELETE FROM DATABASE1_1.BIG_TABLE WHERE contextid in (SELECT contextid FROM DATABASE_2.TABLE_2);
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
| 1 | DELETE | BIG_TABLE | NULL | ALL | NULL | NULL | NULL | NULL | 25730673 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | TABLE_2 | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 10.00 | Using where |
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
Here DATABASE_2.TABLE_2 is a table where the values will change everytime and row count will be less than 100.
How to make use of index IDX_BIG_CID on table DATABASE1_1.BIG_TABLE for the below query
DELETE FROM DATABASE1_1.BIG_TABLE WHERE contextid in (SELECT contextid FROM DATABASE_2.TABLE_2);
Don't use IN ( SELECT ... ). Use a multi-table DELETE. (See the ref manual.)
SELECT id, name, detail FROM student WHERE id NOT IN (1,788,103,100) ORDER BY id DESC LIMIT 1000,10
The table is tiny (10,000 rows). I have to consider two point, "IN query" and "LIMIT query".
Here are the DDLs and the EXPLAIN. I'm using MySQL 5.6.4.
CREATE TABLE student
( id int(11) NOT NULL AUTO_INCREMENT
, name varchar(45) NOT NULL
, detail varchar(255) NOT NULL
, PRIMARY KEY (id)
) ENGINE = MyISAM;
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
| 1 | SIMPLE | student| ALL | Primary,id | NULL | NULL | NULL | 13 | |
The LIMIT and ORDER BY clauses mean that the query has to build the whole table and then order it and then go the record 1000 and then extract the next 10 records.
Why are you looking for 10 records starting at record 1000?
Removing the ORDER BY clause would make it faster as the query would only need to extract 1010 records.
I cannot replicate this finding...
SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.5.16 |
+-----------+
SELECT COUNT(*) FROM student;
+----------+
| COUNT(*) |
+----------+
| 131072 |
+----------+
SELECT id
FROM student
WHERE id
NOT IN (1,788,103,100)
ORDER
BY id DESC
LIMIT 1000,10;
+--------+
| id |
+--------+
| 195591 |
| 195590 |
| 195589 |
| 195588 |
| 195587 |
| 195586 |
| 195585 |
| 195584 |
| 195583 |
| 195582 |
+--------+
10 rows in set (0.00 sec)
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
| 1 | SIMPLE | student | range | PRIMARY | PRIMARY | 4 | NULL | 131069 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
have some table with index for two columns (user_id,date)
and SQL query
select user_id, stat.in, stat.out, stat.time, date
from stat
where user_id in (select id from users force index (street_id) where street_id=30);
or
select user_id, stat.in, stat.out, stat.time, date
from stat where user_id in (select id from users force index (street_id) where street_id=30)
and date between STR_TO_DATE('2010-01-01 00:00:00', '%Y-%m-%d %H:%i:%s') and TR_TO_DATE('2014-05-22 23:59:59', '%Y-%m-%d %H:%i:%s')
In two case index must work, but I sink problem in in statement. If it's possible, how make it work?
Explain:
+----+--------------------+-------+------+---------------+-----------+---------+-------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+---------------+-----------+---------+-------+----------+--------------------------+
| 1 | PRIMARY | stat | ALL | NULL | NULL | NULL | NULL | 32028701 | Using where |
| 2 | DEPENDENT SUBQUERY | users | ref | street_id | street_id | 8 | const | 650 | Using where; Using index |
+----+--------------------+-------+------+---------------+-----------+---------+-------+----------+--------------------------+
if search with one user_id index work
explain select user_id, stat.in, stat.out, stat.time, date
from stat
where user_id=3991;
Explain:
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-------+
| 1 | SIMPLE | stat | ref | user_id_2 | user_id_2 | 8 | const | 2973 | |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-------+
First thing in the query the IN clause is creating havoc and if I am not wrong the indexes are not done properly.
So here is how it should be lets say the tables are as
create table users (id int, name varchar(100),street_id int);
insert into users values
(1,'a',20),(2,'b',30),(3,'c',10),(4,'d',20),(5,'e',10),(6,'f',40),(7,'g',20),
(8,'h',10),(9,'i',10),(10,'j',40);
create table stat (user_id int ,`in` int, `out` int, time int , date date);
insert into stat values
(1,1,1,20,'2014-01-01'),
(1,1,1,20,'2014-01-02'),
(3,1,1,20,'2014-01-01'),
(2,1,1,20,'2014-01-01'),
(4,1,1,20,'2014-01-02'),
(6,1,1,20,'2014-01-02'),
(7,1,1,20,'2014-01-02'),
(8,1,1,20,'2014-01-02'),
(1,1,1,20,'2014-01-02'),
(2,1,1,20,'2014-01-02'),
(3,1,1,20,'2014-01-03'),
(4,1,1,20,'2014-01-04'),
(5,1,1,20,'2014-01-04'),
(6,1,1,20,'2014-01-04'),
(7,1,1,20,'2014-01-04'),
(2,1,1,20,'2014-01-04'),
(3,1,1,20,'2014-01-04'),
(4,1,1,20,'2014-01-05'),
(5,1,1,20,'2014-01-05'),
(6,1,1,20,'2014-01-05'),
(7,1,1,20,'2014-01-05'),
(8,1,1,20,'2014-01-05'),
(9,1,1,20,'2014-01-05'),
(10,1,1,20,'2014-01-05'),
(1,1,1,20,'2014-01-06'),
(4,1,1,20,'2014-01-06');
Now add some indexes on the table
alter table users add index id_idx (id);
alter table users add index street_idx(street_id);
alter table stat add index user_id_idx(user_id);
Now if we execute the same query that you are trying to do using explain yields
EXPLAIN
select user_id, stat.`in`, stat.`out`, stat.time, date
from stat
where user_id in (select id from users force index (street_id) where street_id=30);
+----+--------------------+-------+------+---------------+------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+---------------+------------+---------+-------+------+-------------+
| 1 | PRIMARY | stat | ALL | NULL | NULL | NULL | NULL | 26 | Using where |
| 2 | DEPENDENT SUBQUERY | users | ref | street_idx | street_idx | 5 | const | 1 | Using where |
+----+--------------------+-------+------+---------------+------------+---------+-------+------+-------------+
It still looks like trying to scan the entire table.
Now lets modify the query and use JOIN and see what explain has to say, note that I have index on both table for the joining key and which are of same type and size.
EXPLAIN
select
s.user_id,
s.`in`,
s.`out`,
s.time,
s.date
from stat s
join users u on u.id = s.user_id
where u.street_id=30 ;
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| 1 | SIMPLE | u | ref | id_idx,street_idx | street_idx | 5 | const | 1 | Using where |
| 1 | SIMPLE | s | ref | user_id_idx | user_id_idx | 5 | test.u.id | 3 | Using where |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
Better hun ?? Now lets try a range search
EXPLAIN
select
s.user_id,
s.`in`,
s.`out`,
s.time,
s.date
from stat s
join users u on u.id = s.user_id
where
u.street_id=30
and s.date between '2014-01-01' AND '2014-01-06'
;
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
| 1 | SIMPLE | u | ref | id_idx,street_idx | street_idx | 5 | const | 1 | Using where |
| 1 | SIMPLE | s | ref | user_id_idx | user_id_idx | 5 | test.u.id | 3 | Using where |
+----+-------------+-------+------+-------------------+-------------+---------+-----------+------+-------------+
Still better right ??
So the underlying agenda is try avoiding IN queries. Use JOIN on indexed column and for search columns indexed them properly.