What I'm trying to do is index the first name of a person and the date they were born.
The table is laid out like this:
CREATE TABLE test
(
id INT NOT NULL AUTO_INCREMENT,
fname VARCHAR(10) NOT NULL,
sname VARCHAR(10) NOT NULL,
age INT NOT NULL,
born DATETIME NOT NULL,
PRIMARY KEY(id),
INDEX name_age(fname, sname, age),
INDEX name_date(fname, born)
)
However the index isn't recognised in a where statement like so:
mysql> EXPLAIN SELECT *
-> FROM `test`
-> WHERE fname = "coby"
-> AND born
-> BETWEEN "1900-05-02 06:23:00"
-> AND "2100-05-02 06:23:00";
+----+-------------+-------+------+--------------------+----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+--------------------+----------+---------+-------+------+-------------+
| 1 | SIMPLE | test | ref | name_age,name_date | name_age | 12 | const | 45 | Using where |
+----+-------------+-------+------+--------------------+----------+---------+-------+------+-------------+
1 row in set (0.00 sec)
However it is recognised in an order by statement:
mysql> EXPLAIN SELECT *
-> FROM `test`
-> WHERE fname = "coby"
-> ORDER BY born;
+----+-------------+-------+------+--------------------+-----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+--------------------+-----------+---------+-------+------+-------------+
| 1 | SIMPLE | test | ref | name_age,name_date | name_date | 12 | const | 45 | Using where |
+----+-------------+-------+------+--------------------+-----------+---------+-------+------+-------------+
1 row in set (0.00 sec)
How do I make it so that the index is recognised in a where statement?
Any help would be appreciated.
The index is recognized, as seen in the column possible_keys of the result of the first EXPLAIN statement. It just happens, that for 45 rows the other index produces an equal or better query plan: The selectivity of your date range is close to zero.
The ORDER BY is another pair of shoes: As you use the index not only for selection, but also for ordering, it now becomes useful.
Related
Recently I have considered how Index condition Pushdown work, and whether make a difference of locking in Repeatable read isolation level.
However, I find ICP does not take effect in MySQL 8.0, but work in 5.7. The repro steps as below shown ( based on guide in Index Condition Pushdown Optimization
Prepare table and data
CREATE TABLE `fruit` (
`id` bigint NOT NULL AUTO_INCREMENT,
`name` varchar(32) NOT NULL,
`age` int DEFAULT NULL,
`data` varchar(16) DEFAULT '',
PRIMARY KEY (`id`),
KEY `i_age_name` (`age`,`name`)
) ENGINE=InnoDB;
-- Prepare data
delimiter ;;
CREATE PROCEDURE `load_data`()
BEGIN
DECLARE i int DEFAULT 1;
WHILE i < 300 DO
INSERT INTO fruit (`name`, `age`, `data`) VALUES
(substring(MD5(RAND()),1,20), i / 10 + 1, 'test data'),
(substring(MD5(RAND()),1,20), i / 10 + 1, 'hello world');
SET i = i + 1;
END WHILE;
END ;;
delimiter ;
call load_data();
execute a query
explain select * from fruit where age = 10 and name like '%a';
We hope it Extra column could show Using index condition, but not found, I do not know why?
+----+-------------+-------+------------+------+---------------+------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | fruit | NULL | ref | i_age_name | i_age_name | 5 | const | 20 | 11.11 | Using where |
+----+-------------+-------+------------+------+---------------+------------+---------+-------+------+----------+-------------+
In Mysql 5.7, the query execution plan shows ICP is used
+----+-------------+-------+------------+------+---------------+------------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | fruit | NULL | ref | i_age_name | i_age_name | 5 | const | 20 | 11.11 | Using index condition |
+----+-------------+-------+------------+------+---------------+------------+---------+-------+------+----------+-----------------------+
My problem is: simple select query takes a long time (3 minutes).
Structure:
mysql> show create table seventhcont_exceptionreport;
seventhcont_exceptionreport | CREATE TABLE `seventhcont_exceptionreport` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`body_html` longtext NOT NULL,
`datetime_created` datetime NOT NULL,
`subject` varchar(256) NOT NULL,
`host` varchar(128) NOT NULL,
`exc_value` varchar(512) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=74607 DEFAULT CHARSET=utf8 |
Rows count:
mysql> select count(*) from seventhcont_exceptionreport;
+----------+
| count(*) |
+----------+
| 7064 |
+----------+
1 row in set (0.00 sec)
Query 1 (normal):
mysql> select id, datetime_created from seventhcont_exceptionreport order by id LIMIT 100 OFFSET 6000;
...
100 rows in set (0.30 sec)
Query 2 (very slow):
mysql> select id, datetime_created from seventhcont_exceptionreport order by id LIMIT 100 OFFSET 7000;
...
63 rows in set (3 min 40.56 sec)
!!! 3 minutes and 40 sec.
Why?
UPDATE
Explain for query 1:
mysql> EXPLAIN select id, datetime_created from seventhcont_exceptionreport order by id LIMIT 100 OFFSET 6000;
+----+-------------+-----------------------------+-------+---------------+---------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------+-------+---------------+---------+---------+------+------+-------+
| 1 | SIMPLE | seventhcont_exceptionreport | index | NULL | PRIMARY | 4 | NULL | 6100 | |
+----+-------------+-----------------------------+-------+---------------+---------+---------+------+------+-------+
1 row in set (0.00 sec)
Explain for query 2:
mysql> EXPLAIN select id, datetime_created from seventhcont_exceptionreport order by id LIMIT 100 OFFSET 7000;
+----+-------------+-----------------------------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------+------+---------------+------+---------+------+------+----------------+
| 1 | SIMPLE | seventhcont_exceptionreport | ALL | NULL | NULL | NULL | NULL | 7067 | Using filesort |
+----+-------------+-----------------------------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)
UPDATE
Analyze table:
mysql> ANALYZE TABLE seventhcont_exceptionreport;
+--------------------------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------------------------+---------+----------+----------+
| 7k.seventhcont_exceptionreport | analyze | status | OK |
+--------------------------------+---------+----------+----------+
1 row in set (2.51 sec)
I am no MySQL specialist, but I might be able to point you in the right direction.
In the first query, we can see in the Explain Plan, that an index access was used. In contrast, for the second query, we can see that a non-index access is performed (type index vs ALL). Also, we can see that MySQL is using Using filesort.
This means MySQL cannot perform the sort operation on the index and is therefore performing it on the data itself. This could be because the sort buffer is too small (also see https://www.percona.com/blog/2009/03/05/what-does-using-filesort-mean-in-mysql/).
Therefore, try to increase the size of your sort buffer (soft_buffer_size).
Trying to select a random row from a table, based on autoincremented primary key with no holes.
The table schema :
CREATE TABLE IF NOT EXISTS `testTable` (
`id` int(9) NOT NULL AUTO_INCREMENT,
`data` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
INSERT INTO `testTable` (`id`, `data`) VALUES
(1, 'hello'),
(2, 'world'),
(3, 'new'),
(4, 'data'),
(5, 'more and more'),
(6, 'data '),
(7, 'more rows here'),
(8, 'most rows here'),
(9, 'testing'),
(10,'last');
Queries:
1/ explain select * from testTable where id = ceil(Rand()*10) limit 1 ;
http://sqlfiddle.com/#!2/6e2b1/1
Result :
| ID | SELECT_TYPE | TABLE | TYPE | POSSIBLE_KEYS | KEY | KEY_LEN | REF | ROWS | EXTRA |
--------------------------------------------------------------------------------------------------------
| 1 | SIMPLE | testTable | ALL | (null) | (null) | (null) | (null) | 10 | Using where |
2/ explain select * from testTable where id = 7 limit 1 ;
http://sqlfiddle.com/#!2/6e2b1/2
Result:
| ID | SELECT_TYPE | TABLE | TYPE | POSSIBLE_KEYS | KEY | KEY_LEN | REF | ROWS | EXTRA |
---------------------------------------------------------------------------------------------------
| 1 | SIMPLE | testTable | const | PRIMARY | PRIMARY | 4 | const | 1 | |
Why is query#1 not using the index, when ceil(rand()*10) should ideally evaluate to a constant which can then be compared to the primary key ? Shouldn't the optimizer work that way ? Or am I missing something obvious here.
The key can't be used with that query because RAND() is called for each row and returns a different value each time.
You may try this code instead:
SET #rand_value := CEIL(RAND()*10);
EXPLAIN SELECT * FROM testTable WHERE id = #rand_value;
It first computes a random value and assigns it to a variable, then uses it in the query.
As pointed out by aneroid, the LIMIT 1 is useless: since the condition applies to the primary key, the query will never return more than one row.
With this query, the output is:
| ID | SELECT_TYPE | TABLE | TYPE | POSSIBLE_KEYS | KEY | KEY_LEN | REF | ROWS | EXTRA |
---------------------------------------------------------------------------------------------------
| 1 | SIMPLE | testTable | const | PRIMARY | PRIMARY | 4 | const | 1 | |
From the MySQL documentation for RAND():
RAND() in a WHERE clause is re-evaluated every time the WHERE is executed.
So it's not comparing the Primary Key with a constant, it's a changing value every time (in this case, for each row). If you remove the LIMIT 1 in your query, you will see more rows coming up with the different PKs matched -- which shows the "re-evaluated every time" behaviour.
Edit: See Jocelyn's example as a way to generate the random number first and then get a row with the matching PK id (the LIMIT 1 is not required, btw). Similarly stated in Najzero's comment.
I've researched this, but I still cannot explain why:
SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
INNER JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = 23155
Is significantly slower than:
SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
LEFT JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = 23155
115ms Vs 478ms. They are both using InnoDB and there are relationships defined. The 'card_legality' contains approx 200k rows, while the 'legality' table contains 11 rows. Here is the structure for each:
CREATE TABLE `card_legality` (
`card_id` varchar(8) NOT NULL DEFAULT '',
`legality_id` int(3) NOT NULL,
`cl_boolean` tinyint(1) NOT NULL,
PRIMARY KEY (`card_id`,`legality_id`),
KEY `legality_id` (`legality_id`),
CONSTRAINT `card_legality_ibfk_2` FOREIGN KEY (`legality_id`) REFERENCES `legality` (`legality_id`),
CONSTRAINT `card_legality_ibfk_1` FOREIGN KEY (`card_id`) REFERENCES `card` (`card_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
And:
CREATE TABLE `legality` (
`legality_id` int(3) NOT NULL AUTO_INCREMENT,
`l_name` varchar(16) NOT NULL DEFAULT '',
PRIMARY KEY (`legality_id`)
) ENGINE=InnoDB AUTO_INCREMENT=12 DEFAULT CHARSET=latin1;
I could simply use LEFT-JOIN, but it doesn't seem quite right... any thoughts, please?
UPDATE:
As requested, I've included the results of explain for each. I had run it previously, but I dont pretend to have a thorough understanding of it..
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE cl ALL PRIMARY NULL NULL NULL 199747 Using where
1 SIMPLE l eq_ref PRIMARY PRIMARY 4 hexproof.co.uk.cl.legality_id 1
AND, inner join:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE l ALL PRIMARY NULL NULL NULL 11
1 SIMPLE cl ref PRIMARY,legality_id legality_id 4 hexproof.co.uk.l.legality_id 33799 Using where
It is because of the varchar on card_id. MySQL can't use the index on card_id as card_id as described here mysql type conversion. The important part is
For comparisons of a string column with a number, MySQL cannot use an
index on the column to look up the value quickly. If str_col is an
indexed string column, the index cannot be used when performing the
lookup in the following statement:
SELECT * FROM tbl_name WHERE str_col=1;
The reason for this is that there are many different strings that may
convert to the value 1, such as '1', ' 1', or '1a'.
If you change your queries to
SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
INNER JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = '23155'
and
SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
LEFT JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = '23155'
You should see a huge improvement in speed and also see a different EXPLAIN.
Here is a similar (but easier) test to show this:
> desc id_test;
+-------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| id | varchar(8) | NO | PRI | NULL | |
+-------+------------+------+-----+---------+-------+
1 row in set (0.17 sec)
> select * from id_test;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+----+
9 rows in set (0.00 sec)
> explain select * from id_test where id = 1;
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
| 1 | SIMPLE | id_test | index | PRIMARY | PRIMARY | 10 | NULL | 9 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
> explain select * from id_test where id = '1';
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | SIMPLE | id_test | const | PRIMARY | PRIMARY | 10 | const | 1 | Using index |
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)
In the first case there is Using where; Using index and the second is Using index. Also ref is either NULL or CONST. Needless to say, the second one is better.
L2G has it pretty much summed up, although I suspect it could be because of the varchar type used for card_id.
I actually printed out this informative page for benchmarking / profiling quickies. Here is a quick poor-mans profiling technique:
Time a SQL on MySQL
Enable Profiling
mysql> SET PROFILING = 1
...
RUN your SQLs
...
mysql> SHOW PROFILES;
+----------+------------+-----------------------+
| Query_ID | Duration | Query |
+----------+------------+-----------------------+
| 1 | 0.00014600 | SELECT DATABASE() |
| 2 | 0.00024250 | select user from user |
+----------+------------+-----------------------+
mysql> SHOW PROFILE for QUERY 2;
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000034 |
| checking query cache for query | 0.000033 |
| checking permissions | 0.000006 |
| Opening tables | 0.000011 |
| init | 0.000013 |
| optimizing | 0.000004 |
| executing | 0.000011 |
| end | 0.000004 |
| query end | 0.000002 |
| freeing items | 0.000026 |
| logging slow query | 0.000002 |
| cleaning up | 0.000003 |
+--------------------------------+----------+
Good-luck, oh and please post your findings!
I'd try EXPLAIN on both of those queries. Just prefix each SELECT with EXPLAIN and run them. It gives really useful info on how mySQL is optimizing and executing queries.
I'm pretty sure that MySql has better optimization for Left Joins - no evidence to back this up at the moment.
ETA : A quick scout round and I can't find anything concrete to uphold my view so.....
I have simple function consist of one sql query
CREATE FUNCTION `GetProductIDFunc`( in_title char (14) )
RETURNS bigint(20)
BEGIN
declare out_id bigint;
select id into out_id from products where title = in_title limit 1;
RETURN out_id;
END
Execution time of this function takes 5 seconds
select Benchmark(500 ,GetProductIdFunc('sample_product'));
Execution time of plain query takes 0.001 seconds
select Benchmark(500,(select id from products where title = 'sample_product' limit 1));
"Title" field is indexed. Why function execution takes so much time and how can I optimize it?
edit:
Execution plan
mysql> EXPLAIN EXTENDED select id from products where title = 'sample_product' limit 1;
+----+-------------+----------+-------+---------------+------------+---------+-------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+-------+---------------+------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | products | const | Index_title | Index_title | 14 | const | 1 | 100.00 | Using index |
+----+-------------+----------+-------+---------------+------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN select GetProductIdFunc('sample_product');
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)
This could be a character set issue. If the function is using a different character set than the table column, it would lead to very slow performance despite the index.
Run show create table products\G to determine the character set for the column.
Run show variables like 'character_set%'; to see what the relevant default character sets are for your DB.
Try this:
CREATE FUNCTION `GetProductIDFunc`( in_title char (14) )
RETURNS bigint(20)
BEGIN
declare out_id bigint;
set out_id = (select id from products where title = in_title limit 1);
RETURN out_id;
END