I am trying to figure out the most efficient way to extract values from database that has the structure similar to this:
table test:
int id (primary, auto increment)
varchar(50) stuff,
varchar(50) important_stuff;
where I need to do a query like
select * from test where important_stuff like 'prefix%';
The size of the entire table is approximately 10 million rows, however there are only about 500-1000 distinct values for important_stuff. My current solution is indexing important_stuff however the performance is not satisfactory. Will it be better to create a separate table that will match distinct important_stuff to a certain id, which will be stored in the 'test' table and then do
(select id from stuff_lookup where important_stuff like 'prefix%') a join select * from test b where b.stuff_id=a.id
or this:
select * from test where stuff_id exists in(select id from stuff_lookup where important_stuff like 'prefix%')
What is the best way to optimize things like that?
How big is innodb_buffer_pool_size? How much RAM is available? The former should be about 70% of the latter. You'll see in a minute why I bring up this setting.
Based on your 3 suggested SELECTs, the original one will work as good as the two complex ones. In some other case, the complex formulation might work better.
INDEX(important_stuff) is the 'best' index for
select * from test where important_stuff like 'prefix%';
Now, let's study how that query works with that index:
Reach into the BTree index, starting at 'prefix'. (Effort: Virtually instantaneous)
Scan forward for, say, 1000 entries. That will be about 10 InnoDB blocks (16KB each). Each entry will have the PRIMARY KEY (id). (Effort: <= 10 disk hits)
For each entry, look up the row (so you can get "*"). That's 1000 PK lookups in the BTree that contains both the PK and the data. At best, they might all be in 10 blocks. At worst, they could be in 1000 separate blocks. (Effort: 10-1000 blocks)
Total Effort: ~1010 blocks (worst case).
A standard spinning disk can handle ~100 reads/second. So. we are looking at 10 seconds.
Now, run the query again. Guess what; all those blocks are now in RAM (cached in the "buffer_pool", which is hopefully big enough for all of them). And it runs in less than 1 second.
OPTIMIZE TABLE was not necessary! It was not a statistics refresh, but rather caching that sped up the query.
I'm not MySQL user but I made some tests on my local database. I've added 10 millions rows as you wrote and distinct datas from third column are loaded quite fast. These are my results.
mysql> describe bigtable;
+-----------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| stuff | varchar(50) | NO | | NULL | |
| important_stuff | varchar(50) | NO | MUL | NULL | |
+-----------------+-------------+------+-----+---------+----------------+
3 rows in set (0.03 sec)
mysql> select count(*) from bigtable;
+----------+
| count(*) |
+----------+
| 10000089 |
+----------+
1 row in set (2.87 sec)
mysql> select count(distinct important_stuff) from bigtable;
+---------------------------------+
| count(distinct important_stuff) |
+---------------------------------+
| 1000 |
+---------------------------------+
1 row in set (0.01 sec)
mysql> select distinct important_stuff from bigtable;
....
| is_987 |
| is_988 |
| is_989 |
| is_99 |
| is_990 |
| is_991 |
| is_992 |
| is_993 |
| is_994 |
| is_995 |
| is_996 |
| is_997 |
| is_998 |
| is_999 |
+-----------------+
1000 rows in set (0.15 sec)
Important information is that I refreshed statistics on this table (before this operation I needed ~10 seconds to load these data).
mysql> optimize table bigtable;
Related
My paintings table looks like this
| id | artist_id | name
| 1 | 7 | landscape painting
| 2 | 15 | flowers painting
| 3 | 15 | scuffed painting
The artist_id is indexed and the name has a fulltext index on it. The table contains about 10M record.
Queries that match the name agains some keywords are ok:
select * from `paintings` where match (`name`) against ('+scuffed*' in boolean mode) limit 10;
10 rows in set (0.04 sec)
But when I sometimes want to only check for a certain painting done by a certain artist:
select * from `paintings` where `artist_id` = 15 and match (`name`) against ('+scuffed*' in boolean mode) limit 10;
7 rows in set (0.40 sec)
As you can see it takes 10x longer to run the query when I include the artist_id. I also tried running a nested query in order to get only paintings that have specific ids:
select * from `paintings` where id in (SELECT id from paintings where artist_id = 15) and match (`name`) against ('+scuffed*' in boolean mode) limit 10;
7 rows in set (0.44 sec)
This ended up being even slower.
How can this query be optimized to work well with and without a where clause on the artist_id?
Thank you!
You need to create a COMPOSITE INDEX KEY consisting of columns (id and artist_id) to speed up your query:
mysql> ALTER TABLE paintings ADD INDEX cmp_id_artist_id (id, artist_id);
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> SHOW INDEXES FROM paintings;
+-----------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-----------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| paintings | 1 | cmp_id_artist_id | 1 | id | A | 3 | NULL | NULL | | BTREE | | | YES | NULL |
| paintings | 1 | cmp_id_artist_id | 2 | artist_id | A | 3 | NULL | NULL | YES | BTREE | | | YES | NULL |
| paintings | 1 | ftindex_name | 1 | name | NULL | 3 | NULL | NULL | YES | FULLTEXT | | | YES | NULL |
+-----------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
3 rows in set (0.00 sec)
And now you can test again your 2nd query:
mysql> select * from `paintings` where `artist_id` = 15 and match (`name`) against ('+scuffed*' in boolean mode) limit 10;
+----+-----------+------------------+
| id | artist_id | name |
+----+-----------+------------------+
| 3 | 15 | scuffed painting |
+----+-----------+------------------+
1 row in set (0.00 sec)
Run EXPLAIN SELECT ... to see how the query is performed.
I think your second query is as optimal is can be. MySQL will perform the MATCH first, then check any other conditions.
You could add INDEX(artist_id), but I don't think that will help.
More
Let me provide another approach, then talk through the pros an cons.
PRIMARY KEY(artist_id, id),
INDEX(id),
FULLTEXT(name),
-- When searching _only_ by `name`:
WHERE MATCH(name) AGAINST('+string' IN BOOLEAN MODE)
-- When searching by artist _and_ name:
WHERE artist_id = 123
AND name LIKE '%string%';
Comments:
Without the artist, you get the full speed of FULLTEXT.
With artist, you abandon FULLTEXT (because it seems not to work for your dataset) and switch to the composite index, plus need to look only at those rows with the given artist_id.
id needs to be indexed to keep AUTO_INCREMENT happy; a simple INDEX(id) suffices.
The PRIMARY KEY must be unique; including id suffices.
Starting the PK with `artist_id clusters the rows for a given artist together, thereby speeding the query up (some for small tables, a lot for big tables).
This requires that you create two different queries, but that is probably not a big problem.
For an artist that has a lot of works, it may be 'too' slow -- because it is failing to use FULLTEXT.
FULLTEXT and LIKE have different rules for what will/won't match. So you may get different answers.
Another recent Question discussed why AGAINST('... +the ...' IN BOOLEAN MODE) never finds any rows.
AGAINST("+color +purple") will find purple color, but LIKE '%color purple% won't.
Be careful about benchmarking any approaches -- the performance will depend on how many works an artist has, and many other un-obvious differences.
I'm trying to store about 100 Million domain names in a MySQL database, but I can't figure out the right INDEX method to use on the domain names.
The issue being that LIKE queries will also be executed:
SELECT id FROM domains WHERE domain LIKE '%.example.com'
or
SELECT id FROM domains WHERE domain LIKE 'example.%'
If it makes it easier, '%example%' is not a requirement, but at best a nice to have / be able to.
What would be the proper index to use? Left to right (example.%) should be realitivly straight forward, but right to left (%.example.com) is problematic but the most common query.
I'm using MariaDB 10.3 on Linux. DB running on a PCI-e SSD, lookup times longer then 10 seconds should be coincided "unacceptable"
You can spend one virtual permanent column (rdomain) in your table where the virtual function stores the domainname in reverse order like REVERSE(domain). so it is possible to search from start of string i.e. search for '%.mydomain.com' -> WHERE rdomain like REVERSE('%.mydomain.com
the table
CREATE TABLE `myreverse` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`domain` varchar(64) CHARACTER SET latin1 DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_domain` (`domain`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
add the column
ALTER TABLE myreverse
ADD COLUMN rdomain VARCHAR(64) AS (REVERSE(domain)),
ADD KEY idx_rdomain (rdomain);
insert some data
INSERT INTO `myreverse` (`id`, `domain`)
VALUES
(2, 'img.google.com'),
(3, 'w3.google.com'),
(1, 'www.coogle.com'),
(4, 'www.google.de'),
(5, 'www.mydomain.com');
see the data
mysql> SELECT * from myreverse;
+----+------------------+------------------+
| id | domain | rdomain |
+----+------------------+------------------+
| 1 | www.google.com | moc.elgoog.www |
| 2 | img.google.com | moc.elgoog.gmi |
| 3 | w3.coogle.com | moc.elgooc.3w |
| 4 | www.google.de | ed.elgoog.www |
| 5 | www.mydomain.com | moc.niamodym.www |
+----+------------------+------------------+
5 rows in set (0.01 sec)
mysql>
now you can query with reverse order and MySQL can use the index.
query
mysql> select * from myreverse WHERE rdomain like REVERSE('%.google.com');
+----+----------------+----------------+
| id | domain | rdomain |
+----+----------------+----------------+
| 3 | w3.google.com | moc.elgoog.3w |
| 2 | img.google.com | moc.elgoog.gmi |
+----+----------------+----------------+
2 rows in set (0.00 sec)
mysql>
Here you can see that the optimizer use the index.
mysql> EXPLAIN select * from myreverse WHERE rdomain like REVERSE('%.google.com');
+----+-------------+-----------+------------+-------+---------------+-------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+-------------+---------+------+------+----------+-------------+
| 1 | SIMPLE | myreverse | NULL | range | idx_rdomain | idx_rdomain | 195 | NULL | 2 | 100.00 | Using where |
+----+-------------+-----------+------------+-------+---------------+-------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)
mysql>
I'm not sure an index would help you here. If you can't change the database, your options seem limited. One thing you could do, is if you're running both a subdomain and domain query back to back, to run the subdomain query first. That should help reduce the number of rows the domain query has to cover.
It would definitely help if you split the URL between subdomains and domains into different columsn in the database. Have indexes for both of them. Then you could query the subdomains only and the domains only. It should speed things up. And if there are a lot of repeating values, you should normalize those fields so to remove repetition and speed up queries even more.
Can someone explain this... before I have myself committed? The first result set should have two results, same as the second, no?
mysql> SELECT * FROM kuru_footwear_2.customer_address_entity_varchar
-> WHERE attribute_id=31 AND entity_id=324134;
+----------+----------------+--------------+-----------+-------+
| value_id | entity_type_id | attribute_id | entity_id | value |
+----------+----------------+--------------+-----------+-------+
| 885263 | 2 | 31 | 324134 | NULL |
+----------+----------------+--------------+-----------+-------+
1 row in set (0.00 sec)
mysql> SELECT * FROM kuru_footwear_2.customer_address_entity_varchar
-> WHERE value_id=885263 OR value_id=950181;
+----------+----------------+--------------+-----------+-------+
| value_id | entity_type_id | attribute_id | entity_id | value |
+----------+----------------+--------------+-----------+-------+
| 885263 | 2 | 31 | 324134 | NULL |
| 950181 | 2 | 31 | 324134 | NULL |
+----------+----------------+--------------+-----------+-------+
2 rows in set (0.00 sec)
attribute_id is a SMALLINT(5)
entity_id is a INT(10)
The problem is that you have a unique index on (entity_id,attribute_id). The query optimizer notices this when you write a query whose WHERE clause is covered by the index, and only returns 1 row since the uniqueness of the index implies that there's at most one matching row.
I'm not sure how you can have those duplicates in the first place, it seems like there's something corrupted in the table. Adding a unique index to a table will normally remove any duplicates. In fact, this is often suggested as a way to get rid of duplicates in a table, see How do I delete all the duplicate records in a MySQL table without temp tables.
In the first statement's selection (the 'WHERE' clause), you are using AND; in the second statement, you are using OR. This boils down to the definition of these logical operators. MySQL's official documentation doesn't say much about these other than that AND and OR are their own natural logical operators. If this is confusing, you may want to read up on basic Boolean Algebra.
I was busying myself with exploring GROUP BY optimizations. On a classical "max salary per departament" query. And suddenly weird results. The dump below goes straight from my console. NO COMMAND were issued between these two EXPLAINS. Only some time had passed.
mysql> explain select name, t1.dep_id, salary
from emploee t1
JOIN ( select dep_id, max(salary) msal
from emploee
group by dep_id
) t2
ON t1.salary=t2.msal and t1.dep_id = t2.dep_id
order by salary desc;
+----+-------------+------------+-------+---------------+--------+---------+-------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+-------------------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | Using temporary; Using filesort |
| 1 | PRIMARY | t1 | ref | dep_id | dep_id | 8 | t2.dep_id,t2.msal | 1 | |
| 2 | DERIVED | emploee | index | NULL | dep_id | 8 | NULL | 84 | Using index |
+----+-------------+------------+-------+---------------+--------+---------+-------------------+------+---------------------------------+
3 rows in set (0.00 sec)
mysql> explain select name, t1.dep_id, salary
from emploee t1
JOIN ( select dep_id, max(salary) msal
from emploee
group by dep_id
) t2
ON t1.salary=t2.msal and t1.dep_id = t2.dep_id
order by salary desc;
+----+-------------+------------+-------+---------------+--------+---------+-------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+-------------------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | Using temporary; Using filesort |
| 1 | PRIMARY | t1 | ref | dep_id | dep_id | 8 | t2.dep_id,t2.msal | 3 | |
| 2 | DERIVED | emploee | range | NULL | dep_id | 4 | NULL | 9 | Using index for group-by |
+----+-------------+------------+-------+---------------+--------+---------+-------------------+------+---------------------------------+
3 rows in set (0.00 sec)
As you may notice, it examined ten times less rows in second run. I assume it's because some inner counters got changed. But I don't want to depend on these counters. So - is there a way to hint mysql to use "Using index for group by" behavior only?
Or - if my speculations are wrong - is there any other explanation on the behavior and how to fix it?
CREATE TABLE `emploee` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`dep_id` int(11) NOT NULL,
`salary` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `dep_id` (`dep_id`,`salary`)
) ENGINE=InnoDB AUTO_INCREMENT=85 DEFAULT CHARSET=latin1 |
+-----------+
| version() |
+-----------+
| 5.5.19 |
+-----------+
Hm, showing the cardinality of indexes may help, but keep in mind: range's are usually slower then indexes there.
Because it think it can match the full index in the first one, it uses the full one. In the second one, it drops the index and goes for a range, but guesses the total number of rows satisfying that larger range wildly lower then the smaller full index, because all cardinality has changed. Compare it to this: why would "AA" match 84 rows, but "A[any character]" match only 9 (note that it uses 8 bytes of the key in the first, 4 bytes in the second)? The second one will in reality not read less rows, EXPLAIN just guesses the number of rows differently after an update on it's metadata of indexes. Not also that EXPLAIN does not tell you what a query will do, but what it probably will do.
Updating the cardinality can or will occur when:
The cardinality (the number of different key values) in every index of a table is calculated when a table is opened, at SHOW TABLE STATUS and ANALYZE TABLE and on other circumstances (like when the table has changed too much). Note that all tables are opened, and the statistics are re-estimated, when the mysql client starts if the auto-rehash setting is set on (the default).
So, assume 'at any point' due to 'changed too much', and yes, connecting with the mysql client can alter the behavior in choosing indexes of a server. Also: reconnecting of the mysql client after it lost its connection after a timeout counts as connecting with auto-rehash AFAIK. If you want to give mysql help to find the proper method, run ANALYZE TABLE once in a while, especially after heavy updating. If you think the cardinality it guesses is often wrong, you can alter the number of pages it reads to guess some statistics, but keep in mind a higher number means a longer running update of that cardinality, and something you don't want to happen that often when 'data has changed to much' on a table with a lot of operations.
TL;DR: it guesses rows differently, but you'd actually prefer the first behavior if the data makes that possible.
Adding:
On this previously linked page, we can probably also find why especially dep_id might have this problem:
small values like 1 or 2 can result in very inaccurate estimates of cardinality
I could imagine the number of different dep_id's is typically quite small, and I've indeed observed a 'bouncing' cardinality on non-unique indexes with quite a small range compared to the number of rows in my own databases. It easily guesses a range of 1-10 in the hundreds and then down again the next time, just based on the specific sample pages it picks & some algorithm that tries to extrapolate that.
mysql> desc users;
+-------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| email | varchar(128) | NO | UNI | | |
| password | varchar(32) | NO | | | |
| screen_name | varchar(64) | YES | UNI | NULL | |
| reputation | int(10) unsigned | NO | | 0 | |
| imtype | varchar(1) | YES | MUL | 0 | |
| last_check | datetime | YES | MUL | NULL | |
| robotno | int(10) unsigned | YES | | NULL | |
+-------------+------------------+------+-----+---------+----------------+
8 rows in set (0.00 sec)
mysql> create index i_users_imtype_robotno on users(imtype,robotno);
Query OK, 24 rows affected (0.25 sec)
Records: 24 Duplicates: 0 Warnings: 0
mysql> explain select * from users where imtype!='0' and robotno is null;
+----+-------------+-------+------+------------------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+------------------------+------+---------+------+------+-------------+
| 1 | SIMPLE | users | ALL | i_users_imtype_robotno | NULL | NULL | NULL | 24 | Using where |
+----+-------------+-------+------+------------------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
But this way,it's used:
mysql> explain select * from users where imtype in ('1','2') and robotno is null;
+----+-------------+-------+-------+------------------------+------------------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------------+------------------------+---------+------+------+-------------+
| 1 | SIMPLE | users | range | i_users_imtype_robotno | i_users_imtype_robotno | 11 | NULL | 3 | Using where |
+----+-------------+-------+-------+------------------------+------------------------+---------+------+------+-------------+
1 row in set (0.01 sec)
Besides,this one also did not use index:
mysql> explain select id,email,imtype from users where robotno=1;
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | users | ALL | NULL | NULL | NULL | NULL | 24 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
SELECT *
FROM users
WHERE imtype != '0' and robotno is null
This condition is not satisified by a single contiguous range of (imtype, robotno).
If you have records like this:
imtype robotno
$ NULL
$ 1
0 NULL
0 1
1 NULL
1 1
2 NULL
2 1
, ordered by (imtype, robotno), then the records 1, 5 and 7 would be returned, while other records wouldn't.
You'll need create this index to satisfy the condition:
CREATE INDEX ix_users_ri ON users (robotno, imptype)
and rewrite your query a little:
SELECT *
FROM users
WHERE (
robotno IS NULL
AND imtype < '0'
)
OR
(
robotno IS NULL
AND imtype > '0'
)
, which will result in two contiguous blocks:
robotno imtype
--- first block start
NULL $
--- first block end
NULL 0
--- second block start
NULL 1
NULL 2
--- second block end
1 $
1 0
1 1
1 2
This index will also serve this query:
SELECT id, email, imtype
FROM users
WHERE robotno = 1
, which is not served now by any index for the same reason.
Actually, the index for this query:
SELECT *
FROM users
WHERE imtype in ('1', '2')
AND robotno is null
is used only for coarse filtering on imtype (note using where in the extra field), it doesn't range robotno's
You need an index that has robotno as the first column. Your existing index is (imtype,robotno). Since imtype is not in the where clause, it can't use that index.
An index on (robotno,imtype) could be used for queries with just robotno in the where clause, and also for queries with both imtype and robotno in the where clause (but not imtype by itself).
Check out the docs on how MySQL uses indexes, and look for the parts that talk about multi-column indexes and "leftmost prefix".
BTW, if you think you know better than the optimizer, which is often the case, you can force MySQL to use a specific index by appending
FORCE INDEX (index_name) after FROM users.
It's because 'robotno' is potentially a primary key, and it uses that instead of the index.
A database systems query planner determines whether to do an index scan or not by analyzing the selectivity of the query's where clause relative to the index. (Indexes are also used to join tables together, but you only have users here.)
The first query has where imtype != '0'. This would select nearly all of the rows in users, assuming you have a large number of distinct values of imtype. The inequality operator is inherently unselective. So the MySQL query planner is betting here that reading through the index won't help and that it may as well just do a sequential scan through the whole table, since it probably would have to do that anyway.
On the other hand, had you said where imtype ='0', equality is a highly selective operator, and MySQL would bet that by reading just a few index blocks it could avoid reading nearly all of the blocks of the users table itself. So it would pick the index.
In your second example, where imtype in ('1','2'), MySQL knows that the index will be highly selective (though only half as selective as where imtype = '0'), and it will again bet that using the index will lead to a big payoff, as you discovered.
In your third example, where robotno=1, MySQL probably can't effectively use the index on users(imtype,robotno) since it would need to read in all the index blocks to find the robotno=1 record numbers: the index is sorted by imtype first, then robotno. If you had another index on users(robotno), MySQL would eagerly use it though.
As a footnote, if you had two indexes, one on users(imtype), and the other on users(imtype,robotno), and your query was on where imtype = '0', either index would make your query fast, but MySQL would probably select users(imtype) simply because it's more compact and fewer blocks would need to be read from it.
I'm being very simplistic here. Early database systems would just look at imtype's datatype and make a very rough guess at the selectivity of your query, but people very quickly realized that giving the query planner interesting facts like the total size of the table, the number of ditinct values in each column, etc. would enable it to make much smarter decisions. For instance if you had a users table where imtype was only every '0' or '1', the query planner might choose the index, since in that case the where imtype != '0' is more selective.
Take a look at the MySQL UPDATE STATISTICS statement and you'll see that its query planner must be sophisticated. For that reason I'd hesitate a great deal before using the FORCE statement to dictate a query plan to it. Instead, use UPDATE STATISTICS to give the query planner improved information to base its decisions on.
Your index is over users(imtype,robotno). In order to use this index, either imtype or imtype and robotno must be used to qualify the rows. You are just using robotno in your query, thus it can't use this index.