I am reading the book High Performance MySQL and messing around with a new database testing somethings.
I am not sure if I am doing something wrong though..
I have a table called table_users
Structure:
ID(Integer)
FullName(Char)
UserName(Char)
Password(Char)
SecurityID(TinyINT)
LocationID(TinyINT)
Active(TinyINT)
My indexes are as follows:
PRIMARY : ID
FullName : UNIQUE : FullName
FK_table_users_LocationID (foreign key reference) : INDEX : LocationID
FK_table_users_SecurityID (foreign key reference) : INDEX : SecurityID
Active : INDEX : Active
All are BTREE
While reading the book, I am trying to use the following mysql statement to view the extras involved with a SELECT statement
EXPLAIN
SELECT * FROM table_users WHERE
FullName = 'Jeff';
No matter what the WHERE statement points to with this call, the extra result is either nothing or Using where. If I SELECT ID ... WHERE FullName = 'Jeff' it returns Using where, Using Index. But not whenever I do SELECT FullName .... WHERE FullName = 'Jeff'..
I am not familiar at all with indexes and trying to wrap my head around them bit having a bit of confusion with this. Shouldn't they return Using Index if I am referencing an indexed column?
Thanks.
Using index doesn't mean what it seems to mean. Have a look at covering indexes. If it says "using index" it means that mysql could return the data for your query without reading the actual rows. SELECT * - is only going to be able to use a covering index if even column of the table is in the index. Usually this is not the case.
I seem to remember a Chapters in High Performance Mysql that talks about covering indexes and how to read EXPLAIN results.
What version of MySQL are you using? Here's a test I ran on Percona Server 5.5.16:
mysql> create table table_users (
id int auto_increment primary key,
fullname char(20),
username char(20),
unique key (fullname)
);
Query OK, 0 rows affected (0.03 sec)
mysql> insert into table_users values (default, 'billk', 'billk');
Query OK, 1 row affected (0.00 sec)
mysql> explain select * from table_users where fullname='billk'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: table_users
type: const
possible_keys: fullname
key: fullname
key_len: 21
ref: const
rows: 1
Extra:
1 row in set (0.00 sec)
This shows that it's using the fullname index, looking up by a constant value, but it's not an index-only query.
mysql> explain select fullname from table_users where fullname='billk'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: table_users
type: const
possible_keys: fullname
key: fullname
key_len: 21
ref: const
rows: 1
Extra: Using index
1 row in set (0.00 sec)
This is as expected, it's able to get the fullname column from the fullname index, so this is an index-only query.
mysql> explain select id from table_users where fullname='billk'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: table_users
type: const
possible_keys: fullname
key: fullname
key_len: 21
ref: const
rows: 1
Extra: Using index
1 row in set (0.00 sec)
Searching on fullname but fetching the primary key is also an index-only query, because the leaf nodes of InnoDB secondary indexes (e.g. the unique key) implicitly contain the primary key value. So this query is able to traverse the BTREE for fullname, and as a bonus it gets the id too.
mysql> explain select fullname, username from table_users where fullname='billk'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: table_users
type: const
possible_keys: fullname
key: fullname
key_len: 21
ref: const
rows: 1
Extra:
1 row in set (0.00 sec)
As soon as the select-list includes any column that's not part of the index, it can no longer be an index-only query. First it searches the BTREE for fullname, to find the primary key value. Then it uses that id value to traverse the BTREE for the clustered index, which is how InnoDB stores the whole table. There it finds the other columns for the given row, including username.
Related
I have 3 columns a,b and c and i have indexed them as (a,b,c). i have a query like this :
SELECT * FROM tablename WHERE a=something and c=someone
My question is Does this query use this index or not!?
It may use the first column (a) of the index, but it can't use the third column (c).
One way you can tell is that the output of EXPLAIN.
Here's an example:
mysql> create table tablename (a int, b int, c int, key (a,b,c));
...I filled it with some random data...
mysql> explain SELECT * FROM tablename WHERE a=125 and c=456\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tablename
type: ref
possible_keys: a
key: a
key_len: 5
ref: const
rows: 20
Extra: Using where; Using index
The above shows ref: const which shows only one of the constant values are used to find rows in the index. Also the key_len: 5 shows only a subset of the index is used, since an index entry with three integers should be larger than 5 bytes.
mysql> explain SELECT * FROM tablename WHERE a=125 and b = 789 and c=456\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tablename
type: ref
possible_keys: a
key: a
key_len: 15
ref: const,const,const
rows: 1
Extra: Using index
When we use conditions on all three columns, it shows ref: const,const,const showing that all three values are being used to look up index entries. And the key_len is large enough to be an entry of three integers.
As Mihal says, if you prefix the query with EXPLAIN, the optimizer will tell you if it uses the index or not. Bill is partially correct in that it will only look up the value for a in the index, but if the table only contains the columns a,b and c, then the index is covering and the values for b and c will be retrieved from the index without reference to the table data - but the DBMS will still scan through all values of b and c in the index - not just going directly to the specified value for c.
It may be possible to fudge a query to make it use an index to a greater depth - assuming that b is an integer....
SELECT *
FROM tablename
WHERE a='something'
AND b BETWEEN -8388608 AND 8388607
AND c='someone'
I have a large table (250M rows) with a column group_id that broadly divides the table into groups (group_id). It has the following index:
mysql> show indexes from table\G;
*************************** 13. row ***************************
Table: table
Non_unique: 1
Key_name: myindex
Seq_in_index: 1
Column_name: group_id
Collation: A
Cardinality: 181819
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment:
*************************** 14. row ***************************
Table: table
Non_unique: 1
Key_name: myindex
Seq_in_index: 2
Column_name: id
Collation: A
Cardinality: 213456239
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
I want to execute the following query:
mysql> explain select * from `table` WHERE (`table`.`type_id` IN (11, 17, 12, 19) AND `table`.`group_id` = 310248) ORDER BY `table`.`id` ASC LIMIT 201\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: table
type: index
possible_keys: [SOME INDEX NAMES]
key: PRIMARY
key_len: 4
ref: NULL
rows: 257386914
Extra: Using where
1 row in set (0.00 sec)
I understand the it will need to scan some rows because of the problems with indexing for WHERE ... IN (). Amazingly to me, however, it chooses to scan almost as many rows as possible by using the primary key index.
The following seems unambiguously (and obviously) superior:
mysql> explain select * from `table` USE INDEX (myindex) WHERE (`table`.`type_id` IN (11, 17, 12, 19) AND `table`.`group_id` = 310248) ORDER BY `table`.`id` ASC LIMIT 201\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: table
type: ref
possible_keys: myindex
key: myindex
key_len: 5
ref: const
rows: 1883760
Extra: Using where
1 row in set (0.00 sec)
Using a larger value for LIMIT (2000), using different values of group_id, removing the ORDER BY and removing the type_id filter all cause it to use the index. I have run ANALYZE TABLE.
Its worth noting that the row estimates are quite high:
mysql> select count(*) from table where group_id=310248 and type_id in (11, 17, 12, 19) ;
+----------+
| count(*) |
+----------+
| 583868 |
+----------+
1 row in set (0.61 sec)
mysql version:
Ver 5.1.57-rel12.8-log for debian-linux-gnu on x86_64 ((Percona Server (GPL), 12.8, Revision 233))
Why would mysql choose a plan that it thinks will involve scanning 257386914 rows rather than 1883760? I understand that it might value sequential reads, but why would it choose the index for 2000 rows, but not for 200 rows? Why would filtering by a different group id?
Edited: I have also tried creating the index (group_id, id, type_id) so that all sorting can be done using only an index scan, but I can't get it to ever select that index.
Did you have a question?
Note that because that predicate on the type_id column has to be checked, and because your query is returning at least one column that is not in the index, MySQL will have to visit the data pages of the table, in order to access the values for those columns.
So, MySQL may be favoring the cluster key, since that's where the data pages are; the cluster key also allows MySQL to avoid a sort operation ("Using filesort"). (We do note that the execution plan that uses your index also avoids a sort operation.)
If you want MySQL to favor your index, you might consider including type_id as a third column in that index, if that is at all selective.
Alternatively, you might consider modifying your query to "ORDER BY group_id, id" to influence the optimizer.
Have you measured the performance of the query, both with the hint and without the hint?
I have following table structure.
town:
id (MEDINT,PRIMARY KEY,autoincrement),
town(VARCHAR(150),not null),
lat(FLOAT(10,6),notnull)
lng(FLOAT(10,6),notnull)
i frequently use "SELECT * FROM town ORDER BY town" query. I tried indexing town but it is not being used. So what is the best way to index so that i can speed up my queries.
USING EXPLAIN(UNIQUE INDEX Is PRESENT ON town):
mysql> EXPLAIN SELECT * FROM studpoint_town order by town \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: studpoint_town
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 3
Extra: Using filesort
1 row in set (0.00 sec)
ragards ,
ravi.
Your EXPLAIN output indicates that currently the studpoint_town table has only 3 rows. As explained in the manual:
The output from EXPLAIN shows ALL in the type column when MySQL uses a table scan to resolve a query. This usually happens under the following conditions:
[...]
The table is so small that it is faster to perform a table scan than to bother with a key lookup. This is common for tables with fewer than 10 rows and a short row length. Don't worry in this case.
I have the following query:
SELECT t.id
FROM account_transaction t
JOIN transaction_code tc ON t.transaction_code_id = tc.id
JOIN account a ON t.account_number = a.account_number
GROUP BY tc.id
When I do an EXPLAIN the first row shows, among other things, this:
table: t
type: ALL
possible_keys: account_id,transaction_code_id,account_transaction_transaction_code_id,account_transaction_account_number
key: NULL
rows: 465663
Why is key NULL?
Another issue you may be encountering is a data type mis-match. For example, if your column is a string data type (CHAR, for ex), and your query is not quoting a number, then MySQL won't use the index.
SELECT * FROM tbl WHERE col = 12345; # No index
SELECT * FROM tbl WHERE col = '12345'; # Index
Source: Just fought this same issue today, and learned the hard way on MySQL 5.1. :)
Edit: Additional information to verify this:
mysql> desc das_table \G
*************************** 1. row ***************************
Field: das_column
Type: varchar(32)
Null: NO
Key: PRI
Default:
Extra:
*************************** 2. row ***************************
[SNIP!]
mysql> explain select * from das_table where das_column = 189017 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: das_column
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 874282
Extra: Using where
1 row in set (0.00 sec)
mysql> explain select * from das_table where das_column = '189017' \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: das_column
type: const
possible_keys: PRIMARY
key: PRIMARY
key_len: 34
ref: const
rows: 1
Extra:
1 row in set (0.00 sec)
It might be because the statistics is broken, or because it knows that you always have a 1:1 ratio between the two tables.
You can force an index to be used in the query, and see if that would speed up things. If it does, try to run ANALYZE TABLE to make sure statistics are up to date.
By specifying USE INDEX (index_list), you can tell MySQL to use only one of the named indexes to find rows in the table. The alternative syntax IGNORE INDEX (index_list) can be used to tell MySQL to not use some particular index or indexes. These hints are useful if EXPLAIN shows that MySQL is using the wrong index from the list of possible indexes.
You can also use FORCE INDEX, which acts like USE INDEX (index_list) but with the addition that a table scan is assumed to be very expensive. In other words, a table scan is used only if there is no way to use one of the given indexes to find rows in the table.
Each hint requires the names of indexes, not the names of columns. The name of a PRIMARY KEY is PRIMARY. To see the index names for a table, use SHOW INDEX.
From http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
Index for the group by (=implicit order by)
...
GROUP BY tc.id
The group by does an implicit sort on tc.id.
tc.id is not listed a a possible key.
but t.transaction_id is.
Change the code to
SELECT t.id
FROM account_transaction t
JOIN transaction_code tc ON t.transaction_code_id = tc.id
JOIN account a ON t.account_number = a.account_number
GROUP BY t.transaction_code_id
This will put the potential index transaction_code_id into view.
Indexes for the joins
If the joins (nearly) fully join the three tables, there's no need to use the index, so MySQL doesn't.
Other reasons for not using an index
If a large % of the rows under consideration (40% IIRC) are filled with the same value. MySQL does not use an index. (because not using the index is faster)
table:
foreign_id_1
foreign_id_2
integer
date1
date2
primary(foreign_id_1, foreign_id_2)
Query: delete from table where (foreign_id_1 = ? or foreign_id_2 = ?) and date2 < ?
Without date query takes about 40 sec. That's too high :( With date much more longer..
The options are:
create another table and insert select, then rename
use limit and run query multiple times
split query to run for foreign_id_1 then foreign_id_2
use select then delete by single row
Is there any faster way?
mysql> explain select * from compatibility where user_id = 193 or person_id = 193 \G
id: 1
select_type: SIMPLE
table: compatibility
type: index_merge
possible_keys: PRIMARY,compatibility_person_id_user_id
key: PRIMARY,compatibility_person_id_user_id
key_len: 4,4
ref: NULL
rows: 2
Extra: Using union(PRIMARY,compatibility_person_id_user_id); Using where
1 row in set (0.00 sec)
mysql> explain select * from compatibility where (user_id = 193 or person_id = 193) and updated_at < '2010-12-02 22:55:33' \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: compatibility
type: index_merge
possible_keys: PRIMARY,compatibility_person_id_user_id
key: PRIMARY,compatibility_person_id_user_id
key_len: 4,4
ref: NULL
rows: 2
Extra: Using union(PRIMARY,compatibility_person_id_user_id); Using where
1 row in set (0.00 sec)
Having an OR in your WHERE makes MySQL reluctant (if not completely refuse) to use indexes on your user_id and/or person_id fields (if there is any -- showing the CREATE TABLE would indicate if there was).
If you can add indexes (or modify existing ones since I'm thinking of compound indexes), I'd likely add two:
ALTER TABLE compatibility
ADD INDEX user_id_updated_at (user_id, updated_at),
ADD INDEX persona_id_updated_at (person_id, updated_at);
Correspondingly, assuming the rows to DELETE didn't have to be be deleted atomically (i.e. occur at the same instant).
DELETE FROM compatibility WHERE user_id = 193 AND updated_at < '2010-12-02 22:55:33';
DELETE FROM compatibility WHERE person_id = 193 AND updated_at < '2010-12-02 22:55:33';
By now data amount is 40M (+33%) and rapidly growing. So I've started looking for other, some no-sql, solution.
Thanks.