Why are no keys used in this EXPLAIN? - mysql

I was expecting this query to use a key.
mysql> DESCRIBE TABLE Foo;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| name | varchar(50) | NO | UNI | NULL | |
+-------+-------------+------+-----+---------+----------------+
mysql> EXPLAIN SELECT id FROM Foo WHERE name='foo';
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE noticed after reading const tables |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
Foo has a unique index on name, so why isn't the index being used in the SELECT?

From the MySQL Manual page entitled EXPLAIN Output Format:
Impossible WHERE noticed after reading const tables (JSON property:
message)
MySQL has read all const (and system) tables and notice that the WHERE
clause is always false.
and the definition of const tables, from the Page entitled Constants and Constant Tables:
A MySQL constant is something more than a mere literal in the query.
It can also be the contents of a constant table, which is defined as
follows:
A table with zero rows, or with only one row
A table expression that is restricted with a WHERE condition,
containing expressions of the form column = constant, for all the
columns of the table's primary key, or for all the columns of any of
the table's unique keys (provided that the unique columns are also
defined as NOT NULL).
The second reference is a page and half long. Please refer to it.
const
const
The table has at most one matching row, which is read at the start of
the query. Because there is only one row, values from the column in
this row can be regarded as constants by the rest of the optimizer.
const tables are very fast because they are read only once.
const is used when you compare all parts of a PRIMARY KEY or UNIQUE
index to constant values. In the following queries, tbl_name can be
used as a const table:
SELECT * FROM tbl_name WHERE primary_key=1;
SELECT * FROM tbl_name WHERE primary_key_part1=1 AND
primary_key_part2=2;

It could be because that the said table Foo very less volume of data. In such case optimizer will choose to do table scan rather than looking through index.
As MySQL Documentation clearly says
Indexes are less important for queries on small tables, or big tables
where report queries process most or all of the rows. When a query
needs to access most of the rows, reading sequentially is faster than
working through an index. Sequential reads minimize disk seeks, even
if not all the rows are needed for the query.

Related

MySQL Partitioning and Automatic Movement of Rows

I have a table with ~6M rows that is extracting around ~20,000-30,000 rows per query with index optimization. However, as a lot of people are extracting these rows consecutively (every 30 seconds or so) the site will often time out for people.
I recently migrated the database to a 3-server MySQL cluster with a huge amount of RAM (512GB per server) and the performance haven't improved a lot.
I was wondering if partioning would be the best way to proceed to improve performance. As I have absolutely no experience with partioning I thought I would ask here.
My question is, all of these rows have a column that will either have the value 0, 1, 2 or 3.
Would it be possible somehow to place all the rows with value 1 in a certain column on one partition, and all rows with value 2 in a column in another one? And would they move automatically based on the value being updated in the primary table? And most importantly, could it help out with performance as it would only have to look through finding 1 row in 20,000-30,000 instead of 6,000,000
Yes, MySQL supports partitioning. You can define the partitions pretty well, like:
CREATE TABLE MyTable (
id INT AUTO_INCREMENT PRIMARY KEY,
somestuff INT,
otherstuff VARCHAR(100),
KEY (somestuff)
) PARTITION BY HASH(id) PARTITIONS 4;
INSERT INTO MyTable () VALUES (), (), (), ();
You can verify how many rows are in each partition after this:
SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME='MyTable';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0 | 1 |
| p1 | 1 |
| p2 | 1 |
| p3 | 1 |
+----------------+------------+
However, there are two things that trip people up when they try to use partitioning in MySQL:
First, as https://dev.mysql.com/doc/refman/5.7/en/partitioning-limitations-partitioning-keys-unique-keys.html says:
every unique key on the table must use every column in the table's partitioning expression.
This means if you want to partition by somestuff in the example above, you can't. That would fail the requirement that primary key include the column named in the partition expression.
ALTER TABLE MyTable PARTITION BY HASH(somestuff) PARTITIONS 4;
ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function
You can get around this by removing any primary key or unique key constraints from your table, but that leaves you with kind of a malformed table.
Second, partitioning speeds up queries only if you can take advantage of partition pruning, and this happens only if your query conditions include the column used in the partition expression.
mysql> EXPLAIN PARTITIONS SELECT * FROM MyTable WHERE SomeStuff = 3;
+----+-------------+---------+-------------+------+---------------+-----------+---------+-------+------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------------+------+---------------+-----------+---------+-------+------+-------+
| 1 | SIMPLE | MyTable | p0,p1,p2,p3 | ref | somestuff | somestuff | 5 | const | 4 | NULL |
+----+-------------+---------+-------------+------+---------------+-----------+---------+-------+------+-------+
Note this says it will need to scan partitions p0,p1,p2,p3 — i.e. the whole table. There is no partition pruning, therefore no performance improvement because it is not reducing the number of rows examined.
If you do search for a specific value in the column used in the partitioning expression, you can see that MySQL is able to reduce the number of partitions it scans:
mysql> EXPLAIN PARTITIONS SELECT * FROM MyTable WHERE id = 3;
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | MyTable | p3 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+-------+
Partitioning can help a lot in very specific circumstances, but partitioning isn't as versatile as most people think.
In most cases, it's better to define more specific indexes in your table to support the queries you need to run.

simple SQL statement takes longer time to execute [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Disadvantages of quoting integers in a Mysql query?
I have a very simple table Called Device on MYSql database.
+-----------------------------------+--------------+------+-----+----------------+
| Field | Type | Null | Key | Extra |
+-----------------------------------+--------------+------+-----+----------------+
| DTYPE | varchar(31) | NO | | |
| id | bigint(20) | NO | PRI | auto_increment |
| dateCreated | datetime | NO | | |
| dateModified | datetime | NO | | |
| phoneNumber | varchar(255) | YES | MUL | |
| version | bigint(20) | NO | | |
| oldPhoneNumber | varchar(255) | YES | | |
+-----------------------------------+--------------+------+-----+----------------+
This table has more than 100K records. I am running a very simple query
select * from AttDevice where phoneNumber = 5107357058;
This query takes almost 4-6 second, But when I change this query a little bit as shown below.
select * from AttDevice where phoneNumber = '5107357058';
It takes almost no time to get executed.
Notice that phoneNumber column is varchar. I don't understand why the former case takes longer time and later doesn't. The difference between these two queries is the single quote.
Does MYSQL treats these to query differently if so then why?
EDIT 1
I used EXPLAIN and got the following output but don't know how to interpret these two results.
mysql> EXPLAIN select * from AttDevice where phoneNumber = 5107357058;
+----+-------------+-----------+------+---------------------------------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------------------------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | Device | ALL | phoneNumber,idx_Device_phoneNumber | NULL | NULL | NULL | 6482116 | Using where |
+----+-------------+-----------+------+---------------------------------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)
mysql> EXPLAIN select * from AttDevice where phoneNumber = '5107357058';
+----+-------------+-----------+------+---------------------------------------+-------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------------------------------+-------------+---------+-------+------+-------------+
| 1 | SIMPLE | Device | ref | phoneNumber,idx_Device_phoneNumber | phoneNumber | 258 | const | 2 | Using where |
+----+-------------+-----------+------+---------------------------------------+-------------+---------+-------+------+-------------+
1 row in set (0.00 sec)
Can someone explain me about the key, key_len and rows present in EXPLAIN query output?
1) Thank you for the "EXPLAIN". We all (including you, I'm sure) knew that the problem was that mysql had to convert the integer to a string, and had to do it for each row. But your "EXPLAIN" proved it.
2) Here's a nice, short article about EXPLAIN:
http://www.lornajane.net/posts/2011/explaining-mysqls-explain
The *possible_keys* shows which indexes apply to this query and the key
tells us which of those was actually used -... Finally the rows entry tell
us how many rows MySQL had to look at to find the result set.
Search value: key: type: ref: rows:
------------- --- ---- ---- ----
5107357058 NULL ALL NULL 6482116
'5107357058' phoneNumber ref const 2
3) The "ref" column is the "The columns compared to the index". In the second case, the string literal ("constant") '5107357058' was compared to the key "phoneNumber". In the first case, there was no usable key (because your search condition was a completely different type); hence "ref" was NULL.
4) The "type" column is "The join type". "Ref" means "All rows with matching index values are read from this table" (in this case, 2 rows). "ALL" mans "full table scan". Which in this case means 6 million rows.
5) Here's the mysql documentation for "EXPLAIN":
http://dev.mysql.com/doc/refman/5.5/en/explain-output.html
You fooled MySQL into making a bad choice by NOT quoting the phone number. Consider:
The column definition is varchar
In the first (unquoted) case you provided the value as an integer (long). I would have thought MySQL could figure this one out, but obviously it didn't, and did a full table scan.
In the second (quoted) case, you gave the search key in the correct datatype (character) and MySQL chose the index over the full-table-scan.
The varchar index cannot be used when you use a number as the operand, excerpt from the fine documentation on implicit type conversions:
For comparisons of a string column with a number, MySQL cannot use an index on the column to look up the value quickly. If str_col is an indexed string column, the index cannot be used when performing the lookup in the following statement:
SELECT * FROM tbl_name WHERE str_col=1;
The reason for this is that there are many different strings that may convert to the value 1, such as '1', ' 1', or '1a'.
I believe that MySQL has to convert the number into a varchar in the first example. In the second example it does not. I'm guessing that's where the difference is coming from.
The first example looks through the table one by one, the other one uses the index.
http://dev.mysql.com/doc/refman/5.0/en/show-columns.html
If Key is MUL, multiple occurrences of a given value are permitted within the column. The column is the first column of a nonunique index or a unique-valued index that can contain NULL values.
So instead of scanning all the null values, the second query look exclusively for for non-null values which speeds things up.
....I think.

MySQL MyISAM table index cardinality is zero

I have a table containing 60 million rows. The structure is like entryid, date, sourceid, detail, views. (entryid, date, sourceid, detail) is the PK, and I also have indexes for each field except views.
The problem is the cardinalities of the four indexes are zero, but I am sure they should not.
I wonder why is that? And does it mean the index doesn't work?
It's possible that the table statistics have not been updated.
See this page on optimizing MyISAM tables:
To help MySQL better optimize queries, use ANALYZE TABLE or run
myisamchk --analyze on a table after it has been loaded with data.
This updates a value for each index part that indicates the average
number of rows that have the same value. (For unique indexes, this is
always 1.) MySQL uses this to decide which index to choose when you
join two tables based on a nonconstant expression. You can check the
result from the table analysis by using SHOW INDEX FROM tbl_name and
examining the Cardinality value. myisamchk --description --verbose
shows index distribution information.
The best way to determine whether an index is helping is to explain a query:
mysql> explain select 1;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)

Why does removing this index in MySQL speed up my query 100x?

I have the following MySQL table (simplified):
CREATE TABLE `track` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(256) NOT NULL,
`is_active` tinyint(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `is_active` (`is_active`, `id`)
) ENGINE=MyISAM AUTO_INCREMENT=7495088 DEFAULT CHARSET=utf8
The 'is_active' column marks rows that I want to ignore in most, but not all, of my queries. I have some queries that read chunks out of this table periodically. One of them looks like this:
SELECT id,title from track where (track.is_active=1 and track.id > 5580702) ORDER BY id ASC LIMIT 10;
This query takes over a minute to execute. Here's the execution plan:
> EXPLAIN SELECT id,title from track where (track.is_active=1 and track.id > 5580702) ORDER BY id ASC LIMIT 10;
+----+-------------+-------+------+----------------+--------+---------+-------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------+--------+---------+-------+---------+-------------+
| 1 | SIMPLE | t | ref | PRIMARY,is_active | is_active | 1 | const | 3747543 | Using where |
+----+-------------+-------+------+----------------+--------+---------+-------+---------+-------------+
Now, if I tell MySQL to ignore the 'is_active' index, the query happens instantaneously.
> EXPLAIN SELECT id,title from track IGNORE INDEX(is_active) WHERE (track.is_active=1 AND track.id > 5580702) ORDER BY id ASC LIMIT 10;
+----+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
| 1 | SIMPLE | t | range | PRIMARY | PRIMARY | 4 | NULL | 1597518 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
Now, what's really strange is that if I FORCE MySQL to use the 'is_active' index, the query once again happens instantaneously!
+----+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
| 1 | SIMPLE | t | range | is_active |is_active| 5 | NULL | 1866730 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+---------+-------------+
I just don't understand this behavior. In the 'is_active' index, rows should be sorted by is_active, followed by id. I use both the 'is_active' and 'id' columns in my query, so it seems like it should only need to do a few hops around the tree to find the IDs, then use those IDs to retrieve the titles from the table.
What's going on?
EDIT: More info on what I'm doing:
Query cache is disabled
Running OPTIMIZE TABLE and ANALYZE TABLE had no effect
6,620,372 rows have 'is_active' set to True. 874,714 rows have 'is_active' set to False.
Using FORCE INDEX(is_active) once again speeds up the query.
MySQL version 5.1.54
It looks like MySQL is making a poor decision about how to use the index.
From that query plan, it is showing it could have used either the PRIMARY or is_active index, and it has chosen is_active in order to narrow by track.is_active first. However, it is only using the first column of the index (track.is_active). That gets it 3747543 results which then have to be filtered and sorted.
If it had chosen the PRIMARY index, it would be able to narrow down to 1597518 rows using the index, and they would be retrieved in order of track.id already, which should require no further sorting. That would be faster.
New information:
In the third case where you are using FORCE INDEX, MySQL is using the is_active index but now instead of only using the first column, it is using both columns (see key_len). It is therefore now able to narrow by is_active and sort and filter by id using the same index, and since is_active is a single constant, the ORDER BY is satisfied by the second column (ie the rows from a single branch of the index are already in sorted order). This seems to be an even better outcome than using PRIMARY - and probably what you intended in the first place, right?
I don't know why it wasn't using both columns of this index without FORCE INDEX, unless the query has changed in a subtle way in between. If not I'd put it down to MySQL making bad decisions.
I think the speedup is due to your where clause. I am assuming that it is only retrieving a small subset of the rows in the entire large table. It is faster to do a table scan of the retrieved data for is_active on the small subset than to do the filtering through a large index file. Traversing a single column index is much faster than traversing a combined index.
Few things you could try:
Do an OPTIMIZE and CHECK on your table, so mysql will re-calculate index values
have a look at http://dev.mysql.com/doc/refman/5.1/en/index-hints.html - you can tell mysql to choose the right index in different cases

MySQL datetime index is not working

Table structure:
+-------------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+----------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| total | int(11) | YES | | NULL | |
| thedatetime | datetime | YES | MUL | NULL | |
+-------------+----------+------+-----+---------+----------------+
Total rows: 137967
mysql> explain select * from out where thedatetime <= NOW();
+----+-------------+-------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | out | ALL | thedatetime | NULL | NULL | NULL | 137967 | Using where |
+----+-------------+-------------+------+---------------+------+---------+------+--------+-------------+
The real query is much more longer with more table joins, the point is, I can't get the table to use the datetime index. This is going to be hard for me if I want to select all data until certain date. However, I noticed that I can get MySQL to use the index if I select a smaller subset of data.
mysql> explain select * from out where thedatetime <= '2008-01-01';
+----+-------------+-------------+-------+---------------+-------------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+-------------+---------+------+-------+-------------+
| 1 | SIMPLE | out | range | thedatetime | thedatetime | 9 | NULL | 15826 | Using where |
+----+-------------+-------------+-------+---------------+-------------+---------+------+-------+-------------+
mysql> select count(*) from out where thedatetime <= '2008-01-01';
+----------+
| count(*) |
+----------+
| 15990 |
+----------+
So, what can I do to make sure MySQL will use the index no matter what date that I put?
There are two things in play here -
Index is not selective enough - if the index covers more than approx. 30% of the rows, MySQL will decide a full table scan is more efficient. When you contract the range the index kicks in.
One index per table in a join
The real query is much more longer
with more table joins, the point is ...
The point is exactly because it has joins that it probably can't use that index. MySQL can use one index per table in a join (unless it qualifies for an index-merge optimization). If the primary key is already used for the join, thedatetime won't be used. In order to use it, you need to create a multi-column index on the join key + thedatetime index, in the correct order.
Check the EXPLAIN of the actual query to see which key MySQL uses for the join. Modify that index to include the thedatetime column as well, or create a new multi-column index from both (depending on what you use the join key for).
Everything works as it is supposed to. :)
Indexes are there to speed up retrieval. They do it using index lookups.
In you first query the index is not used because you are retrieving ALL rows, and in this case using index is slower (lookup index, get row, lookup index, get row... x number of rows is slower then get all rows == table scan)
In the second query you are retrieving only a portion of the data and in this case table scan is much slower.
The job of the optimizer is to use statistics that RDBMS keeps on the index to determine the best plan. In first case index was considered, but planner (correctly) threw it away.
EDIT
You might want to read something like this to get some concepts and keywords regarding mysql query planner.