I've got a table with a column called "date".
The table looks somthing like this
CREATE TABLE IF NOT EXISTS `offers_log_archive` (
...
`date` date DEFAULT NULL,
...
KEY `date` (`date`)
) ENGINE=InnoDB
I perform the following query on this table:
SELECT
*
FROM
offers_log_archive as ola
WHERE
ola.date >= "2012-12-01" and
ola.date <= "2012-12-31"
Then I did the following:
explain (SELECT
*
FROM
offers_log_archive as ola
WHERE
ola.date >= "2012-12-01" and
ola.date <= "2012-12-31" );
The result of this explain is
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ola ALL date NULL NULL NULL 6206460 Using where
Why do I get type ALL? From what I know that basically means that the query will inspect every row in the table and ignores the index on date. Although I would expect that mysql would use this.
What happens here and why is the date index ignored?
Almost all values in your column are within the range of the query, so not only would the index be next to useless (it would add little value), but it would actually be much more expensive to use the index than do a simple table scan.
Edit
Try first running ANALYZE on the table:
ANALYZE TABLE MYTABLE
If that doesn't help, try changing the syntax to use BETWEEN:
WHERE ola.date BETWEEN '2012-12-01' AND '2012-12-31'
Related
I'm working on the optimization of MySQL query these days, one of the issues I've encountered is DATE() maybe not working for the table partitioned by date range.
Here is the sample table:
CREATE TABLE `testing_db` (
`date_time` date NOT NULL,
`id` varchar(10) NOT NULL,
PRIMARY KEY (`date_time`,`id`) USING BTREE,
UNIQUE KEY `unique` (`date_time`,`id`),
KEY `idx_date_time` (`date_time`),
KEY `idx_id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
/*!50100 PARTITION BY RANGE (to_days(`date_time`))
(PARTITION p0 VALUES LESS THAN (TO_DAYS('2021-01-01')),
PARTITION p2021_01 VALUES LESS THAN (TO_DAYS('2021-01-31')),
PARTITION p2021_02 VALUES LESS THAN (TO_DAYS('2021-02-28')),
PARTITION future VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;
Statement with DATE():
EXPLAIN
SELECT date_time, id FROM testing_db WHERE date_time = '2021-02-25';
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE testing_db p2021_02 ref PRIMARY,unique,idx_date_time,idx_id PRIMARY 3 const 1 100.00 Using index
Statement without DATE():
EXPLAIN
SELECT date_time, id FROM testing_db WHERE DATE(date_time) = '2021-02-25';
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE testing_db p0,p2021_01,p2021_02,future index idx_date_time 3 1 100.00 Using where; Using index
Comparing two explanations, obviously the statement with DATE() scans all partitions while the statement without DATE() doesn't. Its impact may be significant in a large table.
I've researched similar issues, but it seems they are not relevant to this case:
Official Doc, DATE() extracts the date part of the date or datetime.
Mysql, partitioning not working on date range
https://bugs.mysql.com/bug.php?id=28928
Could you help figure it out? Thanks a lot!
The use of the DATE() function in your WHERE clause negates the use of any relevant index which causes a full table scan. The full table scan will need to read from all partitions.
In your example you are applying the DATE() function to a column of type DATE, so it serves no purpose.
INDEX(date_time) is unnecessary because there are two other indexes starting with that column.
A PRIMARY KEY is (in MySQL) a UNIQUE key. So your UNIQUE(datetime, id) is redundant.
Usually is is unwise to start any index with the partition key (date_time).
WHERE DATE(date_time) = ... is not "sargable". That is, no indexing of date_time can be used when hiding a column in a function (DATE()). (This is the main problem that you are asking about.)
Instead of using DATE(), use a range, such as:
WHERE date_time >= '2021-02-26'
AND date_time < '2021-02-26' + INTERVAL 1 DAY
Based on the above comments, plus other things, just these two indexes would be better:
PRIMARY KEY(id, date_time),
INDEX(date_time, id)
Please don't call it date_time when it is only a DATE. My comments work for either datatype. The DATE() function is never needed around a column of datatype DATE nor a string that looks like a date.
Your partition definitions put the last day of each month in the 'wrong' partition'.
Be aware that PARTITIONing rarely helps with performance. I discuss that further in Pagination
I have a table with an indexed column named date_c with type Date. The problem is when I query something like this
explain select * from table where date_c > '2021-06-29'
the possible_keys is idx_..75 and key is idx_..75 too but when I change comparison like this:
explain select * from table where date_c < '2021-06-29'
in this case possible_keys is idx_..75 again but key is NULL
Why is that? Also I tried these but got the same results:
explain select * from table where date_c > DATE(NOW()) - 2 DAY key is not NULL
explain select * from table where date_c > DATE(NOW()) - 20 DAY key is NULL
Why mysql does not use possible_keys sometimes?
here is the "explain" of my query:
explain
select eil.sell_fmt, count(sell_fmt) as itemCount
from table_items eil
where eil.cl_Id=123 and eil.si_Id='0'
and start_date <= now() and end_date is not null and end_date < NOW()
group by eil.sell_fmt
without date (start_date, end_date) filters:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE eil ref table_items_clid_siid_sellFmt 39 const,const 7393 Using where; Using index
With date filters:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE eil ref table_items_clid_siid_sellFmt 39 const,const 8400 Using where
possible_keys are:
table_items_clid_siid, table_items_clid_siid_itemId, table_items_clid_siid_startDate_endDate, table_items_clid_siid_sellFmt
The query without date filters is very fast (0.4 sec), but with date filters, its taking about 30 seconds. total records are 14K only.
Table field types:
`cl_Id` int(11) NOT NULL,
`si_Id` varchar(11) NOT NULL,
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
`sell_fmt` varchar(20) DEFAULT NULL
I concatenated field-names to give index names, so you can estimate combined fields available in the index.
Can somebody guide me here? what's going on here? what is the best course of action i should take here, or where i'm doing wrong?
I need one more suggestion: in another query on same table, a user can filter based on UPTO 10 fields, and in no definite order of fields (random no of fields in random order). Then this type search would be too slow again. What's the best strategy then? one covering index with "all" possible searchable fields? if yes, does the order of fields in index matter? (i.e. if that order is different than that of fields in query, will the index be used?
First, without seeing your create table statement, I can offer the following... create composite index (multiple fields) that best match your common querying elements applicable to the where clause, starting with the smaller nominal count basis. Since you are explicitly looking for a "cl_ID" and "si_ID" plus start and end dates. Since you have a group by, I would add that to the index for optimization purposes and be a completely COVERING index so the engine does not need to go back to the raw data to complete the query. It can resolve by all fields in the index directly.
I would have an index on
( cl_id, si_id, start_date, end_date, sell_fmt )
Finally, change your count from count(sell_fmt) to just count(*) indicating an "i don't care about a specific field, just as long as a record is found, count it"
What is the best thing for my scenario
I have a tables with nearly 20,000,000 records, which basically stores what users have done in the site
id -> primary int 11 auto increment
user_id -> index int 11 not null
create_date -> ( no index yet ) date-time not null
it has other columns but seems irrelevant to name them here
I know I must put an index on create_date but do I put a single column index or a double column, which one first on the double index ( given the large number of records)?
by the way the query that I'm now using is like :
select max(id) -- in here I'm selecting actions that users have done, after this date, since date is today
from table t
where
t.create_date >= '2014-12-29 00:00:00'
group by t.user_id
Could you edit your question with an EXPLAIN PLAN of your SELECT? EXPLAIN Link. Meanwhile, you can try with this:
Make partitions using your date field create_date. Partitions
Build your index with the most restrictive criteria first. I think that in your case, it will be better create_date + user_id
CREATE INDEX index_name
ON table_name ( create_date , user_id );
I have been playing around with indexes on MySQL (5.5.24, WinXP), but I can't find the reason of why the server is not using one index when a LIKE is used.
The example is this:
I have created a test table:
create table testTable (
id varchar(50) primary key,
text1 varchar(50) not null,
startDate varchar(50) not null
) ENGINE = innodb;
Then, I added an index to startDate. (Please, do not ask why the column is a text and not a date time.. this is just a simple test):
create index jeje on testTable(startdate);
analyze table testTable;
After that, I added almost 200,000 rows of that where startDate had 3 possible values. (One third of appearences for each one..near 70,000 times)
So, if I run an EXPLAIN command like this:
explain select * from testTable use index (jeje) where startDate = 'aaaaaaaaa';
The answer is the following:
id = 1
select_type = SIMPLE
type = ref
possible_keys = jeje
key = jeje
rows = 88412
extra = Using where
So, the key is used, and the rows amount is near to 200,000/3 so all is ok.
The poblem is that if I change the query to: (just chaning '=' to 'LIKE'):
explain select * from testTable use index(jeje) where startDate LIKE 'aaaaaaaaa';
In this case, the answer is:
id = 1
select_type = SIMPLE
type = ALL
possible_keys = jeje
key = null
rows = 176824
extra = Using where
So, the index is not being used now(key is null, and rows near to the full table..as the type=all suggests).
MySQL documentation says that LIKE DOES make use of indexes.
So, what am i not seeing here? Where is the problem?
Thanks for your help.
MySql can ignore index if it index incurs access to more than 30% of table rows.
You could try FORCE INDEX [index_name], it will use index in any case.
The value of sysvar_max_seeks_for_key also affects whether the index is used or not:
http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_max_seeks_for_key
Try changing this value to a smaller number.
Search for similar requests on SO.
Based on Ubik comment, and data changes, I found that:
The Index IS used in these cases:
- explain select * from testTable force index jeje where startDate like 'aaaaaaadsfadsfadsfasafsafsasfsadsfa%';
- explain select * from testTable force index jeje where startDate like 'aaaaaaadsfadsfadsfasafsafsasfsadsfa%';
- explain select * from testTable force index jeje where startDate like 'aaa';
But the index is NOT being used when I use this query:
- explain select * from testTable force index jeje where startDate like 'aaaaaaaaa';
Based on the fact that in startDate column all the values have the same length (9 characters), when I use a query using a LIKE command and a 9 characters constant, PERHAPS MySQL prefer to not use the reason because of some performance algorithm, and goes to the table.
My concern was to see if I was making some kind of mistake on my original tests, but now I think that the index and tests are correct, and that MySQL in some cases decides to not use the index... and I will relay on this.
For me, this is a closed task.
If somebody want to add something to the thread, you are welcome.