What is the optimal index for this query? - mysql

I have the following table in MYSQL 8:
create table session
(
ID bigint unsigned auto_increment
primary key,
session_id varchar(255) null,
username varchar(255) null,
session_status varchar(255) null,
session_time int null,
time_created int null,
time_last_updated int null,
time_ended int null,
date_created date null,
);
I'm executing the following statement:
select * from session where username = VALUE and session_id = VALUE order by time_created desc
What is the the optimal index for the table to speed up this query?
The EXPLAIN query tells me I have two potential indexes, which are:
create index username_3
on session (username, time_created);
create index username_session_id_time_created_desc
on session (username, session_id, time_created desc);
I would have thought the index 'username_session_id_time_created_desc' would have been picked, however the EXPLAIN statement says that index 'username_3' is selected instead.
EDIT*
Result of SHOW CREATE TABLE session:
CREATE TABLE `session` (
`ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`session_id` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`username` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`session_status` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`session_time` int(11) DEFAULT NULL,
`time_created` int(11) DEFAULT NULL,
`time_last_updated` int(11) DEFAULT NULL,
`time_ended` int(11) DEFAULT NULL,
`date_created` date DEFAULT NULL,
PRIMARY KEY (`ID`),
KEY `username_3` (`username`,`time_created`),
KEY `username_session_id_time_created_desc` (`username`,`session_id`,`time_created`)
) ENGINE=InnoDB AUTO_INCREMENT=76149265 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
Result of EXPLAIN statement:
select_type: SIMPLE
type: ref
possible_keys: username_3,username_session_id_time_created_desc
key: username_3
key_len: 768
ref: const
rows: 1
Extra: Using where

For this query:
select *
from session
where username = %s and session_id = %s
order by time_created desc
The optimal index is (username, session_id, time_created desc). The first two columns can be in either order.

First I thought you had a typo because you wrote
create index username_session_id_time_created_desc on session (username, session_id, time_created desc);
But your create table shows
KEY `username_session_id_time_created_desc` (`username`,`session_id`,`time_created`)
instead of
KEY `username_session_id_time_created_desc` (`username`,`session_id`,`time_created` DESC)
But, I now think it is using username_3 in MySQL 8.0 because of backward_index_scan (I did not read it all)
How you can tell : there is no filesort in your simple EXPLAIN, so it must be the optimizer can use the username_3 index to sort your result set. (If you remove time_created from any index, you will see Using filesort. If you are not using MySQL 8.0, it might be that the sorting can also be done with username_3 index in your version.)
Fiddle in 5.7 shows "key": "username_session_id_time_created_desc", : well sometimes, but not always... it might depends on the index length (field length).
Whereas in MySQL 8.0 it shows "key": "username_3", and "backward_index_scan": true,
With only the 3 column index it shows a lower query cost so why choose the other index?
My guess is that the 2 column index is much shorter, and since the backward index scan is possible, the sorting is cheap, and the optimizer still prefer to have less IO, and little more computing.

Related

Why the index on my MySQL table is not being used?

I have a table in MySQL (InnoDB engine) with 100M records. Structure is as below:
CREATE TABLE LEDGER_AGR (
ID BIGINT(20) NOT NULL AUTO_INCREMENT,
`Booking` int(11) NOT NULL,
`LType` varchar(5) NOT NULL,
`PType` varchar(5) NOT NULL,
`FType` varchar(5) DEFAULT NULL,
`TType` varchar(10) DEFAULT NULL,
`AccountCode` varchar(55) DEFAULT NULL,
`AgAccountId` int(11) DEFAULT '0',
`TransactionDate` date NOT NULL,
`DebitAmt` decimal(37,6) DEFAULT '0.000000',
`CreditAmt` decimal(37,6) DEFAULT '0.000000',
KEY `TRANSACTION_DATE` (`TransactionDate`)
)
ENGINE=InnoDB;
When I am doing:
EXPLAIN
SELECT * FROM LEDGER_AGR
WHERE TransactionDate >= '2000-08-01'
AND TransactionDate <= '2017-08-01'
It is not using TRANSACTION_DATE index. But when I am doing:
EXPLAIN
SELECT * FROM LEDGER_AGR
WHERE TransactionDate = '2000-08-01'
it is using TRANSACTION_DATE index. Could someone explain please?
Range query #1 has poor selectivity. Equality query #2 has excellent selectivity. The optimizer is very likely to choose the index access path when result rows will be < 1% of total rows in table. The backend optimizer is unlikely to prefer the index when result rows will be a large fraction of the total, for example a half or a quarter of all rows.
A range of '2000-08-01' thru '2000-08-03' would likely exploit the index.
cf: mysql not using index?

MySQL EXPLAIN not using index with WHERE + ORDER BY (both part of the index)?

I've added an index (IDX_D34A04AD46C53D4C41FA5CD2) to my product table in order to speed-up searching for enabled products, ordered by price ascending:
CREATE TABLE `product` (
`id` varchar(16) COLLATE utf8_unicode_ci NOT NULL,
`unit_price` decimal(13,4) NOT NULL,
`stock_qty` int(11) DEFAULT NULL,
`is_enabled` tinyint(1) NOT NULL,
`min_sale_qty` int(11) DEFAULT NULL,
`max_sale_qty` int(11) DEFAULT NULL,
`package_qty` int(11) DEFAULT NULL,
`name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`is_new` tinyint(1) NOT NULL,
`created_at` date NOT NULL,
`package_type` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_D34A04ADBF396750` (`id`),
KEY `IDX_D34A04AD41FA5CD2` (`unit_price`),
KEY `IDX_D34A04AD46C53D4C` (`is_enabled`),
KEY `IDX_D34A04AD46C53D4C41FA5CD2` (`is_enabled`,`unit_price`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Searching for active products, order by price, showing 50 items per page:
EXPLAIN SELECT * FROM product WHERE is_enabled > 0 ORDER BY unit_price ASC LIMIT 0, 50;
Output:
1 SIMPLE product index IDX_D34A04AD46C53D4C,IDX_D34A04AD46C53D4C41FA5CD2 IDX_D34A04AD41FA5CD2 6 100 Using where
Can you simple explain what I'm doing wrong and why I can't achieve "Using Index Condition" in my example?
EDIT: from MySQL documentation:
The following queries use the index to resolve the ORDER BY part:
SELECT * FROM t1 WHERE key_part1 = constant ORDER BY key_part2;
It seems exactly the same example as mine.
This is your query:
SELECT *
FROM product
WHERE is_enabled > 0
ORDER BY unit_price ASC
LIMIT 0, 50;
Because you have an inequality condition on is_enabled, it ends the index usage. So, the index cannot be used for unit_price. Or, alternatively, it could be used for unit_price and the sorting, but the filtering would be done on output.
The following should use the index:
SELECT *
FROM product
WHERE is_enabled = 1
ORDER BY unit_price ASC
LIMIT 0, 50;
The value “Using index” in the "Extra" column indicates that MySQL will use a covering index to avoid accessing the table. This is not the case.
Your query is using the index (is_enabled,unit_price), but it's not using a covering index because you are retrieving all the columns in the SELECT statement.

Which column(s) to index in MySQL

I'm trying to optimize the following table, according to phpMyAdmin several stats regarding Table Scans are high and indices do not exist or are not being used. (Handler read rnd next 5.7 M)
1.
$query = "
SELECT * FROM apps_discrep
WHERE discrep_station = '$station'
AND discrep_date = '$date'
ORDER BY discrep_timestart";
2.
$query = "
SELECT * FROM apps_discrep
WHERE discrep_date BETWEEN '$keyword' AND '$keyword3'
AND (discrep_station like '$keyword2%') ORDER BY discrep_date";
Would it be correct to Index discrep_station, discrep_date, and discrep_timestart?
There currently only exist the Primary Unique Index on the auto-increment ID.
-- Table structure
`index` int(11) NOT NULL AUTO_INCREMENT,
discrep_station varchar(5) NOT NULL,
discrep_timestart time NOT NULL,
discrep_timestop time NOT NULL,
discrep_date date NOT NULL,
discrep_datetime timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
discrep_show varchar(31) NOT NULL,
discrep_text text NOT NULL,
discrep_by varchar(11) NOT NULL,
discrep_opr varchar(11) NOT NULL,
email_traffic varchar(3) NOT NULL,
email_techs varchar(3) NOT NULL,
email_promos varchar(3) NOT NULL,
email_spots varchar(3) NOT NULL,
eas_row varchar(11) NOT NULL,
PRIMARY KEY (`index`)
ENGINE=MyISAM DEFAULT CHARSET=utf8;
It looks to me like you can get both queries with the same BTREE index, since that allows you to use the left-most tuples as a separate index.
Consider this MySQL doc page as a reference.
ALTER TABLE xxx ADD KEY `key1` (`discrep_station`, `discrep_date`, `discrep_timestart`) USING BTREE;
Your first query will use all 3 fields in the index. The second query will only use the first 2 fields in the index.

Any chance of speeding up this simple MySQL Query?

Stumped on this query, it searches an important table which contains about 213k rows. The purpose of the query is to report traffic data for a month. The amount of traffic for each day of that month. And sum of a decimal value for each day. This query is ran frequently so I need to optimize it to the best possible. Currently takes avg. 2 seconds..
SQL Fiddle: http://sqlfiddle.com/#!2/171f5/3/0
All suggestions will be greatly appreciated! Thank you.
Query:
SELECT `date_day`, COUNT(*) AS num, SUM(decval) AS sum_decval FROM (`tbl_traffic`)
WHERE `uuid` = '1' AND `date_year` = '2012' AND `date_month` = '11'
GROUP BY `date_day`;
Explain Result:
id: 1
select_type: SIMPLE
table: adb1_analytics
type: ref
possible_keys: keys1,keys2,keys3
key: keys1
key_len: 7
ref: const,const,const
rows: 106693
Extra: Using where
1 row in set (0.13 sec)
Table structure:
CREATE TABLE IF NOT EXISTS `tbl_traffic` (
`id` int(100) unsigned NOT NULL AUTO_INCREMENT,
`uuid` int(100) unsigned NOT NULL,
`country` char(2) CHARACTER SET latin1 DEFAULT NULL,
`browser` varchar(50) CHARACTER SET latin1 DEFAULT NULL,
`platform` varchar(50) CHARACTER SET latin1 DEFAULT NULL,
`referrer` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`decval` decimal(15,5) NOT NULL,
`date_year` smallint(4) unsigned NOT NULL,
`date_month` tinyint(2) unsigned NOT NULL,
`date_day` tinyint(2) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `keys1` (`uuid`,`date_year`,`date_month`,`date_day`),
KEY `keys2` (`date_year`,`date_month`,`referrer`),
KEY `keys3` (`date_year`,`date_month`,`country`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
You have no effective indexes on date_day.
I would recommend creating a key specifically for what you are fetching and calculating: (date_day, decval)
Use count(id) and add keys on date_day and decval.
A covering key over both fields might be even better
Anytime you do a 'group by' you're asking for a full table scan.
http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html
If you can't index it better (see #njk answer) then you might want to select for your specific values then do the group, sum, etc on the sub set. If nothing else, at least it will be a smaller set to sort.

optimize query (2 simple left joins)

SELECT fcat.id,fcat.title,fcat.description,
count(DISTINCT ftopic.id) as number_topics,
count(DISTINCT fpost.id) as number_posts FROM fcat
LEFT JOIN ftopic ON fcat.id=ftopic.cat_id
LEFT JOIN fpost ON ftopic.id=fpost.topic_id
GROUP BY fcat.id
ORDER BY fcat.ord
LIMIT 100;
index on ftopic_cat_id, fpost.topic_id, fcat.ord
EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE fcat ALL PRIMARY NULL NULL NULL 11 Using temporary; Using filesort
1 SIMPLE ftopic ref PRIMARY,cat_id_2 cat_id_2 4 bloki.fcat.id 72
1 SIMPLE fpost ref topic_id_2 topic_id_2 4 bloki.ftopic.id 245
fcat - 11 rows,
ftopic - 1106 rows,
fpost - 363000 rows
Query takes 4,2 sec
TABLES:
CREATE TABLE IF NOT EXISTS `fcat` (
`id` int(11) NOT NULL auto_increment,
`title` varchar(250) collate utf8_unicode_ci NOT NULL,
`description` varchar(250) collate utf8_unicode_ci NOT NULL,
`created` datetime NOT NULL,
`visible` tinyint(4) NOT NULL default '1',
`ord` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `ord` (`ord`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=12 ;
CREATE TABLE IF NOT EXISTS `ftopic` (
`id` int(11) NOT NULL auto_increment,
`cat_id` int(11) NOT NULL,
`title` varchar(100) collate utf8_unicode_ci NOT NULL,
`created` datetime NOT NULL,
`updated` timestamp NOT NULL default CURRENT_TIMESTAMP,
`lastname` varchar(200) collate utf8_unicode_ci NOT NULL,
`visible` tinyint(4) NOT NULL default '1',
`closed` tinyint(4) NOT NULL default '0',
`views` int(11) NOT NULL default '1',
PRIMARY KEY (`id`),
KEY `cat_id_2` (`cat_id`,`updated`,`visible`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1116 ;
CREATE TABLE IF NOT EXISTS `fpost` (
`id` int(11) NOT NULL auto_increment,
`topic_id` int(11) NOT NULL,
`pet_id` int(11) NOT NULL,
`content` text collate utf8_unicode_ci NOT NULL,
`imageName` varchar(300) collate utf8_unicode_ci NOT NULL,
`created` datetime NOT NULL,
`reply_id` int(11) NOT NULL,
`visible` tinyint(4) NOT NULL default '1',
`md5` varchar(100) collate utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `md5` (`md5`),
KEY `topic_id_2` (`topic_id`,`created`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=390971 ;
Thanks,
hamlet
you need to create a key with both fcat.id, fcat.ord
Bold rewrite
This code is not functionally identical, but...
Because you want to know about distinct ftopic.id and fpost.id I'm going to be bold and suggest two INNER JOIN's instead of LEFT JOIN's.
Then because the two id's are autoincrementing they will no longer repeat, so you can drop the distinct.
SELECT
fcat.id
, fcat.title
, fcat.description
, count(ftopic.id) as number_topics
, count(fpost.id) as number_posts
FROM fcat
INNER JOIN ftopic ON fcat.id = ftopic.cat_id
INNER JOIN fpost ON ftopic.id = fpost.topic_id
GROUP BY fcat.id
ORDER BY fcat.ord
LIMIT 100;
It depends on your data if this is what you are looking for, but I'm guessing it will be faster.
All your indexes seem to be in order though.
MySQL does not use indexes for small sample sizes!
Note that the explain list that MySQL only has 11 rows to consider for fcat. This is not enough for MySQL to really start worrying about indexes, so it doesn't.
Because going to the index for small row-counts slows things down.
MySQL is trying to speed things up so it chooses not to use the index, this confuses a lot of people because we are trained so hard on the index. Small sample sizes don't give good explains!
Increase the size of the test data so MySQL has more rows to consider and you should start seeing the index being used.
Common misconceptions about force index
Force index does not force MySQL to use an index as such.
It hints at MySQL to use a different index from the one it might naturally use and it pushes MySQL into using an index by setting a very high cost on a table scan.
(In your case MySQL is not using a table scan, so force index has no effect)
MySQL (same most other DBMS's on the planet) has a very strong urge to use indexes, so if it doesn't (use any) that's because using no index at all is faster.
How does MySQL know which index to use
One of the parameters the query optimizer uses is the stored cardinality of the indexes.
Over time these values change... But studying the table takes time, so MySQL doesn't do that unless you tell it to.
Another parameter that affects index selection is the predicted disk-seek-times that MySQL expects to encounter when performing the query.
Tips to improve index usage
ANALYZE TABLE will instruct MySQL to re-evaluate the indexes and update its key distribution (cardinality). (consider running it daily/weekly in a cron job)
SHOW INDEX FROM table will display the key distribution.
MyISAM tables and indexes fragment over time. Use OPTIMIZE TABLE to unfragment the tables and recreate the indexes.
FORCE/USE/IGNORE INDEX limits the options MySQL's query optimizer has to perform your query. Only consider it on complex queries.
Time the effect of your meddling with indexes on a regular basis. A forced index that speeds up your query today might slow it down tomorrow because the underlying data has changed.