I've checked numerous other SO posts and MySQL docs but can't seem to get an answer on why an index isn't being used, and how to force it to be used - I can see many others are having similar problems, but can't find a solution.
The table looks like this
CREATE TABLE `countries_ip` (
`ipfrom` INT(10) UNSIGNED ZEROFILL NOT NULL,
`ipto` INT(10) UNSIGNED ZEROFILL NOT NULL,
`countrySHORT` CHAR(2) NULL DEFAULT NULL,
`country_id` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`ipfrom`, `ipto`, `country_id`),
INDEX `from_to_index` (`ipfrom`, `ipto`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB;
Not sure why the "from_to_index" is there - seems redundant to me. But anyway, the EXPLAIN query looks like this
EXPLAIN SELECT *
FROM track_report t, countries_ip ip
WHERE t.ip BETWEEN ip.ipfrom AND ip.ipto
and the result of the EXPLAIN is as follows:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t ALL getwebmaster NULL NULL NULL 36291
1 SIMPLE ip ALL PRIMARY,from_to_index NULL NULL NULL 153914 Range checked for each record (index map: 0x3)
As you can see, the PRIMARY KEY from the countries_ip table isn't being used and so the query takes a LONG time (countries_ip has over 150k records)
I'm probably missing something simple, but any advice would be appreciated on how I can optimize this query. Thanks in advance.
It might help to define an index on track_report.ip. See SQL Fiddle.
I modified the where clause to do an explicit comparison, and now it uses the from_to_index.
SELECT ip
FROM track_report t, countries_ip ip
where t.ip >= ip.ipfrom and t.ip <= ip.ipto
See SQL Fiddle.
Related
I need to fill the location field in users table with a country name from geoip table, depending on the user's IP.
Here is the tables' CREATE code.
CREATE TABLE `geoip` (
`IP_FROM` INT(10) UNSIGNED ZEROFILL NOT NULL DEFAULT '0000000000',
`IP_TO` INT(10) UNSIGNED ZEROFILL NOT NULL DEFAULT '0000000000',
`COUNTRY_NAME` VARCHAR(50) NOT NULL DEFAULT '',
PRIMARY KEY (`IP_FROM`, `IP_TO`)
)
ENGINE=InnoDB;
CREATE TABLE `users` (
`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`login` VARCHAR(25) NOT NULL DEFAULT ''
`password` VARCHAR(64) NOT NULL DEFAULT ''
`ip` VARCHAR(128) NULL DEFAULT ''
`location` VARCHAR(128) NULL DEFAULT ''
PRIMARY KEY (`id`),
UNIQUE INDEX `login` (`login`),
INDEX `ip` (`ip`(10))
)
ENGINE=InnoDB
ROW_FORMAT=DYNAMIC;
The update query I try to run is:
UPDATE users u
SET u.location =
(SELECT COUNTRY_NAME FROM geoip WHERE INET_ATON(u.ip) BETWEEN IP_FROM AND IP_TO)
The problem is that this query refuses to use PRIMARY index on the geoip table, though it would speed things up a lot. The EXPLAIN gives me:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY u index NULL PRIMARY 4 NULL 1254395
2 DEPENDENT SUBQUERY geoip ALL PRIMARY NULL NULL NULL 62271 Using where
I've ended up converting geoip table to the MEMORY engine for this query only, but I'd like to know what was the right way to do it.
UPDATE
The DBMS I'm using is MariaDB 10.0.17, if it could make a difference.
Did you try to force the index like this
UPDATE users u
SET u.location =
(SELECT COUNTRY_NAME FROM geoip FORCE INDEX (PRIMARY)
WHERE INET_ATON(u.ip) BETWEEN IP_FROM AND IP_TO)
Also since ip can be NULL it probably messing with index optimiziation.
The IP ranges are non-overlapping, correct? You are not getting any IPv6 addresses? (The world ran out of IPv4 a couple of years ago.)
No, the index won't be used, or at least won't perform as well as you would like. So, I have devised a scheme to solve that. However it requires reformulating the schema and building a Stored Routine. See my IP-ranges blog; It has links to code for IPv4 and for IPv6. It will usually touch only one row in the table, not have to scan half the table.
Edit
MySQL does not know that there is only one range (from/to) that should match. So, it scans far too much. The difference between the two encodings of IP (INT UNSIGNED vs VARCHAR) makes it difficult to use a JOIN (instead of a subquery). Alas a JOIN would not be any better because it does not understand that there is exactly one match. Give this a try:
UPDATE users u
SET u.location =
( SELECT COUNTRY_NAME
FROM geoip
WHERE INET_ATON(u.ip) BETWEEN IP_FROM AND IP_TO
LIMIT 1 -- added
)
If that fails to significantly improve the speed, then change from VARCHAR to INT UNSIGNED in users and try again (without INET_ATON).
I am performing a very simple select over a simple table, where the column that I am filtering over has an index.
Here is the schema:
CREATE TABLE IF NOT EXISTS `tmp_inventory_items` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`transmission_id` int(11) unsigned NOT NULL,
`inventory_item_id` int(11) unsigned DEFAULT NULL,
`material_id` int(11) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `transmission_id` (`transmission_id`)
KEY `inventory_item_id` (`inventory_item_id`),
KEY `material_id` (`material_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=21 ;
Here is the SQL:
SELECT * FROM `tmp_inventory_items` WHERE `transmission_id` = 330
However, when explaining the query, I see that the index is NOT being used, why is that (the table has about 20 rows on my local machine)?
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tmp_inventory_items... ALL transmission_id NULL NULL NULL 13 Using where
No key is being used even if I hint the mysql with USE INDEX(transmission_id)... this looks very strange to me (MySQL Version 5.5.28)
Because MySQL's algorithms tell it that preparing an index and using it would use more resources than simply performing the query without one.
When you feed query syntax to a DBMS, one of the things it does is attempts to determine the most efficient way to process the query (usually there are at least tens of ways).
If you want to, you can use FORCE INDEX(transmission_id) (documented here) which will inform MySQL that a table scan is assumed to be very expensive, but it's not recommended as to determine for 20 rows, it's just not valuable.
Why mysql is not using index_merge?
Looks like my server has index_merge ON, but optimizer still not taking in to consideration.
optimizer switch index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on
explain SELECT a,b FROM `zip25` WHERE b=91367 OR a=91367
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE zip25 ALL a,b NULL NULL NULL 752299 Using where
[EDIT]
Table Definition
CREATE TABLE `zip25` (
`a` char(5) DEFAULT NULL,
`b` char(5) DEFAULT NULL,
`distance` float NOT NULL,
KEY `a` (`a`),
KEY `b` (`b`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Thanks in advance
The datatype of the fields is char, but you are using integers in the query. What happens is implicit casting to char. It doesn't look like serious problem, but actually it prevents MySQL from using an index at all! Always mind the data types!
Change your query to:
explain SELECT a,b FROMzip25WHERE b="91367" OR a="91367"
This query:
explain
SELECT `Lineitem`.`id`, `Donation`.`id`, `Donation`.`order_line_id`
FROM `order_line` AS `Lineitem`
LEFT JOIN `donations` AS `Donation`
ON (`Donation`.`order_line_id` = `Lineitem`.`id`)
WHERE `Lineitem`.`session_id` = '1'
correctly uses the Donation.order_line_id and Lineitem.id indexes, shown in this EXPLAIN output:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Lineitem ref session_id session_id 97 const 1 Using where; Using index
1 SIMPLE Donation ref order_line_id order_line_id 4 Lineitem.id 2 Using index
However, this query, which simply includes another field:
explain
SELECT `Lineitem`.`id`, `Donation`.`id`, `Donation`.`npo_id`,
`Donation`.`order_line_id`
FROM `order_line` AS `Lineitem`
LEFT JOIN `donations` AS `Donation`
ON (`Donation`.`order_line_id` = `Lineitem`.`id`)
WHERE `Lineitem`.`session_id` = '1'
Shows that the Donation table does not use an index:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Lineitem ref session_id session_id 97 const 1 Using where; Using index
1 SIMPLE Donation ALL order_line_id NULL NULL NULL 3
All of the _id fields in the tables are indexed, but I can't figure out how adding this field into the list of selected fields causes the index to be dropped.
As requested by James C, here are the table definitions:
CREATE TABLE `donations` (
`id` int(10) unsigned NOT NULL auto_increment,
`npo_id` int(10) unsigned NOT NULL,
`order_line_detail_id` int(10) unsigned NOT NULL default '0',
`order_line_id` int(10) unsigned NOT NULL default '0',
`created` datetime default NULL,
`modified` datetime default NULL,
PRIMARY KEY (`id`),
KEY `npo_id` (`npo_id`),
KEY `order_line_id` (`order_line_id`),
KEY `order_line_detail_id` (`order_line_detail_id`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=utf8
CREATE TABLE `order_line` (
`id` bigint(20) unsigned NOT NULL auto_increment,
`order_id` bigint(20) NOT NULL,
`npo_id` bigint(20) NOT NULL default '0',
`session_id` varchar(32) collate utf8_unicode_ci default NULL,
`created` datetime default NULL,
PRIMARY KEY (`id`),
KEY `order_id` (`order_id`),
KEY `npo_id` (`npo_id`),
KEY `session_id` (`session_id`)
) ENGINE=InnoDB AUTO_INCREMENT=23 DEFAULT CHARSET=utf8
I also did some reading about cardinality, and it looks like both the Donations.npo_id and Donations.order_line_id have a cardinality of 2. Hopefully this suggests something useful?
I'm thinking that a USE INDEX might solve the problem, but I'm using an ORM that makes this a bit tricky, and I don't understand why it wouldn't grab the correct index when the JOIN specifically names indexed fields?!?
Thanks for your brainpower!
The first explain has "uses index" at the end. This means that it was able to find the rows and return the result for the query by just looking at the index and not having to fetch/analyse any row data.
In the second query you add a row that's likely not indexed. This means that MySQL has to look at the data of the table. I'm not sure why the optimiser chose to do a table scan but I think it's likely that if the table is fairly small it's easier for it to just read everything than trying to pick out details for individual rows.
edit: I think adding the following indexes will improve things even more and let all of the join use indexes only:
ALTER TABLE order_line ADD INDEX(session_id, id);
ALTER TABLE donations ADD INDEX(order_line_id, npo_id, id)
This will allow order_line to to find the rows using session_id and then return id and also allow donations to join onto order_line_id and then return the other two columns.
Looking at the auto_increment values can I assume that there's not much data in there. It's worth noting that the amount of data in the tables will have an effect on the query plan and it's good practice to put some sample data in there to test things out. For more detail have a look in this blog post I made some time back: http://webmonkeyuk.wordpress.com/2010/09/27/what-makes-a-good-mysql-index-part-2-cardinality/
I have the following table
CREATE TABLE `Config` (
`id` mediumint(9) NOT NULL AUTO_INCREMENT,
`type_id` mediumint(9) DEFAULT NULL,
`content_id` mediumint(9) DEFAULT NULL,
`menu_id` int(11) DEFAULT NULL,
`field` varchar(50) NOT NULL DEFAULT '',
`value` text NOT NULL,
PRIMARY KEY (`id`),
KEY `menu_id` (`menu_id`) USING BTREE,
KEY `type_id` (`type_id`,`content_id`,`menu_id`) USING BTREE
) ENGINE=MyISAM AUTO_INCREMENT=1;
It's filled with about 800k rows of test data. Whenever I run the following query it takes about 0.4 seconds to complete:
SELECT id, content_id, menu_id, field, `value`
FROM Config
WHERE type_id = ?
AND content_id = ?
An explain tells me, MySQL is doing a full tablescan instead of using an index:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Config ALL 792674 Using where
Can someone please explain what I am doing wrong here? How has the index to be like so it's used here? Sometimes the query has the extra condition AND menu_id = ?, which should benefit from it, too.
I had a problem once with a query where it doesn't use the index that I specified. It turned out, MySQL won't use your index if the result (of your query) exceeds certain rows. For an example, if the result itself is taking a lot of your total rows, it won't use your index. However, I don't have the specific percentage. You could try adjusting the query to return smaller result to test this theory.
My question about the problem: MySQL datetime index is not working
0.4s isn't bad for 800,000 rows. The MySQL optimiser may determine it doesn't need your indexes.
You could try using "hints" to see if you can change performance outcomes:
http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
The accepted answer is actually right, but if you want your MySQL to use the Index regardless the matches rows, you can specify the FORCE INDEX (index_name) command.