How to make MySQL intersect keys of INTEGER field and FLOAT field? - mysql

I have the following MySQL table (table size - around 10K records):
CREATE TABLE `tmp_index_test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`m_id` int(11) DEFAULT NULL,
`r_id` int(11) DEFAULT NULL,
`price` float DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `m_key` (`m_id`),
KEY `r_key` (`r_id`),
KEY `price_key` (`price`)
) ENGINE=InnoDB AUTO_INCREMENT=16390 DEFAULT CHARSET=utf8;
As you can see, I have two INTEGER fields (r_id and m_id) and one FLOAT field (price).
For each of these fields I have an index.
Now, when I run a query with condition on the first integer AND on the second one, everything is fine:
mysql> explain select * from tmp_index_test where m_id=1 and r_id=2;
+----+-------------+----------------+-------------+---------------+-------------+---------+------+------+-------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+-------------+---------------+-------------+---------+------+------+-------------------------------------------+
| 1 | SIMPLE | tmp_index_test | index_merge | m_key,r_key | r_key,m_key | 5,5 | NULL | 1 | Using intersect(r_key,m_key); Using where |
+----+-------------+----------------+-------------+---------------+-------------+---------+------+------+-------------------------------------------+
Seems like MySQL performs it very well since there is the Using intersect(r_key,m_key) in the Extra field.
I'm not a MySQL expert, but according to what I understand, MySQL is first making the intersection on indexes, and only then collects the result of the intersection from the table itself.
HOWEVER, when I run very similar query, but instead of condition on two integers, I put similar condition on an integer and a float, MySQL refuses to intersect the result on indexes:
mysql> explain select * from tmp_index_test where m_id=3 and price=100;
+----+-------------+----------------+------+-----------------+-----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+------+-----------------+-----------+---------+-------+------+-------------+
| 1 | SIMPLE | tmp_index_test | ref | m_key,price_key | price_key | 5 | const | 1 | Using where |
+----+-------------+----------------+------+-----------------+-----------+---------+-------+------+-------------+
As you can see, MySQL decides to use the index of price only.
My first question is why, and how to fix it?
In addition to it, I need to run queries with MORE sign (>) instead of the equal sign (=) on price. Currently explain shows that for such queries, MySQL uses the integer key only.
mysql> explain select * from tmp_index_test where m_id=3 and price > 100;
+----+-------------+----------------+------+-----------------+-------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+------+-----------------+-------+---------+-------+------+-------------+
| 1 | SIMPLE | tmp_index_test | ref | m_key,price_key | m_key | 5 | const | 2 | Using where |
+----+-------------+----------------+------+-----------------+-------+---------+-------+------+-------------+
I need to make somehow MySQL first do the intersection on indexes. Anybody has any idea how?
Thanks a lot in advance!

From the MySQL manual:
ref is used if the join uses only a leftmost prefix of the key or if
the key is not a PRIMARY KEY or UNIQUE index (in other words, if the
join cannot select a single row based on the key value). If the key
that is used matches only a few rows, this is a good join type.
price is not unique or primary, so ref is chosen. I don't believe you can force an intersect.

Related

slower query for searching nearby coordinates

I seem to hit slower query result for searching nearby coordinates ( for now the query is for latitude). This is a mysql query
select ABS(propertyCoordinatesLat - 3.33234) as diff from tablename order by diff asc limit 0,20
is there a way to improve this besides relying on server scripting to do the sorting?
table dump.
CREATE TABLE `property` (
`propertyID` bigint(20) NOT NULL,
`propertyName` varchar(100) NOT NULL,
`propertyCoordinatesLat` varchar(100) NOT NULL,
`propertyCoordinatesLng` varchar(100) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- Indexes for dumped tables
--
--
-- Indexes for table `property`
--
ALTER TABLE `property`
ADD PRIMARY KEY (`propertyID`),
ADD KEY `propertyCoordinatesLat` (`propertyCoordinatesLat`,`propertyCoordinatesLng`),
ADD KEY `propertyCoordinatesLat_2` (`propertyCoordinatesLat`),
ADD KEY `propertyCoordinatesLng` (`propertyCoordinatesLng`);
--
-- AUTO_INCREMENT for dumped tables
--
--
-- AUTO_INCREMENT for table `property`
--
ALTER TABLE `property`
MODIFY `propertyID` bigint(20) NOT NULL AUTO_INCREMENT;
COMMIT;
The query is ordering by the difference between a string and a float. This odd calculation confuses and angers MySQL and results in a slow filesort.
mysql> explain select ABS(propertyCoordinatesLat - 3.33234) as diff from property order by diff
+----+-------------+----------+------------+-------+---------------+--------------------------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+--------------------------+---------+------+------+----------+-----------------------------+
| 1 | SIMPLE | property | NULL | index | NULL | propertyCoordinatesLat_2 | 302 | NULL | 1 | 100.00 | Using index; Using filesort |
+----+-------------+----------+------------+-------+---------------+--------------------------+---------+------+------+----------+-----------------------------+
Changing propertyCoordinatesLat and propertyCoordinatesLng to a more sensible numeric type lets MySQL optimize better. No more filesort. This should perform much better.
alter table property change propertyCoordinatesLat propertyCoordinatesLat numeric(10,8) not null;
alter table property change propertyCoordinatesLng propertyCoordinatesLng numeric(11,8) not null;
mysql> explain select ABS(propertyCoordinatesLat - 3.33234) as diff from property order by propertyCoordinatesLat asc limit 0,20;
+----+-------------+----------+------------+-------+---------------+--------------------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+--------------------------+---------+------+------+----------+-------------+
| 1 | SIMPLE | property | NULL | index | NULL | propertyCoordinatesLat_2 | 5 | NULL | 1 | 100.00 | Using index |
+----+-------------+----------+------------+-------+---------------+--------------------------+---------+------+------+----------+-------------+
If you want to get fancy, look into MySQL's spatial types. These will probably perform better, and definitely be more accurate.

why prefix index is slower than index in mysql?

table:(quantity:2100W)
CREATE TABLE `prefix` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`number` int(11) NOT NULL,
`string` varchar(750) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `idx_string_prefix10` (`string`(10)),
KEY `idx_string` (`string`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
discrimination:
select count(distinct(left(string,10)))/count(*) from prefix;
+-------------------------------------------+
| count(distinct(left(string,10)))/count(*) |
+-------------------------------------------+
| 0.9999 |
+-------------------------------------------+
result:
select sql_no_cache count(*) from prefix force index(idx_string_prefix10)
where string <"1505d28b"
243.96s,241.88s
select sql_no_cache count(*) from prefix force index(idx_string)
where string < "1505d28b"
7.96s,7.21s,7.53s
why prefix index is slower than index in mysql?(forgive my broken English)
explain select sql_no_cache count(*) from prefix force index(idx_string_prefix10)
where string < "1505d28b";
+----+-------------+--------+------------+-------+---------------------+---------------------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------------+---------------------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | prefix | NULL | range | idx_string_prefix10 | idx_string_prefix10 | 42 | NULL | 3489704 | 100.00 | Using where |
+----+-------------+--------+------------+-------+---------------------+---------------------+---------+------+---------+----------+-------------+
When you use a prefix index, MySQL has to read from the index and also after reading the index, it has to read the row of data too, to make sure the value is selected by the WHERE condition. That's two reads, and scanning a lot more data.
When you use a non-prefix index, MySQL can read the whole string value from the index, and it knows immediately whether the value is selected by the condition, or if it can be skipped.

Order by is slow for LIMIT X (but fast for LIMIT X-1)

I know there are many questions already posted to this topic, but I haven't find a solution for my problem.
I have just one table that has 1.042.162 rows.
Table definition:
CREATE TABLE `tbllinks` (
`idLinks` int(11) NOT NULL AUTO_INCREMENT,
`linksText` varchar(500) DEFAULT NULL,
`linksLastChecked` datetime DEFAULT NULL,
`linksLastNewData` datetime DEFAULT NULL,
PRIMARY KEY (`idLinks`),
UNIQUE KEY `idtblLinks_UNIQUE` (`idLinks`),
UNIQUE KEY `linksText_UNIQUE` (`linksText`),
KEY `fasterDate` (`linksLastChecked`),
KEY `faster2` (`linksText`)
) ENGINE=InnoDB AUTO_INCREMENT=3029595 DEFAULT CHARSET=latin1;
This one:
SELECT * FROM tbllinks order by linksLastChecked asc limit 9324;
needs 0.094 sec.
And this one:
SELECT * FROM tbllinks order by linksLastChecked asc limit 9325;
needs 42.559 sec.
Both querys only return "Null" values in the linksLastChecked column (what is right), but most datasets have a Value.
There is also nothing special about the dataset 9325. So i realy have no idea why its so much longer for just one Dataset more.
Edit:
With explains it makes sence why it needs so long.
+----+-------------+----------+-------+---------------+------------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+------------+---------+------+------+-------+
| 1 | SIMPLE | tbllinks | index | NULL | fasterDate | 6 | NULL | 9324 | NULL |
+----+-------------+----------+-------+---------------+------------+---------+------+------+-------+
+----+-------------+----------+------+---------------+------+---------+------+--------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+------+---------+------+--------+----------------+
| 1 | SIMPLE | tbllinks | ALL | NULL | NULL | NULL | NULL | 805031 | Using filesort |
+----+-------------+----------+------+---------------+------+---------+------+--------+----------------+
The first one is using the index key, the scond one not! But i have still the question why that and why the second one is unsing 805031 rows?
Edit: (Answer)
Ok sorry that I even ask. I just searched with the wrong question.
While searching for "mysql doesn't use index" i found this nice article.
http://code.openark.org/blog/mysql/7-ways-to-convince-mysql-to-use-the-right-index
while USE INDEX didnt worked for me, FORCE INDEX helped.

MySQL join issue, query hangs

I have a table holding numeric data points with timestamps, like so:
CREATE TABLE `value_table1` (
`datetime` datetime NOT NULL,
`value` decimal(14,8) DEFAULT NULL,
KEY `datetime` (`datetime`)
) ENGINE=InnoDB;
My table holds a data point for every 5 seconds, so timestamps in the table will be, e.g.:
"2013-01-01 10:23:35"
"2013-01-01 10:23:40"
"2013-01-01 10:23:45"
"2013-01-01 10:23:50"
I have a few such value tables, and it is sometimes necessary to look at the ratio between two value series.
I therefore attempted a join, but it seems to not work:
SELECT value_table1.datetime, value_table1.value / value_table2.rate
FROM value_table1
JOIN value_table2
ON value_table1.datetime = value_table2.datetime
ORDER BY value_table1.datetime ASC;
Running EXPLAIN on the query shows:
+----+-------------+--------------+------+---------------+------+---------+------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+------+---------------+------+---------+------+-------+---------------------------------+
| 1 | SIMPLE | value_table1 | ALL | NULL | NULL | NULL | NULL | 83784 | Using temporary; Using filesort |
| 1 | SIMPLE | value_table2 | ALL | NULL | NULL | NULL | NULL | 83735 | |
+----+-------------+--------------+------+---------------+------+---------+------+-------+---------------------------------+
Edit
Problem solved, no idea where my index disappeared to. EXPLAIN showed it, thanks!
Thanks!
As your explain shows, the query is not using indexes on the join. Without indexes, it has to scan every row in both tables to process the join.
First of all, make sure the columns used in the join are both indexed.
If they are, then it might be the column type that is causing issues. You could create an integer representation of the time, and then use that to join the two tables.

MySQL datetime index is not working

Table structure:
+-------------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+----------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| total | int(11) | YES | | NULL | |
| thedatetime | datetime | YES | MUL | NULL | |
+-------------+----------+------+-----+---------+----------------+
Total rows: 137967
mysql> explain select * from out where thedatetime <= NOW();
+----+-------------+-------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | out | ALL | thedatetime | NULL | NULL | NULL | 137967 | Using where |
+----+-------------+-------------+------+---------------+------+---------+------+--------+-------------+
The real query is much more longer with more table joins, the point is, I can't get the table to use the datetime index. This is going to be hard for me if I want to select all data until certain date. However, I noticed that I can get MySQL to use the index if I select a smaller subset of data.
mysql> explain select * from out where thedatetime <= '2008-01-01';
+----+-------------+-------------+-------+---------------+-------------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+-------------+---------+------+-------+-------------+
| 1 | SIMPLE | out | range | thedatetime | thedatetime | 9 | NULL | 15826 | Using where |
+----+-------------+-------------+-------+---------------+-------------+---------+------+-------+-------------+
mysql> select count(*) from out where thedatetime <= '2008-01-01';
+----------+
| count(*) |
+----------+
| 15990 |
+----------+
So, what can I do to make sure MySQL will use the index no matter what date that I put?
There are two things in play here -
Index is not selective enough - if the index covers more than approx. 30% of the rows, MySQL will decide a full table scan is more efficient. When you contract the range the index kicks in.
One index per table in a join
The real query is much more longer
with more table joins, the point is ...
The point is exactly because it has joins that it probably can't use that index. MySQL can use one index per table in a join (unless it qualifies for an index-merge optimization). If the primary key is already used for the join, thedatetime won't be used. In order to use it, you need to create a multi-column index on the join key + thedatetime index, in the correct order.
Check the EXPLAIN of the actual query to see which key MySQL uses for the join. Modify that index to include the thedatetime column as well, or create a new multi-column index from both (depending on what you use the join key for).
Everything works as it is supposed to. :)
Indexes are there to speed up retrieval. They do it using index lookups.
In you first query the index is not used because you are retrieving ALL rows, and in this case using index is slower (lookup index, get row, lookup index, get row... x number of rows is slower then get all rows == table scan)
In the second query you are retrieving only a portion of the data and in this case table scan is much slower.
The job of the optimizer is to use statistics that RDBMS keeps on the index to determine the best plan. In first case index was considered, but planner (correctly) threw it away.
EDIT
You might want to read something like this to get some concepts and keywords regarding mysql query planner.