MySQL Query Optimization for GPS Tracking system - mysql

I have the following query:
SELECT * FROM `alltrackers`
WHERE `deviceid`='FT_99000083401624'
AND `locprovider`!='none'
ORDER BY `id` DESC
This is the show create table:
CREATE TABLE `alltrackers` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`deviceid` varchar(50) NOT NULL,
`gpsdatetime` int(11) NOT NULL,
`locprovider` varchar(30) NOT NULL,
PRIMARY KEY (`id`),
KEY `statename` (`statename`),
KEY `gpsdatetime` (`gpsdatetime`),
KEY `locprovider` (`locprovider`),
KEY `deviceid` (`deviceid`(18))
) ENGINE=MyISAM AUTO_INCREMENT=8665045 DEFAULT CHARSET=utf8;
I've removed the columns which I thought were unnecessary for this question.
This is the EXPLAIN output for this query:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE alltrackers ref locprovider,deviceid deviceid 56 const 156416 Using
where; Using filesort
This particular query is showing as taking several seconds in mytop (mtop). I'm a bit confused though, as the same query but with a different "deviceid" doesn't take as long. Although I only need the last row, I've already removed LIMIT 1 as that makes it take even longer. This table currently contains 3 million rows.
It is used for storing the locations from different GPS devices. Each GPS device has a unique device ID. Locations come in and are added to the table. For statistics I'm running the above query to find the time of the last received location from a certain device.
I'm open to advice on ways to further optimize the query or even the tables.
Many thanks in advance.

If you only need the last row, add an index on (deviceid, id, locprovider). It would be even faster with an index on (deviceid, id, locprovider, gpsdatetime):
ALTER TABLE alltrackers
ADD INDEX special_covering_IDX
(deviceid, id, locprovider, gpsdatetime) ;
Then try this out:
SELECT id, locprovider, gpsdatetime
FROM alltrackers
WHERE deviceid = 'FT_99000083401624'
AND locprovider <> 'none'
ORDER BY id DESC
LIMIT 1 ;

Related

MySQL indexes not being used in large database

I have a very simple query on a large table (about 37 million rows). This query takes over 10 mins to run and should be fast as the indexes are built correctly (I think). I do not understand why this query is taking so long. I am hoping someone can guide me in the right direction:
Query:
select type_id, sub_type_id, max(settlement_date_time) as max_dt
from transaction_history group by type_id, sub_type_id
Create Statement:
CREATE TABLE `transaction_history` (
`transaction_history_id` int(11) NOT NULL AUTO_INCREMENT,
`type_id` int(11) NOT NULL,
`sub_type_id` int(11) DEFAULT NULL,
`settlement_date_time` datetime DEFAULT NULL,
PRIMARY KEY (`transaction_history_id`),
KEY `sub_type_id_idx` (`sub_type_id_id`),
KEY `settlement_date` (`settlement_date_time`),
KEY `type_sub_type` (`type_id`,`sub_type_id`)
) ENGINE=InnoDB AUTO_INCREMENT=36832823 DEFAULT CHARSET=latin1;
Result from Explain:
id -> 1
select_type -> SIMPLE
table -> transaction_history
type -> index
possible_keys -> NULL
key -> type_sub_type
key_len -> 9
ref -> NULL
rows -> 37025337
filtered -> 100.00
Extra ->
Why is possible keys NULL? It says it is using an index but it does not seem like it is. why is ref NULL? How can I make this query more efficient? Is there something wrong with the indexes? Do I have to change any values MySQL config file?
Thank you
(Apologies to the two commenters who already gave the necessary INDEX; I'll try to say enough more to justify giving a 'Answer'.)
Use the 'composite' (and 'covering') index:
INDEX(type_id, sub_type_id, settlement_date_time)
There is no WHERE, so no need to worry about such columns. First come the columns in the order listed in GROUP BY, then comes the other column. The Optimizer will probably hop through the index very efficiently.
Why NULL? Well the 2-column index is useless. In general, if more than 20% of the table needs to be looked at, it is better to simply scan the table rather than bounce between the index BTree and the data BTree.
More tips: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

Trouble understanding how to get rid of a table scan in a basic MYSQL explain example

I'm trying to optimize a very basic MYSQL example and I can't seem to figure out how to prevent the below query from doing a table scan when referencing the column uid. Using explain, tt shows a possible key correctly but doesn't actually use the key and scans all the rows.
CREATE TABLE `Foo` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`barId` int(10) unsigned NOT NULL,
`uid` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `barId` (`barId`),
KEY `uid` (`uid`)
)
explain
select count(uid) as userCount
FROM Foo
WHERE barId = 1
GROUP BY barId
id select_type table type possible_keys key rows Extra
1 SIMPLE Foo ALL barId NULL 4 Using where
Sample data
id,barId,uid
1,1,1
2,1,2
3,1,3
4,2,4
It looks like MySQL is being smart and realizing it would take more time to use the index with a table that small?
When I EXPLAIN it empty, the key is "barId".
With 4 rows (your
sample data), key is NULL.
With 4096 rows (I ran INSERT SELECT to
itself a handful of times), key returns to "barID".
From the Manual at the bottom.
Indexes are less important for queries on small tables, or big tables
where report queries process most or all of the rows. When a query
needs to access most of the rows, reading sequentially is faster than
working through an index. Sequential reads minimize disk seeks, even
if not all the rows are needed for the query.

MySQL: Get the first row that has a field >= value - is there a way to speed it up?

There is a table:
CREATE TABLE `test` (
`thing` int(10) unsigned NOT NULL,
`price` decimal(10,2) unsigned NOT NULL,
PRIMARY KEY (`thing`,`price`),
KEY `thing` (`thing`),
KEY `price` (`price`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
and some values there:
INSERT INTO test(thing,price)
VALUES
(1,5.00),
(2,7.50),
(3,8.70),
(4,9.00),
(5,9.50),
(6,9.75),
(7,10.00),
(8,10.50),
(9,10.75),
(10,11.00),
(11,11.25);
I want to get a MINIMAL price from this table, that is MORE than, say, 9.2 - that is (5,9.50) record. So, I do:
SELECT thing, price FROM test WHERE price > 9.2 ORDER BY price LIMIT 1 . It's EXPLAIN output says that MySQL goes through all 7 rows that are more than 9.2:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE test range price price 5 \N 7 Using where; Using index
Is there a way to speed this up somehow? So that MySQL would give me ONE record that is a little more than my condition?
Thanks in advance for your help!
How fast do you want it to be? How slow is this now? This could be a case of pre-mature optimization. Please understand that the number of rows in the explain plan is just an estimate.
But none the less, try this if it makes any difference (??):
SELECT thing, min(price) FROM test WHERE price > 9.2
Create an index on price. You probably don't want to have the primary key on (thing, price), because that would allow a thing to have more than one price. Just make thing the primary key. (I am assuming that you don't want a particular thing to have multiple price values.)

mysql where + group by very slow

one question that I should be able to answer myself but I don't and I also don't find any answer in google:
I have a table that contains 5 million rows with this structure:
CREATE TABLE IF NOT EXISTS `files_history2` (
`FILES_ID` int(10) unsigned DEFAULT NULL,
`DATE_FROM` date DEFAULT NULL,
`DATE_TO` date DEFAULT NULL,
`CAMPAIGN_ID` int(10) unsigned DEFAULT NULL,
`CAMPAIGN_STATUS_ID` int(10) unsigned DEFAULT NULL,
`ON_HOLD` decimal(1,0) DEFAULT NULL,
`DIVISION_ID` int(11) DEFAULT NULL,
KEY `DATE_FROM` (`DATE_FROM`),
KEY `FILES_ID` (`FILES_ID`),
KEY `CAMPAIGN_ID` (`CAMPAIGN_ID`),
KEY `CAMP_DATE` (`CAMPAIGN_ID`,`DATE_FROM`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
When I execute
SELECT files_id, min( date_from )
FROM files_history2
WHERE campaign_id IS NOT NULL
GROUP BY files_id
the query rests with status "Sending data" for more than eight hours (then I killed the process).
Here the explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE files_history2 ALL CAMPAIGN_ID,CAMP_DATE NULL NULL NULL 5073254 Using where; Using temporary; Using filesort
I assume that I generated the necessary keys but then the query should take that long, does it?
I would suggest a different index... Index on (Files_ID, Date_From, Campaign_ID)...
Since your group by is on Files_ID, you want THOSE grouped. Then the MIN( Date_From), so that is in second position... Then FINALLY the Campaign_ID to qualify for not null and here's why...
If you put all your campaign IDs first, great, get all the NULLs out of the way... Now, you have 1,000 campaigns and the Files_ID spans MANY campaigns and they also span many dates, you are going to choke.
By the index I'm projecting, by the Files_ID first, you have each "files_id" already ordered to match your group by. Then, within that, all the earliest dates are at the top of the indexed list... great, almost there, then, by campaign ID. Skip over whatever NULL may be there and you are done, on to the next Files_ID
Hope this makes sense -- unless you have TONs of entries with NULL value campaigns.
Also, by having all 3 parts of the index matching the criteria and output columns of your query, it never has to go back to the raw data file for the data, it gets it all from the index directly.
I'd create a covering index (CAMPAIGN_ID, files_id, date_from) and check that performance. I suspect your issue is due to the grouping not and date_from not being able to use the same index.
CREATE INDEX your_index_name ON files_history2 (CAMPAIGN_ID, files_id, date_from);
If this works you could drop the point index CAMPAIGN_ID as it's included in the composite index.
Well the query is slow due to the aggregation ( function MIN ) along with grouping.
One of the solution is altering your query by moving the aggregating subquery from the WHERE clause to the FROM clause, which will be lot faster than the approach you are using.
try following:
SELECT f.files_id
FROM file_history2 AS f
JOIN (
SELECT campaign_id, MIN(date_from) AS datefrom
FROM file_history2
GROUP BY files_id
) AS f1 ON f.campaign_id = f1.campaign_id AND f.date_from = f1.datefrom;
This should have lot better performance, if doesn't work temporary table would only be the choice to go with.

MySQL query need optimization

I got this query:
SELECT user_id
FROM basic_info
WHERE age BETWEEN 18 AND 22 AND gender = 0
ORDER BY rating
LIMIT 50
The table looks like (and it contains about 700k rows):
CREATE TABLE IF NOT EXISTS `basic_info` (
`user_id` mediumint(8) unsigned NOT NULL auto_increment,
`gender` tinyint(1) unsigned NOT NULL default '0',
`age` tinyint(2) unsigned NOT NULL default '0',
`rating` smallint(5) unsigned NOT NULL default '0',
PRIMARY KEY (`user_id`),
KEY `tmp` (`gender`,`rating`),
) ENGINE=MyISAM;
The query itself is optimized but it has to walk about 200k rows to do his job.
Here's the explain output:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE basic_info ref tmp,age tmp 1 const 200451 Using where
Is it possible to optimize the query so it won't walk over 200k rows ?
Thanks !
There are two useful indexes that can help this query:
KEY gender_age (gender, age) -- this index can satisfy both the gender=0 condition as well as age BETWEEN 18 AND 22. However, because you have a range condition over the age field, adding the rating column to the index will not give sorted results -- hence MySQL will select all matching rows -- ignoring your LIMIT clause -- and do an additional filesort regardless.
KEY gender_rating (gender, rating) -- the index you already have; this index can satisfy the gender=0 condition and retrieves data already sorted by rating. However, the database has to scan all elements with gender=0 and eliminate those who are not in range age BETWEEN 18 AND 22
Changing schema
If the above does not help you enough, changing your schema is always possible. One such approach is turning the age BETWEEN condition into an equality condition, by defining an age group column; for instance, ages 0-12 will be in age group 1, ages 12-18 in age group 2, etc.
This way, having an index with (gender, agegroup, rating) and query with WHERE gender=0 AND agegroup=3 ORDER BY rating will retrieve all results from the index and already sorted. In this case, the LIMIT clause should only fetch 50 entries from the table and no more.
Extend you tmp-key to include the age-column:
KEY `tmp` (`age`,`gender`,`rating`)
Attempt to use InnoDB to improve performence?
Benchmarking here