I have a query like this:
SELECT id, terrain, occupied, c_type
FROM map
WHERE x >= $x-$radius
AND x <= $x+$radius
AND y >= $y-$radius
AND y <= $y+$radius
ORDER BY
x ASC,
y ASC
My table looks like this:
CREATE TABLE `map` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`occupied` tinyint(2) NOT NULL DEFAULT '0',
`c_type` tinyint(4) NOT NULL DEFAULT '0',
`x` int(11) NOT NULL,
`y` int(11) NOT NULL,
`terrain` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4_general_ci
I removed all indexes except PRIMARY KEY, because I am unsure how does indexing works with SQL.
What can I do to tune this query? Thanks...
This is not a duplicate,check comments!
The best you can do with that query is
INDEX(x, y) -- In this order
That will be effective with
WHERE x >= $x-$radius
AND x <= $x+$radius
but not for filtering on y.
And it will (probably) avoid the "filesort" for ORDER BY x ASC, y ASC. Note that the index order must match this order.
Provide EXPLAIN SELECT ... for any attempted SELECT.
And, please switch to InnoDB before MyISAM is removed.
Here is a quick cookbook on indexing in MySQL.
So, I added index gmp with columns x, y, id, terrain, occupied, c_type and EXPLAIN SELECT displays:
id: 1
select_type: SIMPLE
table: map
type: range
possible_keys: gmp
key: gmp
key_len: 8
ref: NULL
rows: 1955
Extra: Using where; Using index
So, I guess it works now.
Related
I am working on a mysql 5.6 database, and I have a table looking something like this:
CREATE TABLE `items` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`account_id` int(11) NOT NULL,
`node_type_id` int(11) NOT NULL,
`property_native_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`parent_item_id` bigint(20) DEFAULT NULL,
`external_timestamp` datetime DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `index_items_on_acct_node_prop` (`account_id`,`node_type_id`,`property_native_id`),
KEY `index_items_on_account_id_and_external_timestamp` (`account_id`,`external_timestamp`),
KEY `index_items_on_account_id_and_created_at` (`account_id`,`created_at`),
KEY `parent_item_external_timestamp_idx` (`parent_item_id`,`external_timestamp`),
) ENGINE=InnoDB AUTO_INCREMENT=194417315 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
I am trying to optimize a query doing this:
SELECT *
FROM items
WHERE parent_item_id = ?
AND external_timestamp < ( SELECT external_timestamp
FROM items
WHERE id = ?
) FROM items ORDER BY
external_timestamp LIMIT 5
Currently, there is an index on parent_item_id, so when I run this query with EXPLAIN, I get an "extra" of "Using where; Using filesort"
When I modify the index to be (parent_item_id, external_timestamp), then the EXPLAIN's "extra" becomes "Using index condition"
The problem is that the EXPLAIN's "rows" field is still the same (which is usually a couple thousand rows, but it could be millions in some use-cases).
I know that I can do something like AND external_timestamp > (1 week ago) or something like that, but I'd really like the number of rows to be just the number of LIMIT, so 5 in that case.
Is it possible to instruct the database to lock onto a row and then get the 5 rows before it on that (parent_item_id, external_timestamp) index?
(I'm unclear on what you are trying to do. Perhaps you should provide some sample input and output.) See if this works for you:
SELECT i.*
FROM items AS i
WHERE i.parent_item_id = ?
AND i.external_timestamp < ( SELECT external_timestamp
FROM items
WHERE id = ? )
ORDER BY i.external_timestamp
LIMIT 5
Your existing INDEX(parent_item_id, external_timestamp) will probably be used; see EXPLAIN SELECT ....
If id was supposed to match in all 5 rows, then the subquery is not needed.
SELECT items.*
FROM items
CROSS JOIN ( SELECT external_timestamp
FROM items
WHERE id = ? ) subquery
WHERE items.parent_item_id = ?
AND items.external_timestamp < subquery.external_timestamp
ORDER BY external_timestamp LIMIT 5
id is PK, hence the subquery will return only one row (or none).
I've noticed a serious problem recently, when my database increased to over 620000 records. Following query:
SELECT *,UNIX_TIMESTAMP(`time`) AS `time` FROM `log` WHERE (`projectname`="test" OR `projectname` IS NULL) ORDER BY `time` DESC LIMIT 0, 20
has an execution time about 2,5s on a local database. I was wondering how can I speed it up?
The EXPLAIN commands produces following output:
ID: 1
select type: SIMPLE
TABLE: log
type: ref_or_null
possible_keys: projectname
key: projectname
key_len: 387
ref: const
rows: 310661
Extra: Using where; using filesort
I've got indexes set on projectname, time columns.
Any help?
EDIT: Thanks to ypercube response, I was able to decrease query execution time. But when I only add another condition to WHERE clause (AND severity="changes") it lasts 2s again. Is it a good solution to include all of the possible "WHERE" columns to my merged-index?
ID: 1
select type: SIMPLE
TABLE: log
type: ref_or_null
possible_keys: projectname
key: projectname
key_len: 419
ref: const, const
rows: 315554
Extra: Using where; using filesort
Table structure:
CREATE TABLE `log` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`projectname` VARCHAR(128) DEFAULT NULL,
`time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`master` VARCHAR(128) NOT NULL,
`itemName` VARCHAR(128) NOT NULL,
`severity` VARCHAR(10) NOT NULL DEFAULT 'info',
`message` VARCHAR(255) NOT NULL,
`more` TEXT NOT NULL,
PRIMARY KEY (`id`),
KEY `projectname` (`severity`,`projectname`,`time`)
) ENGINE=INNODB AUTO_INCREMENT=621691 DEFAULT CHARSET=utf8
Add an index on (projectname, time):
ALTER TABLE log
ADD INDEX projectname_time_IX -- choose a name for the index
(projectname, time) ;
And then use the original column for the ORDER BY
SELECT *, UNIX_TIMESTAMP(time) AS unix_time
FROM log
WHERE (projectname = 'test' OR projectname IS NULL)
ORDER BY time DESC
LIMIT 0, 20 ;
or this variation - to make sure that the index is used effectively:
( SELECT *, UNIX_TIMESTAMP(time) AS unix_time
FROM log
WHERE projectname = 'test'
ORDER BY time DESC
LIMIT 20
)
UNION ALL
( SELECT *, UNIX_TIMESTAMP(time) AS unix_time
FROM log
WHERE projectname IS NULL
ORDER BY time DESC
LIMIT 20
)
ORDER BY time DESC
LIMIT 20 ;
Table structure:
CREATE TABLE IF NOT EXISTS `logs` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user` bigint(20) unsigned NOT NULL,
`type` tinyint(1) unsigned NOT NULL,
`date` int(11) unsigned NOT NULL,
`plus` decimal(10,2) unsigned NOT NULL,
`minus` decimal(10,2) unsigned NOT NULL,
`tax` decimal(10,2) unsigned NOT NULL,
`item` bigint(20) unsigned NOT NULL,
`info` char(10) NOT NULL,
PRIMARY KEY (`id`),
KEY `item` (`item`),
KEY `user` (`user`),
KEY `type` (`type`),
KEY `date` (`date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=0 ROW_FORMAT=FIXED;
Query:
SELECT logs.item, COUNT(logs.item) AS total FROM logs WHERE logs.type = 4 GROUP BY logs.item;
Table holds 110k records out of which 50k type 4 records.
Execution time: 0.13 seconds
I know this is fast, but can I make it faster?
I am expecting 1 million records and thus the time would grow quite a bit.
Analyze queries with EXPLAIN:
mysql> EXPLAIN SELECT logs.item, COUNT(logs.item) AS total FROM logs
WHERE logs.type = 4 GROUP BY logs.item\G
id: 1
select_type: SIMPLE
table: logs
type: ref
possible_keys: type
key: type
key_len: 1
ref: const
rows: 1
Extra: Using where; Using temporary; Using filesort
The "Using temporary; Using filesort" indicates some costly operations. Because the optimizer knows it can't rely on the rows with each value of item being stored together, it needs to scan the whole table and collect the count per distinct item in a temporary table. Then sort the resulting temp table to produce the result.
You need an index on the logs table on columns (type, item) in that order. Then the optimizer knows it can leverage the index tree to scan each value of logs.item fully before moving on to the next value. By doing this, it can skip the temporary table to collect values, and skip the implicit sorting of the result.
mysql> CREATE INDEX logs_type_item ON logs (type,item);
mysql> EXPLAIN SELECT logs.item, COUNT(logs.item) AS total FROM logs
WHERE logs.type = 4 GROUP BY logs.item\G
id: 1
select_type: SIMPLE
table: logs
type: ref
possible_keys: type,logs_type_item
key: logs_type_item
key_len: 1
ref: const
rows: 1
Extra: Using where
I have a simple mysql query, but when I have a lot of records (currently 103,0000), the performance is really slow and it says it is using filesort, im not sure if this is why it is slow. Has anyone any suggestions to speed it up? or stop it using filesort?
MYSQL query :
SELECT *
FROM adverts
WHERE (price >= 0)
AND (status = 1)
AND (approved = 1)
ORDER BY date_updated DESC
LIMIT 19990, 10
The Explain results :
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE adverts range price price 4 NULL 103854 Using where; Using filesort
Here is the adverts table and indexes:
CREATE TABLE `adverts` (
`advert_id` int(10) NOT NULL AUTO_INCREMENT,
`user_id` int(10) NOT NULL,
`type_id` tinyint(1) NOT NULL,
`breed_id` int(10) NOT NULL,
`advert_type` tinyint(1) NOT NULL,
`headline` varchar(50) NOT NULL,
`description` text NOT NULL,
`price` int(4) NOT NULL,
`postcode` varchar(7) NOT NULL,
`town` varchar(60) NOT NULL,
`county` varchar(60) NOT NULL,
`latitude` float NOT NULL,
`longitude` float NOT NULL,
`telephone1` varchar(15) NOT NULL,
`telephone2` varchar(15) NOT NULL,
`email` varchar(80) NOT NULL,
`status` tinyint(1) NOT NULL DEFAULT '0',
`approved` tinyint(1) NOT NULL DEFAULT '0',
`date_created` datetime NOT NULL,
`date_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`expiry_date` datetime NOT NULL,
PRIMARY KEY (`advert_id`),
KEY `price` (`price`),
KEY `user` (`user_id`),
KEY `type_breed` (`type_id`,`breed_id`),
KEY `headline_keywords` (`headline`),
KEY `date_updated` (`date_updated`),
KEY `type_status_approved` (`advert_type`,`status`,`approved`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
The problem is that MySQL only uses one index when executing the query. If you add a new index that uses the 3 fields in your WHERE clause, it will find the rows faster.
ALTER TABLE `adverts` ADD INDEX price_status_approved(`price`, `status`, `approved`);
According to the MySQL documentation ORDER BY Optimization:
In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that match the WHERE clause. These cases include the following:
The key used to fetch the rows is not the same as the one used in the ORDER BY.
This is what happens in your case.
As the output of EXPLAIN tells us, the optimizer uses the key price to find the rows. However, the ORDER BY is on the field date_updated which does not belong to the key price.
To find the rows faster AND sort the rows faster, you need to add an index that contains all the fields used in the WHERE and in the ORDER BY clauses:
ALTER TABLE `adverts` ADD INDEX status_approved_date_updated(`status`, `approved`, `date_updated`);
The field used for sorting must be in the last position in the index. It is useless to include price in the index, because the condition used in the query will return a range of values.
If EXPLAIN still shows that it is using filesort, you may try forcing MySQL to use an index you choose:
SELECT adverts.*
FROM adverts
FORCE INDEX(status_approved_date_updated)
WHERE price >= 0
AND adverts.status = 1
AND adverts.approved = 1
ORDER BY date_updated DESC
LIMIT 19990, 10
It is usually not necessary to force an index, because the MySQL optimizer most often does the correct choice. But sometimes it makes a bad choice, or not the best choice. You will need to run some tests to see if it improves performance or not.
Remove the ticks around the '0' - it currently may prevent using the index but I am not sure.
Nevertheless it is better style since price is int type and not a character column.
SELECT adverts .*
FROM adverts
WHERE (
price >= 0
)
AND (
adverts.status = 1
)
AND (
adverts.approved = 1
)
ORDER BY date_updated DESC
LIMIT 19990 , 10
MySQL does not make use of the key date_updated for the sorting but just uses the price key as it is used in the WHERE clause. You could try to to use index hints:
http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
Add something like
USE KEY FOR ORDER BY (date_updated)
I have two suggestions. First, remove the quotes around the zero in your where clause. That line should be:
price >= 0
Second, create this index:
CREATE INDEX `helper` ON `adverts`(`status`,`approved`,`price`,`date_created`);
This should allow MySQL to find the 10 rows specified by your LIMIT clause by using only the index. Filesort itself is not a bad thing... the number of rows that need to be processed is.
Your WHERE condition uses price, status, approved to select, and then date_updated is used to sort.
So you need a single index with those fields; I'd suggest indexing on approved, status, price and date_updated, in this order.
The general rule is placing WHERE equalities first, then ranges (more than, less or equal, between, etc), and sorting fields last. (Note that leaving one field out might make the index less usable, or even unusable, for this purpose).
CREATE INDEX advert_ndx ON adverts (approved, status, price, date_updated);
This way, access to the table data is only needed after LIMIT has worked its magic, and you will slow-retrieve only a small number of records.
I'd also remove any unneeded indexes, which would speed up INSERTs and UPDATEs.
To start out here is a simplified version of the tables involved.
tbl_map has approx 4,000,000 rows, tbl_1 has approx 120 rows, tbl_2 contains approx 5,000,000 rows. I know the data shouldn't be consider that large given that Google, Yahoo!, etc use much larger datasets. So I'm just assuming that I'm missing something.
CREATE TABLE `tbl_map` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`tbl_1_id` bigint(20) DEFAULT '-1',
`tbl_2_id` bigint(20) DEFAULT '-1',
`rating` decimal(3,3) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `tbl_1_id` (`tbl_1_id`),
KEY `tbl_2_id` (`tbl_2_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `tbl_1` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `tbl_2` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`data` varchar(255) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The Query in interest: also, instead of ORDER BY RAND(), ORDERY BY t.id DESC. The query is taking as much as 5~10 seconds and causes a considerable wait when users view this page.
EXPLAIN SELECT t.data, t.id , tm.rating
FROM tbl_2 AS t
JOIN tbl_map AS tm
ON t.id = tm.tbl_2_id
WHERE tm.tbl_1_id =94
AND tm.rating IS NOT NULL
ORDER BY t.id DESC
LIMIT 200
1 SIMPLE tm ref tbl_1_id, tbl_2_id tbl_1_id 9 const 703438 Using where; Using temporary; Using filesort
1 SIMPLE t eq_ref PRIMARY PRIMARY 8 tm.tbl_2_id 1
I would just liked to speed up the query, ensure that I have proper indexes, etc.
I appreciate any advice from DB Gurus out there! Thanks.
SUGGESTION : Index the table as follows:
ALTER TABLE tbl_map ADD INDEX (tbl_1_id,rating,tbl_2_id);
As per Rolando, yes, you definitely need an index on the map table but I would expand to ALSO include the tbl_2_id which is for your ORDER BY clause of Table 2's ID (which is in the same table as the map, so just use that index. Also, since the index now holds all 3 fields, and is based on the ID of the key search and criteria of null (or not) of rating, the 3rd element has them already in order for your ORDER BY clause.
INDEX (tbl_1_id,rating, tbl_2_id);
Then, I would just have the query as
SELECT STRAIGHT_JOIN
t.data,
t.id ,
tm.rating
FROM
tbl_map tm
join tbl_2 t
on tm.tbl_2_id = t.id
WHERE
tm.tbl_1_id = 94
AND tm.rating IS NOT NULL
ORDER BY
tm.tbl_2_id DESC
LIMIT 200