I have a table with ~ 1.500.000 records:
CREATE TABLE `item_locale` (
`item_id` bigint(20) NOT NULL,
`language` int(11) NOT NULL,
`name` varchar(256) COLLATE utf8_czech_ci NOT NULL,
`text` text COLLATE utf8_czech_ci)
PRIMARY KEY (`item_id`,`language`),
KEY `name` (`name`(255))
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_czech_ci;
With item_id, language as primary keys and index on name with size 255.
With following query:
select item_id, name from item_locale order by name limit 50;
The select takes around 3 seconds event though only 50 rows were required.
What can I do to speed up such query?
EDIT: Some of you suggested adding an INDEX. I mentioned above, that the name column is indexed with size 255.
I runned explain on the command:
+----+-------------+---------------+------+---------------+------+---------+------+---------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+------+---------------+------+---------+------+---------+----------------+
| 1 | SIMPLE | item_locale | ALL | NULL | NULL | NULL | NULL | 1558653 | Using filesort |
+----+-------------+---------------+------+---------------+------+---------+------+---------+----------------+
Strange thing is that it is seems not to use any index...
Retrieving 50 Records is heavier too. Limit them to 10 Since you are using Order by also..
Try to use query hint:
select item_id, name
from item_locale USE INDEX FOR ORDER BY (name)
order by name limit 50;
also try to use
select item_id, name
from item_locale FORCE INDEX (name)
order by name limit 50;
In the end, there was some kind of problem with indexes - I dropped them all and recreated them again. And it finally works. Thanks.
Apply an index on the name field which might speed it up a bit.
Related
Version - 10.4.25-MariaDB
I have a table where column(name) is a second part of primary key(idarchive,name).
When i run count(*) on table where name like 'done%', its using the index on field name properly but when i run select * its not using the separate index instead using primary key and slowing down the query.
Any idea what we can do here ?
any changes in optimizer switch or any other alternative which can help ?
Note - we can't use force index as queries are not controlable.
Table structure:
CREATE TABLE `table1 ` (
`idarchive` int(10) unsigned NOT NULL,
`name` varchar(255) NOT NULL,
`idsite` int(10) unsigned DEFAULT NULL,
`date1` date DEFAULT NULL,
`date2` date DEFAULT NULL,
`period` tinyint(3) unsigned DEFAULT NULL,
`ts_archived` datetime DEFAULT NULL,
`value` double DEFAULT NULL,
PRIMARY KEY (`idarchive`,`name`),
KEY `index_idsite_dates_period` (`idsite`,`date1`,`date2`,`period`,`ts_archived`),
KEY `index_period_archived` (`period`,`ts_archived`),
KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Queries:
explain select count(*) from table1 WHERE name like 'done%' ;
+------+-------------+-------------------------------+-------+---------------+------+---------+------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------------------------+-------+---------------+------+---------+------+---------+--------------------------+
| 1 | SIMPLE | table1 | range | name | name | 767 | NULL | 9131455 | Using where; Using index |
+------+-------------+-------------------------------+-------+---------------+------+---------+------+---------+--------------------------+
1 row in set (0.000 sec)
explain select * from table1 WHERE name like 'done%' ;
+------+-------------+-------------------------------+------+---------------+------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------------------------+------+---------------+------+---------+------+----------+-------------+
| 1 | SIMPLE | table1 | ALL | name | NULL | NULL | NULL | 18262910 | Using where |
+------+-------------+-------------------------------+------+---------------+------+---------+------+----------+-------------+
1 row in set (0.000 sec) ```
Your SELECT COUNT(*) ... LIKE 'constant%' query is covered by your index on your name column. That is, the entire query can be satisfied by reading the index. So the query planner decides to range-scan your index to generate the result.
On the other hand, your SELECT * query needs all columns from all rows of the table. That can't be satisfied from any of your indexes. And, it's possible your WHERE name like 'done%' filter reads a significant fraction of the table, enough so the query planner decides the fastest way to satisfy it is to scan the entire table. The query planner figures this out by using statistics on the contents of the table, plus some knowledge of the relative costs of IO and CPU.
If you have just inserted many rows into the table you could try doing ANALYZE TABLE table1 and then rerun the query. Maybe after the table's statistics are updated you'll get a different query plan.
And, if you don't need all the columns, you could stop using SELECT * and instead name the columns you need. SELECT * is a notorious query-performance antipattern, because it often returns column data that never gets used. Once you know exactly what columns you want, you could create a covering index to provide them.
These days the query planner does a pretty good job of optimizing simple queries such as yours.
In MariaDB you can say ANALYZE FORMAT=JSON SELECT.... It will run the query and show you details of the actual execution plan it used.
I have a table defined like this:
article | CREATE TABLE `article` (
`id` varchar(64) NOT NULL,
`type` varchar(16) DEFAULT NULL,
`title` varchar(1024) DEFAULT NULL,
`source` varchar(64) DEFAULT NULL,
`over` tinyint(1) DEFAULT NULL,
`taken` tinyint(1) DEFAULT NULL,
`released_at` varchar(32) DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_article_over` (`over`),
KEY `idx_article_created_at` (`created_at`),
KEY `idx_article_type` (`type`),
KEY `idx_article_taken` (`taken`),
KEY `idx_article_updated_at` (`updated_at`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
mysql> select count(1) from article;
+----------+
| count(1) |
+----------+
| 649773 |
+----------+
1 row in set (0.61 sec)
when I make a query:
SELECT * FROM `article` where taken=0 ORDER BY updated_at asc limit 10;
or
SELECT * FROM `article` where over=0 ORDER BY updated_at asc limit 10;
They are both very fast.
However when I use this, it become very slow:
SELECT * FROM `article` where taken=0 and over=0 ORDER BY updated_at asc limit 10;
It takes 4.94s.
If the article table grows to 20 million rows, it takes much longer time.
Here is the explain with 20 million rows:
mysql> explain SELECT * FROM `article` where taken=0 and processed=0 ORDER BY updated_at asc limit 10;
+----+-------------+-----------+------------+-------------+---------------------------------------------+---------------------------------------------+---------+------+---------+----------+-------------------------------------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------------+---------------------------------------------+---------------------------------------------+---------+------+---------+----------+-------------------------------------------------------------------------------------------+
| 1 | SIMPLE | article | NULL | index_merge | idx_article_processed,idx_article_taken | idx_article_processed,idx_article_taken | 2,2 | NULL | 6234059 | 100.00 | Using intersect(idx_article_processed,idx_article_taken); Using where; Using filesort |
+----+-------------+-----------+------------+-------------+---------------------------------------------+---------------------------------------------+---------+------+---------+----------+-------------------------------------------------------------------------------------------+
mysql> SELECT * FROM `judgement` where taken=0 and processed=0 ORDER BY updated_at asc limit 10;
+--------------------------------------+----------+-----------+---------------------------------------------------------------------------
| id | type | title | source| processed | released_at | created_at | updated_at | taken |
+--------------------------------------+----------+-----------+---------------------------------------------------------------------------
10 rows in set (9 min 15.97 sec)
taken, over both have indexes, why I put them together, the query get worse? shouldn't it be much faster due to more indexes?
I don't know an exact answer to the question "Why it become slow if the article table grows to 20 million rows".
Your query is doing two operations:
index_merge - Using intersect(idx_article_processed,idx_article_taken)
Using filesort
I only guess that up to 20 million rows in the table MySql can do both of these operations in the memory, but above this limit one of these operations (or maybe both) cannot fit in the memory buffer and MySql must use a file on the disk, which is much slower.
You can either increase memory buffers tweeking some MySql parameters or create indexes dedicated to your queries:
For this query:
SELECT * FROM `article` where taken=0 ORDER BY updated_at asc limit 10;
create this index:
CREATE INDEX my_new_index ON article( taken, updated_at )
For this query:
SELECT * FROM `article`
where taken=0 and over=0
ORDER BY updated_at asc limit 10;
create this index:
CREATE INDEX my_new_index1 ON article( taken, over, updated_at )
With help of these new indexes both filesort and megre operations will be eliminated.
The work involved in navigating an index gets worse than a table scan quite quickly. Yes/no indexing can be next to worthless if there's an even split.
If you've only got a few that match consider building another table for those rows concerned and join back, removing them when they are processed. In other dbs you'd build a conditional index.
It "became slow" because there are not that many rows with both taken=0 and over=0. And innodb_buffer_pool_size is too small. But, be careful, that setting should not be so large as to lead to swapping. How much RAM do you have available?
Server version:
[root#cat best]# /usr/libexec/mysqld --version
/usr/libexec/mysqld Ver 5.1.47 for redhat-linux-gnu on i386 (Source distribution)
Schema:
CREATE TABLE `Log` (
`EntryId` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`EntryTime` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP(),
`Severity` ENUM(
'LOG_LEVEL_CRITICAL',
'LOG_LEVEL_ERROR',
'LOG_LEVEL_WARNING',
'LOG_LEVEL_NOTICE',
'LOG_LEVEL_INFO',
'LOG_LEVEL_DEBUG'
) NOT NULL,
`User` TEXT,
`Text` TEXT NOT NULL,
PRIMARY KEY(`EntryId`),
KEY `TimeId` (`EntryTime`,`EntryId`)
) ENGINE=InnoDB COMMENT="Log of server activity";
Query:
SELECT
`EntryId`,
`EntryTime`, -- or, ideally: UNIX_TIMESTAMP(`EntryTime`) AS `EntryTime_UnixTS`
`Severity`,
`User`,
`Text`
FROM `Log`
ORDER BY `EntryTime` DESC, `EntryId` DESC
LIMIT 0, 20
According to the execution plan (and observation), the index is not being used:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Log ALL \N \N \N \N 720 Using filesort
I've tried re-organising it a few ways with little success but, more than anything, would like to understand why this simple approach is failing. My understanding was that a left-most subset of any key can be used to optimise an ORDER BY operation.
Is my index wrong? Can I optimise the query?
Please note that I will also want to conditionally add, e.g.
WHERE `Severity` <= 'LOG_LEVEL_WARNING'
though I'd like to get the basic version working first if this makes the solution very different.
Reproduced on SQLFiddle under MySQL 5.5.32.
The reason is that you index includes the primary key in it. and since it is InnoDB, by default the PK is ancluded in all other indexes as the left-most field. i.e. the index in this case is (EntryId, EntryTime, EntryId).
The solution is to have this index only on (EntryTime):
alter table Log drop index TimeId;
alter table Log add index TimeId(EntryTime);
explain SELECT `EntryId`, `EntryTime`, `Severity`, `User`, `Text` FROM `Log` ORDER BY `EntryTime` DESC, `EntryId` DESC LIMIT 0, 20;
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------+
| 1 | SIMPLE | Log | index | NULL | TimeId | 4 | NULL | 20 | NULL |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------+
HTH
I have a table with 25 million rows, indexed appropriately.
But adding the clause AND status IS NULL turns a super fast query into a crazy slow query.
Please help me speed it up.
Query:
SELECT
student_id,
grade,
status
FROM
grades
WHERE
class_id = 1
AND status IS NULL -- This line delays results from <200ms to 40-70s!
AND grade BETWEEN 0 AND 0.7
LIMIT 25;
Table:
CREATE TABLE IF NOT EXISTS `grades` (
`student_id` BIGINT(20) NOT NULL,
`class_id` INT(11) NOT NULL,
`grade` FLOAT(10,6) DEFAULT NULL,
`status` INT(11) DEFAULT NULL,
UNIQUE KEY `unique_key` (`student_id`,`class_id`),
KEY `class_id` (`class_id`),
KEY `status` (`status`),
KEY `grade` (`grade`)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Local development shows results instantly (<200ms). Production server is huge slowdown (40-70 seconds!).
Can you point me in the right direction to debug?
Explain:
+----+-------------+--------+-------------+-----------------------+-----------------+---------+------+-------+--------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------------+-----------------------+-----------------+---------+------+-------+--------------------------------------------------------+
| 1 | SIMPLE | grades | index_merge | class_id,status,grade | status,class_id | 5,4 | NULL | 26811 | Using intersect(status,class_id); Using where |
+----+-------------+--------+-------------+-----------------------+-----------------+---------+------+-------+--------------------------------------------------------+
A SELECT statement can only use one index per table.
Presumably the query before just did a scan using the sole index class_id for your condition class_id=1. Which will probably filter your result set nicely before checking the other conditions.
The optimiser is 'incorrectly' choosing an index merge on class_id and status for the second query and checking 26811 rows which is probably not optimal. You could hint at the class_id index by adding USING INDEX (class_id) to the end of the FROM clause.
You may get some joy with a composite index on (class_id,status,grade) which may run the query faster as it can match the first two and then range scan the grade. I'm not sure how this works with null though.
I'm guessing the ORDER BY pushed the optimiser to choose the class_id index again and returned your query to it's original speed.
I'm hoping (and pretty sure that) someone out there is much better at MySQL queries than msyelf.
I have a query which checks a table that contains information on :
- a search term
- title and price results from various sites using this search term
For the sake of streamlining, I've inserted the data already converted to lowercase with spaces removed and the whole thing trimmed to 11 characters to help reduce the load on the MySQL server.
The query is designed to find the maximum cost and minimum cost of likely equal titles and determine a price difference if it exists.
Having read some similar questions here, I've also prepended EXPLAIN EXTENDED to the query to see if that would help and I'm including the results along with the query.
The query as is :
SELECT
a.pricesrch11,
b.pricesrch11,
a.pricegroup11,
b.pricegroup11,
a.priceamt - b.priceamt AS pricediff
FROM ebssavings a
LEFT JOIN ebssavings b ON ( a.pricesrch11 = b.pricesrch11 )
AND (a.pricegroup11 = a.pricesrch11)
AND (b.pricegroup11 = a.pricesrch11)
WHERE a.priceamt - b.priceamt >0
GROUP BY a.pricesrch11
The results of the EXPLAIN :
select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
1 | SIMPLE | a | ALL | pricesrch11,pricegroup11 | NULL | NULL | NULL | 8816 | Using where; Using temporary; Using filesort
1 | SIMPLE | b | ALL | pricesrch11,pricegroup11 | NULL | NULL | NULL | 6612 | Using where
ADDENDUM :
I just ran this query and got the following result :
Showing rows 0 - 4 ( 5 total, Query took 66.8119 sec)
CREATE TABLE IF NOT EXISTS ebssavings
( priceid int(44) NOT NULL auto_increment,
priceamt decimal(10,2) NOT NULL,
pricesrch11 varchar(11) character set utf8 collate utf8_unicode_ci NOT NULL,
pricegroup11 varchar(11) character set utf8 collate utf8_unicode_ci NOT NULL,
pricedate timestamp NOT NULL default CURRENT_TIMESTAMP,
PRIMARY KEY (priceid),
KEY priceamt (priceamt),
KEY pricesrch11 (pricesrch11),
KEY pricegroup11 (pricegroup11) )
ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8817
MORE INFO ON THE NEW INDEXES (removed pricegroup11, and made a composite index called srchandtitle from pricesrch11 and pricegroup11):
Edit Drop PRIMARY BTREE Yes No priceid 169 A
Edit Drop priceamt BTREE No No priceamt 56 A
Edit Drop pricesrch11 BTREE No No pricesrch11 12 A
Edit Drop srchandtitle BTREE No No pricesrch11 12 A
pricegroup11 169 A
Create two indexes:
PriceSrch11
A clustered index on pricesrch11,pricegroup11
Remove the Key on pricegroup11 and add a composite clustered key on pricesrch11,pricegroup11.
Also move the table to InnoDB.
It seems that things have sped up now with the changes made to the table and the indexes.
I've emptied the table and am beginning again.
Thank you all for your help.
-A