I have a table defined like this:
article | CREATE TABLE `article` (
`id` varchar(64) NOT NULL,
`type` varchar(16) DEFAULT NULL,
`title` varchar(1024) DEFAULT NULL,
`source` varchar(64) DEFAULT NULL,
`over` tinyint(1) DEFAULT NULL,
`taken` tinyint(1) DEFAULT NULL,
`released_at` varchar(32) DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_article_over` (`over`),
KEY `idx_article_created_at` (`created_at`),
KEY `idx_article_type` (`type`),
KEY `idx_article_taken` (`taken`),
KEY `idx_article_updated_at` (`updated_at`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
mysql> select count(1) from article;
+----------+
| count(1) |
+----------+
| 649773 |
+----------+
1 row in set (0.61 sec)
when I make a query:
SELECT * FROM `article` where taken=0 ORDER BY updated_at asc limit 10;
or
SELECT * FROM `article` where over=0 ORDER BY updated_at asc limit 10;
They are both very fast.
However when I use this, it become very slow:
SELECT * FROM `article` where taken=0 and over=0 ORDER BY updated_at asc limit 10;
It takes 4.94s.
If the article table grows to 20 million rows, it takes much longer time.
Here is the explain with 20 million rows:
mysql> explain SELECT * FROM `article` where taken=0 and processed=0 ORDER BY updated_at asc limit 10;
+----+-------------+-----------+------------+-------------+---------------------------------------------+---------------------------------------------+---------+------+---------+----------+-------------------------------------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------------+---------------------------------------------+---------------------------------------------+---------+------+---------+----------+-------------------------------------------------------------------------------------------+
| 1 | SIMPLE | article | NULL | index_merge | idx_article_processed,idx_article_taken | idx_article_processed,idx_article_taken | 2,2 | NULL | 6234059 | 100.00 | Using intersect(idx_article_processed,idx_article_taken); Using where; Using filesort |
+----+-------------+-----------+------------+-------------+---------------------------------------------+---------------------------------------------+---------+------+---------+----------+-------------------------------------------------------------------------------------------+
mysql> SELECT * FROM `judgement` where taken=0 and processed=0 ORDER BY updated_at asc limit 10;
+--------------------------------------+----------+-----------+---------------------------------------------------------------------------
| id | type | title | source| processed | released_at | created_at | updated_at | taken |
+--------------------------------------+----------+-----------+---------------------------------------------------------------------------
10 rows in set (9 min 15.97 sec)
taken, over both have indexes, why I put them together, the query get worse? shouldn't it be much faster due to more indexes?
I don't know an exact answer to the question "Why it become slow if the article table grows to 20 million rows".
Your query is doing two operations:
index_merge - Using intersect(idx_article_processed,idx_article_taken)
Using filesort
I only guess that up to 20 million rows in the table MySql can do both of these operations in the memory, but above this limit one of these operations (or maybe both) cannot fit in the memory buffer and MySql must use a file on the disk, which is much slower.
You can either increase memory buffers tweeking some MySql parameters or create indexes dedicated to your queries:
For this query:
SELECT * FROM `article` where taken=0 ORDER BY updated_at asc limit 10;
create this index:
CREATE INDEX my_new_index ON article( taken, updated_at )
For this query:
SELECT * FROM `article`
where taken=0 and over=0
ORDER BY updated_at asc limit 10;
create this index:
CREATE INDEX my_new_index1 ON article( taken, over, updated_at )
With help of these new indexes both filesort and megre operations will be eliminated.
The work involved in navigating an index gets worse than a table scan quite quickly. Yes/no indexing can be next to worthless if there's an even split.
If you've only got a few that match consider building another table for those rows concerned and join back, removing them when they are processed. In other dbs you'd build a conditional index.
It "became slow" because there are not that many rows with both taken=0 and over=0. And innodb_buffer_pool_size is too small. But, be careful, that setting should not be so large as to lead to swapping. How much RAM do you have available?
Related
Version - 10.4.25-MariaDB
I have a table where column(name) is a second part of primary key(idarchive,name).
When i run count(*) on table where name like 'done%', its using the index on field name properly but when i run select * its not using the separate index instead using primary key and slowing down the query.
Any idea what we can do here ?
any changes in optimizer switch or any other alternative which can help ?
Note - we can't use force index as queries are not controlable.
Table structure:
CREATE TABLE `table1 ` (
`idarchive` int(10) unsigned NOT NULL,
`name` varchar(255) NOT NULL,
`idsite` int(10) unsigned DEFAULT NULL,
`date1` date DEFAULT NULL,
`date2` date DEFAULT NULL,
`period` tinyint(3) unsigned DEFAULT NULL,
`ts_archived` datetime DEFAULT NULL,
`value` double DEFAULT NULL,
PRIMARY KEY (`idarchive`,`name`),
KEY `index_idsite_dates_period` (`idsite`,`date1`,`date2`,`period`,`ts_archived`),
KEY `index_period_archived` (`period`,`ts_archived`),
KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Queries:
explain select count(*) from table1 WHERE name like 'done%' ;
+------+-------------+-------------------------------+-------+---------------+------+---------+------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------------------------+-------+---------------+------+---------+------+---------+--------------------------+
| 1 | SIMPLE | table1 | range | name | name | 767 | NULL | 9131455 | Using where; Using index |
+------+-------------+-------------------------------+-------+---------------+------+---------+------+---------+--------------------------+
1 row in set (0.000 sec)
explain select * from table1 WHERE name like 'done%' ;
+------+-------------+-------------------------------+------+---------------+------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------------------------+------+---------------+------+---------+------+----------+-------------+
| 1 | SIMPLE | table1 | ALL | name | NULL | NULL | NULL | 18262910 | Using where |
+------+-------------+-------------------------------+------+---------------+------+---------+------+----------+-------------+
1 row in set (0.000 sec) ```
Your SELECT COUNT(*) ... LIKE 'constant%' query is covered by your index on your name column. That is, the entire query can be satisfied by reading the index. So the query planner decides to range-scan your index to generate the result.
On the other hand, your SELECT * query needs all columns from all rows of the table. That can't be satisfied from any of your indexes. And, it's possible your WHERE name like 'done%' filter reads a significant fraction of the table, enough so the query planner decides the fastest way to satisfy it is to scan the entire table. The query planner figures this out by using statistics on the contents of the table, plus some knowledge of the relative costs of IO and CPU.
If you have just inserted many rows into the table you could try doing ANALYZE TABLE table1 and then rerun the query. Maybe after the table's statistics are updated you'll get a different query plan.
And, if you don't need all the columns, you could stop using SELECT * and instead name the columns you need. SELECT * is a notorious query-performance antipattern, because it often returns column data that never gets used. Once you know exactly what columns you want, you could create a covering index to provide them.
These days the query planner does a pretty good job of optimizing simple queries such as yours.
In MariaDB you can say ANALYZE FORMAT=JSON SELECT.... It will run the query and show you details of the actual execution plan it used.
Server version:
[root#cat best]# /usr/libexec/mysqld --version
/usr/libexec/mysqld Ver 5.1.47 for redhat-linux-gnu on i386 (Source distribution)
Schema:
CREATE TABLE `Log` (
`EntryId` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`EntryTime` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP(),
`Severity` ENUM(
'LOG_LEVEL_CRITICAL',
'LOG_LEVEL_ERROR',
'LOG_LEVEL_WARNING',
'LOG_LEVEL_NOTICE',
'LOG_LEVEL_INFO',
'LOG_LEVEL_DEBUG'
) NOT NULL,
`User` TEXT,
`Text` TEXT NOT NULL,
PRIMARY KEY(`EntryId`),
KEY `TimeId` (`EntryTime`,`EntryId`)
) ENGINE=InnoDB COMMENT="Log of server activity";
Query:
SELECT
`EntryId`,
`EntryTime`, -- or, ideally: UNIX_TIMESTAMP(`EntryTime`) AS `EntryTime_UnixTS`
`Severity`,
`User`,
`Text`
FROM `Log`
ORDER BY `EntryTime` DESC, `EntryId` DESC
LIMIT 0, 20
According to the execution plan (and observation), the index is not being used:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Log ALL \N \N \N \N 720 Using filesort
I've tried re-organising it a few ways with little success but, more than anything, would like to understand why this simple approach is failing. My understanding was that a left-most subset of any key can be used to optimise an ORDER BY operation.
Is my index wrong? Can I optimise the query?
Please note that I will also want to conditionally add, e.g.
WHERE `Severity` <= 'LOG_LEVEL_WARNING'
though I'd like to get the basic version working first if this makes the solution very different.
Reproduced on SQLFiddle under MySQL 5.5.32.
The reason is that you index includes the primary key in it. and since it is InnoDB, by default the PK is ancluded in all other indexes as the left-most field. i.e. the index in this case is (EntryId, EntryTime, EntryId).
The solution is to have this index only on (EntryTime):
alter table Log drop index TimeId;
alter table Log add index TimeId(EntryTime);
explain SELECT `EntryId`, `EntryTime`, `Severity`, `User`, `Text` FROM `Log` ORDER BY `EntryTime` DESC, `EntryId` DESC LIMIT 0, 20;
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------+
| 1 | SIMPLE | Log | index | NULL | TimeId | 4 | NULL | 20 | NULL |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------+
HTH
A query that used to work just fine on a production server has started becoming extremely slow (in a matter of hours).
This is it:
SELECT * FROM news_articles WHERE published = '1' AND news_category_id = '4' ORDER BY date_edited DESC LIMIT 1;
This takes up to 20-30 seconds to execute (the table has ~200.000 rows)
This is the output of EXPLAIN:
+----+-------------+---------------+-------------+----------------------------+----------------------------+---------+------+------+--------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------------+----------------------------+----------------------------+---------+------+------+--------------------------------------------------------------------------+
| 1 | SIMPLE | news_articles | index_merge | news_category_id,published | news_category_id,published | 5,5 | NULL | 8409 | Using intersect(news_category_id,published); Using where; Using filesort |
+----+-------------+---------------+-------------+----------------------------+----------------------------+---------+------+------+--------------------------------------------------------------------------+
Playing around with it, I found that hinting a specific index (date_edited) makes it much faster:
SELECT * FROM news_articles USE INDEX (date_edited) WHERE published = '1' AND news_category_id = '4' ORDER BY date_edited DESC LIMIT 1;
This one takes milliseconds to execute.
EXPLAIN output for this one is:
+----+-------------+---------------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+---------------+-------------+---------+------+------+-------------+
| 1 | SIMPLE | news_articles | index | NULL | date_edited | 8 | NULL | 1 | Using where |
+----+-------------+---------------+-------+---------------+-------------+---------+------+------+-------------+
Columns news_category_id, published and date_edited are all indexed.
The storage engine is InnoDB.
This is the table structure:
CREATE TABLE `news_articles` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` text NOT NULL,
`subtitle` text NOT NULL,
`summary` text NOT NULL,
`keywords` varchar(500) DEFAULT NULL,
`body` mediumtext NOT NULL,
`source` varchar(255) DEFAULT NULL,
`source_visible` int(11) DEFAULT NULL,
`author_information` enum('none','name','signature') NOT NULL DEFAULT 'name',
`date_added` datetime NOT NULL,
`date_edited` datetime NOT NULL,
`views` int(11) DEFAULT '0',
`news_category_id` int(11) DEFAULT NULL,
`user_id` int(11) DEFAULT NULL,
`c_forwarded` int(11) DEFAULT '0',
`published` int(11) DEFAULT '0',
`deleted` int(11) DEFAULT '0',
`permalink` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `news_category_id` (`news_category_id`),
KEY `published` (`published`),
KEY `deleted` (`deleted`),
KEY `date_edited` (`date_edited`),
CONSTRAINT `news_articles_ibfk_3` FOREIGN KEY (`news_category_id`) REFERENCES `news_categories` (`id`) ON DELETE SET NULL ON UPDATE CASCADE,
CONSTRAINT `news_articles_ibfk_4` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE SET NULL ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=192588 DEFAULT CHARSET=utf8
I could possibly change all queries my web application does to hint using that index. but this is considerable work.
Is there some way to tune MySQL so that the first query is made more efficient without actually rewriting all queries?
just a few tips..
1 - It seems to me the fields published and news_category_id are INTEGER. If so, please remove the single quotes from your query. It can make a huge difference when comes to performance;
2 - Also, I'd say that your field published has no many different values (it is probably 1 - yes and 0 - no, or something like that). If I'm right, this is not a good field to index at all. The parse in this case still has to go through all the records to find what it is looking for; In this case move the news_category_id to be the first field in your WHERE clause.
3 - "Don't forget about the most left index". This affirmation is valid for your SELECT, JOINS, WHERE, ORDER BY. Even the position of the columns on the table are imporant, keep the indexed ones on the top. Indexes are your friend as long as you know how to play with them.
Hope it can help you in somehow..
SELECT * FROM news_articles WHERE published = '1' AND news_category_id = '4' ORDER BY date_edited DESC LIMIT 1;
Original:
SELECT * FROM news_articles
WHERE published = 1 AND news_category_id = 4
ORDER BY date_edited DESC LIMIT 1;
Since you have LIMIT 1, you're only selecting the latest row. ORDER BY date_edited tells MySQL to sort then take 1 row off the top. This is really slow, and why USE INDEX would help.
Try to match MAX(date_edited) in the WHERE clause instead. That should get the query planner to use its index automatically.
Choose MAX(date_entered):
SELECT * FROM news_articles
WHERE published = 1 AND news_category_id = 4
AND date_edited = (select max(date_edited) from news_articles);
Please change your query to :
SELECT * FROM news_articles WHERE published = 1 AND news_category_id = 4 ORDER BY date_edited DESC LIMIT 1;
Please note that i have removed quotes from '1' and '4' data provided in query
The difference in the datatype passed and the column structure does not allow mysql to be able to use the index on these 2 columns.
I have a table with ~ 1.500.000 records:
CREATE TABLE `item_locale` (
`item_id` bigint(20) NOT NULL,
`language` int(11) NOT NULL,
`name` varchar(256) COLLATE utf8_czech_ci NOT NULL,
`text` text COLLATE utf8_czech_ci)
PRIMARY KEY (`item_id`,`language`),
KEY `name` (`name`(255))
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_czech_ci;
With item_id, language as primary keys and index on name with size 255.
With following query:
select item_id, name from item_locale order by name limit 50;
The select takes around 3 seconds event though only 50 rows were required.
What can I do to speed up such query?
EDIT: Some of you suggested adding an INDEX. I mentioned above, that the name column is indexed with size 255.
I runned explain on the command:
+----+-------------+---------------+------+---------------+------+---------+------+---------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+------+---------------+------+---------+------+---------+----------------+
| 1 | SIMPLE | item_locale | ALL | NULL | NULL | NULL | NULL | 1558653 | Using filesort |
+----+-------------+---------------+------+---------------+------+---------+------+---------+----------------+
Strange thing is that it is seems not to use any index...
Retrieving 50 Records is heavier too. Limit them to 10 Since you are using Order by also..
Try to use query hint:
select item_id, name
from item_locale USE INDEX FOR ORDER BY (name)
order by name limit 50;
also try to use
select item_id, name
from item_locale FORCE INDEX (name)
order by name limit 50;
In the end, there was some kind of problem with indexes - I dropped them all and recreated them again. And it finally works. Thanks.
Apply an index on the name field which might speed it up a bit.
my query is running longer than 30 minutes .it is a simple query even it contains indexes also.we are unable to find why it was taking too much execution time and it effects on our entire db performance.
yesterday it ran around: 122.6mins
any one can help me here.how to improve query performance
This is my query:
SELECT tab1.customer_id,tab1.row_mod,tab1.row_create,tab1.event_id,tab1.event_type,
tab1.new_value,tab1.old_value FROM tab1 force index (tab1_n2)where customer_id >= 1 and customer_id
< 5000000 and (tab1.row_mod >= '2012-10-01') or (tab1.row_create >= '2012-10-01' and tab1.row_create < '2012-10-13');
Explain plan
+----+-------------+------------------+------+---------------------+------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+------+---------------------+------+---------+------+----------+-------------+
| 1 | SIMPLE | tab1 | ALL | tab1_n2 | NULL | NULL | NULL | 18490530 | Using where |
+----+-------------+------------------+------+---------------------+------+---------+------+----------+-------------+
1 row in set (0.00 sec)
Table structure:
mysql> show create table tab1\G
*************************** 1. row ***************************
Table: tab1
Create Table: CREATE TABLE `tab1` (
`customer_id` int(11) NOT NULL,
`row_mod` datetime DEFAULT NULL,
`row_create` datetime DEFAULT NULL,
`event_id` int(11) DEFAULT NULL,
`event_type` varchar(45) DEFAULT NULL,
`new_value` varchar(255) DEFAULT NULL,
`old_value` varchar(255) DEFAULT NULL,
KEY `customer_id1` (`customer_id`),
KEY `new_value_n1` (`new_value`),
KEY `tab1_n1` (`row_create`),
KEY `tab1_n2` (`row_mod`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
Please help me how to tune it .even it having indexes also
Probably because you are using an index that does not make sense.
The row_mod condition is only one branch of the OR condition, so that index is not much help here. If you are forcing every lookup through the index without eliminating any rows, that could be a lot slower than a full table scan. Good rule of thumb is that an index should eliminate more than 90% of rows.
Try to do without the "force index" part.
Try using a UNION of the two conditions. That way each condition can use an index.
ALTER TABLE tab1 ADD INDEX idx_row_mod_customer_id (row_mod, customer_id);
ALTER TABLE tab1 ADD INDEX idx_row_create (row_create);
SELECT tab1.customer_id, tab1.row_mod, tab1.row_create, tab1.event_id, tab1.event_type,
tab1.new_value, tab1.old_value
FROM tab1
WHERE customer_id >= 1 and customer_id
< 5000000 AND tab1.row_mod >= '2012-10-01'
UNION
SELECT tab1.customer_id, tab1.row_mod, tab1.row_create, tab1.event_id, tab1.event_type,
tab1.new_value, tab1.old_value
FROM tab1
WHERE tab1.row_create >= '2012-10-01' AND tab1.row_create < '2012-10-13';
To optimise further, you could add all selected columns to both indices, saving MySQL from having to load the rows into memory. This will greatly increase the size of the indices, and therefore their memory requirement.