i have one MySQL issue. I have to optimize some queries on my website. One of them i have already done, but there are still some which i cannot resolve without your help.
I have a table called "news":
CREATE TABLE IF NOT EXISTS `news` (
`id` int(10) NOT NULL auto_increment,
`edited` smallint(1) NOT NULL default '0',
`site` varchar(30) default NULL,
`foreign_id` varchar(25) default NULL,
`title` varchar(255) NOT NULL,
`text` text NOT NULL,
`image` varchar(255) default NULL,
`horizontal` smallint(1) NOT NULL,
`image_author` varchar(255) default NULL,
`text_author` varchar(255) default NULL,
`lang` varchar(3) NOT NULL,
`link` varchar(255) NOT NULL,
`date` date NOT NULL,
`redirect` smallint(1) NOT NULL,
`parent` int(10) NOT NULL,
`views` int(5) NOT NULL,
`status` smallint(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `lang` (`lang`,`status`),
KEY `date` (`date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=47122 ;
as you can see i have two indexes: "lang" and "date"
I have tried some combinations of different indexes and this one has produced me the best results ... unfortunately only on my local computer. On the server i still have bad results. I want to say that the database is the same.
query:
SELECT id FROM news WHERE lang = 'en' AND STATUS =1 ORDER BY DATE DESC LIMIT 0, 10
localhost explain:
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | news | index | lang | date | 3 | NULL | 23 | Using where |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
server explain:
+----+-------------+-------+------+---------------+--------+---------+-------------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+--------+---------+-------------+-------+-----------------------------+
| 1 | SIMPLE | news | ref | status | status | 13 | const,const | 15840 | Using where; Using filesort |
+----+-------------+-------+------+---------------+--------+---------+-------------+-------+-----------------------------+
I have looked a lot of other similar topics, but unfortunately i cannot find any solution to work on my server. I will be very glad to here from you some solution with some explanation for that so i can optimize my other queries.
Thanks !
This is your query:
SELECT id
FROM news
WHERE lang = 'en' AND STATUS =1
ORDER BY DATE DESC
LIMIT 0, 10
The best index is one that contains all the fields used in the query (four fields in all). The ordering in the index is by equality conditions in the where clause followed by the order by clause followed by other columns in the select clause.
So, try this index: ndws(leng, status, date, id).
Related
Mysql version - 5.7.22
Table definition
CREATE TABLE `books` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`uuid` char(32) COLLATE utf8mb4_unicode_ci NOT NULL,
`title` varchar(254) COLLATE utf8mb4_unicode_ci NOT NULL,
`created` datetime(6) NOT NULL,
`modified` datetime(6) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `book_uuid` (`uuid`),
) ENGINE=InnoDB AUTO_INCREMENT=115 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Query
SELECT DISTINCT `books`.`id`,
`books`.`uuid`,
`books`.`title`
FROM `books`
WHERE (`books`.`uuid` IN ("334222a0e99b4a3e97f577665055208e",
"979c059840964934816280ba85c67221",
"4e2978c765dd435998666ea3083666e5",
"535aa78ba80e4215bbf75fb1e20cc5f3",
"f969fb10c72b4875aabdf75c1b493524",
"1daa0015055444a4b1c0821618a7a4d9",
"04f34ede284a4b86b0adddb405d30a75",
"513cad12c88c44c6ab248d43643459b9",
"de2bde6d016f4381ad0ba714234386fa",
"f645c2c9f1594a199a960b97b7015986",
"3ce02c072f24447a8a7b269a19ec554f",
"75450daf9d024d9d9c0df038437ae2c2",
"0e822042b50b4f79bb38304e0acde6f0",
"38d808fb3f9a4f57b4f7b30a141e7169",
"ecd424abd3a94a339383f6f8e668655e"))
ORDER BY `books`.`id` DESC
LIMIT 15;
when i do explain on this query it doesn't pick the index
+----+-------------+-----------------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| 1 | SIMPLE | books | NULL | ALL | book_uuid | NULL | NULL | NULL | 107 | 12.15 | Using where; Using filesort |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
strangely the index is correctly used when there are 12 or less entries passed to the IN clause (without forcing index)
forcing the index works, but I cannot force index as this query is created by django ORM, cannot use django-mysql 's force_index as I am on Innodb
Note: I know 'distinct' and 'limit' can be avoided in this query, but its part of a bigger query, so i have kept it as is
the table structure is follow:
CREATE TABLE `crm_member` (
`member_id` int(11) NOT NULL AUTO_INCREMENT,
`shop_id` int(11) NOT NULL,
`nick` varchar(255) NOT NULL DEFAULT '',
`name` varchar(255) NOT NULL DEFAULT '',
`mobile` varchar(255) NOT NULL DEFAULT '',
`grade` int(11) NOT NULL DEFAULT '-1',
`trade_count` int(11) NOT NULL,
`trade_amount` float NOT NULL,
`last_trade_time` int(11) NOT NULL,
`trade_from` tinyint(4) NOT NULL,
`avg_price` float NOT NULL,
`seller_flag` tinyint(1) NOT NULL,
`is_black` tinyint(1) NOT NULL DEFAULT '0',
`created` int(11) NOT NULL,
PRIMARY KEY (`member_id`),
UNIQUE KEY `shop_id` (`shop_id`,`nick`),
KEY `last_trade_time` (`last_trade_time`),
KEY `idx_shop_id_grade` (`shop_id`,`grade`),
KEY `idx_shopid_created` (`shop_id`,`created`),
KEY `idx_trade_amount` (`shop_id`,`trade_amount`),
KEY `idx_trade_count` (`shop_id`,`trade_count`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
the table has 2148037 rows under the shop_id 3498706;
AND my query is like
SELECT AVG(trade_count) as trade_count, AVG(trade_amount) as trade_amount, AVG(grade) as grade from `crm_member_0141` WHERE shop_id = '3498706' and grade >= 0 and trade_count > 0 and is_black = 0 LIMIT 1
the query exec about 30 seconds .
the explain result
mysql> explain SELECT member_id, AVG(trade_count) as trade_count, AVG(trade_amount) as trade_amount, AVG(grade) as grade from `crm_member_0141` WHERE shop_id = 3498706 and grade >= 0 and trade_count > 0 and is_black = 0 order by member_id LIMIT 1;
+----+-------------+-----------------+------------+------+-------------------------------------------------------------------------------+---------+---------+-------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------+------------+------+-------------------------------------------------------------------------------+---------+---------+-------+---------+----------+-------------+
| 1 | SIMPLE | crm_member_0141 | NULL | ref | shop_id,idx_shop_id_grade,idx_shopid_created,idx_trade_amount,idx_trade_count | shop_id | 4 | const | 1074018 | 1.11 | Using where |
+----+-------------+-----------------+------------+------+-------------------------------------------------------------------------------+---------+---------+-------+---------+----------+-------------+
how to speed up the query?
It looks like mysql chooses to use the primary key instead of the shop_id, grade index, although shop_id, grade would probably speed up the selection of rows.
You can tell mysql to use a specific index for a query, using the USE INDEX directive :
try to add USE INDEX (idx_shop_id_grade) in your query after the table name and see if the computation is faster.
Otherwise, if this query is especially useful and frequent to call in your app, you can build a more specialised index for this query :
Your query selects on a particular value for shop_id and is_black, then does a range selection on grade and trade_count.
I would suggest to try indexing using shop_id, is_black, grade, trade_count.
note : obviously, test this first on test database
Please refer the table strcuture below.
CREATE TABLE `oarc` (
`ID` bigint(20) NOT NULL AUTO_INCREMENT,
`zID` int(11) NOT NULL,
`cID` int(11) NOT NULL,
`bID` int(11) NOT NULL,
`rtype` char(1) COLLATE utf8_unicode_ci NOT NULL,
`created` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1821039 ;
Other than the PRIMARY KEY, I have not set any index on this, and when I run the following query
select COUNT(oarc.ID) as total
from `oarc` where`oarc`.`rtype` = 'v'
group
by `oarc`.`zID`
I am getting the result in less than 1 second. But if I add index to zID it is taking more than 5 seconds.
Please see below explain result :
id | select_type | table | type | possible_keys | key | key_len | ref | row | Extra
--------------------------------------------------------------------------------------------------------
1 | SIMPLE | oarc | index | NULL | zone_ID | 4 | NULL | 1909387 | Using where
Currently the table have more than 1821039 records in it and it will increase on a hourly basis. What are the things I need to do in order to reduce the query execution time. I am expecting only something at the table and query level, nothing on my.cnf or server side because I can not do anything there.
Thanks in advance.
Is this better?
CREATE TABLE `oarc` (
`ID` bigint(20) NOT NULL AUTO_INCREMENT,
`zID` int(11) NOT NULL,
`cID` int(11) NOT NULL,
`bID` int(11) NOT NULL,
`rtype` char(1) COLLATE utf8_unicode_ci NOT NULL,
`created` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`ID`),
KEY(rtype,zid)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1821039 ;
explain
select COUNT(oarc.ID) as total
from `oarc` where`oarc`.`rtype` = 'v'
group
by `oarc`.`zID`
+----+-------------+-------+------+---------------+-------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------+------+--------------------------+
| 1 | SIMPLE | oarc | ref | rtype | rtype | 3 | const | 1 | Using where; Using index |
+----+-------------+-------+------+---------------+-------+---------+-------+------+--------------------------+
Here is the query:
select timespans.id as timespan_id, count(*) as num
from reports, timespans
where timespans.after_date >= '2011-04-13 22:08:38' and
timespans.after_date <= reports.authored_at and
reports.authored_at < timespans.before_date
group by timespans.id;
Here are the table defs:
CREATE TABLE `reports` (
`id` int(11) NOT NULL auto_increment,
`source_id` int(11) default NULL,
`url` varchar(255) default NULL,
`lat` decimal(20,15) default NULL,
`lng` decimal(20,15) default NULL,
`content` text,
`notes` text,
`authored_at` datetime default NULL,
`created_at` datetime default NULL,
`updated_at` datetime default NULL,
`data` text,
`title` varchar(255) default NULL,
`author_id` int(11) default NULL,
`orig_id` varchar(255) default NULL,
PRIMARY KEY (`id`),
KEY `index_reports_on_title` (`title`),
KEY `index_content_on_reports` (`content`(128))
CREATE TABLE `timespans` (
`id` int(11) NOT NULL auto_increment,
`after_date` datetime default NULL,
`before_date` datetime default NULL,
`after_offset` int(11) default NULL,
`before_offset` int(11) default NULL,
`is_common` tinyint(1) default NULL,
`created_at` datetime default NULL,
`updated_at` datetime default NULL,
`is_search_chunk` tinyint(1) default NULL,
`is_day` tinyint(1) default NULL,
PRIMARY KEY (`id`),
KEY `index_timespans_on_after_date` (`after_date`),
KEY `index_timespans_on_before_date` (`before_date`)
And here is the explain:
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9 | NULL | 84 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | reports | ALL | NULL | NULL | NULL | NULL | 183297 | Using where |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
And here is the explain after I create an index on authored_at. As you can see, the index is not actually getting used (I think...)
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
| 1 | SIMPLE | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9 | NULL | 86 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | reports | ALL | index_reports_on_authored_at | NULL | NULL | NULL | 183317 | Range checked for each record (index map: 0x8) |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
There are about 142k rows in the reports table, and far fewer in the timespans table.
The query is taking about 3 seconds now.
The strange thing is that if I add an index on reports.authored_at, it actually makes the query far slower, about 20 seconds. I would have thought it would do the opposite, since it would make it easy to find the reports at either end of the range, and throw the rest away, rather than having to examine all of them.
Can someone clarify? I'm stumped.
Instead of two separate indexes for the timespan table, try merging them into a single multi-column index with before_date and after_date in a single index. Then add that index to authored_at as well.
i rewrite you query like this:
select t.id, count(*) as num from timespans t
join reports r where t.after_date >= '2011-04-13 22:08:38'
and r.authored_at >= '2011-04-13 22:08:38'
and r.authored_at < t.before_date
group by t.id order by null;
and change indexes of tables
alter table reports add index authored_at_idx(authored_at);
You can used partition feature of database on column after_date. It will help u a lot.
I currently have 2 tables that are used for a select query with a simple join. The first table houses around 6-9 million rows, and this gets used as the join. The primary table is anywhere from 1mil to 300mil rows. However, I notice when I join above 10mil rows on the primary table the select query goes from instant to very slow (3+ seconds and grows).
Here is my table structure and queries.
CREATE TABLE IF NOT EXISTS `links` (
`link_id` int(10) unsigned NOT NULL,
`domain_id` mediumint(7) unsigned NOT NULL,
`parent_id` int(11) unsigned DEFAULT NULL,
`hash` int(10) unsigned NOT NULL,
`url` text NOT NULL,
`type` enum('html','pdf') DEFAULT NULL,
`processed` enum('N','Y') NOT NULL DEFAULT 'N',
UNIQUE KEY `hash` (`hash`),
KEY `idx_processed` (`processed`),
KEY `domain_id` (`domain_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;
CREATE TABLE IF NOT EXISTS `domains` (
`domain_id` mediumint(7) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(170) NOT NULL,
`blocked` enum('N','Y') NOT NULL DEFAULT 'N',
`count` mediumint(6) NOT NULL DEFAULT '0',
`mcount` mediumint(3) NOT NULL,
PRIMARY KEY (`domain_id`),
KEY `name` (`name`),
KEY `blocked` (`blocked`),
KEY `mcount` (`mcount`),
KEY `count` (`count`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=10834389 ;
Query:
(SELECT link_id, url, hash FROM links, domains WHERE links.domain_id = domains.domain_id and mcount > 1 and processed='N' limit 200)
UNION
(SELECT link_id, url, hash FROM links where processed='N' and type='html' limit 200)
Explain select:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+------------+-------+-------------------------+--------------- +---------+---------------------------+---------+-------------+
| 1 | PRIMARY | domains | range | PRIMARY,mcount | mcount | 3 | NULL | 257673 | Using where |
| 1 | PRIMARY | links | ref | idx_processed,domain_id | domain_id | 3 | crawler.domains.domain_id | 1 | Using where |
| 2 | UNION | links | ref | idx_processed | idx_processed | 1 | const | 7090017 | Using where |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+------------+-------+-------------------------+---------------+---------+---------------------------+---------+-------------+
Right now, I'm trying a partition with 20 partitions on links using domain_id as the key.
Any other options would be greatly appreciated.
A single SELECT statement would replace your entire UNION statement:
SELECT link_id, url, hash
FROM links, domains
WHERE links.domain_id = domains.domain_id
AND mcount > 1
AND processed='N'
AND type='html'
This may not be THE answer you are looking for, but it should help you simplify your question.
When things suddenly slow down you might want to check the size of your indexes (used in the query execution) vs size of various mysql buffers.