mysql indexing makes group by slow - mysql

Please refer the table strcuture below.
CREATE TABLE `oarc` (
`ID` bigint(20) NOT NULL AUTO_INCREMENT,
`zID` int(11) NOT NULL,
`cID` int(11) NOT NULL,
`bID` int(11) NOT NULL,
`rtype` char(1) COLLATE utf8_unicode_ci NOT NULL,
`created` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1821039 ;
Other than the PRIMARY KEY, I have not set any index on this, and when I run the following query
select COUNT(oarc.ID) as total
from `oarc` where`oarc`.`rtype` = 'v'
group
by `oarc`.`zID`
I am getting the result in less than 1 second. But if I add index to zID it is taking more than 5 seconds.
Please see below explain result :
id | select_type | table | type | possible_keys | key | key_len | ref | row | Extra
--------------------------------------------------------------------------------------------------------
1 | SIMPLE | oarc | index | NULL | zone_ID | 4 | NULL | 1909387 | Using where
Currently the table have more than 1821039 records in it and it will increase on a hourly basis. What are the things I need to do in order to reduce the query execution time. I am expecting only something at the table and query level, nothing on my.cnf or server side because I can not do anything there.
Thanks in advance.

Is this better?
CREATE TABLE `oarc` (
`ID` bigint(20) NOT NULL AUTO_INCREMENT,
`zID` int(11) NOT NULL,
`cID` int(11) NOT NULL,
`bID` int(11) NOT NULL,
`rtype` char(1) COLLATE utf8_unicode_ci NOT NULL,
`created` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`ID`),
KEY(rtype,zid)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1821039 ;
explain
select COUNT(oarc.ID) as total
from `oarc` where`oarc`.`rtype` = 'v'
group
by `oarc`.`zID`
+----+-------------+-------+------+---------------+-------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------+------+--------------------------+
| 1 | SIMPLE | oarc | ref | rtype | rtype | 3 | const | 1 | Using where; Using index |
+----+-------------+-------+------+---------------+-------+---------+-------+------+--------------------------+

Related

MYSQL request optimization select by simple column

I have a query like
SELECT `table1`.*
FROM `table1`
WHERE `table1`.`table2_id` IN (1,2,6,12,53,666)
and it works more than 20 seconds
Explain looks like:
+----+-------------+--------------------------+------------+-------+-------------------------------------------------------------------------------+----------------------------------+---------+------+-------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------------+------------+-------+-------------------------------------------------------------------------------+----------------------------------+---------+------+-------+----------+-----------------------+
| 1 | SIMPLE | table1 | NULL | range | table2_id | table2_id | 4 | NULL | 74778 | 100.00 | Using index condition |
+----+-------------+--------------------------+------------+-------+-------------------------------------------------------------------------------+----------------------------------+---------+------+-------+----------+-----------------------+
Table looks like
CREATE TABLE `table1` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`table2_id` int(11) NOT NULL,
`table3_id` int(11) NOT NULL,
`field1` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`field2` int(11) NOT NULL DEFAULT '0',
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`field3` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `name_of_index_id` (`table3_id`),
KEY `other_name_of_index` (`field2`),
KEY `table2_id` (`table2_id`)
) ENGINE=InnoDB AUTO_INCREMENT=86623178 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
You can use index to fetch the result quickly. But, Indexing will take more space to generate and store in the database. So, if you are fine to utilize speed compare to space, Indexing will be used with following SQL statement.
ALTER TABLE `table1` ADD INDEX(`table2_id`);

MySQL optimization query

i have one MySQL issue. I have to optimize some queries on my website. One of them i have already done, but there are still some which i cannot resolve without your help.
I have a table called "news":
CREATE TABLE IF NOT EXISTS `news` (
`id` int(10) NOT NULL auto_increment,
`edited` smallint(1) NOT NULL default '0',
`site` varchar(30) default NULL,
`foreign_id` varchar(25) default NULL,
`title` varchar(255) NOT NULL,
`text` text NOT NULL,
`image` varchar(255) default NULL,
`horizontal` smallint(1) NOT NULL,
`image_author` varchar(255) default NULL,
`text_author` varchar(255) default NULL,
`lang` varchar(3) NOT NULL,
`link` varchar(255) NOT NULL,
`date` date NOT NULL,
`redirect` smallint(1) NOT NULL,
`parent` int(10) NOT NULL,
`views` int(5) NOT NULL,
`status` smallint(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `lang` (`lang`,`status`),
KEY `date` (`date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=47122 ;
as you can see i have two indexes: "lang" and "date"
I have tried some combinations of different indexes and this one has produced me the best results ... unfortunately only on my local computer. On the server i still have bad results. I want to say that the database is the same.
query:
SELECT id FROM news WHERE lang = 'en' AND STATUS =1 ORDER BY DATE DESC LIMIT 0, 10
localhost explain:
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | news | index | lang | date | 3 | NULL | 23 | Using where |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
server explain:
+----+-------------+-------+------+---------------+--------+---------+-------------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+--------+---------+-------------+-------+-----------------------------+
| 1 | SIMPLE | news | ref | status | status | 13 | const,const | 15840 | Using where; Using filesort |
+----+-------------+-------+------+---------------+--------+---------+-------------+-------+-----------------------------+
I have looked a lot of other similar topics, but unfortunately i cannot find any solution to work on my server. I will be very glad to here from you some solution with some explanation for that so i can optimize my other queries.
Thanks !
This is your query:
SELECT id
FROM news
WHERE lang = 'en' AND STATUS =1
ORDER BY DATE DESC
LIMIT 0, 10
The best index is one that contains all the fields used in the query (four fields in all). The ordering in the index is by equality conditions in the where clause followed by the order by clause followed by other columns in the select clause.
So, try this index: ndws(leng, status, date, id).

How can I optimize a Mysql query that searches for rows in a certain date range

Here is the query:
select timespans.id as timespan_id, count(*) as num
from reports, timespans
where timespans.after_date >= '2011-04-13 22:08:38' and
timespans.after_date <= reports.authored_at and
reports.authored_at < timespans.before_date
group by timespans.id;
Here are the table defs:
CREATE TABLE `reports` (
`id` int(11) NOT NULL auto_increment,
`source_id` int(11) default NULL,
`url` varchar(255) default NULL,
`lat` decimal(20,15) default NULL,
`lng` decimal(20,15) default NULL,
`content` text,
`notes` text,
`authored_at` datetime default NULL,
`created_at` datetime default NULL,
`updated_at` datetime default NULL,
`data` text,
`title` varchar(255) default NULL,
`author_id` int(11) default NULL,
`orig_id` varchar(255) default NULL,
PRIMARY KEY (`id`),
KEY `index_reports_on_title` (`title`),
KEY `index_content_on_reports` (`content`(128))
CREATE TABLE `timespans` (
`id` int(11) NOT NULL auto_increment,
`after_date` datetime default NULL,
`before_date` datetime default NULL,
`after_offset` int(11) default NULL,
`before_offset` int(11) default NULL,
`is_common` tinyint(1) default NULL,
`created_at` datetime default NULL,
`updated_at` datetime default NULL,
`is_search_chunk` tinyint(1) default NULL,
`is_day` tinyint(1) default NULL,
PRIMARY KEY (`id`),
KEY `index_timespans_on_after_date` (`after_date`),
KEY `index_timespans_on_before_date` (`before_date`)
And here is the explain:
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9 | NULL | 84 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | reports | ALL | NULL | NULL | NULL | NULL | 183297 | Using where |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
And here is the explain after I create an index on authored_at. As you can see, the index is not actually getting used (I think...)
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
| 1 | SIMPLE | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9 | NULL | 86 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | reports | ALL | index_reports_on_authored_at | NULL | NULL | NULL | 183317 | Range checked for each record (index map: 0x8) |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
There are about 142k rows in the reports table, and far fewer in the timespans table.
The query is taking about 3 seconds now.
The strange thing is that if I add an index on reports.authored_at, it actually makes the query far slower, about 20 seconds. I would have thought it would do the opposite, since it would make it easy to find the reports at either end of the range, and throw the rest away, rather than having to examine all of them.
Can someone clarify? I'm stumped.
Instead of two separate indexes for the timespan table, try merging them into a single multi-column index with before_date and after_date in a single index. Then add that index to authored_at as well.
i rewrite you query like this:
select t.id, count(*) as num from timespans t
join reports r where t.after_date >= '2011-04-13 22:08:38'
and r.authored_at >= '2011-04-13 22:08:38'
and r.authored_at < t.before_date
group by t.id order by null;
and change indexes of tables
alter table reports add index authored_at_idx(authored_at);
You can used partition feature of database on column after_date. It will help u a lot.

MySql query taking too long - django on webfaction

I use django on webfaction, and I've got a "MySql query taking too long" message,
the sql is
SELECT (1) AS `a` FROM `main_userprofile` WHERE `main_userprofile`.`id` = 98
This is a rather simple sql, why the query taken too long?
here it is the 'create table':
main_userprofile | CREATE TABLE `main_userprofile` (
`id` int(11) NOT NULL auto_increment,
`user_id` int(11) NOT NULL,
`sex` smallint(6) NOT NULL,
`active_number` varchar(64) NOT NULL,
`phone_number` varchar(32) NOT NULL,
`work_number` varchar(32) NOT NULL,
...
...
PRIMARY KEY (`id`),
UNIQUE KEY `user_id` (`user_id`)
) ENGINE=MyISAM AUTO_INCREMENT=1652 DEFAULT CHARSET=utf8 |
the id is the primary key
the explain:
explain SELECT (1) AS `a` FROM `main_userprofile` WHERE `main_userprofile`.`id` = 98;
+----+-------------+------------------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | SIMPLE | main_userprofile | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index |
+----+-------------+------------------+-------+---------------+---------+---------+-------+------+-------------+
Seems like id is not indexed, isn't it. Is id the primary key of your table? Could you post the result of SHOW CREATE TABLE main_userprofile;?

MySQL structure help for joins ( large tables)

I currently have 2 tables that are used for a select query with a simple join. The first table houses around 6-9 million rows, and this gets used as the join. The primary table is anywhere from 1mil to 300mil rows. However, I notice when I join above 10mil rows on the primary table the select query goes from instant to very slow (3+ seconds and grows).
Here is my table structure and queries.
CREATE TABLE IF NOT EXISTS `links` (
`link_id` int(10) unsigned NOT NULL,
`domain_id` mediumint(7) unsigned NOT NULL,
`parent_id` int(11) unsigned DEFAULT NULL,
`hash` int(10) unsigned NOT NULL,
`url` text NOT NULL,
`type` enum('html','pdf') DEFAULT NULL,
`processed` enum('N','Y') NOT NULL DEFAULT 'N',
UNIQUE KEY `hash` (`hash`),
KEY `idx_processed` (`processed`),
KEY `domain_id` (`domain_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;
CREATE TABLE IF NOT EXISTS `domains` (
`domain_id` mediumint(7) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(170) NOT NULL,
`blocked` enum('N','Y') NOT NULL DEFAULT 'N',
`count` mediumint(6) NOT NULL DEFAULT '0',
`mcount` mediumint(3) NOT NULL,
PRIMARY KEY (`domain_id`),
KEY `name` (`name`),
KEY `blocked` (`blocked`),
KEY `mcount` (`mcount`),
KEY `count` (`count`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=10834389 ;
Query:
(SELECT link_id, url, hash FROM links, domains WHERE links.domain_id = domains.domain_id and mcount > 1 and processed='N' limit 200)
UNION
(SELECT link_id, url, hash FROM links where processed='N' and type='html' limit 200)
Explain select:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+------------+-------+-------------------------+--------------- +---------+---------------------------+---------+-------------+
| 1 | PRIMARY | domains | range | PRIMARY,mcount | mcount | 3 | NULL | 257673 | Using where |
| 1 | PRIMARY | links | ref | idx_processed,domain_id | domain_id | 3 | crawler.domains.domain_id | 1 | Using where |
| 2 | UNION | links | ref | idx_processed | idx_processed | 1 | const | 7090017 | Using where |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+------------+-------+-------------------------+---------------+---------+---------------------------+---------+-------------+
Right now, I'm trying a partition with 20 partitions on links using domain_id as the key.
Any other options would be greatly appreciated.
A single SELECT statement would replace your entire UNION statement:
SELECT link_id, url, hash
FROM links, domains
WHERE links.domain_id = domains.domain_id
AND mcount > 1
AND processed='N'
AND type='html'
This may not be THE answer you are looking for, but it should help you simplify your question.
When things suddenly slow down you might want to check the size of your indexes (used in the query execution) vs size of various mysql buffers.