MySQL query paralyzes site - mysql

Once in a while, at random intervals, our website gets completely paralyzed.
Looking at SHOW FULL PROCESSLIST;, I've noticed that when this happens, there is a specific query that is "Copying to tmp table" for a loooong time (sometimes 350 seconds), and almost all the other queries are "Locked".
The part I don't understand is that 90% of the time, this query runs fine. I see it going through in the process list and it finishes pretty quickly most of the time.
This query is being called by an ajax call on our homepage to display product recommendations based your browsing history (a la amazon).
Just sometimes, randomly (but too often), it gets stuck at "copying to tmp table".
Here is a caught instance of the query that was up 109 seconds when I looked:
SELECT DISTINCT product_product.id, product_product.name, product_product.retailprice, product_product.imageurl, product_product.thumbnailurl, product_product.msrp
FROM product_product, product_xref, product_viewhistory
WHERE
(
(product_viewhistory.productId = product_xref.product_id_1 AND product_xref.product_id_2 = product_product.id)
OR
(product_viewhistory.productId = product_xref.product_id_2 AND product_xref.product_id_1 = product_product.id)
)
AND product_product.outofstock='N'
AND product_viewhistory.cookieId = '188af1efad392c2adf82'
AND product_viewhistory.productId IN (24976, 25873, 26067, 26073, 44949, 16209, 70528, 69784, 75171, 75172)
ORDER BY product_xref.hits DESC
LIMIT 10
Of course the "cookieId" and the list of "productId" changes dynamically depending on the request.
I use php with PDO.
Edit: I figured some of the table structures involved might help:
CREATE TABLE IF NOT EXISTS `product_viewhistory` (
`userId` int(10) unsigned NOT NULL default '0',
`cookieId` varchar(30) collate utf8_unicode_ci NOT NULL,
`productId` int(11) NOT NULL,
`viewTime` timestamp NOT NULL default CURRENT_TIMESTAMP,
KEY `userId` (`userId`),
KEY `cookieId` (`cookieId`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE IF NOT EXISTS `product_xref` (
`id` int(11) NOT NULL auto_increment,
`product_id_1` int(11) default NULL,
`product_id_2` int(11) default NULL,
`hits` int(11) NOT NULL default '0',
PRIMARY KEY (`id`),
KEY `IDX_PROD1` (`product_id_1`),
KEY `IDX_PROD2` (`product_id_2`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=184531 ;
CREATE TABLE IF NOT EXISTS `product_product` (
`id` int(11) NOT NULL auto_increment,
`supplierid` int(11) NOT NULL default '0',
`suppliersku` varchar(100) NOT NULL default '',
`name` varchar(100) NOT NULL default '',
`cost` decimal(10,2) NOT NULL default '0.00',
`retailprice` decimal(10,2) NOT NULL default '0.00',
`weight` decimal(10,2) NOT NULL default '0.00',
`imageurl` varchar(255) NOT NULL default '',
`thumbnailurl` varchar(255) NOT NULL default '',
`sizechartlink` varchar(255) NOT NULL default '',
`content` text NOT NULL,
`remark` varchar(100) NOT NULL default '',
`colorchartlink` varchar(255) default NULL,
`outofstock` char(1) NOT NULL default '',
`summary` text NOT NULL,
`freehandoutlink` varchar(255) default NULL,
`msrp` decimal(10,2) default NULL,
`enabled` tinyint(1) NOT NULL default '1',
`sales_score` float NOT NULL default '0',
`sales_score_offset` float NOT NULL default '0',
`date_added` timestamp NULL default CURRENT_TIMESTAMP,
`brand` varchar(255) default NULL,
`tag_status` varchar(20) default NULL,
PRIMARY KEY (`id`),
KEY `product_retailprice_idx` (`retailprice`),
KEY `suppliersku` (`suppliersku`),
FULLTEXT KEY `product_name_summary_ft` (`name`,`summary`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Also, by request, the result of a EXPLAIN:
+----+-------------+---------------------+------+---------------------+----------+---------+-------+-------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------+------+---------------------+----------+---------+-------+-------+------------------------------------------------+
| 1 | SIMPLE | product_xref | ALL | IDX_PROD1,IDX_PROD2 | NULL | NULL | NULL | 30035 | Using temporary; Using filesort |
| 1 | SIMPLE | product_viewhistory | ref | cookieId | cookieId | 92 | const | 682 | Using where |
| 1 | SIMPLE | product_product | ALL | PRIMARY | NULL | NULL | NULL | 31880 | Range checked for each record (index map: 0x1) |
+----+-------------+---------------------+------+---------------------+----------+---------+-------+-------+------------------------------------------------+
3 rows in set (0.00 sec)
New updated version as I realized I did not need product_viewhistory at all. I was left from older code:
SELECT DISTINCT product_product.id, product_product.name, product_product.retailprice, product_product.imageurl, product_product.thumbnailurl, product_product.msrp
FROM product_product, product_xref
WHERE
(
(product_xref.product_id_1 IN (24976, 25873, 26067, 26073, 44949, 16209, 70528, 69784, 75171, 75172) AND product_xref.product_id_2 = product_product.id)
OR
(product_xref.product_id_2 IN (24976, 25873, 26067, 26073, 44949, 16209, 70528, 69784, 75171, 75172) AND product_xref.product_id_1 = product_product.id)
)
AND product_product.outofstock='N'
ORDER BY product_xref.hits DESC
LIMIT 10
And the new explain:
+----+-------------+-----------------+-------------+---------------------+---------------------+---------+------+-------+-------------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+-------------+---------------------+---------------------+---------+------+-------+-------------------------------------------------------------------------------------+
| 1 | SIMPLE | product_xref | index_merge | IDX_PROD1,IDX_PROD2 | IDX_PROD1,IDX_PROD2 | 5,5 | NULL | 32 | Using sort_union(IDX_PROD1,IDX_PROD2); Using where; Using temporary; Using filesort |
| 1 | SIMPLE | product_product | ALL | PRIMARY | NULL | NULL | NULL | 31880 | Range checked for each record (index map: 0x1) |
+----+-------------+-----------------+-------------+---------------------+---------------------+---------+------+-------+-------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

The first thing to do is see what MySQL is doing under the hood with EXPLAIN, then go from there. It sounds like you have some indexing to do.

I rewrote your query as:
SELECT DISTINCT
pp.id,
pp.name,
pp.retailprice,
pp.imageurl,
pp.thumbnailurl,
pp.msrp
FROM PRODUCT_PRODUCT pp
LEFT JOIN PRODUCT_XREF px1 ON px1.product_id_2 = pp.id
LEFT JOIN PRODUCT_XREF px2 ON px2.product_id_1 = pp.id
WHERE EXISTS(SELECT NULL
FROM PRODUCT_VIEWHISTORY pvh
WHERE pvh.productid = px1.product_id_1
AND pvh.cookieId = '188af1efad392c2adf82'
AND pvh.productId IN (24976, 25873, 26067, 26073, 44949, 16209, 70528, 69784, 75171, 75172))
OR EXISTS(SELECT NULL
FROM PRODUCT_VIEWHISTORY pvh
WHERE pvh.productid = px2.product_id_2
AND pvh.cookieId = '188af1efad392c2adf82'
AND pvh.productId IN (24976, 25873, 26067, 26073, 44949, 16209, 70528, 69784, 75171, 75172))
AND pp.outofstock = 'N'
ORDER BY GREATEST(px1.hits, px2.hits) DESC
LIMIT 10
It would've been easier if the ORDER BY didn't rely on the PRODUCT_XREF.hits column. Too bad MySQL doesn't support Common Table Expressions (CTEs)/Subquery Factoring...
Having two different product_id references is a highly questionable approach. I recommend reviewing the data model.

You need to optimize your query. Run it from mysql prompt or mysql client with EXPLAIN and check execution plan. You may need to add indexes to your tables. Keep in mind that if you run this query few times
in a row, mysql server will cache results and you shouldn't rely on their fast execution time . Maybe it is the reason why your query runs fine 90% of the time.

Related

How to optimize the below query and what is (Using where; Using join buffer (Block Nested Loop)) With EXPLAIN

I have been facing an issue with the query. whenever I try to select the query it taking 10 to 15 seconds to execute. and what is (Using where; Using join buffer (Block Nested Loop)) in the explain
The query is
SELECT wp_posts.ID, post_title, post_content, wp_pvc_total.postcount,
wp_pvc_total.postnum
FROM wp_posts
LEFT JOIN wp_term_relationships
ON ( wp_posts.ID =
wp_term_relationships.object_id )
LEFT JOIN wp_term_taxonomy
ON ( wp_term_relationships.term_taxonomy_id =
wp_term_taxonomy.term_taxonomy_id )
LEFT JOIN wp_pvc_total ON ( wp_pvc_total.postnum = wp_posts.ID )
WHERE wp_posts.post_author = 630
AND wp_posts.post_author NOT IN(675)
AND wp_posts.ID != 48075
GROUP BY wp_posts.ID
ORDER BY wp_pvc_total.postcount DESC
LIMIT 0, 9;
Query with explain
+----+-------------+-----------------------+--------+---------------------------------------------------------------------------+-------------+---------+-------------------------------------------------------+-------+--------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------+--------+---------------------------------------------------------------------------+-------------+---------+-------------------------------------------------------+-------+--------------------------------------------------------+
| 1 | SIMPLE | wp_posts | range | PRIMARY,post_name,type_status_date,post_parent,post_author,idx_post_title | post_author | 16 | NULL | 1682 | Using index condition; Using temporary; Using filesort |
| 1 | SIMPLE | wp_term_relationships | ref | PRIMARY | PRIMARY | 8 | marriai1_topic.wp_posts.ID | 1 | Using index |
| 1 | SIMPLE | wp_term_taxonomy | eq_ref | PRIMARY | PRIMARY | 8 | marriai1_topic.wp_term_relationships.term_taxonomy_id | 1 | Using index |
| 1 | SIMPLE | wp_pvc_total | ALL | NULL | NULL | NULL | NULL | 19670 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-----------------------+--------+---------------------------------------------------------------------------+-------------+---------+-------------------------------------------------------+-------+--------------------------------------------------------+
The MySQL Version is 5.6, Innodb Engine and The table structures are
CREATE TABLE `wp_posts` (
`ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`post_author` bigint(20) unsigned NOT NULL DEFAULT '0',
`post_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`post_date_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`post_content` longtext NOT NULL,
`post_title` text NOT NULL,
`post_excerpt` text NOT NULL,
`post_status` varchar(20) NOT NULL DEFAULT 'publish',
`comment_status` varchar(20) NOT NULL DEFAULT 'open',
`ping_status` varchar(20) NOT NULL DEFAULT 'open',
`post_password` varchar(255) NOT NULL DEFAULT '',
`post_name` varchar(200) NOT NULL DEFAULT '',
`to_ping` text NOT NULL,
`pinged` text NOT NULL,
`post_modified` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`post_modified_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`post_content_filtered` longtext NOT NULL,
`post_parent` bigint(20) unsigned NOT NULL DEFAULT '0',
`guid` varchar(255) NOT NULL DEFAULT '',
`menu_order` int(11) NOT NULL DEFAULT '0',
`post_type` varchar(20) NOT NULL DEFAULT 'post',
`post_mime_type` varchar(100) NOT NULL DEFAULT '',
`comment_count` bigint(20) NOT NULL DEFAULT '0',
PRIMARY KEY (`ID`),
KEY `post_name` (`post_name`(191)),
KEY `type_status_date` (`post_type`,`post_status`,`post_date`,`ID`),
KEY `post_parent` (`post_parent`),
KEY `post_author` (`post_author`),
FULLTEXT KEY `idx_post_title` (`post_title`)
)
and
CREATE TABLE `wp_pvc_total` (
`id` mediumint(9) NOT NULL AUTO_INCREMENT,
`postnum` varchar(255) NOT NULL,
`postcount` int(11) NOT NULL DEFAULT '750',
UNIQUE KEY `id` (`id`)
)
Add INDEX(postnum, postcount)
That will make reaching into wp_pvc_total more efficient. And it is "covering".
BNL is quite efficient, so I am no sure that this index will be better. But it seems like having no useful index is inefficient.
On a separate issue, KEY post_name (post_name(191)) is problematic. See http://mysql.rjweb.org/doc.php/limits#767_limit_in_innodb_indexes

How to optimize this query in MySQL

I have these two tables (Moodle 2.8):
CREATE TABLE `mdl_course` (
`id` bigint(10) NOT NULL AUTO_INCREMENT,
`category` bigint(10) NOT NULL DEFAULT '0',
`sortorder` bigint(10) NOT NULL DEFAULT '0',
`fullname` varchar(254) NOT NULL DEFAULT '',
`shortname` varchar(255) NOT NULL DEFAULT '',
`idnumber` varchar(100) NOT NULL DEFAULT '',
`summary` longtext,
`summaryformat` tinyint(2) NOT NULL DEFAULT '0',
`format` varchar(21) NOT NULL DEFAULT 'topics',
`showgrades` tinyint(2) NOT NULL DEFAULT '1',
`newsitems` mediumint(5) NOT NULL DEFAULT '1',
`startdate` bigint(10) NOT NULL DEFAULT '0',
`marker` bigint(10) NOT NULL DEFAULT '0',
`maxbytes` bigint(10) NOT NULL DEFAULT '0',
`legacyfiles` smallint(4) NOT NULL DEFAULT '0',
`showreports` smallint(4) NOT NULL DEFAULT '0',
`visible` tinyint(1) NOT NULL DEFAULT '1',
`visibleold` tinyint(1) NOT NULL DEFAULT '1',
`groupmode` smallint(4) NOT NULL DEFAULT '0',
`groupmodeforce` smallint(4) NOT NULL DEFAULT '0',
`defaultgroupingid` bigint(10) NOT NULL DEFAULT '0',
`lang` varchar(30) NOT NULL DEFAULT '',
`theme` varchar(50) NOT NULL DEFAULT '',
`timecreated` bigint(10) NOT NULL DEFAULT '0',
`timemodified` bigint(10) NOT NULL DEFAULT '0',
`requested` tinyint(1) NOT NULL DEFAULT '0',
`enablecompletion` tinyint(1) NOT NULL DEFAULT '0',
`completionnotify` tinyint(1) NOT NULL DEFAULT '0',
`cacherev` bigint(10) NOT NULL DEFAULT '0',
`calendartype` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `mdl_cour_cat_ix` (`category`),
KEY `mdl_cour_idn_ix` (`idnumber`),
KEY `mdl_cour_sho_ix` (`shortname`),
KEY `mdl_cour_sor_ix` (`sortorder`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `mdl_log` (
`id` bigint(10) NOT NULL AUTO_INCREMENT,
`time` bigint(10) NOT NULL DEFAULT '0',
`userid` bigint(10) NOT NULL DEFAULT '0',
`ip` varchar(45) NOT NULL DEFAULT '',
`course` bigint(10) NOT NULL DEFAULT '0',
`module` varchar(20) NOT NULL DEFAULT '',
`cmid` bigint(10) NOT NULL DEFAULT '0',
`action` varchar(40) NOT NULL DEFAULT '',
`url` varchar(100) NOT NULL DEFAULT '',
`info` varchar(255) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `mdl_log_coumodact_ix` (`course`,`module`,`action`),
KEY `mdl_log_tim_ix` (`time`),
KEY `mdl_log_act_ix` (`action`),
KEY `mdl_log_usecou_ix` (`userid`,`course`),
KEY `mdl_log_cmi_ix` (`cmid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And this query:
SELECT l.id,
l.userid AS participantid,
l.course AS courseid,
l.time,
l.ip,
l.action,
l.info,
l.module,
l.url
FROM mdl_log l
INNER JOIN mdl_course c ON l.course = c.id AND c.category <> 0
WHERE
l.id > [some large id]
AND
l.time > [some unix timestamp]
ORDER BY l.id ASC
LIMIT 0,200
mdl_log table has over 200 milion records, and I need to export it into file using PHP and not die in intent. The main problem here is that executing this is too slow. The main killer here is the join to the mdl_course table. If I remove it, everything works fast.
Here is the explain:
+----+-------------+-------+-------+---------------------------------------------+----------------------+---------+----------------+------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------------------------------------+----------------------+---------+----------------+------+-----------------------------------------------------------+
| 1 | SIMPLE | c | range | PRIMARY,mdl_cour_cat_ix | mdl_cour_cat_ix | 8 | NULL | 3152 | Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | l | ref | PRIMARY,mdl_log_coumodact_ix,mdl_log_tim_ix | mdl_log_coumodact_ix | 8 | xray2qasb.c.id | 618 | Using index condition; Using where |
+----+-------------+-------+-------+---------------------------------------------+----------------------+---------+----------------+------+-----------------------------------------------------------+
Is there any way to remove usage of temporary and filesort? What do you propose here?
After some testing this query works fast as expected:
SELECT l.id,
l.userid AS participantid,
l.course AS courseid,
l.time,
l.ip,
l.action,
l.info,
l.module,
l.url
FROM mdl_log l
WHERE
l.id > 123456
AND
l.time > 1234
AND
EXISTS (SELECT * FROM mdl_course c WHERE l.course = c.id AND c.category <> 0 )
ORDER BY l.id ASC
LIMIT 0,200
Thanks to JamieD77 for his suggestion!
execution plan:
+----+--------------------+-------+--------+-------------------------+---------+---------+--------------------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+--------+-------------------------+---------+---------+--------------------+----------+-------------+
| 1 | PRIMARY | l | range | PRIMARY,mdl_log_tim_ix | PRIMARY | 8 | NULL | 99962199 | Using where |
| 2 | DEPENDENT SUBQUERY | c | eq_ref | PRIMARY,mdl_cour_cat_ix | PRIMARY | 8 | xray2qasb.l.course | 1 | Using where |
+----+--------------------+-------+--------+-------------------------+---------+---------+--------------------+----------+-------------+
Try moving the category selection outside the JOIN. Here I put it in an IN() which the engine will cache on successive runs. I don't have 200M rows to test on, so YMMV.
DESCRIBE
SELECT l.id,
l.userid AS participantid,
l.course AS courseid,
l.time,
l.ip,
l.action,
l.info,
l.module,
l.url
FROM mdl_log l
WHERE
l.id > 1234567890
AND
l.time > 1234567890
AND
l.course IN (SELECT c.id FROM mdl_course c WHERE c.category > 0)
ORDER BY l.id ASC
LIMIT 0,200;
(In addition to using EXISTS...)
l.id > 123456 AND l.time > 1234
seems to beg for a 2-dimensional index.
99962199 -- the table is very big, correct?
Consider PARTITION BY RANGE on mdl_log on time. But...
Don't have more than about 50 partitions; other inefficiencies kick in then.
Partitioning probably won't help id and time are sorta in lock-step. Typical case: id is AUTO_INCREMENT and time is approximately the time of the INSERT.
If that applies, consider:
PRIMARY KEY(time, id) -- see below
INDEX(id) -- Yes, this is sufficient for `id AUTO_INCREMENT`.
With those indexes, you could efficiently do
WHERE time > ...
ORDER BY time, id
which is probably what you really wanted.

MySQL optimization query

i have one MySQL issue. I have to optimize some queries on my website. One of them i have already done, but there are still some which i cannot resolve without your help.
I have a table called "news":
CREATE TABLE IF NOT EXISTS `news` (
`id` int(10) NOT NULL auto_increment,
`edited` smallint(1) NOT NULL default '0',
`site` varchar(30) default NULL,
`foreign_id` varchar(25) default NULL,
`title` varchar(255) NOT NULL,
`text` text NOT NULL,
`image` varchar(255) default NULL,
`horizontal` smallint(1) NOT NULL,
`image_author` varchar(255) default NULL,
`text_author` varchar(255) default NULL,
`lang` varchar(3) NOT NULL,
`link` varchar(255) NOT NULL,
`date` date NOT NULL,
`redirect` smallint(1) NOT NULL,
`parent` int(10) NOT NULL,
`views` int(5) NOT NULL,
`status` smallint(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `lang` (`lang`,`status`),
KEY `date` (`date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=47122 ;
as you can see i have two indexes: "lang" and "date"
I have tried some combinations of different indexes and this one has produced me the best results ... unfortunately only on my local computer. On the server i still have bad results. I want to say that the database is the same.
query:
SELECT id FROM news WHERE lang = 'en' AND STATUS =1 ORDER BY DATE DESC LIMIT 0, 10
localhost explain:
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | news | index | lang | date | 3 | NULL | 23 | Using where |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
server explain:
+----+-------------+-------+------+---------------+--------+---------+-------------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+--------+---------+-------------+-------+-----------------------------+
| 1 | SIMPLE | news | ref | status | status | 13 | const,const | 15840 | Using where; Using filesort |
+----+-------------+-------+------+---------------+--------+---------+-------------+-------+-----------------------------+
I have looked a lot of other similar topics, but unfortunately i cannot find any solution to work on my server. I will be very glad to here from you some solution with some explanation for that so i can optimize my other queries.
Thanks !
This is your query:
SELECT id
FROM news
WHERE lang = 'en' AND STATUS =1
ORDER BY DATE DESC
LIMIT 0, 10
The best index is one that contains all the fields used in the query (four fields in all). The ordering in the index is by equality conditions in the where clause followed by the order by clause followed by other columns in the select clause.
So, try this index: ndws(leng, status, date, id).

MySQL Get rows that only exist in one table performance

The basics of this query have been asked, and answered, many times before, but I'm still having trouble with performance. Here are the details:
I have the table, Products, that has 105724 rows.
I have an update table, _e360products, that has 51813 rows.
I am matching on an alphanumeric 10 character code, that is indexed (unique) on both tables.
I have tried:
SELECT _e360products.Product_Code, products.StockCode
FROM _e360products Left Join Products ON _e360products.Product_Code = Products.StockCode
WHERE products.StockCode IS NULL
and:
SELECT Product_Code
FROM _e360products
WHERE Product_code NOT IN (SELECT StockCode FROM Products)
and, just for a laugh, even:
SELECT Product_Code
FROM _e360products
WHERE (SELECT count(*) FROM Products WHERE StockCode = Product_code) = 0
None of these have returned results within 20 mins!
If I reverse the queries, i.e. getting unique rows from _e360products, I get results very quickly.
Does anyone have any ideas?
~~~~~ Update ~~~~~
Explain results are:
+----+-------------+---------------+--------+---------------+--------------+---------+------------------------------------------+-------+--------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+--------+---------------+--------------+---------+------------------------------------------+-------+--------------------------------------+
| 1 | SIMPLE | _e360products | index | NULL | Product_code | 12 | NULL | 50811 | Using index |
| 1 | SIMPLE | Products | eq_ref | stockcode | stockcode | 12 | plumbase_bkup._e360products.Product_code | 1 | Using where; Using index; Not exists |
+----+-------------+---------------+--------+---------------+--------------+---------+------------------------------------------+-------+--------------------------------------+
CREATE TABLE `_e360products` (
`Product_code` varchar(10) CHARACTER SET latin1 NOT NULL DEFAULT '',
`Manufacturers_code` varchar(255) DEFAULT '',
`Description` varchar(255) DEFAULT '',
`Supplier` varchar(255) DEFAULT '',
`Price` varchar(20) DEFAULT '',
`VAT` varchar(20) DEFAULT '',
`Analysis_code` varchar(20) DEFAULT NULL,
PRIMARY KEY (`Product_code`),
KEY `Product_code` (`Product_code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `products` (
`productid` int(11) NOT NULL AUTO_INCREMENT,
`QPUM2` varchar(50) NOT NULL DEFAULT '1',
`NWID` varchar(50) NOT NULL DEFAULT '0',
`NHEI` varchar(50) NOT NULL DEFAULT '0',
`NLEN` varchar(50) NOT NULL DEFAULT '0',
`donotdisplayprice` tinyint(2) DEFAULT '0',
`productname` text,
`stockcode` varchar(10) NOT NULL DEFAULT '',
`analysiscode` varchar(50) DEFAULT '',
`usestockcontrol` int(11) DEFAULT '0',
`stockvalue` int(11) DEFAULT '0',
`stock_notification_level` int(11) DEFAULT '0',
`sectionid` int(11) DEFAULT '0',
`productprice` varchar(50) DEFAULT '',
`productprice_incvat` varchar(50) DEFAULT '',
`deleted` int(11) DEFAULT '0',
PRIMARY KEY (`productid`),
UNIQUE KEY `stockcode` (`stockcode`) USING BTREE,
KEY `deleted` (`deleted`),
KEY `allowordering` (`allowordering`),
) ENGINE=MyISAM AUTO_INCREMENT=147440 DEFAULT CHARSET=latin1;
NoteL Products table doesn't include ALL the fields, as there are quite a few...
Please provide a query execution plan (EXPLAIN), it seems your index is not used. Also show as CREATE TABLEs for both tables.
Typo? StockCode[add space here]IS
SELECT _e360products.Product_Code, products.StockCode
FROM _e360products Left Join Products ON _e360products.Product_Code = Products.StockCode
WHERE products.StockCode IS NULL

How can I optimize a Mysql query that searches for rows in a certain date range

Here is the query:
select timespans.id as timespan_id, count(*) as num
from reports, timespans
where timespans.after_date >= '2011-04-13 22:08:38' and
timespans.after_date <= reports.authored_at and
reports.authored_at < timespans.before_date
group by timespans.id;
Here are the table defs:
CREATE TABLE `reports` (
`id` int(11) NOT NULL auto_increment,
`source_id` int(11) default NULL,
`url` varchar(255) default NULL,
`lat` decimal(20,15) default NULL,
`lng` decimal(20,15) default NULL,
`content` text,
`notes` text,
`authored_at` datetime default NULL,
`created_at` datetime default NULL,
`updated_at` datetime default NULL,
`data` text,
`title` varchar(255) default NULL,
`author_id` int(11) default NULL,
`orig_id` varchar(255) default NULL,
PRIMARY KEY (`id`),
KEY `index_reports_on_title` (`title`),
KEY `index_content_on_reports` (`content`(128))
CREATE TABLE `timespans` (
`id` int(11) NOT NULL auto_increment,
`after_date` datetime default NULL,
`before_date` datetime default NULL,
`after_offset` int(11) default NULL,
`before_offset` int(11) default NULL,
`is_common` tinyint(1) default NULL,
`created_at` datetime default NULL,
`updated_at` datetime default NULL,
`is_search_chunk` tinyint(1) default NULL,
`is_day` tinyint(1) default NULL,
PRIMARY KEY (`id`),
KEY `index_timespans_on_after_date` (`after_date`),
KEY `index_timespans_on_before_date` (`before_date`)
And here is the explain:
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9 | NULL | 84 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | reports | ALL | NULL | NULL | NULL | NULL | 183297 | Using where |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
And here is the explain after I create an index on authored_at. As you can see, the index is not actually getting used (I think...)
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
| 1 | SIMPLE | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9 | NULL | 86 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | reports | ALL | index_reports_on_authored_at | NULL | NULL | NULL | 183317 | Range checked for each record (index map: 0x8) |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
There are about 142k rows in the reports table, and far fewer in the timespans table.
The query is taking about 3 seconds now.
The strange thing is that if I add an index on reports.authored_at, it actually makes the query far slower, about 20 seconds. I would have thought it would do the opposite, since it would make it easy to find the reports at either end of the range, and throw the rest away, rather than having to examine all of them.
Can someone clarify? I'm stumped.
Instead of two separate indexes for the timespan table, try merging them into a single multi-column index with before_date and after_date in a single index. Then add that index to authored_at as well.
i rewrite you query like this:
select t.id, count(*) as num from timespans t
join reports r where t.after_date >= '2011-04-13 22:08:38'
and r.authored_at >= '2011-04-13 22:08:38'
and r.authored_at < t.before_date
group by t.id order by null;
and change indexes of tables
alter table reports add index authored_at_idx(authored_at);
You can used partition feature of database on column after_date. It will help u a lot.