MySQL Get rows that only exist in one table performance - mysql

The basics of this query have been asked, and answered, many times before, but I'm still having trouble with performance. Here are the details:
I have the table, Products, that has 105724 rows.
I have an update table, _e360products, that has 51813 rows.
I am matching on an alphanumeric 10 character code, that is indexed (unique) on both tables.
I have tried:
SELECT _e360products.Product_Code, products.StockCode
FROM _e360products Left Join Products ON _e360products.Product_Code = Products.StockCode
WHERE products.StockCode IS NULL
and:
SELECT Product_Code
FROM _e360products
WHERE Product_code NOT IN (SELECT StockCode FROM Products)
and, just for a laugh, even:
SELECT Product_Code
FROM _e360products
WHERE (SELECT count(*) FROM Products WHERE StockCode = Product_code) = 0
None of these have returned results within 20 mins!
If I reverse the queries, i.e. getting unique rows from _e360products, I get results very quickly.
Does anyone have any ideas?
~~~~~ Update ~~~~~
Explain results are:
+----+-------------+---------------+--------+---------------+--------------+---------+------------------------------------------+-------+--------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+--------+---------------+--------------+---------+------------------------------------------+-------+--------------------------------------+
| 1 | SIMPLE | _e360products | index | NULL | Product_code | 12 | NULL | 50811 | Using index |
| 1 | SIMPLE | Products | eq_ref | stockcode | stockcode | 12 | plumbase_bkup._e360products.Product_code | 1 | Using where; Using index; Not exists |
+----+-------------+---------------+--------+---------------+--------------+---------+------------------------------------------+-------+--------------------------------------+
CREATE TABLE `_e360products` (
`Product_code` varchar(10) CHARACTER SET latin1 NOT NULL DEFAULT '',
`Manufacturers_code` varchar(255) DEFAULT '',
`Description` varchar(255) DEFAULT '',
`Supplier` varchar(255) DEFAULT '',
`Price` varchar(20) DEFAULT '',
`VAT` varchar(20) DEFAULT '',
`Analysis_code` varchar(20) DEFAULT NULL,
PRIMARY KEY (`Product_code`),
KEY `Product_code` (`Product_code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `products` (
`productid` int(11) NOT NULL AUTO_INCREMENT,
`QPUM2` varchar(50) NOT NULL DEFAULT '1',
`NWID` varchar(50) NOT NULL DEFAULT '0',
`NHEI` varchar(50) NOT NULL DEFAULT '0',
`NLEN` varchar(50) NOT NULL DEFAULT '0',
`donotdisplayprice` tinyint(2) DEFAULT '0',
`productname` text,
`stockcode` varchar(10) NOT NULL DEFAULT '',
`analysiscode` varchar(50) DEFAULT '',
`usestockcontrol` int(11) DEFAULT '0',
`stockvalue` int(11) DEFAULT '0',
`stock_notification_level` int(11) DEFAULT '0',
`sectionid` int(11) DEFAULT '0',
`productprice` varchar(50) DEFAULT '',
`productprice_incvat` varchar(50) DEFAULT '',
`deleted` int(11) DEFAULT '0',
PRIMARY KEY (`productid`),
UNIQUE KEY `stockcode` (`stockcode`) USING BTREE,
KEY `deleted` (`deleted`),
KEY `allowordering` (`allowordering`),
) ENGINE=MyISAM AUTO_INCREMENT=147440 DEFAULT CHARSET=latin1;
NoteL Products table doesn't include ALL the fields, as there are quite a few...

Please provide a query execution plan (EXPLAIN), it seems your index is not used. Also show as CREATE TABLEs for both tables.

Typo? StockCode[add space here]IS
SELECT _e360products.Product_Code, products.StockCode
FROM _e360products Left Join Products ON _e360products.Product_Code = Products.StockCode
WHERE products.StockCode IS NULL

Related

Optimizing a simple query with many left joins

I have quite a large query which is used for a user search on 'map_item'.
SELECT map_item_name
FROM map_item
LEFT JOIN map_section_item ON map_section_item_item_id = map_item_id
LEFT JOIN map_section ON map_section_id = map_section_item_section_id
LEFT JOIN map_item_flag ON map_item_flag_item_id = map_item_id
LEFT JOIN flag ON flag_id = map_item_flag_flag_id
LEFT JOIN map ON map_id = map_section_map_id
LEFT JOIN place_map ON place_map_map_id = map_id
LEFT JOIN place ON place_id = place_map_place_id
LEFT JOIN place_category ON place_category_place_id = place_id
LEFT JOIN category ON category_id = place_category_category_id
LEFT JOIN review ON review_map_item_id = map_item_id
LEFT JOIN map_price ON map_price_item_id = map_item_id
LEFT JOIN county_list ON place_address_county = county_id
'map_item' has 5399 records in total and none of the joined tables have much data in at all.
If I run this query without the left joins (SELECT map_item_name FROM map_item) it returns in 0.00s as expected, but the above query with the joins takes around 10.00s.
All of the left joins are required in the query due to the different filters that the user can apply to the search, however the original query was taking a long time to run (20 seconds or so), and after stripping out most parts of the query I was left with the above (which is just the left joins) and even this is taking 18 seconds to run.
Here is the explain statement from the query:
+----+-------------+-------------------+--------+----------------------------------+----------------------------------+---------+-----------------------------------------------------------+------+-----------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------+--------+----------------------------------+----------------------------------+---------+-----------------------------------------------------------+------+-----------------------------------------------------------------+
| 1 | SIMPLE | map_item | ALL | NULL | NULL | NULL | NULL | 5455 | NULL |
| 1 | SIMPLE | map_section_item | index | NULL | map_section_item_section_id | 8 | NULL | 5330 | Using where; Using index; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | map_section | eq_ref | PRIMARY | PRIMARY | 4 | bestmeal.map_section_item.map_section_item_section_id | 1 | NULL |
| 1 | SIMPLE | map_item_flag | ALL | NULL | NULL | NULL | NULL | 1509 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | flag | eq_ref | PRIMARY | PRIMARY | 4 | bestmeal.map_item_flag.map_item_flag_flag_id | 1 | Using index |
| 1 | SIMPLE | map | eq_ref | PRIMARY | PRIMARY | 4 | bestmeal.map_section.map_section_map_id | 1 | Using index |
| 1 | SIMPLE | place_map | index | NULL | branch_map_branch_id | 8 | NULL | 1275 | Using where; Using index; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | place | eq_ref | PRIMARY | PRIMARY | 4 | bestmeal.place_map.place_map_place_id | 1 | NULL |
| 1 | SIMPLE | place_category | ref | place_category_place_id | place_category_place_id | 4 | bestmeal.place.place_id | 1 | Using index |
| 1 | SIMPLE | category | eq_ref | PRIMARY | PRIMARY | 4 | bestmeal.place_category.place_category_category_id | 1 | Using index |
| 1 | SIMPLE | review | ref | review_map_item_id | review_map_item_id | 4 | bestmeal.map_item.map_item_id | 1 | Using index |
| 1 | SIMPLE | map_price | ref | map_price_item_id | map_price_item_id | 4 | bestmeal.map_item.map_item_id | 1 | Using index |
| 1 | SIMPLE | county_list | eq_ref | PRIMARY | PRIMARY | 4 | bestmeal.place.place_address_county | 1 | Using index |
+----+-------------+-------------------+--------+----------------------------------+----------------------------------+---------+-----------------------------------------------------------+------+-----------------------------------------------------------------+
All of these joins are made against indexed fields, and none of the tables that are joined have any unnecessary indexes in them which could be used instead of the intended index.
I'm not an expert when it comes to optimising queries, but I'm struggling to work out what I can do to speed this query up whilst keeping the left joins. I also can't really think of any alternative solutions which will return the same results without using the joins.
Does anybody have any ideas that will help me to increase the performance on this query or accomplish the user search using a different, faster method?
Edit
Table structures as requested:
CREATE TABLE `map_item` (
`map_item_id` int(11) NOT NULL AUTO_INCREMENT,
`map_item_account_id` int(11) NOT NULL DEFAULT '0',
`map_item_category_id` int(11) NOT NULL,
`map_item_name` varchar(255) DEFAULT NULL,
`map_item_description` text,
`map_item_tags` varchar(255) DEFAULT NULL,
`map_item_type` set('d','f') DEFAULT NULL,
PRIMARY KEY (`map_item_id`),
KEY `map_item_account_id` (`map_item_account_id`),
KEY `map_item_tags` (`map_item_tags`),
KEY `map_item_category_id` (`map_item_category_id`),
FULLTEXT KEY `map_item_keyword_search` (`map_item_name`,`map_item_description`,`map_item_tags`),
FULLTEXT KEY `map_item_name` (`map_item_name`),
FULLTEXT KEY `map_item_description` (`map_item_description`),
FULLTEXT KEY `map_item_tags_2` (`map_item_tags`)
) ENGINE=InnoDB AUTO_INCREMENT=5420 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `map_section_item` (
`map_section_item_id` int(11) NOT NULL AUTO_INCREMENT,
`map_section_item_section_id` int(11) NOT NULL DEFAULT '0',
`map_section_item_item_id` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`map_section_item_id`),
KEY `map_section_item_section_id` (`map_section_item_section_id`,`map_section_item_item_id`)
) ENGINE=InnoDB AUTO_INCREMENT=24410 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `map_section` (
`map_section_id` int(11) NOT NULL AUTO_INCREMENT,
`map_section_map_id` int(11) NOT NULL DEFAULT '0',
`map_section_map_draft_id` int(11) NOT NULL DEFAULT '0',
`map_section_column` tinyint(1) NOT NULL DEFAULT '1',
`map_section_name` varchar(255) DEFAULT NULL,
`map_section_description` text,
PRIMARY KEY (`map_section_id`),
KEY `map_section_map_draft_id` (`map_section_map_draft_id`),
KEY `map_section_map_id` (`map_section_map_id`),
FULLTEXT KEY `index_name` (`map_section_name`)
) ENGINE=InnoDB AUTO_INCREMENT=4254 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `map_item_flag` (
`map_item_flag_id` int(11) NOT NULL AUTO_INCREMENT,
`map_item_flag_item_id` int(11) NOT NULL DEFAULT '0',
`map_item_flag_flag_id` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`map_item_flag_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1547 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `flag` (
`flag_id` int(11) NOT NULL AUTO_INCREMENT,
`flag_category_id` int(11) NOT NULL DEFAULT '0',
`flag_name` varchar(255) DEFAULT NULL,
`flag_description` varchar(255) DEFAULT NULL,
`flag_img` varchar(255) DEFAULT NULL,
`flag_order` tinyint(2) NOT NULL DEFAULT '0',
PRIMARY KEY (`flag_id`),
KEY `flag_category_id` (`flag_category_id`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `map` (
`map_id` int(11) NOT NULL AUTO_INCREMENT,
`map_account_id` int(11) NOT NULL DEFAULT '0',
`map_name` varchar(255) DEFAULT NULL,
`map_description` text,
`map_type` set('d','f') DEFAULT NULL,
`map_layout` set('columns','tabs','collapsed') DEFAULT NULL,
PRIMARY KEY (`map_id`),
KEY `map_account_id` (`map_account_id`)
) ENGINE=InnoDB AUTO_INCREMENT=138 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `place_map` (
`place_map_id` int(11) NOT NULL AUTO_INCREMENT,
`place_map_place_id` int(11) NOT NULL DEFAULT '0',
`place_map_map_id` int(11) NOT NULL DEFAULT '0',
`place_map_active` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`place_map_id`),
KEY `branch_map_branch_id` (`place_map_place_id`,`place_map_map_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2176 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `place` (
`place_id` int(11) NOT NULL AUTO_INCREMENT,
`place_account_id` int(11) NOT NULL DEFAULT '0',
`place_name` varchar(120) DEFAULT NULL,
`place_alias` varchar(255) DEFAULT NULL,
`place_description` text,
`place_address_line_one` varchar(100) DEFAULT NULL,
`place_address_line_two` varchar(100) DEFAULT NULL,
`place_address_line_three` varchar(100) DEFAULT NULL,
`place_address_town` varchar(100) DEFAULT NULL,
`place_address_county` int(11) NOT NULL DEFAULT '0',
`place_address_postcode` varchar(10) DEFAULT NULL,
`place_address_latitude` decimal(11,8) DEFAULT NULL,
`place_address_longitude` decimal(11,8) DEFAULT NULL,
`place_phone` varchar(20) DEFAULT NULL,
`place_email` varchar(255) DEFAULT NULL,
`place_website` varchar(120) DEFAULT NULL,
`place_flag_initial_email` tinyint(1) NOT NULL DEFAULT '0',
`place_audit_admin_id` int(11) NOT NULL DEFAULT '0',
`place_last_audit_datetime` datetime DEFAULT NULL,
`place_created_by_admin_id` int(11) NOT NULL DEFAULT '0',
`place_created` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`place_tried_google` int(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`place_id`),
KEY `place_account_id` (`place_account_id`),
KEY `place_address_county` (`place_address_county`),
KEY `place_alias` (`place_alias`),
KEY `place_audit_admin_id` (`place_audit_admin_id`),
KEY `place_created_by_admin_id` (`place_created_by_admin_id`),
FULLTEXT KEY `place_name` (`place_name`),
FULLTEXT KEY `place_keyword_search` (`place_name`,`place_address_town`),
FULLTEXT KEY `place_address_town` (`place_address_town`)
) ENGINE=InnoDB AUTO_INCREMENT=135167 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `place_category` (
`place_category_id` int(11) NOT NULL AUTO_INCREMENT,
`place_category_place_id` int(11) NOT NULL DEFAULT '0',
`place_category_category_id` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`place_category_id`),
UNIQUE KEY `place_category_place_id` (`place_category_place_id`,`place_category_category_id`)
) ENGINE=InnoDB AUTO_INCREMENT=208987 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `category` (
`category_id` int(11) NOT NULL AUTO_INCREMENT,
`category_name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`category_id`)
) ENGINE=InnoDB AUTO_INCREMENT=168 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `review` (
`review_id` int(11) NOT NULL AUTO_INCREMENT,
`review_user_id` int(11) NOT NULL DEFAULT '0',
`review_place_id` int(11) NOT NULL DEFAULT '0',
`review_map_item_id` int(11) NOT NULL DEFAULT '0',
`review_otm_item_name` varchar(156) DEFAULT NULL,
`review_headline` varchar(255) DEFAULT NULL,
`review_message` text,
`review_rating` tinyint(1) NOT NULL DEFAULT '0',
`review_datetime` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`review_edited_datetime` datetime DEFAULT NULL,
`review_hidden` tinyint(1) NOT NULL DEFAULT '0',
`review_deleted` tinyint(1) NOT NULL DEFAULT '0',
`review_status` set('pending','published','hidden','deleted') NOT NULL,
PRIMARY KEY (`review_id`),
KEY `review_map_item_id` (`review_map_item_id`),
KEY `review_place_id` (`review_place_id`),
KEY `review_user_id` (`review_user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `map_price` (
`map_price_id` int(11) NOT NULL AUTO_INCREMENT,
`map_price_item_id` int(11) NOT NULL DEFAULT '0',
`map_price_label` varchar(50) DEFAULT NULL,
`map_price_value` decimal(10,2) DEFAULT NULL,
PRIMARY KEY (`map_price_id`),
KEY `map_price_item_id` (`map_price_item_id`)
) ENGINE=InnoDB AUTO_INCREMENT=5872 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
CREATE TABLE `county_list` (
`county_id` int(11) NOT NULL AUTO_INCREMENT,
`county_country_id` int(11) NOT NULL DEFAULT '0',
`county_name` varchar(120) DEFAULT NULL,
`county_alias` varchar(120) DEFAULT NULL,
PRIMARY KEY (`county_id`),
KEY `county_alias` (`county_alias`),
KEY `county_country_id` (`county_country_id`)
) ENGINE=InnoDB AUTO_INCREMENT=142 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
Look at these lines:
LEFT JOIN map_section_item ON map_section_item_item_id = map_item_id
| 1 | SIMPLE | map_section_item | index | NULL | map_section_item_section_id | 8 | NULL | 5330 | Using where; Using index; Using join buffer (Block Nested Loop) |
|
Notice "5330". That means it had to search about 5330 items to find the row it needed.
With a simple INDEX(map_section_item_item_id), it would go directly to the one (or few) row it needed. This would make the query run a lot faster.
Repeat for each other JOIN, at least for those with a "Rows" > 1.
Why LEFT? Is each "right" table optionally missing data?
A side issue: Don't prefix everything with the table name; it is too much clutter.
For MySQL, try using the STRAIGHT_JOIN clause...
SELECT STRAIGHT_JOIN map_item_name
FROM map_item
LEFT JOIN ...
STRAIGHT_JOIN tells MySQL to do the query in the order I've listed. This way it forces the map_item as the primary table and all the rest as lookup secondary tables...

how to convert a field to string in a mysql join

I try to join a filed that is a int(13) on to a field that is varchar(50).
If I only use (a.id = b.id) the DESCRIBE says type: ref.
If I use (a.id = CONCAT(b.id)) the DESCRIBE says type: eq_ref. (where b.id is the integer)
The use of CONCAT to cast a field is ugly, so I tried to use CAST() or CONVERT().
If I use (a.id = CAST(b.id AS CHAR(50))) the DESCRIBE says type: ref.
How do I write a correct cast/convert, that gives a eq_ref join?
UPDATE 1:
DESCRIBE SELECT.. with CONCAT
+------+-------------+-----------------------+--------+-----------------------------------+----------------+---------+-------------------------------------+------+----------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-----------------------+--------+-----------------------------------+----------------+---------+-------------------------------------+------+----------------------------------------+
| 1 | SIMPLE | ext_icecat_prodmatch | ref | PRIMARY,our_article_id,product_id | our_article_id | 152 | const | 3016 | Using index condition; Using temporary |
| 1 | SIMPLE | ext_icecat_product | eq_ref | PRIMARY,product_id | PRIMARY | 4 | ext_icecat_prodmatch.product_id | 1 | |
| 1 | SIMPLE | ext_icecat_supplier | eq_ref | PRIMARY | PRIMARY | 4 | ext_icecat_product.supplier_id | 1 | |
| 1 | SIMPLE | products | eq_ref | PRIMARY | PRIMARY | 152 | ext_icecat_prodmatch.our_article_id | 1 | |
| 1 | SIMPLE | partner_product_saved | eq_ref | PRIMARY | PRIMARY | 155 | const,func | 1 | Using where |
| 1 | SIMPLE | category_names | eq_ref | PRIMARY | PRIMARY | 6 | products.category_id,const | 1 | Using where |
+------+-------------+-----------------------+--------+-----------------------------------+----------------+---------+-------------------------------------+------+----------------------------------------+
The Select:
SELECT
partner_product_saved.*,
ext_icecat_product.product_id,
CONCAT(ext_icecat_supplier.name, ' ', ext_icecat_product.name) AS export_product_name,
ext_icecat_product.catid_match AS category_id,
GROUP_CONCAT(ext_icecat_prodmatch.our_article_id) AS oais,
products.file_name,
category_names.category_path
FROM ext_icecat_product
LEFT JOIN ext_icecat_prodmatch USING (product_id)
LEFT JOIN ext_icecat_supplier USING (supplier_id)
LEFT JOIN products USING (our_article_id)
LEFT JOIN partner_product_saved ON (partner_product_saved.partner_id = 29 AND partner_product_saved.product_id = CONCAT(ext_icecat_product.product_id))
LEFT JOIN category_names ON (category_names.category_id = products.category_id AND category_names.language_id = 2)
WHERE ext_icecat_prodmatch.our_article_id = '0EF03850-D25A-1174-BCDC-EC67352010A6'
GROUP BY ext_icecat_product.product_id
ORDER BY NULL;
SHOW CREATE TABLE
CREATE TABLE `partner_product_saved` (
`partner_id` mediumint(8) NOT NULL,
`product_id` varchar(50) CHARACTER SET utf8 NOT NULL,
`product_name` varchar(100) CHARACTER SET utf8 NOT NULL,
`our_article_id` varchar(50) CHARACTER SET utf8 DEFAULT NULL,
`our_category_id` mediumint(8) DEFAULT NULL,
`manufacture_id` mediumint(8) DEFAULT NULL,
`manufacturer_partnr` varchar(255) COLLATE utf8_bin NOT NULL,
`manufacturer_upc` varchar(255) COLLATE utf8_bin NOT NULL,
`image` tinytext COLLATE utf8_bin NOT NULL,
`image_small` tinytext COLLATE utf8_bin NOT NULL,
`image_big` tinytext COLLATE utf8_bin NOT NULL,
`image_200` tinytext COLLATE utf8_bin NOT NULL,
`image_original` tinytext COLLATE utf8_bin NOT NULL,
`image_width` int(11) DEFAULT NULL,
`image_height` int(11) DEFAULT NULL,
`birth` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`last_updated` timestamp NULL DEFAULT NULL,
`saved` tinyint(3) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`partner_id`,`product_id`),
KEY `our_article_id` (`our_article_id`),
KEY `our_category_id` (`our_category_id`),
KEY `manufacture_id` (`manufacture_id`,`manufacturer_partnr`),
KEY `manufacturer_upc` (`manufacturer_upc`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
CREATE TABLE `ext_icecat_product` (
`product_id` int(13) NOT NULL,
`supplier_id` int(13) NOT NULL DEFAULT '0',
`prod_id` varchar(235) COLLATE utf8_bin NOT NULL DEFAULT '',
`prod_id_clean` varchar(255) CHARACTER SET utf8 NOT NULL,
`catid` int(13) NOT NULL DEFAULT '0',
`catid_match` varchar(50) CHARACTER SET utf8 NOT NULL,
`name` varchar(255) CHARACTER SET utf8 NOT NULL,
`name_clean` varchar(255) CHARACTER SET utf8 NOT NULL,
`low_pic` varchar(255) COLLATE utf8_bin NOT NULL DEFAULT '',
`high_pic` varchar(255) COLLATE utf8_bin NOT NULL DEFAULT '',
`thumb_pic` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`family_id` int(13) NOT NULL DEFAULT '0',
`low_pic_size` int(13) DEFAULT '0',
`high_pic_size` int(13) DEFAULT '0',
`thumb_pic_size` int(13) DEFAULT '0',
`import_date` datetime NOT NULL,
`release_date` datetime NOT NULL,
`updated` datetime NOT NULL,
`need_update` tinyint(1) NOT NULL DEFAULT '0',
`deleted` tinyint(1) NOT NULL DEFAULT '0',
`keyword` tinyint(1) NOT NULL DEFAULT '0',
`special_match` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`product_id`),
KEY `supplier_id` (`supplier_id`),
KEY `catid` (`catid`),
KEY `prod_id` (`prod_id`),
KEY `product_id` (`product_id`,`prod_id`,`supplier_id`),
KEY `release_Date` (`release_date`),
KEY `prod_id_clean` (`prod_id_clean`),
KEY `name_clean` (`name_clean`),
KEY `need_update` (`need_update`),
KEY `deleted` (`deleted`),
KEY `keyword` (`keyword`),
KEY `catid_2` (`catid`,`import_date`),
KEY `catid_match` (`catid_match`),
KEY `special_match` (`special_match`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
WHERE indexed_column = any_function(any_column) -- can use index. WHERE non_indexed_column = any_function(indexed_column) -- cannot use index.
The difference between ref and eq_ref is minor. I think that eq_ref is where the optimizer decides that there cannot be more than one match, often because of UNIQUE.
WHERE ext_icecat_prodmatch.our_article_id = '0EF03850-D25A-1174-BCDC-EC67352010A6' -- is our_article_id INDEXed? or UNIQUE? Sounds like it is only an INDEX, so multiple rows might ensue. To make it eq_ref, you need UNIQUE. But only if the data supports such. The stats imply there might be 3016 rows with that article_id.
Do not use LEFT unless you need it. Note how the Optimizer turned LEFT JOIN ext_icecat_prodmatch USING (product_id) into JOIN and decided (rightly) to start with ext_icecat_prodmatch.
Back to other discussions...
AND partner_product_saved.product_id = CONCAT(ext_icecat_product.product_id))
can go one way, but not the other. That is, it can efficiently go from eip to pps, but not the other way. And EXPLAIN indicated such with const,func.

How to optimize this query in MySQL

I have these two tables (Moodle 2.8):
CREATE TABLE `mdl_course` (
`id` bigint(10) NOT NULL AUTO_INCREMENT,
`category` bigint(10) NOT NULL DEFAULT '0',
`sortorder` bigint(10) NOT NULL DEFAULT '0',
`fullname` varchar(254) NOT NULL DEFAULT '',
`shortname` varchar(255) NOT NULL DEFAULT '',
`idnumber` varchar(100) NOT NULL DEFAULT '',
`summary` longtext,
`summaryformat` tinyint(2) NOT NULL DEFAULT '0',
`format` varchar(21) NOT NULL DEFAULT 'topics',
`showgrades` tinyint(2) NOT NULL DEFAULT '1',
`newsitems` mediumint(5) NOT NULL DEFAULT '1',
`startdate` bigint(10) NOT NULL DEFAULT '0',
`marker` bigint(10) NOT NULL DEFAULT '0',
`maxbytes` bigint(10) NOT NULL DEFAULT '0',
`legacyfiles` smallint(4) NOT NULL DEFAULT '0',
`showreports` smallint(4) NOT NULL DEFAULT '0',
`visible` tinyint(1) NOT NULL DEFAULT '1',
`visibleold` tinyint(1) NOT NULL DEFAULT '1',
`groupmode` smallint(4) NOT NULL DEFAULT '0',
`groupmodeforce` smallint(4) NOT NULL DEFAULT '0',
`defaultgroupingid` bigint(10) NOT NULL DEFAULT '0',
`lang` varchar(30) NOT NULL DEFAULT '',
`theme` varchar(50) NOT NULL DEFAULT '',
`timecreated` bigint(10) NOT NULL DEFAULT '0',
`timemodified` bigint(10) NOT NULL DEFAULT '0',
`requested` tinyint(1) NOT NULL DEFAULT '0',
`enablecompletion` tinyint(1) NOT NULL DEFAULT '0',
`completionnotify` tinyint(1) NOT NULL DEFAULT '0',
`cacherev` bigint(10) NOT NULL DEFAULT '0',
`calendartype` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `mdl_cour_cat_ix` (`category`),
KEY `mdl_cour_idn_ix` (`idnumber`),
KEY `mdl_cour_sho_ix` (`shortname`),
KEY `mdl_cour_sor_ix` (`sortorder`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `mdl_log` (
`id` bigint(10) NOT NULL AUTO_INCREMENT,
`time` bigint(10) NOT NULL DEFAULT '0',
`userid` bigint(10) NOT NULL DEFAULT '0',
`ip` varchar(45) NOT NULL DEFAULT '',
`course` bigint(10) NOT NULL DEFAULT '0',
`module` varchar(20) NOT NULL DEFAULT '',
`cmid` bigint(10) NOT NULL DEFAULT '0',
`action` varchar(40) NOT NULL DEFAULT '',
`url` varchar(100) NOT NULL DEFAULT '',
`info` varchar(255) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `mdl_log_coumodact_ix` (`course`,`module`,`action`),
KEY `mdl_log_tim_ix` (`time`),
KEY `mdl_log_act_ix` (`action`),
KEY `mdl_log_usecou_ix` (`userid`,`course`),
KEY `mdl_log_cmi_ix` (`cmid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And this query:
SELECT l.id,
l.userid AS participantid,
l.course AS courseid,
l.time,
l.ip,
l.action,
l.info,
l.module,
l.url
FROM mdl_log l
INNER JOIN mdl_course c ON l.course = c.id AND c.category <> 0
WHERE
l.id > [some large id]
AND
l.time > [some unix timestamp]
ORDER BY l.id ASC
LIMIT 0,200
mdl_log table has over 200 milion records, and I need to export it into file using PHP and not die in intent. The main problem here is that executing this is too slow. The main killer here is the join to the mdl_course table. If I remove it, everything works fast.
Here is the explain:
+----+-------------+-------+-------+---------------------------------------------+----------------------+---------+----------------+------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------------------------------------+----------------------+---------+----------------+------+-----------------------------------------------------------+
| 1 | SIMPLE | c | range | PRIMARY,mdl_cour_cat_ix | mdl_cour_cat_ix | 8 | NULL | 3152 | Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | l | ref | PRIMARY,mdl_log_coumodact_ix,mdl_log_tim_ix | mdl_log_coumodact_ix | 8 | xray2qasb.c.id | 618 | Using index condition; Using where |
+----+-------------+-------+-------+---------------------------------------------+----------------------+---------+----------------+------+-----------------------------------------------------------+
Is there any way to remove usage of temporary and filesort? What do you propose here?
After some testing this query works fast as expected:
SELECT l.id,
l.userid AS participantid,
l.course AS courseid,
l.time,
l.ip,
l.action,
l.info,
l.module,
l.url
FROM mdl_log l
WHERE
l.id > 123456
AND
l.time > 1234
AND
EXISTS (SELECT * FROM mdl_course c WHERE l.course = c.id AND c.category <> 0 )
ORDER BY l.id ASC
LIMIT 0,200
Thanks to JamieD77 for his suggestion!
execution plan:
+----+--------------------+-------+--------+-------------------------+---------+---------+--------------------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+--------+-------------------------+---------+---------+--------------------+----------+-------------+
| 1 | PRIMARY | l | range | PRIMARY,mdl_log_tim_ix | PRIMARY | 8 | NULL | 99962199 | Using where |
| 2 | DEPENDENT SUBQUERY | c | eq_ref | PRIMARY,mdl_cour_cat_ix | PRIMARY | 8 | xray2qasb.l.course | 1 | Using where |
+----+--------------------+-------+--------+-------------------------+---------+---------+--------------------+----------+-------------+
Try moving the category selection outside the JOIN. Here I put it in an IN() which the engine will cache on successive runs. I don't have 200M rows to test on, so YMMV.
DESCRIBE
SELECT l.id,
l.userid AS participantid,
l.course AS courseid,
l.time,
l.ip,
l.action,
l.info,
l.module,
l.url
FROM mdl_log l
WHERE
l.id > 1234567890
AND
l.time > 1234567890
AND
l.course IN (SELECT c.id FROM mdl_course c WHERE c.category > 0)
ORDER BY l.id ASC
LIMIT 0,200;
(In addition to using EXISTS...)
l.id > 123456 AND l.time > 1234
seems to beg for a 2-dimensional index.
99962199 -- the table is very big, correct?
Consider PARTITION BY RANGE on mdl_log on time. But...
Don't have more than about 50 partitions; other inefficiencies kick in then.
Partitioning probably won't help id and time are sorta in lock-step. Typical case: id is AUTO_INCREMENT and time is approximately the time of the INSERT.
If that applies, consider:
PRIMARY KEY(time, id) -- see below
INDEX(id) -- Yes, this is sufficient for `id AUTO_INCREMENT`.
With those indexes, you could efficiently do
WHERE time > ...
ORDER BY time, id
which is probably what you really wanted.

JOIN very slow when using RIGHT JOIN on this query

I'm having a problem with this query that takes several seconds to complete. I already tried many optimizations but I'm shooting blanks at this point.
The tables are the following (and are not absolutely normalized fully especially the tracks table)
CREATE TABLE `tracks` (
`id` int(14) unsigned NOT NULL AUTO_INCREMENT,
`artist` varchar(200) NOT NULL,
`track` varchar(200) NOT NULL,
`album` varchar(200) NOT NULL,
`path` text NOT NULL,
`tags` text NOT NULL,
`priority` int(10) NOT NULL DEFAULT '0',
`lastplayed` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`lastrequested` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`usable` int(1) NOT NULL DEFAULT '0',
`accepter` varchar(200) NOT NULL DEFAULT '',
`lasteditor` varchar(200) NOT NULL DEFAULT '',
`hash` varchar(40) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `hash` (`hash`),
FULLTEXT KEY `searchindex` (`tags`,`artist`,`track`,`album`),
FULLTEXT KEY `artist` (`artist`,`track`,`album`,`tags`)
) ENGINE=MyISAM AUTO_INCREMENT=3336 DEFAULT CHARSET=utf8
CREATE TABLE `esong` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`hash` varchar(40) COLLATE utf8_bin NOT NULL,
`len` int(10) unsigned NOT NULL,
`meta` text COLLATE utf8_bin NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `hash` (`hash`)
) ENGINE=InnoDB AUTO_INCREMENT=16032 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
CREATE TABLE `efave` (
`id` int(10) unsigned NOT NULL DEFAULT '0',
`inick` int(10) unsigned NOT NULL,
`isong` int(10) unsigned NOT NULL,
UNIQUE KEY `inick` (`inick`,`isong`),
KEY `isong` (`isong`),
CONSTRAINT `inick` FOREIGN KEY (`inick`) REFERENCES `enick` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `isong` FOREIGN KEY (`isong`) REFERENCES `esong` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `enick` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT
`nick` varchar(30) COLLATE utf8_bin NOT NULL,
`dta` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`dtb` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
KEY `nick` (`nick`)
) ENGINE=InnoDB AUTO_INCREMENT=488 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
and the query I'm trying to execute with a normal speed is the following
SELECT esong.meta, tracks.id FROM tracks RIGHT JOIN esong ON tracks.hash = esong.hash JOIN efave ON efave.isong = esong.id JOIN enick ON efave.inick = enick.id WHERE enick.nick = lower('nickname');
Where if you remove the RIGHT JOIN and change it to JOIN it is fast
The EXPLAIN gives me this result, it seems there is a small problem in the efave selection but I have no idea how to get that out
+----+-------------+--------+--------+---------------+---------+---------+-----------------------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+--------+---------------+---------+---------+-----------------------+------+----------+--------------------------+
| 1 | SIMPLE | enick | ref | PRIMARY,nick | nick | 92 | const | 1 | 100.00 | Using where; Using index |
| 1 | SIMPLE | efave | ref | inick,isong | inick | 4 | radiosite.enick.id | 12 | 100.00 | Using index |
| 1 | SIMPLE | esong | eq_ref | PRIMARY | PRIMARY | 4 | radiosite.efave.isong | 1 | 100.00 | |
| 1 | SIMPLE | tracks | ALL | hash | NULL | NULL | NULL | 3210 | 100.00 | |
+----+-------------+--------+--------+---------------+---------+---------+-----------------------+------+----------+--------------------------+
Your explain looks clean, the only thing that stands out to me is the fact that the esong table is using a collate of utf8_bin, and the tracks table doesn't have a collation specified, which means it is probably using another collation type. Try aligning your collations and see how the join performs.
Have you checked your Execution Plan? If not, run your query to include it. Your Right Join may be doing an Index Scan instead of an Index Seek. Or you may be lacking indexes. Either way, you need to look at your Execution Plan so you can optimize your query better. No one will really be able to tell you how to make it faster using a Right Join (or a Join for that matter) until you know what the real problem is. Here are some links..
For MySQL: http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html
For SqlServer: http://www.sql-server-performance.com/2006/query-execution-plan-analysis/

More Odd MySQL Behavior - Query Optimization Help

We have a central login that we use to support multiple websites. To store our users' data we have an accounts table which stores each user account and then users tables for each site for site specific information. We also have a simple connections table which stores the connections between users.
We noticed that one query that is joining the tables on their primary key user_id is executing slowly. I'm hoping that some SQL expert out there can explain why it's using WHERE to search the users_site1 table and suggest how we can optimize it. Here is the slow query & the explain results:
mysql> explain select a.username,a.first_name,a.last_name,a.organization_name,a.organization,a.city,a.state,a.zip,a.country,a.profile_photo,a.facebook_id,a.twitter_id,u.reviews from accounts a join users_site1 u ON a.user_id=u.user_id where a.user_id IN (select cid2 from connections where cid1=10001006 AND type="MM" AND status="A") OR a.user_id IN (select cid1 from connections where cid2=10001006 AND type="MM" AND status="A") order by RAND() LIMIT 4;
+----+--------------------+-------------+--------+-------------------+---------+---------+-----------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------------+--------+-------------------+---------+---------+-----------------------+-------+----------------------------------------------+
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 79783 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | a | eq_ref | PRIMARY | PRIMARY | 4 | exampledb.u.user_id | 1 | |
| 3 | DEPENDENT SUBQUERY | connections | ref | PRIMARY,cid1,cid2 | cid2 | 6 | const,const | 2 | Using where |
| 2 | DEPENDENT SUBQUERY | connections | ref | PRIMARY,cid1,cid2 | cid1 | 6 | const,const | 1 | Using where |
+----+--------------------+-------------+--------+-------------------+---------+---------+-----------------------+-------+----------------------------------------------+
4 rows in set (0.00 sec)
Here are the definitions for each table:
CREATE TABLE `accounts` (
`user_id` int(9) unsigned NOT NULL AUTO_INCREMENT,
`username` varchar(40) DEFAULT NULL,
`facebook_id` bigint(15) unsigned DEFAULT NULL,
`facebook_username` varchar(30) DEFAULT NULL,
`password` varchar(20) DEFAULT NULL,
`profile_photo` varchar(100) DEFAULT NULL,
`first_name` varchar(40) DEFAULT NULL,
`middle_name` varchar(40) DEFAULT NULL,
`last_name` varchar(40) DEFAULT NULL,
`suffix_name` char(3) DEFAULT NULL,
`organization_name` varchar(100) DEFAULT NULL,
`organization` tinyint(1) unsigned DEFAULT NULL,
`address` varchar(200) DEFAULT NULL,
`city` varchar(40) DEFAULT NULL,
`state` varchar(20) DEFAULT NULL,
`zip` varchar(10) DEFAULT NULL,
`province` varchar(40) DEFAULT NULL,
`country` int(3) DEFAULT NULL,
`latitude` decimal(11,7) DEFAULT NULL,
`longitude` decimal(12,7) DEFAULT NULL,
`phone` varchar(20) DEFAULT NULL,
`sex` char(1) DEFAULT NULL,
`birthday` date DEFAULT NULL,
`about_me` varchar(2000) DEFAULT NULL,
`activities` varchar(300) DEFAULT NULL,
`website` varchar(100) DEFAULT NULL,
`email` varchar(150) DEFAULT NULL,
`referrer` int(4) unsigned DEFAULT NULL,
`referredid` int(9) unsigned DEFAULT NULL,
`verify` int(6) DEFAULT NULL,
`status` char(1) DEFAULT 'R',
`created` datetime DEFAULT NULL,
`verified` datetime DEFAULT NULL,
`activated` datetime DEFAULT NULL,
`network` datetime DEFAULT NULL,
`deleted` datetime DEFAULT NULL,
`logins` int(6) unsigned DEFAULT '0',
`api_logins` int(6) unsigned DEFAULT '0',
`last_login` datetime DEFAULT NULL,
`last_update` datetime DEFAULT NULL,
`private` tinyint(1) unsigned DEFAULT NULL,
`ip` varchar(20) DEFAULT NULL,
PRIMARY KEY (`user_id`),
UNIQUE KEY `username` (`username`),
KEY `facebook_id` (`facebook_id`),
KEY `status` (`status`),
KEY `state` (`state`)
);
CREATE TABLE `users_site1` (
`user_id` int(9) unsigned NOT NULL,
`facebook_id` bigint(15) unsigned DEFAULT NULL,
`facebook_username` varchar(30) DEFAULT NULL,
`facebook_publish` tinyint(1) unsigned DEFAULT NULL,
`facebook_checkin` tinyint(1) unsigned DEFAULT NULL,
`facebook_offline` varchar(300) DEFAULT NULL,
`twitter_id` varchar(60) DEFAULT NULL,
`twitter_secret` varchar(50) DEFAULT NULL,
`twitter_username` varchar(20) DEFAULT NULL,
`type` char(1) DEFAULT 'M',
`referrer` int(4) unsigned DEFAULT NULL,
`referredid` int(9) unsigned DEFAULT NULL,
`session` varchar(60) DEFAULT NULL,
`api_session` varchar(60) DEFAULT NULL,
`status` char(1) DEFAULT 'R',
`created` datetime DEFAULT NULL,
`verified` datetime DEFAULT NULL,
`activated` datetime DEFAULT NULL,
`deleted` datetime DEFAULT NULL,
`logins` int(6) unsigned DEFAULT '0',
`api_logins` int(6) unsigned DEFAULT '0',
`last_login` datetime DEFAULT NULL,
`last_update` datetime DEFAULT NULL,
`ip` varchar(20) DEFAULT NULL,
PRIMARY KEY (`user_id`)
);
CREATE TABLE `connections` (
`cid1` int(9) unsigned NOT NULL DEFAULT '0',
`cid2` int(9) unsigned NOT NULL DEFAULT '0',
`cid3` int(9) unsigned NOT NULL DEFAULT '0',
`type` char(2) NOT NULL,
`status` char(1) NOT NULL,
`created` datetime DEFAULT NULL,
`updated` datetime DEFAULT NULL,
PRIMARY KEY (`cid1`,`cid2`,`type`,`cid3`),
KEY `cid1` (`cid1`,`type`),
KEY `cid2` (`cid2`,`type`)
);
Instead of WHERE a.userid IN( ... ) OR a.userid IN( ... ) you should use another join:
select
a.username,a.first_name,a.last_name,a.organization_name,a.organization,a.city,
a.state,a.zip,a.country,a.profile_photo,a.facebook_id,a.twitter_id,u.reviews
from accounts a
join users_site1 u ON a.user_id=u.user_id
join ( select cid2 as id from connections
where cid1=10001006 AND type="MM" AND status="A"
union
select cid1 as id from connections
where cid2=10001006 AND type="MM" AND status="A" ) c
on a.user_id = c.id
order by RAND() LIMIT 4;
have you tried remove order by RAND() and run again?
my result is below:
+----+--------------------+-------------+----------------+-------------------+---------+---------+------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------------+----------------+-------------------+---------+---------+------------------+------+----------------------------------------------+
| 1 | PRIMARY | a | ALL | PRIMARY | NULL | NULL | NULL | 2 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 2 | Using where; Using join buffer |
| 3 | DEPENDENT SUBQUERY | connections | index_subquery | PRIMARY,cid1,cid2 | PRIMARY | 14 | func,const,const | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | connections | ref | PRIMARY,cid1,cid2 | PRIMARY | 14 | const,func,const | 1 | Using where |
+----+--------------------+-------------+----------------+-------------------+---------+---------+------------------+------+----------------------------------------------+
I am not a MySQL guru by any means but been have involved more than once in optimization of high performance applications, though I was more on the implementation end of the optimisation process versus finding what needed to be optimized.
The firt thing I see is the subqueries seem efficient, but the way the first query is run with this where clause: ... where a.user_id IN (select cid2 ...) or a.user_id IN (select cid1 from ...) is a performance killer in my very humble opinion.
The first thing I would try to optimise performance, consider trying join decomposition , split your request in 2 or even 3 queries. The code is less pretty, but the db will be able to work more efficiently. It is a myth that doing everything in one query is better.
What can this bring you? Caching will be more efficient, if using MyISam tables the locking srategy is more efficient when you have less tables in your query, and you will reduce the redundant row accesses. If you can get your main query ( that would be the last one if you decompose ) from Using where; Using temporary; Using filesort, you will have much faster response.
Profile the different options you try with SHOW SESSION STATUS and FLUSH status, also you can disable caching to get true comparison of different options you try by adding SQL_NO_CACHE in your query, ie SELSECT SQL_NO_CACHE a.username ... etc..
Profiling and measuring the results is the only way you will be able to determine the performance gains. Unfortunately this step is often overlooked.
Good luck!