We have a central login that we use to support multiple websites. To store our users' data we have an accounts table which stores each user account and then users tables for each site for site specific information. We also have a simple connections table which stores the connections between users.
We noticed that one query that is joining the tables on their primary key user_id is executing slowly. I'm hoping that some SQL expert out there can explain why it's using WHERE to search the users_site1 table and suggest how we can optimize it. Here is the slow query & the explain results:
mysql> explain select a.username,a.first_name,a.last_name,a.organization_name,a.organization,a.city,a.state,a.zip,a.country,a.profile_photo,a.facebook_id,a.twitter_id,u.reviews from accounts a join users_site1 u ON a.user_id=u.user_id where a.user_id IN (select cid2 from connections where cid1=10001006 AND type="MM" AND status="A") OR a.user_id IN (select cid1 from connections where cid2=10001006 AND type="MM" AND status="A") order by RAND() LIMIT 4;
+----+--------------------+-------------+--------+-------------------+---------+---------+-----------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------------+--------+-------------------+---------+---------+-----------------------+-------+----------------------------------------------+
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 79783 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | a | eq_ref | PRIMARY | PRIMARY | 4 | exampledb.u.user_id | 1 | |
| 3 | DEPENDENT SUBQUERY | connections | ref | PRIMARY,cid1,cid2 | cid2 | 6 | const,const | 2 | Using where |
| 2 | DEPENDENT SUBQUERY | connections | ref | PRIMARY,cid1,cid2 | cid1 | 6 | const,const | 1 | Using where |
+----+--------------------+-------------+--------+-------------------+---------+---------+-----------------------+-------+----------------------------------------------+
4 rows in set (0.00 sec)
Here are the definitions for each table:
CREATE TABLE `accounts` (
`user_id` int(9) unsigned NOT NULL AUTO_INCREMENT,
`username` varchar(40) DEFAULT NULL,
`facebook_id` bigint(15) unsigned DEFAULT NULL,
`facebook_username` varchar(30) DEFAULT NULL,
`password` varchar(20) DEFAULT NULL,
`profile_photo` varchar(100) DEFAULT NULL,
`first_name` varchar(40) DEFAULT NULL,
`middle_name` varchar(40) DEFAULT NULL,
`last_name` varchar(40) DEFAULT NULL,
`suffix_name` char(3) DEFAULT NULL,
`organization_name` varchar(100) DEFAULT NULL,
`organization` tinyint(1) unsigned DEFAULT NULL,
`address` varchar(200) DEFAULT NULL,
`city` varchar(40) DEFAULT NULL,
`state` varchar(20) DEFAULT NULL,
`zip` varchar(10) DEFAULT NULL,
`province` varchar(40) DEFAULT NULL,
`country` int(3) DEFAULT NULL,
`latitude` decimal(11,7) DEFAULT NULL,
`longitude` decimal(12,7) DEFAULT NULL,
`phone` varchar(20) DEFAULT NULL,
`sex` char(1) DEFAULT NULL,
`birthday` date DEFAULT NULL,
`about_me` varchar(2000) DEFAULT NULL,
`activities` varchar(300) DEFAULT NULL,
`website` varchar(100) DEFAULT NULL,
`email` varchar(150) DEFAULT NULL,
`referrer` int(4) unsigned DEFAULT NULL,
`referredid` int(9) unsigned DEFAULT NULL,
`verify` int(6) DEFAULT NULL,
`status` char(1) DEFAULT 'R',
`created` datetime DEFAULT NULL,
`verified` datetime DEFAULT NULL,
`activated` datetime DEFAULT NULL,
`network` datetime DEFAULT NULL,
`deleted` datetime DEFAULT NULL,
`logins` int(6) unsigned DEFAULT '0',
`api_logins` int(6) unsigned DEFAULT '0',
`last_login` datetime DEFAULT NULL,
`last_update` datetime DEFAULT NULL,
`private` tinyint(1) unsigned DEFAULT NULL,
`ip` varchar(20) DEFAULT NULL,
PRIMARY KEY (`user_id`),
UNIQUE KEY `username` (`username`),
KEY `facebook_id` (`facebook_id`),
KEY `status` (`status`),
KEY `state` (`state`)
);
CREATE TABLE `users_site1` (
`user_id` int(9) unsigned NOT NULL,
`facebook_id` bigint(15) unsigned DEFAULT NULL,
`facebook_username` varchar(30) DEFAULT NULL,
`facebook_publish` tinyint(1) unsigned DEFAULT NULL,
`facebook_checkin` tinyint(1) unsigned DEFAULT NULL,
`facebook_offline` varchar(300) DEFAULT NULL,
`twitter_id` varchar(60) DEFAULT NULL,
`twitter_secret` varchar(50) DEFAULT NULL,
`twitter_username` varchar(20) DEFAULT NULL,
`type` char(1) DEFAULT 'M',
`referrer` int(4) unsigned DEFAULT NULL,
`referredid` int(9) unsigned DEFAULT NULL,
`session` varchar(60) DEFAULT NULL,
`api_session` varchar(60) DEFAULT NULL,
`status` char(1) DEFAULT 'R',
`created` datetime DEFAULT NULL,
`verified` datetime DEFAULT NULL,
`activated` datetime DEFAULT NULL,
`deleted` datetime DEFAULT NULL,
`logins` int(6) unsigned DEFAULT '0',
`api_logins` int(6) unsigned DEFAULT '0',
`last_login` datetime DEFAULT NULL,
`last_update` datetime DEFAULT NULL,
`ip` varchar(20) DEFAULT NULL,
PRIMARY KEY (`user_id`)
);
CREATE TABLE `connections` (
`cid1` int(9) unsigned NOT NULL DEFAULT '0',
`cid2` int(9) unsigned NOT NULL DEFAULT '0',
`cid3` int(9) unsigned NOT NULL DEFAULT '0',
`type` char(2) NOT NULL,
`status` char(1) NOT NULL,
`created` datetime DEFAULT NULL,
`updated` datetime DEFAULT NULL,
PRIMARY KEY (`cid1`,`cid2`,`type`,`cid3`),
KEY `cid1` (`cid1`,`type`),
KEY `cid2` (`cid2`,`type`)
);
Instead of WHERE a.userid IN( ... ) OR a.userid IN( ... ) you should use another join:
select
a.username,a.first_name,a.last_name,a.organization_name,a.organization,a.city,
a.state,a.zip,a.country,a.profile_photo,a.facebook_id,a.twitter_id,u.reviews
from accounts a
join users_site1 u ON a.user_id=u.user_id
join ( select cid2 as id from connections
where cid1=10001006 AND type="MM" AND status="A"
union
select cid1 as id from connections
where cid2=10001006 AND type="MM" AND status="A" ) c
on a.user_id = c.id
order by RAND() LIMIT 4;
have you tried remove order by RAND() and run again?
my result is below:
+----+--------------------+-------------+----------------+-------------------+---------+---------+------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------------+----------------+-------------------+---------+---------+------------------+------+----------------------------------------------+
| 1 | PRIMARY | a | ALL | PRIMARY | NULL | NULL | NULL | 2 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 2 | Using where; Using join buffer |
| 3 | DEPENDENT SUBQUERY | connections | index_subquery | PRIMARY,cid1,cid2 | PRIMARY | 14 | func,const,const | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | connections | ref | PRIMARY,cid1,cid2 | PRIMARY | 14 | const,func,const | 1 | Using where |
+----+--------------------+-------------+----------------+-------------------+---------+---------+------------------+------+----------------------------------------------+
I am not a MySQL guru by any means but been have involved more than once in optimization of high performance applications, though I was more on the implementation end of the optimisation process versus finding what needed to be optimized.
The firt thing I see is the subqueries seem efficient, but the way the first query is run with this where clause: ... where a.user_id IN (select cid2 ...) or a.user_id IN (select cid1 from ...) is a performance killer in my very humble opinion.
The first thing I would try to optimise performance, consider trying join decomposition , split your request in 2 or even 3 queries. The code is less pretty, but the db will be able to work more efficiently. It is a myth that doing everything in one query is better.
What can this bring you? Caching will be more efficient, if using MyISam tables the locking srategy is more efficient when you have less tables in your query, and you will reduce the redundant row accesses. If you can get your main query ( that would be the last one if you decompose ) from Using where; Using temporary; Using filesort, you will have much faster response.
Profile the different options you try with SHOW SESSION STATUS and FLUSH status, also you can disable caching to get true comparison of different options you try by adding SQL_NO_CACHE in your query, ie SELSECT SQL_NO_CACHE a.username ... etc..
Profiling and measuring the results is the only way you will be able to determine the performance gains. Unfortunately this step is often overlooked.
Good luck!
Related
I have this query and takes 40 seconds, Is there a way to speed up? thank you
SELECT *, last.Date
FROM constant.derogation der
LEFT JOIN variable.last last
ON der.code = last.code
WHERE 1 = 1
AND status != 'removed'
ORDER BY status;
+------+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | der | ALL | NULL | NULL | NULL | NULL | 318 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | last | ALL | NULL | NULL | NULL | NULL | 250950 | Using where |
+------+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
This is the structure of both tables, both databases are on the same server.
I will only get a value from Last Table
DLL
| derogation | CREATE TABLE `derogation` (
`xxx` char(10) NOT NULL,
`xxx` varchar(100) DEFAULT NULL,
`code` char(17) DEFAULT NULL,
`xxx` varchar(20) DEFAULT NULL,
`xxx` char(6) DEFAULT NULL,
`xxx` varchar(50) DEFAULT NULL,
`xxx` varchar(200) DEFAULT NULL,
`xxxx` varchar(200) DEFAULT NULL,
`xxxx` varchar(100) DEFAULT NULL,
`xxxx` datetime DEFAULT NULL,
`xxx` varchar(100) DEFAULT NULL,
`xxxx` varchar(20) DEFAULT NULL,
`xxx` varchar(50) DEFAULT NULL,
`xxx` varchar(10) DEFAULT NULL,
`xxx` datetime DEFAULT NULL,
`xxx` varchar(10) DEFAULT NULL,
`xxx` datetime DEFAULT NULL,
`xxx` datetime DEFAULT NULL,
`status` varchar(20) DEFAULT NULL,
`xxx` varchar(1000) DEFAULT NULL,
KEY `code_index_derogation` (`code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
| Last | CREATE TABLE `Last` (
`code` char(17) DEFAULT NULL,
`xxx` decimal(10,2) DEFAULT NULL,
`Date` datetime DEFAULT NULL,
`xxxx` datetime DEFAULT NULL,
`xxxx` datetime DEFAULT NULL,
`xxxx` text DEFAULT NULL,
`xxxxx` datetime DEFAULT NULL,
`xxxx` char(6) DEFAULT NULL,
KEY `idx_Last_Code` (`code`),
KEY `idx_Last_xxx` (`xxx`),
KEY `code_index` (`code`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
you can use temporary tables, staging tables CTE or you can add indexes for the columns of the tables.
But using temporary tables, it's always an excelent choice to accelerate the excecution of the the SQL query.
(The problem is subtle, and often overlooked in this forum.)
Don't mix collations. You are joining on code, but the CHARACTER SET and COLLATION are different between the two tables. Suggest using `ALTER TABLE .. CONVERT TO .. to get them to be the same.
Other tips:
Use CHAR only for truly fixed-length columns. Use VARCHAR otherwise.
Have a PRIMARY KEY on every table.
Don't say LEFT JOIN if you really expect JOIN; it adds confusion to the reader.
I have a SQL query with 3 tables joined on a distant MySQL DB- Two of these tables have size of about 18GByte (STEP_RESULT and meas_numericlimit) and then the distant server create a TMP table which takes age (about 25 min) to end
How can I optimize this query ?
select
t1.UUT_NAME,
t1.STATION_NUM,
t1.START_DATE_TIME,
t3.LOW_LIMIT,
t3.DATA,
t3.HIGH_LIMIT,
t3.UNITS,
t2b.STEP_NAME
from
meas_numericlimit t3
inner join STEP_RESULT t2a on t3.ID = t2a.STEP_ID
inner join STEP_RESULT t2b on t2a.STEP_PARENT = t2b.STEP_ID
inner join uut_result t1 on t2b.UUT_RESULT = t1.ID
where
t1.UUT_NAME like 'Variable1-1%' and
t1.STATION_NUM = 'variable2' and
t2b.STEP_NAME = 'variable3' and
t2b.STEP_TYPE = 'constant'
Here the SHOW TABLES and EXPLAIN output queries :
+--------------------+
| Tables_in_spectrum |
+--------------------+
| cal_dates |
| calibrage |
| execution_time |
| meas_numericlimit |
| station_feature |
| step_callexe |
| step_graph |
| step_msgjnl |
| step_msgpopup |
| step_passfail |
| step_result |
| step_seqcall |
| step_stringvalue |
| syst_event |
| uptime |
| users |
| uut_result |
+--------------------+
and
+----+-------------+-------+--------+-------------------------+---------+
| id | select_type | table | type | possible_keys | key |
+----+-------------+-------+--------+-------------------------+---------+
| 1 | SIMPLE | t2a | ALL | NULL | NULL |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY |
| 1 | SIMPLE | t2b | ALL | NULL | NULL |
| 1 | SIMPLE | t1 | eq_ref | PRIMARY,FK_uut_result_1 | PRIMARY |
+----+-------------+-------+--------+-------------------------+---------+
---------+----------------------+----------- +---------------------------+
key_len | ref | rows | Extra |
---------+----------------------+----------- +---------------------------+
NULL | NULL | 48120004 | |
40 | spectrum.t2a.STEP_ID | 1 | |
NULL | NULL | 48120004 | Using where; Using join
buffer |
40 | spectrum.t2b.UUT_RESULT | 1 | Using where |
-------+----------------------+------------+---------------------------+
Here the SHOW CREATE TABLE :
CREATE TABLE `uut_result` (
`ID` varchar(38) NOT NULL DEFAULT '',
`STATION_NUM` varchar(255) DEFAULT NULL,
`SOFTVER_ODTGEN` varchar(10) DEFAULT NULL,
`HARDVER_ODTGEN` varchar(10) DEFAULT NULL,
`NEXT_CAL_DATE` date DEFAULT NULL,
`UUT_NAME` varchar(255) DEFAULT NULL,
`UUT_SERIAL_NUMBER` varchar(255) DEFAULT NULL,
`UUT_VERSION` varchar(255) DEFAULT NULL,
`USER_LOGIN_NAME` varchar(255) DEFAULT NULL,
`USER_LOGIN_LOGIN` varchar(255) NOT NULL DEFAULT '',
`START_DATE_TIME` datetime DEFAULT NULL,
`EXECUTION_TIME` float DEFAULT NULL,
`UUT_STATUS` varchar(255) DEFAULT NULL,
`UUT_ERROR_CODE` int(11) DEFAULT NULL,
`UUT_ERROR_MESSAGE` varchar(1023) DEFAULT NULL,
`PAT_NAME` varchar(255) NOT NULL DEFAULT '',
`PAT_VERSION` varchar(10) NOT NULL DEFAULT '',
`TEST_LEVEL` varchar(50) DEFAULT NULL,
`INTERFACE_ID` int(10) unsigned NOT NULL DEFAULT '0',
`EXECUTION_MODE` varchar(45) DEFAULT NULL,
`LOOP_MODE` varchar(45) DEFAULT NULL,
`STOP_ON_FAIL` tinyint(4) unsigned NOT NULL DEFAULT '0',
`EXECUTION_COMMENT` text,
PRIMARY KEY (`ID`),
KEY `FK_uut_result_1` (`STATION_NUM`)
) ENGINE=MyISAM DEFAULT CHARSET=latin;
and
CREATE TABLE `meas_numericlimit` (
`ID` varchar(38) NOT NULL DEFAULT '',
`STEP_RESULT` varchar(38) NOT NULL DEFAULT '',
`NAME` varchar(255) DEFAULT NULL,
`COMP_OPERATOR` varchar(30) DEFAULT NULL,
`HIGH_LIMIT` double DEFAULT NULL,
`LOW_LIMIT` double DEFAULT NULL,
`UNITS` varchar(255) DEFAULT NULL,
`DATA` double DEFAULT NULL,
`STATUS` varchar(255) DEFAULT NULL,
`FORMAT` varchar(15) DEFAULT NULL,
`NANDATA` int(11) DEFAULT '0',
PRIMARY KEY (`ID`),
KEY `FK_meas_numericlimit_1` (`STEP_RESULT`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
and
CREATE TABLE `step_result` (
`ID` varchar(38) NOT NULL DEFAULT '',
`UUT_RESULT` varchar(38) NOT NULL DEFAULT '',
`STEP_PARENT` varchar(38) DEFAULT NULL,
`STEP_NAME` varchar(255) DEFAULT NULL,
`STEP_ID` varchar(38) NOT NULL DEFAULT '',
`STEP_TYPE` varchar(255) DEFAULT NULL,
`STATUS` varchar(255) DEFAULT NULL,
`REPORT_TEXT` text,
`DIAG` text,
`ERROR_OCCURRED` tinyint(1) NOT NULL DEFAULT '0',
`ERROR_CODE` int(11) DEFAULT NULL,
`ERROR_MESSAGE` varchar(1023) DEFAULT NULL,
`MODULE_TIME` float DEFAULT NULL,
`TOTAL_TIME` float DEFAULT NULL,
`NUM_LOOPS` int(11) DEFAULT NULL,
`NUM_PASSED` int(11) DEFAULT NULL,
`NUM_FAILED` int(11) DEFAULT NULL,
`ENDING_LOOP_INDEX` int(11) DEFAULT NULL,
`LOOP_INDEX` int(11) DEFAULT NULL,
`INTERACTIVE_EXENUM` int(11) DEFAULT NULL,
`STEP_GROUP` varchar(30) DEFAULT NULL,
`STEP_INDEX` int(11) DEFAULT NULL,
`ORDER_NUMBER` int(11) DEFAULT NULL,
PRIMARY KEY (`ID`),
KEY `FK_step_result_1` (`UUT_RESULT`),
KEY `IDX_step_parent` (`STEP_PARENT`)
) ENGINE=MyISAM DEFAULT CHARSET=latin
Thank you for your help
What is the value of join_buffer_size? It should not be more than about 1% of RAM. If it is much bigger, you run the risk of swapping, which is especially bad for performance.
One thing jumps out in the EXPLAIN: NULL | 48120004 saying that this is needed: INDEX(STEP_ID);
However, the SELECT and the EXPLAIN do not seem to match. Please double check.
uut_result needs INDEX(station_num, uut_name) -- in that order; replaces just (station_num).
What is varchar(38)? UUIDs are only 36. IPv6 needs 39.
UUIDs are terribly inefficient when the data is too big to be cached. More discussion: http://mysql.rjweb.org/doc.php/uuid
Lots of datatypes could (should) be shrunken -- this shrinkage will cut down on I/O, which will speed up queries. If you provide some sample values for some typical columns, I can give more advice.
For example, STATUS is (usually) a small number of distinct values. That could be represented as a 1-byte ENUM or a 1-byte TINYINT; but maybe your app has hundreds of different status values? If so, "normalizing" it may be the better answer.
DOUBLE takes 8 bytes; FLOAT takes only 4 bytes, but limits precision to only ~7 significant digits -- perhaps that is sufficient?
(Presumably you meant latin1, not latin?)
Also consider switching to InnoDB.
How much RAM do you have? How big (GB) are the tables?
I try to join a filed that is a int(13) on to a field that is varchar(50).
If I only use (a.id = b.id) the DESCRIBE says type: ref.
If I use (a.id = CONCAT(b.id)) the DESCRIBE says type: eq_ref. (where b.id is the integer)
The use of CONCAT to cast a field is ugly, so I tried to use CAST() or CONVERT().
If I use (a.id = CAST(b.id AS CHAR(50))) the DESCRIBE says type: ref.
How do I write a correct cast/convert, that gives a eq_ref join?
UPDATE 1:
DESCRIBE SELECT.. with CONCAT
+------+-------------+-----------------------+--------+-----------------------------------+----------------+---------+-------------------------------------+------+----------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-----------------------+--------+-----------------------------------+----------------+---------+-------------------------------------+------+----------------------------------------+
| 1 | SIMPLE | ext_icecat_prodmatch | ref | PRIMARY,our_article_id,product_id | our_article_id | 152 | const | 3016 | Using index condition; Using temporary |
| 1 | SIMPLE | ext_icecat_product | eq_ref | PRIMARY,product_id | PRIMARY | 4 | ext_icecat_prodmatch.product_id | 1 | |
| 1 | SIMPLE | ext_icecat_supplier | eq_ref | PRIMARY | PRIMARY | 4 | ext_icecat_product.supplier_id | 1 | |
| 1 | SIMPLE | products | eq_ref | PRIMARY | PRIMARY | 152 | ext_icecat_prodmatch.our_article_id | 1 | |
| 1 | SIMPLE | partner_product_saved | eq_ref | PRIMARY | PRIMARY | 155 | const,func | 1 | Using where |
| 1 | SIMPLE | category_names | eq_ref | PRIMARY | PRIMARY | 6 | products.category_id,const | 1 | Using where |
+------+-------------+-----------------------+--------+-----------------------------------+----------------+---------+-------------------------------------+------+----------------------------------------+
The Select:
SELECT
partner_product_saved.*,
ext_icecat_product.product_id,
CONCAT(ext_icecat_supplier.name, ' ', ext_icecat_product.name) AS export_product_name,
ext_icecat_product.catid_match AS category_id,
GROUP_CONCAT(ext_icecat_prodmatch.our_article_id) AS oais,
products.file_name,
category_names.category_path
FROM ext_icecat_product
LEFT JOIN ext_icecat_prodmatch USING (product_id)
LEFT JOIN ext_icecat_supplier USING (supplier_id)
LEFT JOIN products USING (our_article_id)
LEFT JOIN partner_product_saved ON (partner_product_saved.partner_id = 29 AND partner_product_saved.product_id = CONCAT(ext_icecat_product.product_id))
LEFT JOIN category_names ON (category_names.category_id = products.category_id AND category_names.language_id = 2)
WHERE ext_icecat_prodmatch.our_article_id = '0EF03850-D25A-1174-BCDC-EC67352010A6'
GROUP BY ext_icecat_product.product_id
ORDER BY NULL;
SHOW CREATE TABLE
CREATE TABLE `partner_product_saved` (
`partner_id` mediumint(8) NOT NULL,
`product_id` varchar(50) CHARACTER SET utf8 NOT NULL,
`product_name` varchar(100) CHARACTER SET utf8 NOT NULL,
`our_article_id` varchar(50) CHARACTER SET utf8 DEFAULT NULL,
`our_category_id` mediumint(8) DEFAULT NULL,
`manufacture_id` mediumint(8) DEFAULT NULL,
`manufacturer_partnr` varchar(255) COLLATE utf8_bin NOT NULL,
`manufacturer_upc` varchar(255) COLLATE utf8_bin NOT NULL,
`image` tinytext COLLATE utf8_bin NOT NULL,
`image_small` tinytext COLLATE utf8_bin NOT NULL,
`image_big` tinytext COLLATE utf8_bin NOT NULL,
`image_200` tinytext COLLATE utf8_bin NOT NULL,
`image_original` tinytext COLLATE utf8_bin NOT NULL,
`image_width` int(11) DEFAULT NULL,
`image_height` int(11) DEFAULT NULL,
`birth` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`last_updated` timestamp NULL DEFAULT NULL,
`saved` tinyint(3) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`partner_id`,`product_id`),
KEY `our_article_id` (`our_article_id`),
KEY `our_category_id` (`our_category_id`),
KEY `manufacture_id` (`manufacture_id`,`manufacturer_partnr`),
KEY `manufacturer_upc` (`manufacturer_upc`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
CREATE TABLE `ext_icecat_product` (
`product_id` int(13) NOT NULL,
`supplier_id` int(13) NOT NULL DEFAULT '0',
`prod_id` varchar(235) COLLATE utf8_bin NOT NULL DEFAULT '',
`prod_id_clean` varchar(255) CHARACTER SET utf8 NOT NULL,
`catid` int(13) NOT NULL DEFAULT '0',
`catid_match` varchar(50) CHARACTER SET utf8 NOT NULL,
`name` varchar(255) CHARACTER SET utf8 NOT NULL,
`name_clean` varchar(255) CHARACTER SET utf8 NOT NULL,
`low_pic` varchar(255) COLLATE utf8_bin NOT NULL DEFAULT '',
`high_pic` varchar(255) COLLATE utf8_bin NOT NULL DEFAULT '',
`thumb_pic` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`family_id` int(13) NOT NULL DEFAULT '0',
`low_pic_size` int(13) DEFAULT '0',
`high_pic_size` int(13) DEFAULT '0',
`thumb_pic_size` int(13) DEFAULT '0',
`import_date` datetime NOT NULL,
`release_date` datetime NOT NULL,
`updated` datetime NOT NULL,
`need_update` tinyint(1) NOT NULL DEFAULT '0',
`deleted` tinyint(1) NOT NULL DEFAULT '0',
`keyword` tinyint(1) NOT NULL DEFAULT '0',
`special_match` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`product_id`),
KEY `supplier_id` (`supplier_id`),
KEY `catid` (`catid`),
KEY `prod_id` (`prod_id`),
KEY `product_id` (`product_id`,`prod_id`,`supplier_id`),
KEY `release_Date` (`release_date`),
KEY `prod_id_clean` (`prod_id_clean`),
KEY `name_clean` (`name_clean`),
KEY `need_update` (`need_update`),
KEY `deleted` (`deleted`),
KEY `keyword` (`keyword`),
KEY `catid_2` (`catid`,`import_date`),
KEY `catid_match` (`catid_match`),
KEY `special_match` (`special_match`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
WHERE indexed_column = any_function(any_column) -- can use index. WHERE non_indexed_column = any_function(indexed_column) -- cannot use index.
The difference between ref and eq_ref is minor. I think that eq_ref is where the optimizer decides that there cannot be more than one match, often because of UNIQUE.
WHERE ext_icecat_prodmatch.our_article_id = '0EF03850-D25A-1174-BCDC-EC67352010A6' -- is our_article_id INDEXed? or UNIQUE? Sounds like it is only an INDEX, so multiple rows might ensue. To make it eq_ref, you need UNIQUE. But only if the data supports such. The stats imply there might be 3016 rows with that article_id.
Do not use LEFT unless you need it. Note how the Optimizer turned LEFT JOIN ext_icecat_prodmatch USING (product_id) into JOIN and decided (rightly) to start with ext_icecat_prodmatch.
Back to other discussions...
AND partner_product_saved.product_id = CONCAT(ext_icecat_product.product_id))
can go one way, but not the other. That is, it can efficiently go from eip to pps, but not the other way. And EXPLAIN indicated such with const,func.
I am still new to SQL and I am trying to improve the performance of my query. I have been searching around and have come to the conclusion that using JOINS instead of so many WHERE INS would help improve my performance, but I am unsure of how I would convert my statement. This is my current statement.
SELECT stop_id, stop_name FROM stops WHERE stop_id IN (
SELECT DISTINCT stop_id FROM stop_times WHERE trip_id IN (
SELECT trip_id from trips WHERE route_id = <routeid> ));
It takes anywhere from 5-25 seconds to return the results which is unacceptable. I was hoping to get it below 1 second. If anyone was wondering the data is from a GTFS feed. The stops and trips tables have about ~10,000 rows each, while the stop_times table has ~900,000. I have created indexes at each of the columns I am using. Here is the output of EXPLAIN, and also what was used to create each table.
Thanks for any help and if you need any more info let me know!
+----+--------------------+------------+-----------------+------------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+-----------------+------------------+---------+---------+------+------+-------------+
| 1 | PRIMARY | stops | ALL | NULL | NULL | NULL | NULL | 6481 | Using where |
| 2 | DEPENDENT SUBQUERY | stop_times | index_subquery | stop_id | stop_id | 63 | func | 63 | Using where |
| 3 | DEPENDENT SUBQUERY | trips | unique_subquery | PRIMARY,route_id | PRIMARY | 62 | func | 1 | Using where |
+----+--------------------+------------+-----------------+------------------+---------+---------+------+------+-------------+
| stops | CREATE TABLE `stops` (
`stop_id` varchar(20) NOT NULL,
`stop_code` varchar(50) DEFAULT NULL,
`stop_name` varchar(255) DEFAULT NULL,
`stop_desc` varchar(255) DEFAULT NULL,
`stop_lat` decimal(8,6) DEFAULT NULL,
`stop_lon` decimal(8,6) DEFAULT NULL,
`zone_id` int(11) DEFAULT NULL,
`stop_url` varchar(255) DEFAULT NULL,
`location_type` int(2) DEFAULT NULL,
`parent_station` int(11) DEFAULT NULL,
`wheelchair_boarding` int(2) DEFAULT NULL,
PRIMARY KEY (`stop_id`),
KEY `zone_id` (`zone_id`),
KEY `stop_lat` (`stop_lat`),
KEY `stop_lon` (`stop_lon`),
KEY `location_type` (`location_type`),
KEY `parent_station` (`parent_station`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
| stop_times | CREATE TABLE `stop_times` (
`trip_id` varchar(20) DEFAULT NULL,
`arrival_time` varchar(8) DEFAULT NULL,
`arrival_time_seconds` int(11) DEFAULT NULL,
`departure_time` varchar(8) DEFAULT NULL,
`departure_time_seconds` int(11) DEFAULT NULL,
`stop_id` varchar(20) DEFAULT NULL,
`stop_sequence` int(11) DEFAULT NULL,
`stop_headsign` varchar(50) DEFAULT NULL,
`pickup_type` int(2) DEFAULT NULL,
`drop_off_type` int(2) DEFAULT NULL,
`shape_dist_traveled` varchar(50) DEFAULT NULL,
KEY `trip_id` (`trip_id`),
KEY `arrival_time_seconds` (`arrival_time_seconds`),
KEY `departure_time_seconds` (`departure_time_seconds`),
KEY `stop_id` (`stop_id`),
KEY `stop_sequence` (`stop_sequence`),
KEY `pickup_type` (`pickup_type`),
KEY `drop_off_type` (`drop_off_type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
| trips | CREATE TABLE `trips` (
`route_id` varchar(20) DEFAULT NULL,
`service_id` varchar(20) DEFAULT NULL,
`trip_id` varchar(20) NOT NULL,
`trip_headsign` varchar(255) DEFAULT NULL,
`trip_short_name` varchar(255) DEFAULT NULL,
`direction_id` tinyint(1) DEFAULT NULL,
`block_id` int(11) DEFAULT NULL,
`shape_id` varchar(50) DEFAULT NULL,
PRIMARY KEY (`trip_id`),
KEY `route_id` (`route_id`),
KEY `service_id` (`service_id`),
KEY `direction_id` (`direction_id`),
KEY `block_id` (`block_id`),
KEY `shape_id` (`shape_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
You're right in thinking that JOINS are usually faster than WHERE IN subqueries.
Try this:
SELECT T3.stop_id, T3.stop_name
FROM trips AS T1
JOIN
stop_times AS T2
ON T1.trip_id=T2.trip_id AND route_id = <routeid>
JOIN stops AS T3
ON T2.stop_id=T3.stop_id
GROUP BY T3.stop_id, T3.stop_name
The basics of this query have been asked, and answered, many times before, but I'm still having trouble with performance. Here are the details:
I have the table, Products, that has 105724 rows.
I have an update table, _e360products, that has 51813 rows.
I am matching on an alphanumeric 10 character code, that is indexed (unique) on both tables.
I have tried:
SELECT _e360products.Product_Code, products.StockCode
FROM _e360products Left Join Products ON _e360products.Product_Code = Products.StockCode
WHERE products.StockCode IS NULL
and:
SELECT Product_Code
FROM _e360products
WHERE Product_code NOT IN (SELECT StockCode FROM Products)
and, just for a laugh, even:
SELECT Product_Code
FROM _e360products
WHERE (SELECT count(*) FROM Products WHERE StockCode = Product_code) = 0
None of these have returned results within 20 mins!
If I reverse the queries, i.e. getting unique rows from _e360products, I get results very quickly.
Does anyone have any ideas?
~~~~~ Update ~~~~~
Explain results are:
+----+-------------+---------------+--------+---------------+--------------+---------+------------------------------------------+-------+--------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+--------+---------------+--------------+---------+------------------------------------------+-------+--------------------------------------+
| 1 | SIMPLE | _e360products | index | NULL | Product_code | 12 | NULL | 50811 | Using index |
| 1 | SIMPLE | Products | eq_ref | stockcode | stockcode | 12 | plumbase_bkup._e360products.Product_code | 1 | Using where; Using index; Not exists |
+----+-------------+---------------+--------+---------------+--------------+---------+------------------------------------------+-------+--------------------------------------+
CREATE TABLE `_e360products` (
`Product_code` varchar(10) CHARACTER SET latin1 NOT NULL DEFAULT '',
`Manufacturers_code` varchar(255) DEFAULT '',
`Description` varchar(255) DEFAULT '',
`Supplier` varchar(255) DEFAULT '',
`Price` varchar(20) DEFAULT '',
`VAT` varchar(20) DEFAULT '',
`Analysis_code` varchar(20) DEFAULT NULL,
PRIMARY KEY (`Product_code`),
KEY `Product_code` (`Product_code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `products` (
`productid` int(11) NOT NULL AUTO_INCREMENT,
`QPUM2` varchar(50) NOT NULL DEFAULT '1',
`NWID` varchar(50) NOT NULL DEFAULT '0',
`NHEI` varchar(50) NOT NULL DEFAULT '0',
`NLEN` varchar(50) NOT NULL DEFAULT '0',
`donotdisplayprice` tinyint(2) DEFAULT '0',
`productname` text,
`stockcode` varchar(10) NOT NULL DEFAULT '',
`analysiscode` varchar(50) DEFAULT '',
`usestockcontrol` int(11) DEFAULT '0',
`stockvalue` int(11) DEFAULT '0',
`stock_notification_level` int(11) DEFAULT '0',
`sectionid` int(11) DEFAULT '0',
`productprice` varchar(50) DEFAULT '',
`productprice_incvat` varchar(50) DEFAULT '',
`deleted` int(11) DEFAULT '0',
PRIMARY KEY (`productid`),
UNIQUE KEY `stockcode` (`stockcode`) USING BTREE,
KEY `deleted` (`deleted`),
KEY `allowordering` (`allowordering`),
) ENGINE=MyISAM AUTO_INCREMENT=147440 DEFAULT CHARSET=latin1;
NoteL Products table doesn't include ALL the fields, as there are quite a few...
Please provide a query execution plan (EXPLAIN), it seems your index is not used. Also show as CREATE TABLEs for both tables.
Typo? StockCode[add space here]IS
SELECT _e360products.Product_Code, products.StockCode
FROM _e360products Left Join Products ON _e360products.Product_Code = Products.StockCode
WHERE products.StockCode IS NULL