I has table with the same schema
CREATE TABLE `stock` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`currency` varchar(3) COLLATE utf8_unicode_ci NOT NULL,
`against` varchar(3) COLLATE utf8_unicode_ci NOT NULL,
`date` date NOT NULL,
`time` time NOT NULL,
`rate` double(8,4) NOT NULL,
`ask` double(8,4) NOT NULL,
`bid` double(8,4) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
KEY `stock_currency_index` (`currency`),
KEY `stock_against_index` (`against`),
KEY `stock_date_index` (`date`),
KEY `stock_time_index` (`time`),
KEY `created_at_index` (`created_at`)
) ENGINE=InnoDB AUTO_INCREMENT=244221 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
When i execute this query mysql has using index
mysql> explain select max(id) from stock group by currency;
+----+-------------+-------+------------+-------+----------------------+----------------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+----------------------+----------------------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | stock | NULL | range | stock_currency_index | stock_currency_index | 11 | NULL | 2 | 100.00 | Using index for group-by |
+----+-------------+-------+------------+-------+----------------------+----------------------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
Also when i am executing this query mysql has using primary index too
mysql> explain select * from stock where id in (244221, 244222);
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | stock | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
BUT when i am combine these two queries PRIMARY index are not using... i am confused. What i am doing wrong?
mysql> explain select * from stock where id in (select max(id) from stock group by currency);
+----+-------------+-------+------------+-------+----------------------+----------------------+---------+------+--------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+----------------------+----------------------+---------+------+--------+----------+--------------------------+
| 1 | PRIMARY | stock | NULL | ALL | NULL | NULL | NULL | NULL | 221800 | 100.00 | Using where |
| 2 | SUBQUERY | stock | NULL | range | stock_currency_index | stock_currency_index | 11 | NULL | 2 | 100.00 | Using index for group-by |
+----+-------------+-------+------------+-------+----------------------+----------------------+---------+------+--------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
First, try rewriting the query as:
select s.*
from stock s join
(select max(id) as maxid
from stock
group by currency
) ss
on ss.maxid = s.id;
Second, I would be tempted to put an index on stock(currency, id) and to use:
select s.*
from stock s
where s.id = (select max(s2.id) from stock s2 where s2.currency = s.currency);
Do either of these perform better?
Related
Given this table:
CREATE TABLE `queue` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`type` int(10) unsigned NOT NULL,
`posted_on` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
`status` enum('pending','complete','error') NOT NULL DEFAULT 'pending',
`body` blob NOT NULL,
`process_id` int(10) unsigned DEFAULT NULL,
`acquired_on` datetime(6) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `acquiredon` (`acquired_on`),
KEY `type_status_processid_postedon` (`type`,`status`,`process_id`,`posted_on`) USING BTREE
);
When I do a select on this table, it makes proper/full use of the index:
EXPLAIN SELECT *
FROM `queue`
FORCE INDEX (`type_status_processid_postedon`)
WHERE type = 1
AND `status` = 'pending'
AND `process_id` IS NULL
ORDER BY `posted_on` ASC
LIMIT 1;
+----+-------------+-------+------------+------+--------------------------------+--------------------------------+---------+-------------------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+--------------------------------+--------------------------------+---------+-------------------+------+----------+-----------------------+
| 1 | SIMPLE | queue | NULL | ref | type_status_processid_postedon | type_status_processid_postedon | 10 | const,const,const | 1 | 100.00 | Using index condition |
+----+-------------+-------+------------+------+--------------------------------+--------------------------------+---------+-------------------+------+----------+-----------------------+
And yet, when I do the same query as an UPDATE, the index is not fully used.
EXPLAIN UPDATE `queue`
FORCE INDEX(`type_status_processid_postedon`)
SET `process_id` = 1
WHERE `type` = 1
AND `status` = 'pending'
AND `process_id` IS NULL
ORDER BY `posted_on` ASC
LIMIT 1;
+----+-------------+-------+------------+-------+--------------------------------+--------------------------------+---------+-------------------+------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+--------------------------------+--------------------------------+---------+-------------------+------+----------+-----------------------------+
| 1 | UPDATE | queue | NULL | range | type_status_processid_postedon | type_status_processid_postedon | 10 | const,const,const | 1 | 100.00 | Using where; Using filesort |
+----+-------------+-------+------------+-------+--------------------------------+--------------------------------+---------+-------------------+------+----------+-----------------------------+
The update does a filesort. What's going on here?
I have this query:
EXPLAIN EXTENDED
SELECT DISTINCT
PMS_STAGIONI.DINIZVAL,
PMS_STAGIONI.DFINEVAL,
PMS_DISPO.DDATA
FROM
PMS_DISPO JOIN PMS_STAGIONI
HAVING
PMS_DISPO.DDATA BETWEEN PMS_STAGIONI.DINIZVAL AND PMS_STAGIONI.DFINEVAL
The output of explain is:
+----+-------------+--------------+-------+---------------+------------------------------+---------+------+------+----------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+-------+---------------+------------------------------+---------+------+------+----------+--------------------------------+
| 1 | SIMPLE | PMS_STAGIONI | index | NULL | IDX_INIZFINEVAL_PMS_STAGIONI | 6 | NULL | 3 | 100.00 | Using index; Using temporary |
| 1 | SIMPLE | PMS_DISPO | index | NULL | IDX_DDATA_PMS_DISPO | 3 | NULL | 1199 | 100.00 | Using index; Using join buffer |
+----+-------------+--------------+-------+---------------+------------------------------+---------+------+------+----------+--------------------------------+
My question is how to calculate the product of the join using explain. For example, in this case are performed 3597 (1199x3) scans or only 1199?
1)If I add "ORDER BY DDATA" lines scanned in the table "PMS_DISPO" become 1130.
2)If I use the "WHERE" clause instead of "HAVING" clause scan no longer uses the indexes. How is it possible?
3)If i want show PMS_STAGIONI.CSTAGIONI (primary key) explain show me that:
+----+-------------+--------------+-------+---------------+---------------------+---------+------+------+----------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+-------+---------------+---------------------+---------+------+------+----------+--------------------------------+
| 1 | SIMPLE | PMS_STAGIONI | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using temporary |
| 1 | SIMPLE | PMS_DISPO | index | NULL | IDX_DDATA_PMS_DISPO | 3 | NULL | 1130 | 100.00 | Using index; Using join buffer |
+----+-------------+--------------+-------+---------------+---------------------+---------+------+------+----------+--------------------------------+
How can I force the use of the other index?
Thanks in advance.
Edit:
The structure of "PMS_DISPO" is:
CREATE TABLE IF NOT EXISTS `PMS_DISPO` (
`ID` int(11) NOT NULL AUTO_INCREMENT ,
`CPRENOTA` int(11) NOT NULL,
`DDATA` date NOT NULL,
`CCATRIS` int(4) NOT NULL,
`NQUANT` int(4) NOT NULL,
`CAZIENDA` int(4) NOT NULL,
`CAFFILIATO` int(4) NOT NULL,
PRIMARY KEY (`ID`),
KEY `IDX_DDATA_PMS_DISPO` (`DDATA`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1084 ;
And "PMS_STAGIONI" is:
CREATE TABLE IF NOT EXISTS `PMS_STAGIONI` (
`CSTAGIONE` int(11) NOT NULL,
`NVALIDI` tinyint(2) NOT NULL,
`BECCEZIONE` tinyint(1) NOT NULL,
`AGGSET` varchar(7) DEFAULT NULL,
`DINIZVAL` date NOT NULL,
`DFINEVAL` date NOT NULL,
`CAZIENDA` int(4) NOT NULL,
`ID` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`ID`),
KEY `CSTAGIONE` (`CSTAGIONE`),
KEY `IDX_INIZFINEVAL_PMS_STAGIONI` (`DINIZVAL`,`DFINEVAL`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=4 ;
A query of this sort would normally be written as follows, with indexes just as you have them...
SELECT DISTINCT s.dinizval
, s.dfineval
, d.ddata
FROM pms_dispo d
JOIN pms_stagioni s
ON d.ddata BETWEEN s.dinizval AND s.dfineval
This is the shortversion of my query:
SELECT product.* FROM product_list product
LEFT JOIN language_item language ON (product.title=language.languageVariable)
WHERE language.languageID = 1
ORDER BY language.languageValue ASC
When I use it, the query has 3 seconds. When I remove the order by the query has 0.3 seconds. Can you recommend a change to accelerate it?
product.title and language.languageVariable is a language variable like global.product.title1, and languageValue is the title like car, doll or something else.
CREATE TABLE `language_item` (
`languageItemID` int(10) UNSIGNED NOT NULL,
`languageID` int(10) UNSIGNED NOT NULL DEFAULT '0',
`languageVariable` varchar(255) NOT NULL DEFAULT '',
`languageValue` mediumtext NOT NULL,
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
ALTER TABLE `language_item`
ADD PRIMARY KEY (`languageItemID`),
ADD UNIQUE KEY `languageVariable` (`languageVariable`,`languageID`),
ADD KEY `languageValue` (`languageValue`(300));
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra
1 | SIMPLE | product | NULL | ALL | PRIMARY,inactive,archive,productCategoryID | NULL | NULL | NULL | 1475 | 88.27 | Using where; Using temporary; Using filesort
1 | SIMPLE | language | NULL | ref | languageVariable | languageVariable | 767 | db.product.title | 136 | 1.00 | Using index condition
Here is the structur from language_item with the index:
CREATE TABLE `language_item` (
`languageItemID` int(10) UNSIGNED NOT NULL,
`languageID` int(10) UNSIGNED NOT NULL DEFAULT '0',
`languageVariable` varchar(255) NOT NULL DEFAULT '',
`languageValue` mediumtext NOT NULL,
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
ALTER TABLE `language_item`
ADD PRIMARY KEY (`languageItemID`),
ADD UNIQUE KEY `languageVariable` (`languageVariable`,`languageID`),
ADD KEY `languageValue` (`languageValue`(300));
The Explain:
id | select_type | table | partitions | type | possible_keys | key |
key_len | ref | rows | filtered | Extra 1 | SIMPLE | product | NULL |
ALL | PRIMARY,inactive,archive,productCategoryID | NULL | NULL | NULL
| 1475 | 88.27 | Using where; Using temporary; Using filesort 1 |
SIMPLE | language | NULL | ref | languageVariable | languageVariable |
767 | db.product.title | 136 | 1.00 | Using index condition
TRy this:
SELECT d.* from (
SELECT product.*, language.languageValue AS lv
FROM product_list product
JOIN language_item language ON (product.title=language.languageVariable)
WHERE language.languageID = 1
) as d
ORDER BY d.lv ASC
I have a table with 18,310,298 records right now.
And next query
SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE
`event` = '11' AND
`obj_type` = '2' AND
`region` = 'us' AND
DATE(`date`) = DATE('20120213010502');
With next structure
CREATE TABLE `logs` (
`log_id` int(11) NOT NULL AUTO_INCREMENT,
`event` tinyint(4) NOT NULL,
`obj_type` tinyint(1) NOT NULL DEFAULT '0',
`obj_id` int(11) unsigned NOT NULL DEFAULT '0',
`region` varchar(3) NOT NULL DEFAULT '',
`date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`log_id`),
KEY `event` (`event`),
KEY `obj_type` (`obj_type`),
KEY `region` (`region`),
KEY `for_stat` (`event`,`obj_type`,`obj_id`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |
and MySQL explain show the next
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
| 1 | SIMPLE | logs | ref | event,obj_type,region,for_stat | for_stat | 2 | const,const | 837216 | 100.00 | Using where; Using index |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
Running such query in daily peak usage time take about 5 seconds.
What can I do to make it faster ?
UPDATED: Regarding all comments I modified INDEX and take off DATE function in WHERE clause
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| logs | 0 | PRIMARY | 1 | log_id | A | 15379109 | NULL | NULL | | BTREE | |
| logs | 1 | event | 1 | event | A | 14 | NULL | NULL | | BTREE | |
| logs | 1 | obj_type | 1 | obj_type | A | 14 | NULL | NULL | | BTREE | |
| logs | 1 | region | 1 | region | A | 14 | NULL | NULL | | BTREE | |
| logs | 1 | for_stat | 1 | event | A | 157 | NULL | NULL | | BTREE | |
| logs | 1 | for_stat | 2 | obj_type | A | 157 | NULL | NULL | | BTREE | |
| logs | 1 | for_stat | 3 | region | A | 157 | NULL | NULL | | BTREE | |
| logs | 1 | for_stat | 4 | date | A | 157 | NULL | NULL | | BTREE | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
mysql> explain extended SELECT COUNT(obj_id) as cnt
-> FROM `common`.`logs`
-> WHERE `event`= '11' AND
-> `obj_type` = '2' AND
-> `region`= 'est' AND
-> date between '2012-11-25 00:00:00' and '2012-11-25 23:59:59';
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
| 1 | SIMPLE | logs | range | event,obj_type,region,for_stat | for_stat | 21 | NULL | 9674 | 75.01 | Using where |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
It seems it's running faster. Thanks everyone.
The EXPLAIN output shows that the query is using only the first two columns of the for_stat index.
This is because the query doesn't use obj_id in the WHERE clause. If you create a new key without obj_id (or modify the existing key to reorder the columns), more of the key can be used and you may see better performance:
KEY `for_stat2` (`event`,`obj_type`,`region`,`date`)
If it's still too slow, changing the last condition, where you use DATE(), as said by Salman and Sashi, might improve things.
#Joni already explained what is wrong with your index. For query, I assume that your example query selects all records for 2012-02-13 regardless of time. You can change the where clause to use >= and < instead of DATE cast:
SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE
`event` = 11 AND
`obj_type` = 2 AND
`region` = 'us' AND
`date` >= DATE('20120213010502') AND
`date` < DATE('20120213010502') + INTERVAL 1 DAY
The date function on the date column is making the full table scan.
Try this ::
SELECT COUNT(obj_id) as cnt
FROM
`common`.`logs`
WHERE
`event` = 11
AND
`obj_type` = 2
AND
`region` = 'us'
AND
`date` = DATE('20120213010502')
As logging (inserts) needs to be fast too, use as less indices as possible.
Evaluation may take long as that is admin, not necessarily needing indices.
CREATE TABLE `logs` (
`log_id` int(11) NOT NULL AUTO_INCREMENT,
`event` tinyint(4) NOT NULL,
`obj_type` tinyint(1) NOT NULL DEFAULT '0',
`obj_id` int(11) unsigned NOT NULL DEFAULT '0',
`region` varchar(3) NOT NULL DEFAULT '',
`date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`log_id`),
KEY `for_stat` (`event`,`obj_type`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |
And about the date search #SashiKant and #SalmanA already answered.
Is Mysql you should place index columns by collation count; less possible values in table - placed closer to the left.
Also you can try to change column region to enum() and try to search date with BETWEEN clause.
Mysql is not using third column in the index because it's usage takes more efforts then just filtering (it's a common thing in Mysql).
I have a couple of tables (products and suppliers) and want to find out which items are no longer listed in the suppliers table.
Table uc_products has the products. Table uc_supplier_csv has supplier stocks. uc_products.model joins against uc_suppliers.sku.
I am seeing very long queries when trying to identify the stock in the products table which are not referred to in the suppliers table. I only want to extract the nid of the entries which match; sid IS NULL is just so I can identify which items don't have a supplier.
For the first of the queries below, it takes the DB server (4GB ram / 2x 2.4GHz intel) an hour to get a result (507 rows). I didn't wait for the second query to finish.
How can I make this query more optimal? Is it due to the mismatched character sets?
I was thinking that the following would be the most efficient SQL to use:
SELECT nid, sid
FROM uc_products p
LEFT OUTER JOIN uc_supplier_csv c
ON p.model = c.sku
WHERE sid IS NULL ;
For this query, I get the following EXPLAIN result:
mysql> EXPLAIN SELECT nid, sid FROM uc_products p LEFT OUTER JOIN uc_supplier_csv c ON p.model = c.sku WHERE sid IS NULL;
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
| 1 | SIMPLE | p | ALL | NULL | NULL | NULL | NULL | 6526 | |
| 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 126639 | Using where; Not exists |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
2 rows in set (0.00 sec)
I would have thought that the keys idx_sku and idx_model would be valid for use here, but they aren't. Is that because the tables' default charsets do not match? One is UTF-8 and one is latin1.
I also considered this form:
SELECT nid
FROM uc_products
WHERE model
NOT IN (
SELECT DISTINCT sku FROM uc_supplier_csv
) ;
EXPLAIN shows the following results for that query:
mysql> explain select nid from uc_products where model not in ( select sku from uc_supplier_csv ) ;
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| 1 | PRIMARY | uc_products | ALL | NULL | NULL | NULL | NULL | 6520 | Using where |
| 2 | DEPENDENT SUBQUERY | uc_supplier_csv | index | idx_sku,idx_sku_stock | idx_sku | 258 | NULL | 126639 | Using where; Using index |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)
And just so I don't miss anything out, here are a few more exciting details: the table sizes and stats, and the table structure :)
mysql> show table status where Name in ( 'uc_supplier_csv', 'uc_products' ) ;
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| uc_products | MyISAM | 10 | Dynamic | 6520 | 89 | 585796 | 281474976710655 | 232448 | 912 | NULL | 2009-04-24 11:03:15 | 2009-10-12 14:23:43 | 2009-04-24 11:03:16 | utf8_general_ci | NULL | | |
| uc_supplier_csv | MyISAM | 10 | Dynamic | 126639 | 26 | 3399704 | 281474976710655 | 5864448 | 0 | NULL | 2009-10-12 14:28:25 | 2009-10-12 14:28:25 | 2009-10-12 14:28:27 | latin1_swedish_ci | NULL | | |
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
and
CREATE TABLE `uc_products` (
`vid` mediumint(9) NOT NULL default '0',
`nid` mediumint(9) NOT NULL default '0',
`model` varchar(255) NOT NULL default '',
`list_price` decimal(10,2) NOT NULL default '0.00',
`cost` decimal(10,2) NOT NULL default '0.00',
`sell_price` decimal(10,2) NOT NULL default '0.00',
`weight` float NOT NULL default '0',
`weight_units` varchar(255) NOT NULL default 'lb',
`length` float unsigned NOT NULL default '0',
`width` float unsigned NOT NULL default '0',
`height` float unsigned NOT NULL default '0',
`length_units` varchar(255) NOT NULL default 'in',
`pkg_qty` smallint(5) unsigned NOT NULL default '1',
`default_qty` smallint(5) unsigned NOT NULL default '1',
`unique_hash` varchar(32) NOT NULL,
`ordering` tinyint(2) NOT NULL default '0',
`shippable` tinyint(2) NOT NULL default '1',
PRIMARY KEY (`vid`),
KEY `idx_model` (`model`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
CREATE TABLE `uc_supplier_csv` (
`sid` int(10) unsigned NOT NULL default '0',
`sku` varchar(255) default NULL,
`stock` int(10) unsigned NOT NULL default '0',
`list_price` decimal(8,2) default '0.00',
KEY `idx_sku` (`sku`),
KEY `idx_stock` (`stock`),
KEY `idx_sku_stock` (`sku`,`stock`),
KEY `idx_sid` (`sid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
EDIT: Adding query plans for a couple of suggested queries from Martin below:
mysql> explain SELECT nid FROM uc_products p WHERE NOT EXISTS ( SELECT 1 FROM uc_supplier_csv c WHERE p.model = c.sku ) ;
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| 1 | PRIMARY | p | ALL | NULL | NULL | NULL | NULL | 6526 | Using where |
| 2 | DEPENDENT SUBQUERY | c | index | NULL | idx_sku | 258 | NULL | 126639 | Using where; Using index |
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)
mysql> explain SELECT nid FROM uc_products WHERE model NOT IN ( SELECT sku FROM uc_supplier_csv ) ;
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| 1 | PRIMARY | uc_products | ALL | NULL | NULL | NULL | NULL | 6526 | Using where |
| 2 | DEPENDENT SUBQUERY | uc_supplier_csv | index | idx_sku,idx_sku_stock | idx_sku | 258 | NULL | 126639 | Using where; Using index |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)
Perhaps try using NOT EXISTS rather than counts? For example:
SELECT nid
FROM uc_products p
WHERE NOT EXISTS (
SELECT 1
FROM uc_supplier_csv c
WHERE p.model = c.sku
)
SO user Quassnoi has a short article outlining some tests that suggest that this might also be worth a try:
SELECT nid
FROM uc_products
WHERE model NOT IN (
SELECT sku
FROM uc_supplier_csv
)
basically as per your original query, without the DISTINCTion.
Another one for you Chris, this time with help for the cross-encoding join:
SELECT nid
FROM uc_products p
WHERE NOT EXISTS (
SELECT 1
FROM uc_supplier_csv c
WHERE CONVERT( p.model USING latin1 ) = c.sku
)