I have two tables:
1st:
CREATE TABLE IF NOT EXISTS `tags` (
`i` int(10) NOT NULL AUTO_INCREMENT,
`id` int(10) NOT NULL,
`k` varchar(32) COLLATE utf8_bin NOT NULL,
PRIMARY KEY (`i`),
KEY `k` (`k`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=1505426 ;
and 2nd:
CREATE TABLE IF NOT EXISTS `w` (
`id` int(7) NOT NULL,
`title` varchar(255) COLLATE utf8_bin NOT NULL,
`k1` varchar(24) COLLATE utf8_bin NOT NULL,
`k2` varchar(24) COLLATE utf8_bin NOT NULL,
`k3` varchar(24) COLLATE utf8_bin NOT NULL,
`k4` varchar(24) COLLATE utf8_bin NOT NULL,
`k5` varchar(24) COLLATE utf8_bin NOT NULL,
`k6` varchar(24) COLLATE utf8_bin NOT NULL,
`k7` varchar(24) COLLATE utf8_bin NOT NULL,
`k8` varchar(24) COLLATE utf8_bin NOT NULL,
`w` int(5) NOT NULL,
`h` int(5) NOT NULL,
`s` varchar(32) COLLATE utf8_bin NOT NULL,
`r` varchar(11) COLLATE utf8_bin NOT NULL,
`v` int(7) NOT NULL,
`c` varchar(32) COLLATE utf8_bin NOT NULL,
`c1` varchar(6) COLLATE utf8_bin NOT NULL,
`c2` varchar(6) COLLATE utf8_bin NOT NULL,
`c3` varchar(6) COLLATE utf8_bin NOT NULL,
`c4` varchar(6) COLLATE utf8_bin NOT NULL,
`c5` varchar(6) COLLATE utf8_bin NOT NULL,
`c6` varchar(6) COLLATE utf8_bin NOT NULL,
`m` varchar(4) COLLATE utf8_bin NOT NULL,
`t` int(10) NOT NULL,
`i` int(6) NOT NULL,
`o` int(6) NOT NULL,
`f` varchar(255) COLLATE utf8_bin NOT NULL,
PRIMARY KEY (`id`),
KEY `keywords` (`k1`,`k2`,`k3`,`k4`,`k5`,`k6`,`k7`,`k8`),
KEY `category` (`c`),
KEY `color1` (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin ROW_FORMAT=DYNAMIC;
I'm trying to search 2 words ($search1 + $search2) from table 'tags' and find id's at 'w' table.
query:
SELECT w.id,w.title,w.k1,w.k2,w.k3,w.k4,w.k5,w.k6,w.k7,w.w,w.h,w.s,w.c1,w.c2,w.c3,w.c4,w.c5,w.c6,w.m,w.f
FROM tags,tags as tags2,w
WHERE tags.k ='$search1' AND tags2.k = '$search2'
and tags.id = tags2.id and tags2.id = w.id
ORDER BY `w`.`id`desc limit 24
The problem is that EXPLAIN function shows "Using where; Using temporary; Using filesort"
Also I believe that there is another way to optimize this my type search system.
Actually all data is in 'w' table, I just have no idea how exactly to search it properly.
I would be grateful for any help.
Edit:a bit more information about this
table 'tags' 1505425 records InnoDB = 65,9 MiB
table 'w' 398900 rercords, InnoDB = 140,3 MiB
after this (updated) query:
select `w`.`id` AS `id`,`w`.`title` AS `title`,`w`.`k1` AS `k1`,`w`.`k2` AS `k2`,`w`.`k3` AS `k3`,`w`.`k4` AS `k4`,`w`.`k5` AS `k5`,`w`.`k6` AS `k6`,`w`.`k7` AS `k7`,`w`.`k8` AS `k8`,`w`.`w` AS `w`,`w`.`h` AS `h`,`w`.`s` AS `s`,`w`.`c1` AS `c1`,`w`.`c2` AS `c2`,`w`.`c3` AS `c3`,`w`.`c4` AS `c4`,`w`.`c5` AS `c5`,`w`.`c6` AS `c6`,`w`.`m` AS `m`,`w`.`f` AS `f` from `tags` join `tags` `tags2` join `w` where ((`tags`.`k` = '$search1') and (`tags2`.`id` = `tags`.`id`) and (`w`.`id` = `tags`.`id`) and (`tags2`.`k` = '$search2')) order by `tags`.`i` desc limit 0,24;
mysql still shows:
+----+-------------+------------+--------+---------------+---------+---------+-------------+------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+--------+---------------+---------+---------+-------------+------+----------+-----------------------------+
| 1 | SIMPLE | tags | ref | k,id | k | 98 | const | 902 | 100.00 | Using where; Using filesort |
| 1 | SIMPLE | tags2 | ref | k,id | id | 4 | db.tags.id | 4 | 100.00 | Using where |
| 1 | SIMPLE | w | eq_ref | PRIMARY | PRIMARY | 4 | db.tags2.id | 1 | 100.00 | Using where |
+----+-------------+------------+--------+---------------+---------+---------+-------------+------+----------+-----------------------------+
3 rows in set, 1 warning (0.00 sec)
One warning, also still everything remains the same, is there any solution for better performance? At the moment it works fine, but later everything will goes slower and slower I think.
You're mixing storage engines; mySQL has a hard time joining tables across storage engines.
Before:
mysql> explain SELECT w.id,w.title,w.k1,w.k2,w.k3,w.k4,w.k5,w.k6,w.k7,w.w,w.h,w.s,w.c1,w.c2,w.c3,w.c4,w.c5,w.c6,w.m,w.f FROM tags,tags as tags2,w WHERE tags.k ='$search1' AND tags2.k = 'search2' and tags.id = tags2.id and tags2.id = w.id ORDER BY `w`.`id`desc limit 24;
+----+-------------+-------+--------+---------------+---------+---------+-------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-------------+------+----------------------------------------------+
| 1 | SIMPLE | tags | ref | k | k | 98 | const | 1 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | tags2 | ref | k | k | 98 | const | 1 | Using where |
| 1 | SIMPLE | w | eq_ref | PRIMARY | PRIMARY | 4 | tmp.tags.id | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+-------------+------+----------------------------------------------+
Change w to MyISAM:
mysql> alter table w engine MyISAM;
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0
After:
mysql> explain SELECT w.id,w.title,w.k1,w.k2,w.k3,w.k4,w.k5,w.k6,w.k7,w.w,w.h,w.s,w.c1,w.c2,w.c3,w.c4,w.c5,w.c6,w.m,w.f FROM tags,tags as tags2,w WHERE tags.k ='$search1' AND tags2.k = 'search2' and tags.id = tags2.id and tags2.id = w.id ORDER BY `w`.`id`desc limit 24;
+----+-------------+-------+--------+---------------+------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+------+---------+-------+------+-------------+
| 1 | SIMPLE | w | system | PRIMARY | NULL | NULL | NULL | 1 | |
| 1 | SIMPLE | tags | ref | k | k | 98 | const | 1 | Using where |
| 1 | SIMPLE | tags2 | ref | k | k | 98 | const | 1 | Using where |
+----+-------------+-------+--------+---------------+------+---------+-------+------+-------------+
3 rows in set (0.00 sec)
Vagiz, your updated query seems OK:
select `w`.`id` AS `id`,`w`.`title` AS `title`,`w`.`k1` AS `k1`,`w`.`k2` AS `k2`,
`w`.`k3` AS `k3`,`w`.`k4` AS `k4`,`w`.`k5` AS `k5`,`w`.`k6` AS `k6`,`w`.
`k7` AS `k7`,`w`.`k8` AS `k8`,`w`.`w` AS `w`,`w`.`h` AS `h`,`w`.`s` AS `s`,
`w`.`c1` AS `c1`,`w`.`c2` AS `c2`,`w`.`c3` AS `c3`,`w`.`c4` AS `c4`,
`w`.`c5` AS `c5`,`w`.`c6` AS `c6`,`w`.`m` AS `m`,`w`.`f` AS `f`
from `tags` join `tags` `tags2` join `w` where ((`tags`.`k` = '$search1')
and (`tags2`.`id` = `tags`.`id`) and (`w`.`id` = `tags`.`id`)
and (`tags2`.`k` = '$search2')) order by `tags`.`i` desc limit 0,24;
If it's producing following explain result:
+----+-------------+------------+--------+---------------+---------+---------+-------------+------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+--------+---------------+---------+---------+-------------+------+----------+-----------------------------+
| 1 | SIMPLE | tags | ref | k,id | k | 98 | const | 902 | 100.00 | Using where; Using filesort |
| 1 | SIMPLE | tags2 | ref | k,id | id | 4 | db.tags.id | 4 | 100.00 | Using where |
| 1 | SIMPLE | w | eq_ref | PRIMARY | PRIMARY | 4 | db.tags2.id | 1 | 100.00 | Using where |
+----+-------------+------------+--------+---------------+---------+---------+-------------+------+----------+-----------------------------+
3 rows in set, 1 warning (0.00 sec)
Don't worry about filesort and performance. From the Mysql Performance Blog:
The truth is, filesort is badly named. Anytime a sort can’t be
performed from an index, it’s a filesort. It has nothing to do with
files. Filesort should be called “sort.” It is quicksort at heart.
Related
I have three tables that are concerned by this query
CREATE TABLE `tags` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`latName` varchar(191) NOT NULL,
`araName` varchar(191) NOT NULL,
`active` tinyint(1) NOT NULL DEFAULT 0,
`img_name` varchar(191) DEFAULT NULL,
`icon` varchar(191) DEFAULT NULL,
`rgba_color` varchar(191) DEFAULT NULL,
`color` varchar(191) DEFAULT NULL,
`overlay` varchar(191) DEFAULT NULL,
`position` int(11) NOT NULL,
`mdi_icon` varchar(191) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `tags_latname_unique` (`latName`),
UNIQUE KEY `tags_araname_unique` (`araName`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8mb3
CREATE TABLE `newspapers` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`latName` varchar(191) NOT NULL,
`araName` varchar(191) NOT NULL,
`img_name` varchar(191) DEFAULT NULL,
`active` tinyint(1) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`),
UNIQUE KEY `newspapers_latname_unique` (`latName`),
UNIQUE KEY `newspapers_araname_unique` (`araName`)
) ENGINE=InnoDB AUTO_INCREMENT=21 DEFAULT CHARSET=utf8mb3
CREATE TABLE `articles` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`newspaper_id` bigint(20) unsigned NOT NULL,
`tag_id` bigint(20) unsigned NOT NULL,
`seen` int(10) unsigned NOT NULL,
`link` varchar(1000) NOT NULL,
`title` varchar(191) NOT NULL,
`img_name` varchar(191) NOT NULL,
`date` datetime NOT NULL,
`paragraph` text NOT NULL,
`read_time` int(11) DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `articles_link_unique` (`link`),
UNIQUE KEY `articles_img_name_unique` (`img_name`),
KEY `articles_newspaper_id_foreign` (`newspaper_id`),
KEY `articles_tag_id_foreign` (`tag_id`),
CONSTRAINT `articles_newspaper_id_foreign` FOREIGN KEY (`newspaper_id`) REFERENCES `newspapers` (`id`),
CONSTRAINT `articles_tag_id_foreign` FOREIGN KEY (`tag_id`) REFERENCES `tags` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=47421 DEFAULT CHARSET=utf8mb3
Basically, I want to load the latest 5 articles (ordered by date) that have an active newspaper and active tag.
Right now articles table contains about 40k entries.
This is the query generated by Laravel's query builder
SELECT `articles`.*
FROM `articles`
INNER JOIN `tags` ON `tags`.`id` = `articles`.`tag_id`
AND `tags`.`active` = 1
INNER JOIN `newspapers` ON `newspapers`.`id` = `articles`.`newspaper_id`
AND `newspapers`.`active` = 1
ORDER BY `date` DESC
LIMIT 5;
It takes Mysql about 6sec to run the query, when I remove the ORDER BY clause, the query becomes very fast (0.001sec).
Here is the query explanation:
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
| 1 | SIMPLE | newspapers | ALL | PRIMARY | NULL | NULL | NULL | 18 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | articles | ref | articles_newspaper_id_foreign,articles_tag_id_foreign | articles_newspaper_id_foreign | 8 | mouhim.newspapers.id | 1127 | |
| 1 | SIMPLE | tags | eq_ref | PRIMARY | PRIMARY | 8 | mouhim.articles.tag_id | 1 | Using where |
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
I tried creating an index on the date attribute but it didn't help.
for convenience, this is how I am using Query Builder for this query:
Article::select("articles.*")
->join("tags", function ($join) {
$join->on("tags.id", "articles.tag_id")
->where("tags.active", 1);
})
->join("newspapers", function ($join) {
$join->on("newspapers.id", "articles.newspaper_id")
->where("newspapers.active", 1);
})
->orderBy("date", "desc")
->paginate(5)
At first, I was using Eloquent (whereHas) but Eloquent was generating non optimized query using (where exists), so I had to go the joins way.
What can I do to improve execution time of this query?
Result of SHOW INDEXES FROM articles;
+----------+------------+-------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+----------+------------+-------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| articles | 0 | PRIMARY | 1 | id | A | 36072 | NULL | NULL | | BTREE | | | NO |
| articles | 0 | articles_link_unique | 1 | link | A | 36072 | NULL | NULL | | BTREE | | | NO |
| articles | 0 | articles_img_name_unique | 1 | img_name | A | 36072 | NULL | NULL | | BTREE | | | NO |
| articles | 1 | articles_newspaper_id_foreign | 1 | newspaper_id | A | 32 | NULL | NULL | | BTREE | | | NO |
| articles | 1 | articles_tag_id_foreign | 1 | tag_id | A | 12 | NULL | NULL | | BTREE | | | NO |
| articles | 1 | data | 1 | date | A | 36072 | NULL | NULL | | BTREE | | | NO |
+----------+------------+-------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
This query was suggested by Rick James as a solution
SELECT `articles`.*
FROM `articles`
WHERE EXISTS ( SELECT 1 FROM tags WHERE id = `articles`.`tag_id` and active = 1)
AND EXISTS ( SELECT 1 FROM newspapers WHERE id = `articles`.`newspaper_id` and active = 1)
ORDER BY `date` DESC
LIMIT 5;
Running EXPLAIN on this query yields the following result
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
| 1 | PRIMARY | newspapers | ALL | PRIMARY | NULL | NULL | NULL | 18 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | articles | ref | articles_newspaper_id_foreign,articles_tag_id_foreign | articles_newspaper_id_foreign | 8 | mouhim.newspapers.id | 1127 | |
| 1 | PRIMARY | tags | eq_ref | PRIMARY | PRIMARY | 8 | mouhim.articles.tag_id | 1 | Using where |
+------+-------------+------------+--------+-------------------------------------------------------+-------------------------------+---------+------------------------+------+----------------------------------------------+
Assuming you don't want dups, change to this; it is likely to be much faster:
SELECT `articles`.*
FROM `articles`
WHERE EXISTS ( SELECT 1 FROM tags
WHERE id = `articles`.`tag_id` )
AND EXISTS ( SELECT 1 FROM newspapers
WHERE id = `articles`.`newspaper_id` )
ORDER BY `date` DESC
LIMIT 5;
Also, have this index on articles:
INDEX(date)
(This is a rare use case for starting index with a column that will be used in a 'range'.)
(Sorry, I don't speak 'Laravel'; maybe someone else can help with that part.)
PS. Having 3 UNIQUE keys on a table is highly unusual. It often indicates a problem with the schema design.
each article has one and only one Tag associated with it
Can multiple articles have the same Tag?
when I remove the ORDER BY clause, the query becomes very fast (0.001sec).
That is because you get whatever 5 rows are easy to return to you. Clearly the ORDER BY is part of the requirement. "Using temporary; Using filesort" says there was at least a sort. It will actually be a "file" sort -- because SELECT * includes a TEXT column. (There is a technique to avoid "file", but I don't think it is needed here.)
I am not sure if the two queries are supposed to be same, but they are not.
Anyway for the second query I think this should be better
Article::leftJoin('tags', 'articles.tag_id', '=', 'tags.id)
->where('tags.latName', $tag)
->orderBy("articles.date", "desc")
->select(['articles.*'])
->paginate(5);
The problem is probably, that the subquery you created in whereIn is slowing it down and whereIn itself may as well slow your query. This may be eased by using join and where.
As for the first query, can you show how you did the index for date? :)
I've tried this a few different ways and am getting bad results.
The Core problem is that Member Search is scanning ALL members, ignoring indexes.
The main reason (from what I can tell) is this fragment
(Member.priv_profile = 3 OR MyFriend.status_id IN (1,2))
Either side of that OR fragment alone, works fine, getting an index, scanning a few rows, and thus performing well.
I really don't want to split this query into 2 and do a UNION but we might have to do so unless someone can come up with a good way making this select "work" with the important OR.
mysql> ALTER TABLE `members` ADD INDEX A (is_active, last_name, first_name);
Query OK, 140019 rows affected (6.82 sec)
Records: 140019 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE `members` ADD INDEX B (is_active, last_name, first_name, priv_profile);
Query OK, 140019 rows affected (7.70 sec)
Records: 140019 Duplicates: 0 Warnings: 0
mysql> explain SELECT COUNT(*) AS `count` FROM `ao_prod`.`members` AS `Member`
LEFT JOIN `ao_prod`.`member_friends` AS `MyFriend` ON (`MyFriend`.`member_2_id` = `Member`.`id` AND member_1_id = '150365')
WHERE `Member`.`is_active` = '1' AND NOT(`Member`.`first_name` = '' AND `Member`.`last_name` = '') AND (`Member`.`priv_profile` = 3 OR `MyFriend`.`status_id` IN (1,2));
+----+-------------+----------+------+----------------------------------------------+-------------+---------+-------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+----------------------------------------------+-------------+---------+-------+--------+--------------------------+
| 1 | SIMPLE | Member | ALL | active_delete,scope,member_search_alerts,A,B | NULL | NULL | NULL | 140019 | Using where |
| 1 | SIMPLE | MyFriend | ref | member_1_id | member_1_id | 4 | const | 155 | Using where; Using index |
+----+-------------+----------+------+----------------------------------------------+-------------+---------+-------+--------+--------------------------+
2 rows in set (0.00 sec)
// without the "public profile" part
mysql> explain SELECT COUNT(*) AS `count` FROM `ao_prod`.`members` AS `Member`
LEFT JOIN `ao_prod`.`member_friends` AS `MyFriend` ON (`MyFriend`.`member_2_id` = `Member`.`id` AND member_1_id = '150365')
WHERE `Member`.`is_active` = '1' AND NOT(`Member`.`first_name` = '' AND `Member`.`last_name` = '') AND (`MyFriend`.`status_id` IN (1,2));
+----+-------------+----------+--------+------------------------------------------------------+-------------+---------+------------------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+--------+------------------------------------------------------+-------------+---------+------------------------------+------+--------------------------+
| 1 | SIMPLE | MyFriend | range | member_1_id | member_1_id | 5 | NULL | 251 | Using where; Using index |
| 1 | SIMPLE | Member | eq_ref | PRIMARY,active_delete,scope,member_search_alerts,A,B | PRIMARY | 4 | ao_prod.MyFriend.member_2_id | 1 | Using where |
+----+-------------+----------+--------+------------------------------------------------------+-------------+---------+------------------------------+------+--------------------------+
2 rows in set (0.00 sec)
// without the "my connection" part
mysql> explain SELECT COUNT(*) AS `count` FROM `ao_prod`.`members` AS `Member`
LEFT JOIN `ao_prod`.`member_friends` AS `MyFriend` ON (`MyFriend`.`member_2_id` = `Member`.`id` AND member_1_id = '42983')
WHERE `Member`.`is_active` = '1' AND ( NOT(`Member`.`first_name` = '' AND `Member`.`last_name` = '')) AND (`Member`.`priv_profile` = 3);
+----+-------------+----------+------+----------------------------------------------+-------------+---------+-------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+----------------------------------------------+-------------+---------+-------------+------+-------------+
| 1 | SIMPLE | Member | ref | active_delete,scope,member_search_alerts,A,B | scope | 2 | const,const | 2007 | Using where |
| 1 | SIMPLE | MyFriend | ref | member_1_id | member_1_id | 4 | const | 252 | Using index |
+----+-------------+----------+------+----------------------------------------------+-------------+---------+-------------+------+-------------+
2 rows in set (0.01 sec)
// as a subquery vs. join (no workie)
mysql> explain SELECT COUNT(*) AS `count` FROM `ao_prod`.`members` AS `Member`
WHERE `Member`.`is_active` = '1' AND NOT(`Member`.`first_name` = '' AND `Member`.`last_name` = '') AND ( `Member`.`id` IN (
SELECT member_2_id FROM member_friends WHERE member_1_id = 150365 AND status_id IN (1,2)
));
+----+--------------------+----------------+-------+----------------------------------------------+-------------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------------+-------+----------------------------------------------+-------------+---------+------+--------+--------------------------+
| 1 | PRIMARY | Member | ALL | active_delete,scope,member_search_alerts,A,B | NULL | NULL | NULL | 140019 | Using where |
| 2 | DEPENDENT SUBQUERY | member_friends | range | member_1_id | member_1_id | 5 | NULL | 155 | Using where; Using index |
+----+--------------------+----------------+-------+----------------------------------------------+-------------+---------+------+--------+--------------------------+
2 rows in set (0.01 sec)
// sketch of the possible, ugly UNION
mysql> explain SELECT COUNT(*) AS `count` FROM `ao_prod`.`members` AS `Member` LEFT JOIN `ao_prod`.`member_friends` AS `MyFriend` ON (`MyFriend`.`member_2_id` = `Member`.`id` AND member_1_id = '42983') WHERE `Member`.`is_active` = '1' AND ( NOT(`Member`.`first_name` = '' AND `Member`.`last_name` = '')) AND (`MyFriend`.`status_id` IN (1,2))
-> UNION
-> SELECT COUNT(*) AS `count` FROM `ao_prod`.`members` AS `Member` WHERE `Member`.`is_active` = '1' AND ( NOT(`Member`.`first_name` = '' AND `Member`.`last_name` = '')) AND (`Member`.`priv_profile` = 3)
-> GROUP BY Member.id
-> ;
+----+--------------+------------+--------+------------------------------------------------------+-------------+---------+------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+------------+--------+------------------------------------------------------+-------------+---------+------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | MyFriend | range | member_1_id | member_1_id | 5 | NULL | 251 | Using where; Using index |
| 1 | PRIMARY | Member | eq_ref | PRIMARY,active_delete,scope,member_search_alerts,A,B | PRIMARY | 4 | ao_prod.MyFriend.member_2_id | 1 | Using where |
| 2 | UNION | Member | ref | active_delete,scope,member_search_alerts,A,B | scope | 2 | const,const | 2007 | Using where; Using temporary; Using filesort |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+------------+--------+------------------------------------------------------+-------------+---------+------------------------------+------+----------------------------------------------+
4 rows in set (0.02 sec)
// using index hinting to no avail
mysql> explain SELECT COUNT(*) AS `count`
FROM `ao_prod`.`members` AS `Member`
USE INDEX (A)
LEFT JOIN `ao_prod`.`member_friends` AS `MyFriend` ON (`MyFriend`.`member_2_id` = `Member`.`id` AND member_1_id = '150365')
WHERE `Member`.`is_active` = '1' AND NOT(`Member`.`first_name` = '' AND `Member`.`last_name` = '') AND (`Member`.`priv_profile` = 3 OR `MyFriend`.`status_id` IN (1,2));
+----+-------------+----------+------+---------------+-------------+---------+-------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+-------------+---------+-------+--------+--------------------------+
| 1 | SIMPLE | Member | ALL | A | NULL | NULL | NULL | 140245 | Using where |
| 1 | SIMPLE | MyFriend | ref | member_1_id | member_1_id | 4 | const | 181 | Using where; Using index |
+----+-------------+----------+------+---------------+-------------+---------+-------+--------+--------------------------+
2 rows in set (0.01 sec)
Here are create statements for the involved tables (full, ugly tables and all other indexes shown)
CREATE TABLE IF NOT EXISTS `member_friends` (
`id` varchar(36) NOT NULL,
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
`member_1_id` int(11) NOT NULL DEFAULT '0',
`member_2_id` int(11) NOT NULL DEFAULT '0',
`status_id` tinyint(3) NOT NULL DEFAULT '0',
`requested_by` tinyint(3) NOT NULL DEFAULT '0',
`requested` datetime DEFAULT NULL,
`accepted` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `member_1_id` (`member_1_id`,`status_id`,`member_2_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `members_fields`
ADD PRIMARY KEY (`id`), ADD KEY `key` (`key`), ADD KEY `member_key` (`member_id`,`key`);
CREATE TABLE IF NOT EXISTS `members` (
`id` int(11) NOT NULL,
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
`profile_updated` datetime NOT NULL,
`last_login` datetime DEFAULT NULL,
`is_active` tinyint(1) NOT NULL,
`email` varchar(256) NOT NULL DEFAULT '',
`password` varchar(40) NOT NULL,
`first_name` varchar(128) NOT NULL DEFAULT '',
`middle_name` varchar(128) NOT NULL,
`last_name` varchar(128) NOT NULL DEFAULT '',
`suffix` varchar(32) NOT NULL,
`company` varchar(128) NOT NULL,
`address` varchar(128) NOT NULL,
`address_2` varchar(128) NOT NULL,
`city` varchar(128) NOT NULL,
`state` varchar(5) NOT NULL,
`zip` varchar(16) NOT NULL,
`location_name` varchar(128) NOT NULL,
`image_url` varchar(256) NOT NULL,
`slug` varchar(64) NOT NULL,
`headline` varchar(256) NOT NULL,
`experience_level` varchar(64) NOT NULL,
`apply_job_states` varchar(256) NOT NULL COMMENT 'CSV list',
`apply_job_us` tinyint(1) NOT NULL DEFAULT '0',
`apply_job_ca` tinyint(1) NOT NULL DEFAULT '0',
`apply_job_traveling` tinyint(1) NOT NULL DEFAULT '0',
`apply_job_international` tinyint(1) NOT NULL DEFAULT '0',
`apply_job_fulltime` tinyint(1) NOT NULL DEFAULT '0',
`apply_job_parttime` tinyint(1) NOT NULL DEFAULT '0',
`apply_job_perdiem` tinyint(1) NOT NULL DEFAULT '0',
`contact_for_professional_opportunities` tinyint(1) NOT NULL DEFAULT '0',
`contact_for_job_inquiries` tinyint(1) NOT NULL DEFAULT '0',
`contact_for_new_ventures` tinyint(1) NOT NULL DEFAULT '0',
`contact_for_expertise_requests` tinyint(1) NOT NULL DEFAULT '0',
`country` varchar(2) NOT NULL,
`timezone` varchar(32) NOT NULL,
`phone` varchar(16) NOT NULL,
`fax` varchar(16) NOT NULL,
`birthday` varchar(5) NOT NULL COMMENT 'MM/DD (required)',
`birth_year` varchar(4) DEFAULT NULL COMMENT 'YYYY (optional)',
`corp_id` int(11) NOT NULL DEFAULT '0',
`is_deleted` tinyint(1) NOT NULL,
`url` varchar(256) DEFAULT NULL,
`emails` varchar(512) NOT NULL COMMENT 'JSON list of alternate emails',
`phones` varchar(512) NOT NULL COMMENT 'JSON list of alternate phones',
`lat` float NOT NULL,
`lon` float NOT NULL,
`facebook_id` varchar(32) NOT NULL,
`connect_id` int(11) NOT NULL,
`is_student` tinyint(1) NOT NULL DEFAULT '0',
`is_career_center_recruiter` tinyint(1) NOT NULL DEFAULT '0',
`is_continuing_education_portal_manager` tinyint(1) NOT NULL DEFAULT '0',
`is_manually_approved` tinyint(1) NOT NULL DEFAULT '0',
`is_employer` tinyint(1) NOT NULL DEFAULT '0',
`is_jobseeker` tinyint(1) NOT NULL DEFAULT '0',
`is_jobseeker_badge` tinyint(1) NOT NULL DEFAULT '0',
`is_contributor` tinyint(1) NOT NULL DEFAULT '0',
`priv_profile` tinyint(3) NOT NULL DEFAULT '1',
`priv_email` tinyint(3) NOT NULL DEFAULT '0',
`priv_phone` tinyint(3) NOT NULL DEFAULT '0',
`has_certification` tinyint(1) DEFAULT NULL,
`has_state_license` tinyint(1) DEFAULT NULL,
`job_title` varchar(64) NOT NULL,
`occupation_id` int(11) NOT NULL,
`occupation_other` varchar(64) NOT NULL,
`work_setting_id` int(11) NOT NULL,
`work_setting_other` varchar(64) NOT NULL,
`memberships_honors_awards` text NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1093688 ;
ALTER TABLE `members`
ADD PRIMARY KEY (`id`), ADD KEY `is_cc` (`is_career_center_recruiter`,`corp_id`), ADD KEY `is_ce` (`is_continuing_education_portal_manager`,`corp_id`), ADD KEY `corp_id` (`corp_id`), ADD KEY `active_delete` (`is_active`,`is_deleted`), ADD KEY `delete` (`is_deleted`), ADD KEY `email_pass` (`email`,`password`), ADD KEY `apply_job_states` (`apply_job_states`,`apply_job_us`,`apply_job_ca`), ADD KEY `experience_level` (`experience_level`), ADD KEY `latlon` (`lat`,`lon`), ADD KEY `location` (`state`,`zip`), ADD KEY `slug` (`slug`,`is_active`,`priv_profile`), ADD KEY `scope` (`is_active`,`priv_profile`,`state`), ADD KEY `member_search_alerts` (`is_active`,`is_jobseeker`,`profile_updated`,`priv_profile`,`apply_job_us`,`apply_job_ca`);
UPDATE: as requested, here are the optimizer settings
mysql> SELECT ##optimizer_switch\G
*************************** 1. row ***************************
##optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on
1 row in set (0.00 sec)
NOTE: this has been tested on
Server version: 5.6.20-68.0-56-log - Percona XtraDB Cluster (GPL), Release 25.7
Server version: 5.5.29-0ubuntu0.12.04.1
Server version: 5.1.72 - Source distribution
In this case, 1 of the tables was MyISAM and the other was InnoDB
When I switched both to InnoDB it magically changed from ALL to ref and from scanning all rows to a subset.
mysql> explain SELECT COUNT(*) AS `count` FROM `ao_prod`.`members` AS `Member` LEFT JOIN `ao_prod`.`member_friends` AS `MyFriend` ON (`MyFriend`.`member_2_id` = `Member`.`id` AND member_1_id = '150365') WHERE `Member`.`is_active` = '1' AND NOT(`Member`.`first_name` = '' AND `Member`.`last_name` = '') AND (`Member`.`priv_profile` = 3 OR `MyFriend`.`status_id` IN (1,2));
+----+-------------+----------+------+---------------+-------------+---------+-------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+-------------+---------+-------+--------+--------------------------+
| 1 | SIMPLE | Member | ALL | A | NULL | NULL | NULL | 140245 | Using where |
| 1 | SIMPLE | MyFriend | ref | member_1_id | member_1_id | 4 | const | 181 | Using where; Using index |
+----+-------------+----------+------+---------------+-------------+---------+-------+--------+--------------------------+
2 rows in set (0.00 sec)
mysql> ALTER TABLE `members` ENGINE = InnoDB;
Query OK, 140245 rows affected (1 min 8.10 sec)
Records: 140245 Duplicates: 0 Warnings: 0
mysql> explain SELECT COUNT(*) AS `count` FROM `ao_prod`.`members` AS `Member` LEFT JOIN `ao_prod`.`member_friends` AS `MyFriend` ON (`MyFriend`.`member_2_id` = `Member`.`id` AND member_1_id = '150365') WHERE `Member`.`is_active` = '1' AND NOT(`Member`.`first_name` = '' AND `Member`.`last_name` = '') AND (`Member`.`priv_profile` = 3 OR `MyFriend`.`status_id` IN (1,2));
+----+-------------+----------+------+---------------+-------------+---------+-------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+-------------+---------+-------+-------+--------------------------+
| 1 | SIMPLE | Member | ref | A | A | 1 | const | 53916 | Using where |
| 1 | SIMPLE | MyFriend | ref | member_1_id | member_1_id | 4 | const | 181 | Using where; Using index |
+----+-------------+----------+------+---------------+-------------+---------+-------+-------+--------------------------+
Creating these two tables along with those two indices, and running your first query actually uses index A for the memebers table:
+----+-------------+----------+------+---------------+-------------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+-------------+---------+-------+------+--------------------------+
| 1 | SIMPLE | Member | ref | A,B | A | 1 | const | 3199 | Using index condition |
| 1 | SIMPLE | MyFriend | ref | member_1_id | member_1_id | 4 | const | 2 | Using where; Using index |
+----+-------------+----------+------+---------------+-------------+---------+-------+------+--------------------------+
Tested on: 5.6.19-0ubuntu0.14.04.1
Also on SQLFiddle
I have two tables with matches and users.
I'm trying to find the way to get the top countries playing matches, and I have this SQL:
select
distinct(user.country),
count(*) as counter
from matches
left join user on matches.user_id = user.id
where
matches.`date` between '2014-01-01' and '2014-03-15'
group by user.country
order by counter DESC
limit 10
The problem is that I'm getting "Using where; Using temporary; Using file sort" and the sql takes about 8s in a m3.medium RDS Amazon server (not bad one!)
I have user.country indexed. Both tables are InnoDB.
Any ideas to improve it ?
Tables:
CREATE TABLE `user` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`nick` varchar(32) DEFAULT NULL,
`email` varchar(128) DEFAULT NULL,
`password` varchar(40) DEFAULT NULL,
`country` char(2) DEFAULT '',
PRIMARY KEY (`id`),
KEY `country` (`country`),
) ENGINE=InnoDB AUTO_INCREMENT=254183 DEFAULT CHARSET=utf8;
CREATE TABLE `matches` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) unsigned DEFAULT NULL,
`date` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `date` (`date`)
) ENGINE=InnoDB AUTO_INCREMENT=2593195 DEFAULT CHARSET=utf8;
EXPLAIN gives:
+----+-------------+---------+--------+-----------------+---------+---------+----------------------------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+--------+-----------------+---------+---------+----------------------------+---------+----------------------------------------------+
| 1 | SIMPLE | matches | ALL | date | NULL | NULL | NULL | 2386708 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | user | eq_ref | PRIMARY,country | PRIMARY | 4 | matches.user_id | 1 | NULL |
+----+-------------+---------+--------+-----------------+---------+---------+----------------------------+---------+----------------------------------------------+
EDIT: Changing to inner join:
+----+-------------+----------+-------+------------------------------+-----------------+---------+------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+------------------------------+-----------------+---------+------------------+--------+----------------------------------------------+
| 1 | SIMPLE | user | index | PRIMARY,country | country | 7 | NULL | 234262 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | matches | ref | user_id,date | user_id | 5 | user.id | 5 | Using where |
+----+-------------+----------+-------+------------------------------+-----------------+---------+------------------+--------+----------------------------------------------+
Any idea why cli.clitype and c.cli is not being picked even if I use/force index as hint in sql. Its giving me the output in 4 secs to fetch 1634 only for the below query. I'm using 5.5.25.log Please suggest.
mysql> explain SELECT DATE(sr.`date`), v.company_name, c.cli, COUNT(*), c.charge, SUM(c.`charge`) FROM subscriptionrequest AS sr, cli AS c , vendor AS v WHERE sr.cli = c.cli AND sr.secretkey = v.secretkey AND sr.`date` BETWEEN'2012-03-12 00:00:00' AND '2012-10-13 00:00:00' and c.clitype = 'chargemo' GROUP BY DATE(sr.`date`), sr.secretkey,c.cli;
+----+-------------+-------+------+-----------------------------------------------+----------------+---------+----------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-----------------------------------------------+----------------+---------+----------------------------+------+----------------------------------------------+
| 1 | SIMPLE | c | ALL | idx_cli | NULL | NULL | NULL | 115 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | sr | ref | idx_subreq_key,idx_subreq_cli,idx_subreq_date | idx_subreq_cli | 53 | crystal_du_sm.c.cli | 869 | Using where |
| 1 | SIMPLE | v | ref | secretkey_idx | secretkey_idx | 52 | crystal_du_sm.sr.secretkey | 1 | Using where |
+----+-------------+-------+------+-----------------------------------------------+----------------+---------+----------------------------+------+----------------------------------------------+
3 rows in set (0.00 sec)
mysql> show indexes from cli;
+-------+------------+--------------+--------------+-------------+-----------+--------- ----+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+--------------+--------------+-------------+-----------+-------- -----+----------+--------+------+------------+---------+---------------+
| cli | 0 | PRIMARY | 1 | idcli | A | 115 | NULL | NULL | | BTREE | | |
| cli | 1 | idx_cli | 1 | cli | A | 115 | NULL | NULL | | BTREE | | |
| cli | 1 | cli_type_idx | 1 | clitype | A | 115 | NULL | NULL | YES | BTREE | | |
+-------+------------+--------------+--------------+-------------+-----------+--------- ----+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)
mysql> show create table cli;
| cli | CREATE TABLE `cli` (
`idcli` bigint(255) NOT NULL AUTO_INCREMENT,
`cli` varchar(256) NOT NULL,
`type` enum('SDMF','MDMF') NOT NULL DEFAULT 'SDMF',
`priority` enum('realtime','high','normal','low','ignore') NOT NULL DEFAULT 'normal',
`status` enum('active','inactive','suspended','deleted') NOT NULL DEFAULT 'active',
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`description` text,
`charge` float DEFAULT '0',
`clitype` enum('chargemo','freemo') DEFAULT 'freemo',
PRIMARY KEY (`idcli`),
KEY `idx_cli` (`cli`),
KEY `cli_type_idx` (`clitype`)
) ENGINE=InnoDB AUTO_INCREMENT=117 DEFAULT CHARSET=latin1 |
1 row in set (0.00 sec)
mysql> show create table vendor;
| vendor | CREATE TABLE `vendor` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(256) NOT NULL,
`company_name` varchar(256) DEFAULT NULL,
`phone_no` varchar(256) DEFAULT NULL,
`status` enum('active','inactive','suspended','deleted') DEFAULT 'active',
`mo` bigint(255) NOT NULL,
`mt` bigint(255) NOT NULL,
`used_mo` bigint(255) DEFAULT '0',
`used_mt` bigint(255) DEFAULT '0',
`start_time` timestamp NULL DEFAULT '0000-00-00 00:00:00',
`end_time` timestamp NULL DEFAULT '0000-00-00 00:00:00',
`secretkey` varchar(50) NOT NULL,
`callback_url` text,
`payment_callback_url` text,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`userid` int(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `secretkey_idx` (`secretkey`)
) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=latin1 |
1 row in set (0.00 sec)
| subscriptionrequest | CREATE TABLE `subscriptionrequest` (
`id` bigint(255) unsigned NOT NULL AUTO_INCREMENT,
`ipaddress` varchar(256) CHARACTER SET latin1 NOT NULL DEFAULT '0.0.0.0',
`message` text,
`msisdn` varchar(50) CHARACTER SET latin1 DEFAULT NULL,
`mode` varchar(50) CHARACTER SET latin1 DEFAULT NULL,
`cli` varchar(50) CHARACTER SET latin1 DEFAULT NULL,
`transactionid` varchar(100) DEFAULT NULL,
`secretkey` varchar(100) CHARACTER SET latin1 DEFAULT NULL,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`error_code` int(10) DEFAULT NULL,
`success` int(10) NOT NULL DEFAULT '0',
`status` enum('waiting','processing','completed','moexceeds','reject') DEFAULT 'waiting',
PRIMARY KEY (`id`),
KEY `idx_subreq_key` (`secretkey`),
KEY `idx_subreq_status` (`status`),
KEY `idx_subreq_transid` (`transactionid`),
KEY `idx_subreq_cli` (`cli`),
KEY `idx_subreq_date` (`date`)
) ENGINE=InnoDB AUTO_INCREMENT=1594161 DEFAULT CHARSET=utf8 |
FOR SETSUNA ---
mysql> explain SELECT DATE(sr.`date`) AS sr_date, v.company_name, c.cli,
-> COUNT(*) AS cnt, c.charge,
-> SUM(c.`charge`) AS charge_sum
-> FROM
-> subscriptionrequest AS sr
-> JOIN cli AS c ON sr.cli = c.cli
-> JOIN vendor AS v ON sr.secretkey = v.secretkey
-> WHERE
-> sr.`date` >= '2012-03-12' AND sr.`date` <= '2012-10-13'
-> AND c.clitype = 'chargemo'
-> GROUP BY DATE(sr.`date`), sr.secretkey, c.cli;
+----+-------------+-------+------+-----------------------------+----------------+--------- +---------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-----------------------------+----------------+--------- +---------------------------+-------+---------------------------------+
| 1 | SIMPLE | v | ALL | secretkey_idx | NULL | NULL | NULL | 9 | Using temporary; Using filesort |
| 1 | SIMPLE | sr | ref | idx_subreq_key,cli_date_idx | idx_subreq_key | 103 | crystal_du_sm.v.secretkey | 88746 | Using where |
| 1 | SIMPLE | c | ref | idx_cli,cli_type_idx | idx_cli | 258 | crystal_du_sm.sr.cli | 1 | Using where |
+----+-------------+-------+------+-----------------------------+----------------+--------- +---------------------------+-------+---------------------------------+
3 rows in set (0.00 sec)
--- 23/8/2012 ---
+----+-------------+-------+------+---------------------------------------+------------ -----+---------+----------------------------+-------+-----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------------------------------+-----------------+---------+----------------------------+-------+-----------------+
| 1 | SIMPLE | v | ALL | secretkey_idx | NULL | NULL | NULL | 9 | Using temporary |
| 1 | SIMPLE | sr | ref | idx_subreq_key,idx_date_cli_secretkey | idx_subreq_key | 103 | crystal_du_sm.v.secretkey | 88608 | Using where |
| 1 | SIMPLE | c | ref | idx_cli_clitype | idx_cli_clitype | 260 | crystal_du_sm.sr.cli,const | 1 | Using where |
+----+-------------+-------+------+---------------------------------------+---------------- -+---------+----------------------------+-------+-----------------+
3 rows in set (0.00 sec)
A couple of general remarks:
Avoid using reserved keywords as column names (subscriptionrequest.date)
Use aliases on field names especially when using functions
I think this version is more readable
SELECT DATE(sr.`date`) AS sr_date, v.company_name, c.cli,
COUNT(*) AS cnt, c.charge,
SUM(c.`charge`) AS charge_sum
FROM
subscriptionrequest AS sr
JOIN cli AS c ON sr.cli = c.cli
JOIN vendor AS v ON sr.secretkey = v.secretkey
WHERE
sr.`date` >= '2012-03-12' AND sr.`date` <= '2012-10-13'
AND c.clitype = 'chargemo'
GROUP BY DATE(sr.`date`), sr.secretkey, c.cli;
You will probably need to modify the subscriptionrequest table:
ALTER TABLE subscriptionrequest DROP INDEX `idx_subreq_cli` , DROP INDEX
`idx_subreq_date`, ADD INDEX `cli_date` (`date`,`cli`);
This will help fetching the proper subset of records based on the date field thus diminishing the number of returned records from the subscriptionrequest table.
Edit #1
Schema Modifications & (Slight) Query Optimization:
ALTER TABLE subscriptionrequest DROP INDEX `cli_date`,
ADD INDEX `idx_date_cli_secretkey` (`date`,`secretkey`,`cli`);
ALTER TABLE `cli` DROP INDEX idx_cli, DROP INDEX cli_type_idx,
ADD INDEX `idx_cli_clitype` (cli,clitype);
EXPLAIN SELECT DATE(sr.`date`) AS sr_date, v.company_name, c.cli,
COUNT(*) AS cnt, c.charge, SUM(c.`charge`) AS charge_sum
FROM subscriptionrequest AS sr JOIN
cli AS c ON sr.cli = c.cli JOIN vendor AS v ON sr.secretkey = v.secretkey
WHERE sr.`date` >= '2012-03-12' AND sr.`date` <= '2012-10-13'
AND c.clitype = 'chargemo' GROUP BY DATE(sr.`date`), sr.secretkey, c.cli
ORDER BY NULL\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sr
type: index
possible_keys: idx_subreq_key,idx_date_cli_secretkey
key: idx_date_cli_secretkey
key_len: 160
ref: NULL
rows: 1
Extra: Using where; Using index; Using temporary
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: c
type: ref
possible_keys: idx_cli_clitype
key: idx_cli_clitype
key_len: 260
ref: so_12055859.sr.cli,const
rows: 1
Extra: Using where
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: v
type: ref
possible_keys: secretkey_idx
key: secretkey_idx
key_len: 52
ref: so_12055859.sr.secretkey
rows: 1
Extra: Using where
3 rows in set (0.01 sec)
I have a couple of tables (products and suppliers) and want to find out which items are no longer listed in the suppliers table.
Table uc_products has the products. Table uc_supplier_csv has supplier stocks. uc_products.model joins against uc_suppliers.sku.
I am seeing very long queries when trying to identify the stock in the products table which are not referred to in the suppliers table. I only want to extract the nid of the entries which match; sid IS NULL is just so I can identify which items don't have a supplier.
For the first of the queries below, it takes the DB server (4GB ram / 2x 2.4GHz intel) an hour to get a result (507 rows). I didn't wait for the second query to finish.
How can I make this query more optimal? Is it due to the mismatched character sets?
I was thinking that the following would be the most efficient SQL to use:
SELECT nid, sid
FROM uc_products p
LEFT OUTER JOIN uc_supplier_csv c
ON p.model = c.sku
WHERE sid IS NULL ;
For this query, I get the following EXPLAIN result:
mysql> EXPLAIN SELECT nid, sid FROM uc_products p LEFT OUTER JOIN uc_supplier_csv c ON p.model = c.sku WHERE sid IS NULL;
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
| 1 | SIMPLE | p | ALL | NULL | NULL | NULL | NULL | 6526 | |
| 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 126639 | Using where; Not exists |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
2 rows in set (0.00 sec)
I would have thought that the keys idx_sku and idx_model would be valid for use here, but they aren't. Is that because the tables' default charsets do not match? One is UTF-8 and one is latin1.
I also considered this form:
SELECT nid
FROM uc_products
WHERE model
NOT IN (
SELECT DISTINCT sku FROM uc_supplier_csv
) ;
EXPLAIN shows the following results for that query:
mysql> explain select nid from uc_products where model not in ( select sku from uc_supplier_csv ) ;
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| 1 | PRIMARY | uc_products | ALL | NULL | NULL | NULL | NULL | 6520 | Using where |
| 2 | DEPENDENT SUBQUERY | uc_supplier_csv | index | idx_sku,idx_sku_stock | idx_sku | 258 | NULL | 126639 | Using where; Using index |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)
And just so I don't miss anything out, here are a few more exciting details: the table sizes and stats, and the table structure :)
mysql> show table status where Name in ( 'uc_supplier_csv', 'uc_products' ) ;
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| uc_products | MyISAM | 10 | Dynamic | 6520 | 89 | 585796 | 281474976710655 | 232448 | 912 | NULL | 2009-04-24 11:03:15 | 2009-10-12 14:23:43 | 2009-04-24 11:03:16 | utf8_general_ci | NULL | | |
| uc_supplier_csv | MyISAM | 10 | Dynamic | 126639 | 26 | 3399704 | 281474976710655 | 5864448 | 0 | NULL | 2009-10-12 14:28:25 | 2009-10-12 14:28:25 | 2009-10-12 14:28:27 | latin1_swedish_ci | NULL | | |
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
and
CREATE TABLE `uc_products` (
`vid` mediumint(9) NOT NULL default '0',
`nid` mediumint(9) NOT NULL default '0',
`model` varchar(255) NOT NULL default '',
`list_price` decimal(10,2) NOT NULL default '0.00',
`cost` decimal(10,2) NOT NULL default '0.00',
`sell_price` decimal(10,2) NOT NULL default '0.00',
`weight` float NOT NULL default '0',
`weight_units` varchar(255) NOT NULL default 'lb',
`length` float unsigned NOT NULL default '0',
`width` float unsigned NOT NULL default '0',
`height` float unsigned NOT NULL default '0',
`length_units` varchar(255) NOT NULL default 'in',
`pkg_qty` smallint(5) unsigned NOT NULL default '1',
`default_qty` smallint(5) unsigned NOT NULL default '1',
`unique_hash` varchar(32) NOT NULL,
`ordering` tinyint(2) NOT NULL default '0',
`shippable` tinyint(2) NOT NULL default '1',
PRIMARY KEY (`vid`),
KEY `idx_model` (`model`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
CREATE TABLE `uc_supplier_csv` (
`sid` int(10) unsigned NOT NULL default '0',
`sku` varchar(255) default NULL,
`stock` int(10) unsigned NOT NULL default '0',
`list_price` decimal(8,2) default '0.00',
KEY `idx_sku` (`sku`),
KEY `idx_stock` (`stock`),
KEY `idx_sku_stock` (`sku`,`stock`),
KEY `idx_sid` (`sid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
EDIT: Adding query plans for a couple of suggested queries from Martin below:
mysql> explain SELECT nid FROM uc_products p WHERE NOT EXISTS ( SELECT 1 FROM uc_supplier_csv c WHERE p.model = c.sku ) ;
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| 1 | PRIMARY | p | ALL | NULL | NULL | NULL | NULL | 6526 | Using where |
| 2 | DEPENDENT SUBQUERY | c | index | NULL | idx_sku | 258 | NULL | 126639 | Using where; Using index |
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)
mysql> explain SELECT nid FROM uc_products WHERE model NOT IN ( SELECT sku FROM uc_supplier_csv ) ;
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| 1 | PRIMARY | uc_products | ALL | NULL | NULL | NULL | NULL | 6526 | Using where |
| 2 | DEPENDENT SUBQUERY | uc_supplier_csv | index | idx_sku,idx_sku_stock | idx_sku | 258 | NULL | 126639 | Using where; Using index |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)
Perhaps try using NOT EXISTS rather than counts? For example:
SELECT nid
FROM uc_products p
WHERE NOT EXISTS (
SELECT 1
FROM uc_supplier_csv c
WHERE p.model = c.sku
)
SO user Quassnoi has a short article outlining some tests that suggest that this might also be worth a try:
SELECT nid
FROM uc_products
WHERE model NOT IN (
SELECT sku
FROM uc_supplier_csv
)
basically as per your original query, without the DISTINCTion.
Another one for you Chris, this time with help for the cross-encoding join:
SELECT nid
FROM uc_products p
WHERE NOT EXISTS (
SELECT 1
FROM uc_supplier_csv c
WHERE CONVERT( p.model USING latin1 ) = c.sku
)