MYSQL LEFT JOIN returns unexpected results - mysql

I have two tables talk_comments and talk_comment_votes.
I run the following code to select, commentId, numberOfUpvotes, whetherUserUpvoted, numberOfDownvotes, whetherUserDownvoted usin LEFT JOINs to the same table.
SELECT c.id, COUNT(v1.id) as upvotes, COUNT(v2.id) as userUpvoted, COUNT(v3.id) as downvotes, COUNT(v4.id) as userDownvoted FROM talk_comments c
LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id
I have the following data in my talk_comment_votes table
So, according to the query, it should select values 2,2,0,1,1 respectively. When I break those JOIN statements and do the queries, it returns the expected results. But, with JOINs, it returns something like the follows.
Can I get some help on fixing this?
Thanks.

I ran a benchmark on queries based on #spencer7593 and #RaymondNijland's 2 answers.
LEFT JOINS wins!
1. Using LEFT JOINS
SELECT c.id, COUNT(DISTINCT v1.id) as upvotes, COUNT(DISTINCT v2.id) as userUpvoted, COUNT(DISTINCT v3.id) as downvotes, COUNT(DISTINCT v4.id) as userDownvoted FROM talk_comments c
LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id
Time for 1000 queries: 0.55000805854797s
2. Using Sub Queries
SELECT c.id,c.user_id, c.time,c.body, c.reply_to,
(SELECT COUNT(v1.id) FROM talk_comment_votes v1 WHERE v1.comment_id = c.id AND v1.status = 1 LIMIT 1) as upvotes,
(SELECT COUNT(v2.id) FROM talk_comment_votes v2 WHERE v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 LIMIT 1) as clientUpvoted,
(SELECT COUNT(v3.id) FROM talk_comment_votes v3 WHERE v3.comment_id = c.id AND v3.status = 2 LIMIT 1) as downvotes,
(SELECT COUNT(v4.id) FROM talk_comment_votes v4 WHERE v4.comment_id = c.id AND v4.status = 2 AND v4.user_id = 1 LIMIT 1) as clientDownvoted
FROM talk_comments c
WHERE c.id = 2 GROUP BY c.id
Time for 1000 queries: 0.95499300956726s
3. Using SUM, IF
SELECT c.id
, SUM(IF(v.status = 1 ,1,0)) AS upvotes
, SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
, SUM(IF(v.status = 2 ,1,0)) AS downvotes
, SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
FROM talk_comments c
LEFT
JOIN talk_comment_votes v
ON v.comment_id = c.id
WHERE c.id = 2
GROUP BY c.id
Time for 1000 queries: 1.2266919612885s
Thank you for all the answers.

I'd use conditional aggregation. A join to a single reference to tall_comment_votes, and then check conditions in expressions.
SELECT c.id
, SUM(IF(v.status = 1 ,1,0)) AS upvotes
, SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
, SUM(IF(v.status = 2 ,1,0)) AS downvotes
, SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
FROM talk_comments c
LEFT
JOIN talk_comment_votes v
ON v.comment_id = c.id
WHERE c.id = 2
GROUP
BY c.id
This avoids the problem of the partial cross product, when there are multiple rows returned from v1, v2, v3 and v4.
The MySQL IF() expression could replaced with a more ANSI standards compliant CASE expression, e.g.
, SUM(CASE WHEN v.status = 1 THEN 1 ELSE 0 END) AS upvotes
FOLLOWUP
setup test case and observe execution plans and performance
populate tables
CREATE TABLE talk_comments (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT);
CREATE TABLE talk_comment_votes (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, comment_id INT UNSIGNED NOT NULL, user_id INT UNSIGNED NOT NULL, is_anonymous TINYINT(1) UNSIGNED NOT NULL, STATUS TINYINT UNSIGNED, time_ INT UNSIGNED);
CREATE INDEX talk_comment_votes_IX1 ON talk_comment_votes (comment_id, STATUS, user_id, is_anonymous) ;
INSERT INTO talk_comments (id) VALUES (1),(2),(3);
INSERT INTO talk_comment_votes (id, comment_id, user_id, is_anonymous, STATUS, time_) VALUES (1,2,2,0,1,0),(2,1,1,0,1,0),(3,2,1,0,2,NULL),(4,7,1,0,2,NULL),(5,1,14,1,1,NULL),(6,2,14,1,1,NULL);
query execution plans
EXPLAIN
SELECT c.id, COUNT(DISTINCT v1.id) AS upvotes, COUNT(DISTINCT v2.id) AS userUpvoted, COUNT(DISTINCT v3.id) AS downvotes, COUNT(DISTINCT v4.id) AS userDownvoted FROM talk_comments c
LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id
;
EXPLAIN
SELECT c.id
, SUM(IF(v.status = 1 ,1,0)) AS upvotes
, SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
, SUM(IF(v.status = 2 ,1,0)) AS downvotes
, SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
FROM talk_comments c
LEFT
JOIN talk_comment_votes v
ON v.comment_id = c.id
WHERE c.id = 2
GROUP BY c.id
;
output from explain
-- id select_type table type possible_keys key key_len ref rows Extra
-- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ----------------------- ------ -------------
-- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
-- 1 SIMPLE v1 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 2 Using index
-- 1 SIMPLE v2 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index
-- 1 SIMPLE v3 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 1 Using index
-- 1 SIMPLE v4 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index
-- id select_type table type possible_keys key key_len ref rows Extra
-- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ------ ------ -------------
-- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
-- 1 SIMPLE v ref talk_comment_votes_IX1 talk_comment_votes_IX1 4 const 3 Using index
measured performance:
100 executions round 1 round 2 round 3
------------------------------------ ---------- ---------- ---------
multiple left join, count(distinct 0.123 secs 0.130 secs 0.125 secs
conditional aggregation sum(if 0.113 secs 0.114 secs 0.111 secs

Related

count for different column in union and displaying in same row

I am trying to get a count(*) for different column from a different table using union.
//tbl_churidar
order_id order_no_first order_no
--------------------------------------
1 C 1000
2 C 1001
3 C 1002
//tbl_anarkali
order_id order_no_first order_no
--------------------------------------
1 A 1003
2 A 1004
3 A 1005
//tbl_assign
assign_id order_id order_no_first
---------------------------------------
1 1 C
2 1 A
3 2 C
4 3 C
5 2 A
6 3 A
//tbl_unit_status
status_id assign_id status_status stitching_worker
-----------------------------------------------------------
1 1 Stitch AA
2 2 QC {null}
3 3 Stitch BB
4 4 Stitch BB
5 5 Stitch AA
6 6 Stitch CC
from the table tbl_unit_status where status_status = Stitch should INNER JOIN with other two table and get the total count of churidar and anarkali each stitching_worker taken.
the required output is,
churidar anarkali stitching_worker
----------------------------------------
1 1 AA
2 0 BB
0 1 CC
I have tried to get the above output but got stuck. Below is my code,
SELECT churidar, anarkali, stitching_worker
FROM ((
SELECT count(*) AS churidar, NULL AS anarkali,
us.stitching_worker
FROM tbl_unit_status us
INNER JOIN tbl_assign a ON a.assign_id = us.assign_id
INNER JOIN tbl_churidar o ON
(o.order_id = a.order_id AND
o.order_no_first = a.order_no_first)
INNER JOIN tbl_contacts c ON c.contacts_id = o.contacts_id
LEFT JOIN tbl_title t ON t.title_id = c.title_id
WHERE us.status_status = "Stitch" AND
o.order_no_first = "C"
GROUP BY us.stitching_worker
)
UNION (
SELECT NULL AS churidar, count(*) AS anarkali,
us.stitching_worker
FROM tbl_unit_status us
INNER JOIN tbl_assign a ON a.assign_id = us.assign_id
INNER JOIN tbl_anarkali o ON (
o.order_id = a.order_id AND
o.order_no_first = a.order_no_first)
INNER JOIN tbl_contacts c ON c.contacts_id = o.contacts_id
LEFT JOIN tbl_title t ON t.title_id = c.title_id
WHERE us.status_status = "Stitch" AND
o.order_no_first = "A"
GROUP BY us.stitching_worker
)
) AS T1
the output for the above code is,
churidar anarkali stitching_worker
----------------------------------------
1 0 AA
{null} 1 AA
2 0 BB
0 1 CC
how to get the required output. I have tried a lot. Help me find the answer. Thankyou.
If I understand correctly (which I may not), you don't need the first two tables. You can get the information you need from tbl_assign and just use aggregation:
select us.stitching_working,
sum(a.order_no_first = 'C') as churidar,
sum(a.order_no_first = 'A') as anarkali
from tbl_unit_status us join
tbl_assign a
on us.assign_id = a.assign_id
where us.status_status = 'Stitch'
group by us.stitching_working;

MySQL query too slow about 25 seconds

I have a MySQL query and it takes about 25 sec. There are not many rows (just about 200) but I don't understand why it takes long time.
Query:
SELECT *
, c.id c_id
FROM campaign c
JOIN campaign_category cc
ON c.campaign_type = cc.id
WHERE c.is_deleted = 0
AND c.status = 1
AND c.id NOT IN (SELECT campaign_id FROM user_reviews WHERE user_id = 4)
AND c.amt_req > (SELECT COUNT(id)
FROM reserved_reviews
WHERE camping_id = c.id
AND user_id != 4)
+ (SELECT COUNT(id)
FROM user_reviews
WHERE campaign_id = c.id)
Edit:
I tried with JOIN like this but i got no result:
SELECT
*, `c`.`id` as `c_id`,COUNT(`ur`.`id`) as `total_reviewed`, COUNT(`rr`.`id`) as `total_reserved`
FROM
`campaign` `c`
JOIN `campaign_category` `cc` ON `c`.`campaign_type`=`cc`.`id`
JOIN `user_reviews` `ur` ON `ur`.`campaign_id`=`c`.`id`
JOIN `reserved_reviews` `rr` ON `rr`.`camping_id`=`c`.`id`
WHERE
`c`.`is_deleted` =0
AND
`c`.`status` = 1
AND
`ur`.`user_id` != 4
GROUP BY `c`.`id`
HAVING `c`.`amt_req` > COUNT(`ur`.`id`) + COUNT(`rr`.`id`)
Edit: Table structures: First Image - user_reviews Table, Second image campagin Table, Third image: reserved_reviews Table.
http://imgur.com/GI4817B,SdnSxuz,truxHM6#0
You can improve this query with indexes;
SELECT *, c.id c_id
FROM campaign c JOIN
campaign_category cc
ON c.campaign_type = cc.id
WHERE c.is_deleted = 0 AND
c.status = 1 AND
c.id NOT IN (SELECT campaign_id FROM user_reviews WHERE user_id = 4)
c.amt_req > (SELECT COUNT(*)
FROM reserved_reviews
WHERE campaign_id = c.id AND user_id <> 4)
) +
(SELECT COUNT(id)
FROM user_reviews
WHERE campaign_id = c.id
) ;
For the outer query and joins: campaign(status, is_deleted, id, amt_req) and campaign_category(id) (you should have the latter if it is defined as a primary key.
Then: user_reviews(user_id, campaign_id), reserved_reviews(campaign_id, user_id), and user_reviews(campaign_id).

using joins together with aggregates, and retrieving rows when no aggregate exists

The following query on my MySQL tables returns rows from the purchaseorder table that have corresponding entries in the deliveryorder table. How do I construct this query so that I get rows from the purchaseorder table even if no corresponding rows exist in the deliveryorder table? If the users want to see sql table CREATE statements, I can post those, but I'm not posting now as it really makes the question too big.
SELECT
`purchaseorder`.`id` AS `po_id`,
`purchaseorder`.`order_quantity` AS `po_order_quantity`,
`purchaseorder`.`applicable_approved_unit_rate` AS `po_unit_rate`,
`purchaseorder`.`applicable_sales_tax_rate` AS `po_tax_rate`,
`purchaseorder`.`order_date` AS `po_order_date`,
`purchaseorder`.`remarks` AS `po_remarks`,
`purchaseorder`.`is_open` AS `po_is_open`,
`purchaseorder`.`is_active` AS `po_is_active`,
`purchaseorder`.`approved_rate_id` AS `po_app_rate_id`,
`supplier`.`name` AS `sup_name`,
SUM(`deliveryorder`.`quantity`) AS `total_ordered`
FROM `purchaseorder`
LEFT JOIN `deliveryorder` ON (`deliveryorder`.`purchase_order_id` = `purchaseorder`.`id`)
INNER JOIN `approvedrate` ON (`purchaseorder`.`approved_rate_id` = `approvedrate`.`id`)
INNER JOIN `supplier` ON (`approvedrate`.`supplier_id` = `supplier`.`id`)
WHERE (
`purchaseorder`.`is_active` = 1
AND `purchaseorder`.`is_open` = 1
AND `deliveryorder`.`is_active` = 1
AND `approvedrate`.`material_id` = 2
)
HAVING `purchaseorder`.`order_quantity` >= `total_ordered` + 1
You have an aggregating function but no GROUP BY clause, which is wierd, but anyway - something like this? Oops - edited...
SELECT po.id po_id
, po.order_quantity po_order_quantity
, po.applicable_approved_unit_rate po_unit_rate
, po.applicable_sales_tax_rate po_tax_rate
, po.order_date po_order_date
, po.remarks po_remarks
, po.is_open po_is_open
, po.is_active po_is_active
, po.approved_rate_id po_app_rate_id
, s.name sup_name
, SUM(do.quantity) total_ordered
FROM purchaseorder po
LEFT
JOIN deliveryorder do
ON do.purchase_order_id = po.
AND do.is_active = 1
LEFT
JOIN approvedrate ar
ON ar.id = po.approved_rate_id
AND ar.material_id = 2
LEFT
JOIN supplier s
ON s.id = ar.supplier_id
WHERE po.is_active = 1
AND po.is_open = 1
HAVING po.order_quantity >= total_ordered + 1
I couldn't work out how to get the desired results all in one query, but ended up using the following two queries to fulfill my requirements: -
1st query
SELECT
pot.`id` AS `po_id`,
pot.`order_quantity` AS `po_order_quantity`,
pot.`applicable_approved_unit_rate` AS `po_unit_rate`,
pot.`applicable_sales_tax_rate` AS `po_tax_rate`,
pot.`is_open` AS `po_is_open`,
pot.`is_active` AS `po_is_active`,
st.`id` AS `sup_id`,
st.`name` AS `sup_name`,
SUM(dot.`quantity`) AS `total_ordered`
FROM `purchaseorder` pot
INNER JOIN `deliveryorder` dot ON (dot.`purchase_order_id` = pot.`id`)
INNER JOIN `approvedrate` art ON (pot.`approved_rate_id` = art.`id`)
INNER JOIN `supplier` st ON (art.`supplier_id` = st.`id`)
WHERE (
pot.`is_active` = 1
AND pot.`is_open` = 1
AND art.`material_id` = #materialid
AND art.`in_effect` = 1
AND art.`is_active` = 1
AND dot.`is_active` = 1
AND st.`is_active` = 1
)
HAVING pot.`order_quantity` >= `total_ordered` + #materialquantity
2nd query
SELECT
pot.`id` AS `po_id`,
pot.`order_quantity` AS `po_order_quantity`,
pot.`applicable_approved_unit_rate` AS `po_unit_rate`,
pot.`applicable_sales_tax_rate` AS `po_tax_rate`,
pot.`is_open` AS `po_is_open`,
pot.`is_active` AS `po_is_active`,
st.`id` AS `sup_id`,
st.`name` AS `sup_name`,
0 AS `total_ordered`
FROM `purchaseorder` pot
INNER JOIN `approvedrate` art ON (pot.`approved_rate_id` = art.`id`)
INNER JOIN `supplier` st ON (art.`supplier_id` = st.`id`)
WHERE (
pot.`is_active` = 1
AND pot.`is_open` = 1
AND art.`material_id` = #materialid
AND art.`in_effect` = 1
AND art.`is_active` = 1
AND st.`is_active` = 1
AND pot.`order_quantity` >= #materialquantity
AND pot.`id` NOT IN
(
SELECT dot.`purchase_order_id`
FROM `deliveryorder` dot
WHERE dot.is_active = 1
)
)

OR in left join

billboards table 140000 rows, regions 1000 rows.
SELECT
r.id,
SUM(IF(bb.r1_id = r.id, 1, 0)) AS count,
SUM(IF(bb.r2_id = r.id, 1, 0)) AS count2
FROM
tmp_regions AS r
LEFT JOIN
tmp_billboards AS bb
ON (r.id = bb.r1_id OR r.id = bb.r2_id)
WHERE
bb.deleted = 0
AND
bb.x != 0
AND
bb.y != 0
GROUP BY r.id
ORDER BY r.capital DESC , r.other , r.name
execution time is 8 sec
Explain
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE bb ref bb_r,bb_deleted,bb_x,bb_y,deleted_x_y,bb_r2 bb_deleted 1 const 66396 Using where; Using temporary; Using filesort
1 SIMPLE r ALL PRIMARY NULL NULL NULL 1000 Using where; Using join buffer
how can i change OR in join to improve perfomance?
Add indexes. The output of explain shows you which fields need them.
Assuming that tmp_regions (id) is the primary key, you could rewrite the query and the OR is converted to 2 joins:
SELECT
r.id,
COALESCE(bb1.cnt, 0) AS count,
COALESCE(bb2.cnt, 0) AS count2
FROM
tmp_regions AS r
LEFT JOIN
( SELECT r1_id, COUNT(*) AS cnt
FROM tmp_billboards
WHERE deleted = 0
AND x <> 0
AND y <> 0
GROUP BY r1_id
) AS bb1
ON r.id = bb1.r1_id
LEFT JOIN
( SELECT r2_id, COUNT(*) AS cnt
FROM tmp_billboards
WHERE deleted = 0
AND x <> 0
AND y <> 0
GROUP BY r2_id
) AS bb2
ON r.id = bb2.r2_id
ORDER BY r.capital DESC , r.other , r.name ;
For efficiency, indexes on (deleted, r1_id, x, y) and (deleted, r2_id, x, y) would help to avoid table scans on the tmp_billboards.

When doing a UNION in mysql how can I do a where on the results

Hi I am doing a union over several tables. It's a little long but works!
(SELECT user_id,added_date,group_id,'joined',0,0,'' FROM group_members WHERE status = 1)
UNION
(SELECT user_id,added_date,object_id,'made a comment',0,0,'' FROM comments WHERE object_type = 11 AND status = 1)
UNION
(SELECT user_id,added_date,group_id,'made the event',1,group_calendar_id,title FROM group_calendars WHERE status = 1)
UNION
(SELECT comments.user_id,comments.added_date,group_calendars.group_id,'made a comment on the event',1,group_calendar_id,'' FROM group_calendars
INNER JOIN comments ON group_calendars.group_calendar_id = comments.object_id WHERE group_calendars.status = 1 AND comments.status = 1 AND object_type = 10
)
UNION
(SELECT user_id,pd.added_date,pd.object_id,'uploaded a photo',2,pd.photo_data_id,
(SELECT varchar_val FROM photo_data WHERE data_id = 1 AND photo_data.photo_id = photos.photos_id AND object_type = 3 AND object_id = pd.object_id)
FROM photo_data pd
INNER JOIN photos ON photos.photos_id = pd.photo_id
WHERE photos.photo_status = 1 AND pd.status = 1 AND pd.data_id = 0 AND pd.object_type = 3
)
UNION
(SELECT cp.user_id,cp.added_date,cp.object_id,'made a comment on the photo',2,pd.photo_data_id,
(SELECT varchar_val FROM photo_data WHERE data_id = 1 AND photo_data.photo_id = photos.photos_id AND object_type = 3 AND object_id = pd.object_id)
FROM comments cp
INNER JOIN photo_data pd ON pd.photo_data_id = cp.object_id
INNER JOIN photos ON photos.photos_id = pd.photo_id
WHERE cp.object_type = 8 AND cp.status = 1 AND pd.status = 1 AND pd.data_id = 0 AND photos.photo_status = 1 AND pd.object_type = 3
)
UNION
(SELECT user_id,added_date,group_id,'made a topic',3,forum_topic_id,title FROM forum_topics WHERE forum_categories_id = ".GROUP_FORUM_CATEGORY." AND group_id > 0 AND status = 1)
UNION
(SELECT forum_comments.user_id,forum_comments.added_date,group_id,'made a comment on the topic',3,forum_comments.forum_topic_id,title FROM forum_comments
INNER JOIN forum_topics ON forum_comments.forum_topic_id = forum_topics.forum_topic_id
WHERE forum_topics.forum_categories_id = 16 AND forum_topics.group_id > 0 AND forum_topics.status = 1 AND forum_comments.status = 1
)
This gets all the activity from a set of groups. My question is at the end I want to make sure that the group is active.
So at the end want to do something like WHERE (SELECT COUNT(1) FROM groups g WHERE g.group_id = group_id AND status = 1) = 1
Is there any way of doing that?
i'd suggest to store it to a view or temporary table and query the view then. i know you will have two calls then, but it's actually faster in mysql that way.