wrong count on the multiple joins with the same table

wrong count on the multiple joins with the same table - mysql

i have 2 tables
1 - coupons
2 - tractions
for each coupon there might be couple of rows in tractions table
I want to have list of all coupons and count of its tractions under different condition
SELECT `coupons`.`id` ,
count( tractions_all.id ) AS `all` ,
count( tractions_void.id ) AS void,
count( tractions_returny.id ) AS returny,
count( tractions_burned.id ) AS burned
FROM `coupons`
LEFT JOIN `tractions` AS `tractions_all`
ON `coupons`.`id` = `tractions_all`.`coupon_parent`
LEFT JOIN `tractions` AS `tractions_void`
ON `coupons`.`id` = `tractions_void`.`coupon_parent`
AND `tractions_void`.`expired` =1
LEFT JOIN `tractions` `tractions_returny`
ON `tractions_returny`.`coupon_parent` = `coupons`.`id`
AND `tractions_returny`.`expired` =11
LEFT JOIN `tractions` `tractions_burned`
ON `tractions_burned`.`coupon_parent` = `coupons`.`id`
AND `tractions_burned`.`expired` =0
AND '2014-02-12'
WHERE `coupons`.`parent` =0
GROUP BY `coupons`.`id`
right now only one of my coupons has 2 traction on both are burned traction other coupons have no tractions at all
here is the result
as you can see coupon with id=13 has 4 traction while it should be 2 ... what am i doing wrong ? if i remove the last join it works fine and i get 2

You are aggregating along multiple dimensions at one time, resulting in a cartesian product for each id.
If your data volume is not very large, the easiest way to fix this is using distinct:
SELECT `coupons`.`id` ,
count(distinct tractions_all.id ) AS `all` ,
count(distinct tractions_void.id ) AS void,
count(distinct tractions_returny.id ) AS returny,
count(distinct tractions_burned.id ) AS burned
If your data is large, then you will probably need to aggregate values as subqueries first and then do the joins.

Related

How to limit record before group by for pagination?

I have this query that will LEFT JOIN and GROUP BY to get SUM of column.
SELECT
c.id,
SUM(
r.score
) AS score_sum,
SUM(
CASE WHEN r.is_active = '0' THEN r.negative ELSE 0 END
) AS negative_sum
FROM comments AS c
LEFT JOIN rates AS r ON (r.comment_id = c.id)
WHERE r.comment_id = c.id
GROUP BY c.id
DB Fiddle link:
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=fadba795d8426f91471fa4db83845b6f
The query works, but if the comments records is large (10K for example), I need to implement pagination, how do I modify this query to limit the comments records first before GROUP BY?
In short:
Get the first 5 comments by limit to 5
Left join the table rates
Get the SUM by group by
Example, show the first 4 comments SUM
Thanks

You can use subquery to "select c.id from comments limit N" in the FROM clause.
select c.id,
sum(r.score) as score_sum,
SUM(
CASE WHEN r.is_active = '0' THEN r.negative ELSE 0 END
) AS negative_sum
from ( select c.id from comments c limit 2) c
LEFT JOIN rates AS r ON (r.comment_id = c.id)
GROUP BY c.id;
You may apply order by in the subquery to determine order in which you want to select the comments (Top N).
DB Fiddle link

Try the following:
SELECT
c.id,
SUM(
r.score
) AS score_sum,
SUM(
CASE WHEN r.is_active = '0' THEN r.negative ELSE 0 END
) AS negative_sum
FROM comments AS c
LEFT JOIN rates AS r ON (r.comment_id = c.id)
WHERE r.comment_id = c.id
GROUP BY c.id
ORDER BY c.id ASC
LIMIT 5
The rationale behind the above query is that id is the Primary key (hence indexed) in your comments table. Also, your GROUP BY and ORDER BY is on the same column, that is, id; so MySQL will first utilize the index on id and get first 5 rows (due to LIMIT), and then proceed forward to JOIN with other tables and do aggregation etc.
Give it a Try!! More details here: https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html
We can confirm the same using EXPLAIN .. on this query.

Simplify slow MySQL query

This query calculates the columns free,plus,score and total based on the COUNT of columns in subquery.
SELECT movie_title,movie_id,MAX(x.free_cnt) as free, MAX(x.plus_cnt) as plus,
(MAX(x.free_cnt) + (MAX(x.plus_cnt)*3)) AS score, (MAX(x.free_cnt) + MAX(x.plus_cnt)) AS total
FROM (
SELECT b.id as movie_id, b.movie_title as movie_title, COUNT(*) AS free_cnt, 0 as plus_cnt
FROM subtitles_request a1
LEFT JOIN movies b on a1.movie_id=b.id
JOIN users c on c.email=a1.email
WHERE c.subsc_status='0'
GROUP BY b.movie_title
UNION ALL
SELECT d.id as movie_id, d.movie_title as movie_title, 0 as free_cnt, COUNT(*) AS plus_cnt
FROM subtitles_request a2
LEFT JOIN movies d on a2.movie_id=d.id
JOIN users e on e.email=a2.email
WHERE e.subsc_status='1'
GROUP BY d.movie_title
) AS x
GROUP BY movie_title
ORDER BY total DESC
LIMIT 10
It is slow performing and i'm wondering is there anyway i can simplify or change the query to speed up performance. I can't calculate the free,plus,score ,total columns outside of query due to being able to order by. Also i may incorporate date.
Anyway to simplify this query?

Try this:
SELECT b.movie_title, x.movie_id, MAX( x.free_cnt ) AS free, MAX( x.plus_cnt ) AS plus,
( MAX( x.free_cnt ) + ( MAX( x.plus_cnt ) * 3 ) ) AS score, ( MAX( x.free_cnt ) + MAX( x.plus_cnt ) ) AS total
FROM ( SELECT a.movie_id,
SUM( IF( c.subsc_status = '0', 1, 0 ) ) AS free_cnt,
SUM( IF( c.subsc_status = '1', 1, 0 ) ) AS plus_cnt
FROM subtitles_request a1
JOIN users c on c.email=a1.email
WHERE c.subsc_status in ('0','1')
GROUP BY a.movie_id
) AS x
LEFT JOIN movies b on x.movie_id = b.id
GROUP BY movie_title, movie_id
ORDER BY total DESC
LIMIT 10
Maybe I've simplified a bit too much. Moreover, I'm not used to grouping on only some of the non-aggregate fields, hence I added movie_id to what is being grouped by and thus changing your query a bit (if two films had the same name, but different ID, then only one of the id's would be returned in your original query, but I guess (being a MySQL newbie, I really don't know) the counts would be for both of them taken together).
HTH,
Set

Well, I have check your the subquery:
SELECT b.id as movie_id, b.movie_title as movie_title, COUNT(*) AS free_cnt, 0 as plus_cnt
FROM subtitles_request a1
LEFT JOIN movies b on a1.movie_id=b.id
JOIN users c on c.email=a1.email
WHERE c.subsc_status='0'
GROUP BY b.movie_title
UNION ALL
SELECT d.id as movie_id, d.movie_title as movie_title, 0 as free_cnt, COUNT(*) AS plus_cnt
FROM subtitles_request a2
LEFT JOIN movies d on a2.movie_id=d.id
JOIN users e on e.email=a2.email
WHERE e.subsc_status='1'
GROUP BY d.movie_title
The statement beside "UNION ALL" can be replaced with one statement with condition at c.subsc_status IN('0','1'). And you can try to use "CASE WHEN" statement at 0 as free_cnt, COUNT(*) AS plus_cnt, just like IFNULL((CASE WHEN e.subsc_status='1' THEN COUNT(*)),0) as free_cnt. It's not a complicated sql statement, I don't think it will take too much time to query. Is there too many datas?
As a matter of fact, I'm also a newer, but I just have some experence about it. Please forgive me if it doesn't work.

Joining two columns in mysql

I want to add data from table b in table a but unfortunately full outer join do not work in mysql . I have also tried union but it is throwing errors because my statement has group by and order by keyword
SELECT COUNT( ReviewedBy ) AS TotalReviews, OrganizationId, SUM( Rating ) AS TotalStars, COUNT( Rating ) AS TotalRatings, (
SUM( Rating ) / COUNT( Rating )
) AS AverageRating
FROM `tbl_reviews`
WHERE ReviewType = 'shopper'
AND ReviewFor = 'org'
AND OrganizationId
IN (
SELECT OrganizationId
FROM tbl_organizations
WHERE CategoryID =79
)
GROUP BY OrganizationId
ORDER BY AverageRating DESC
This is what i'm getting from the above statement
I want to get organizationId 21 data in the result but i'm not getting result because it's not present in 'tbl_review' table
click here to see the table b
How can i get Desired result ?

You don't need a FULL, but a LEFT join:
SELECT COUNT( ReviewedBy ) AS TotalReviews, o.OrganizationId,
SUM( Rating ) AS TotalStars, COUNT( Rating ) AS TotalRatings,
(SUM( Rating ) / COUNT( Rating )) AS AverageRating
FROM tbl_organizations AS o
LEFT JOIN `tbl_reviews` AS r
ON o.OrganizationId = r.OrganizationId
AND ReviewType = 'shopper' -- conditions on inner table
AND ReviewFor = 'org' -- must be moved to ON
WHERE CategoryID =79
GROUP BY o.OrganizationId
ORDER BY AverageRating DESC
Why don't you use AVG instead of SUM/COUNT?

Have you tried:
from organization
left outer join tbl_reviews
on organization.ID = tbl_reviews.organization is
for your where clause? I don't think you need a full outer join in this case... A left outer join should do

Mysql SUM Float give wrong value [duplicate]

I'm looking for help using sum() in my SQL query:
SELECT links.id,
count(DISTINCT stats.id) as clicks,
count(DISTINCT conversions.id) as conversions,
sum(conversions.value) as conversion_value
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY links.id
ORDER BY links.created desc;
I use DISTINCT because I'm doing "group by" and this ensures the same row is not counted more than once.
The problem is that SUM(conversions.value) counts the "value" for each row more than once (due to the group by)
I basically want to do SUM(conversions.value) for each DISTINCT conversions.id.
Is that possible?

I may be wrong but from what I understand
conversions.id is the primary key of your table conversions
stats.id is the primary key of your table stats
Thus for each conversions.id you have at most one links.id impacted.
You request is a bit like doing the cartesian product of 2 sets :
[clicks]
SELECT *
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
[conversions]
SELECT *
FROM links
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
and for each link, you get sizeof([clicks]) x sizeof([conversions]) lines
As you noted the number of unique conversions in your request can be obtained via a
count(distinct conversions.id) = sizeof([conversions])
this distinct manages to remove all the [clicks] lines in the cartesian product
but clearly
sum(conversions.value) = sum([conversions].value) * sizeof([clicks])
In your case, since
count(*) = sizeof([clicks]) x sizeof([conversions])
count(*) = sizeof([clicks]) x count(distinct conversions.id)
you have
sizeof([clicks]) = count(*)/count(distinct conversions.id)
so I would test your request with
SELECT links.id,
count(DISTINCT stats.id) as clicks,
count(DISTINCT conversions.id) as conversions,
sum(conversions.value)*count(DISTINCT conversions.id)/count(*) as conversion_value
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY links.id
ORDER BY links.created desc;
Keep me posted !
Jerome

Jeromes solution is actually wrong and can produce incorrect results!!
sum(conversions.value)*count(DISTINCT conversions.id)/count(*) as conversion_value
let's assume the following table
conversions
id value
1 5
1 5
1 5
2 2
3 1
the correct sum of value for distinct ids would be 8.
Jerome's formula produces:
sum(conversions.value) = 18
count(distinct conversions.id) = 3
count(*) = 5
18*3/5 = 9.6 != 8

For an explanation of why you were seeing incorrect numbers, read this.
I think that Jerome has a handle on what is causing your error. Bryson's query would work, though having that subquery in the SELECT could be inefficient.

Use the following query:
SELECT links.id
, (
SELECT COUNT(*)
FROM stats
WHERE links.id = stats.parent_id
) AS clicks
, conversions.conversions
, conversions.conversion_value
FROM links
LEFT JOIN (
SELECT link_id
, COUNT(id) AS conversions
, SUM(conversions.value) AS conversion_value
FROM conversions
GROUP BY link_id
) AS conversions ON links.id = conversions.link_id
ORDER BY links.created DESC

I use a subquery to do this. It eliminates the problems with grouping.
So the query would be something like:
SELECT COUNT(DISTINCT conversions.id)
...
(SELECT SUM(conversions.value) FROM ....) AS Vals

How about something like this:
select l.id, count(s.id) clicks, count(c.id) clicks, sum(c.value) conversion_value
from (SELECT l.id id, l.created created,
s.id clicks,
c.id conversions,
max(c.value) conversion_value
FROM links l
LEFT JOIN stats s ON l.id = s.parent_id
LEFT JOIN conversions c ON l.id = c.link_id
GROUP BY l.id, l.created, s.id, c.id) t
order by t.created

This will do the trick, just divide the sum with the count of conversation id which are duplicate.
SELECT a.id,
a.clicks,
SUM(a.conversion_value/a.conversions) AS conversion_value,
a.conversions
FROM (SELECT links.id,
COUNT(DISTINCT stats.id) AS clicks,
COUNT(conversions.id) AS conversions,
SUM(conversions.value) AS conversion_value
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY conversions.id,links.id
ORDER BY links.created DESC) AS a
GROUP BY a.id

Select sum(x.value) as conversion_value,count(x.clicks),count(x.conversions)
FROM
(SELECT links.id,
count(DISTINCT stats.id) as clicks,
count(DISTINCT conversions.id) as conversions,
conversions.value,
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY conversions.id) x
GROUP BY x.id
ORDER BY x.created desc;
I believe this will give you the answer that you are looking for.

LEFT JOIN after GROUP BY?

I have a table of "Songs", "Songs_Tags" (relating songs with tags) and "Songs_Votes" (relating songs with boolean like/dislike).
I need to retrieve the songs with a GROUP_CONCAT() of its tags and also the number of likes (true) and dislikes (false).
My query is something like that:
SELECT
s.*,
GROUP_CONCAT(st.id_tag) AS tags_ids,
COUNT(CASE WHEN v.vote=1 THEN 1 ELSE NULL END) as votesUp,
COUNT(CASE WHEN v.vote=0 THEN 1 ELSE NULL END) as votesDown,
FROM Songs s
LEFT JOIN Songs_Tags st ON (s.id = st.id_song)
LEFT JOIN Votes v ON (s.id=v.id_song)
GROUP BY s.id
ORDER BY id DESC
The problem is that when a Song has more than 1 tag, it gets returned more then once, so when I do the COUNT(), it returns more results.
The best solution I could think is if it would be possible to do the last LEFT JOIN after the GROUP BY (so now there would be only one entry for each song). Then I'd need another GROUP BY m.id.
Is there a way to accomplish that? Do I need to use a subquery?

There've been some good answers so far, but I would adopt a slightly different method quite similar to what you described originally
SELECT
songsWithTags.*,
COALESCE(SUM(v.vote),0) AS votesUp,
COALESCE(SUM(1-v.vote),0) AS votesDown
FROM (
SELECT
s.*,
COLLATE(GROUP_CONCAT(st.id_tag),'') AS tags_ids
FROM Songs s
LEFT JOIN Songs_Tags st
ON st.id_song = s.id
GROUP BY s.id
) AS songsWithTags
LEFT JOIN Votes v
ON songsWithTags.id = v.id_song
GROUP BY songsWithTags.id DESC
In this the subquery is responsible for collating songs with tags into a 1 row per song basis. This is then joined onto Votes afterwards. I also opted to simply sum up the v.votes column as you have indicated it is 1 or 0 and therefore a SUM(v.votes) will add up 1+1+1+0+0 = 3 out of 5 are upvotes, while SUM(1-v.vote) will sum 0+0+0+1+1 = 2 out of 5 are downvotes.
If you had an index on votes with the columns (id_song,vote) then that index would be used for this so it wouldn't even hit the table. Likewise if you had an index on Songs_Tags with (id_song,id_tag) then that table wouldn't be hit by the query.
edit added solution using count
SELECT
songsWithTags.*,
COUNT(CASE WHEN v.vote=1 THEN 1 END) as votesUp,
COUNT(CASE WHEN v.vote=0 THEN 1 END) as votesDown
FROM (
SELECT
s.*,
COLLATE(GROUP_CONCAT(st.id_tag),'') AS tags_ids
FROM Songs s
LEFT JOIN Songs_Tags st
ON st.id_song = s.id
GROUP BY s.id
) AS songsWithTags
LEFT JOIN Votes v
ON songsWithTags.id = v.id_song
GROUP BY songsWithTags.id DESC

Try this:
SELECT
s.*,
GROUP_CONCAT(DISTINCT st.id_tag) AS tags_ids,
COUNT(DISTINCT CASE WHEN v.vote=1 THEN id_vote ELSE NULL END) AS votesUp,
COUNT(DISTINCT CASE WHEN v.vote=0 THEN id_vote ELSE NULL END) AS votesDown
FROM Songs s
LEFT JOIN Songs_Tags st ON (s.id = st.id_song)
LEFT JOIN Votes v ON (s.id=v.id_song)
GROUP BY s.id
ORDER BY id DESC

Your code results in a mini-Cartesian product because you are doing two Joins in 1-to-many relationships and the 1 table is on the same side of both joins.
Convert to 2 subqueries with groupings and then Join:
SELECT
s.*,
COALESCE(st.tags_ids, '') AS tags_ids,
COALESCE(v.votesUp, 0) AS votesUp,
COALESCE(v.votesDown, 0) AS votesDown
FROM
Songs AS s
LEFT JOIN
( SELECT
id_song,
GROUP_CONCAT(id_tag) AS tags_ids
FROM Songs_Tags
GROUP BY id_song
) AS st
ON s.id = st.id_song
LEFT JOIN
( SELECT
id_song,
COUNT(CASE WHEN v.vote=1 THEN id_vote END) AS votesUp,
COUNT(CASE WHEN v.vote=0 THEN id_vote END) AS votesDown
FROM Votes
GROUP BY id_song
) AS v
ON s.id = v.id_song
ORDER BY s.id DESC

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

wrong count on the multiple joins with the same table - mysql

Related

How to limit record before group by for pagination?

Simplify slow MySQL query

Joining two columns in mysql

Mysql SUM Float give wrong value [duplicate]

LEFT JOIN after GROUP BY?

Categories

Resources