MySQL : Different behaviors depending on WHERE result - mysql

I'm modifying an existing project. I want to group 3 mysql request in one.
These 3 request have the same selected data, only the WHERE change.
here's one of the request for exemple :
SELECT COUNT(seg.my_seg1) FROM (
SELECT COUNT(DISTINCT cp.conference_id) as my_seg1 FROM A.Account a
INNER JOIN A.ConferenceParticipant cp ON a.account_id = cp.user_id
INNER JOIN A.Conference cf ON cf.id = cp.conference_id
WHERE cf.`status` = 0
AND DATE_SUB(CURDATE(), INTERVAL 30 DAY) <= cf.creation_timestamp
GROUP BY a.account_id) as seg
WHERE seg.my_seg1 >= 30
The 2 other requests are exactly the same except :
WHERE seg.my_seg1 >= 11 AND seg.my_seg1 <= 30;
and :
WHERE seg.my_seg1 >= 30;
So my question is how can I get 3 different values depending on the WHERE result in the same request ?

Like this you'll have 3 virtual columns:
SELECT
COUNT(IF(seg.my_seg1 >= 30, 1, 0)) AS res1,
COUNT(IF(seg.my_seg1 >= 11 AND seg.my_seg1 < 30, 1, 0)) AS res2
FROM (
SELECT COUNT(DISTINCT cp.conference_id) as my_seg1
FROM A.Account a
JOIN A.ConferenceParticipant cp ON a.account_id = cp.user_id
JOIN A.Conference cf ON cf.id = cp.conference_id
WHERE
cf.`status` = 0
AND DATE_SUB(CURDATE(), INTERVAL 30 DAY) <= cf.creation_timestamp
GROUP BY a.account_id
) AS seg
But you have to revise your filters, you talk about 3 but I only see 2 different ones.

Related

If statement in mysql query with inner join

I'm currently showing users that got unfinished jobs and based on the results I run a while loop and a switch case statement to come with the final results. I'm wondering if it is possible to move that statement in the mysql query.
select
sum(cnt_jobs_unfinished = 0) cnt_users_no_unfinished_jobs,
sum(cnt_jobs_unfinished_30d > 0) cnt_users_unfinished_30d,
sum(cnt_jobs_unfinished_31_60d > 0) cnt_users_unfinished_31_60d,
sum(cnt_jobs_unfinished_61_90d > 0) cnt_users_unfinished_61_90d,
sum(cnt_jobs_unfinished_90d_more > 0) cnt_users_unfinished_90d_more
from (
select
u.user_id,
sum(l.job_id is null) cnt_jobs_unfinished,
sum(l.job_id is null and j.date >= curdate() - interval 30 day) cnt_jobs_unfinished_30d,
sum(
l.job_id is null
and j.date < curdate() - interval 30 day
and j.date >= curdate() - interval 60 day
) cnt_jobs_unfinished_31_60d,
sum(
l.job_id is null
and j.date < curdate() - interval 60 day
and j.date >= curdate() - interval 90 day
) cnt_jobs_unfinished_61_90d,
sum(
l.job_id is null
and j.date < curdate() - interval 90 day
) cnt_jobs_unfinished_90d_more
from users u
inner join scheduled_jobs j
on j.date <= curdate()
and j.user_id = u.user_id
left join last_update l
on l.job_id = j.job_id
group by u.user_id
) t
Here is the dbfiddle: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=d2f217e074a391d8b5f769e08b1d2c87
As you can see because a user got both an unfinished job between 61-90 days and 90+ days the final table shows both results which is a mistake. The correct one would be 61-90 days: 0 users and 90+: 1 user.
This is what I came up with (fiddle):
SELECT
sum(IF(t.nb_days IS NULL OR t.nb_days <= 0, 1, 0)) cnt_users_no_unfinished_jobs,
sum(IF(t.nb_days > 0 AND t.nb_days <= 30, 1, 0)) cnt_users_unfinished_30d,
sum(IF(t.nb_days > 30 AND t.nb_days <= 60, 1, 0)) cnt_users_unfinished_31_60d,
sum(IF(t.nb_days > 60 AND t.nb_days <= 90, 1, 0)) cnt_users_unfinished_61_90d,
sum(IF(t.nb_days > 90, 1, 0)) cnt_users_unfinished_90d_more
FROM users u
LEFT JOIN (
SELECT j.user_id, DATEDIFF(curdate(), MIN(j.date)) AS nb_days
FROM scheduled_jobs j
LEFT JOIN last_update l
ON l.job_id = j.job_id
WHERE l.job_id IS NULL
GROUP BY j.user_id
) AS t
ON u.user_id = t.user_id
Instead of counting how many jobs are in each range for each user, I'm only looking at their oldest unfinished job, and extracting the number of days since the deadline: DATEDIFF(curdate(), MIN(j.date)) AS nb_days).
Then this result is LEFT JOIN-ed to users (that way, in case a user doesn't have any unfinished jobs, he'll still show up in the cnt_users_no_unfinished_jobs column by checking for NULL values)
Finally, it's just a matter of SELECT-ing how many nb_days are in each range.

MySql graph query multiple series aligned to same time x-axis

I have queries that I'm using to make a graph of earnings. But now people are able to earn from two different sources, so I want to separate this out into two lines on the same chart
This one for standard earnings:
SELECT DATE_FORMAT(earning_created, '%c/%e/%Y') AS day, SUM(earning_amount) AS earning_standard
FROM earnings
WHERE earning_account_id = ? AND earning_referral_id = 0 AND (earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
GROUP BY DATE(earning_created)
ORDER BY earning_created
And this one for referral earnings:
SELECT DATE_FORMAT(e.earning_created, '%c/%e/%Y') AS day, SUM(e.earning_amount) AS earning_referral
FROM earnings AS e
INNER JOIN referrals AS r
ON r.referral_id = e.earning_referral_id
WHERE e.earning_account_id = ? AND e.earning_referral_id > 0 AND (e.earning_created > DATE_SUB(now(), INTERVAL 90 DAY)) AND r.referral_type = 0
GROUP BY DATE(e.earning_created)
ORDER BY e.earning_created
How do I get it to run the queries together, so that it outputs two columns/series for the y-axis: earning_standard and earning_referral.
But with them both aligned to the same day column/scale for the x-axis - substituting zero when there are no earnings for a specific series.
You'll need to set both of those queries as subqueries
SELECT DATE_FORMAT(earnings.earning_created, '%c/%e/%Y') AS day,
COALESCE(es.earning_standard, 0) AS earning_standard,
COALESCE(er.earning_referral, 0) AS earning_referral
FROM earnings
LEFT JOIN (SELECT DATE_FORMAT(earning_created, '%c/%e/%Y') AS day,
SUM(earning_amount) AS earning_standard
FROM earnings
WHERE earning_account_id = ?
AND earning_referral_id = 0
AND (earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
GROUP BY DATE(earning_created)) AS es
ON (day = es.day)
LEFT JOIN (SELECT DATE_FORMAT(e.earning_created, '%c/%e/%Y') AS day,
SUM(e.earning_amount) AS earning_referral
FROM earnings AS e
INNER JOIN referrals AS r
ON r.referral_id = e.earning_referral_id
WHERE e.earning_account_id = ?
AND e.earning_referral_id > 0
AND (e.earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
AND r.referral_type = 0
GROUP BY DATE(e.earning_created)) AS er
ON (day = er.day)
WHERE earnings.earning_account_id = ?
ORDER BY day
where I'm assuming earning_account_id = ? is intended to be with a question mark because the language you're using to run the query is replacing it with the actual id before running the query.
SELECT
COALESCE(t1.amount,0) AS link_earnings,
COALESCE(t2.amount,0) AS publisher_referral_earnings,
COALESCE(t3.amount,0) AS advertiser_referral_earnings,
t1.day AS day
FROM
(
SELECT DATE_FORMAT(earning_created, '%c/%e/%Y') AS day, SUM(earning_amount) AS amount
FROM earnings
WHERE earning_referral_id = 0
AND (earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
AND earning_account_id = ?
GROUP BY DATE(earning_created)
) t1
LEFT JOIN
(
SELECT DATE_FORMAT(ep.earning_created, '%c/%e/%Y') AS day, (SUM(ep.earning_amount) * rp.referral_share) AS amount
FROM earnings AS ep
INNER JOIN referrals AS rp
ON ep.earning_referral_id = rp.referral_id
WHERE ep.earning_referral_id > 0
AND (ep.earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
AND ep.earning_account_id = ?
AND rp.referral_type = 0
GROUP BY DATE(ep.earning_created)
) t2
ON t1.day = t2.day
LEFT JOIN
(
SELECT DATE_FORMAT(ea.earning_created, '%c/%e/%Y') AS day, (SUM(ea.earning_amount) * ra.referral_share) AS amount
FROM earnings AS ea
INNER JOIN referrals AS ra
ON ea.earning_referral_id = ra.referral_id
WHERE ea.earning_referral_id > 0
AND (ea.earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
AND ea.earning_account_id = ?
AND ra.referral_type = 1
GROUP BY DATE(ea.earning_created)
) t3
ON t1.day = t3.day
ORDER BY day
Seems to run ok....
You can simply use an outer join to retain earnings even when there is no matching referral, and then conditionally sum depending on whether a referral exists or not:
SELECT DATE_FORMAT(e.earning_created, '%c/%e/%Y') AS day,
SUM(IF(r.referral_id IS NULL, e.earning_amount, 0)) earning_standard,
SUM(IF(r.referral_id IS NULL, 0, e.earning_amount)) earning_referral
FROM earnings e LEFT JOIN referrals r ON r.referral_id = e.earning_referral_id
WHERE e.earning_account_id = ?
AND e.earning_created > CURRENT_DATE - INTERVAL 90 DAY
AND (r.referral_id IS NULL OR r.referral_type = 0)
GROUP BY 1
ORDER BY 1
I've assumed here that earnings.earning_referral_id is never negative, though you can add an explicit test to filter such records if so desired.
I've also changed the filter on earnings.earning_created to base from CURRENT_DATE rather than NOW() to ensure that any earnings created earlier than the current time on the first day of the series are still included—this would typically be what one actually wants, but feel free to change back if not.

MySQL using IF or CASE statement across joined tables

HI all here is a MySQL problem that uses results from a 2 table join, conditionally assess them and outputs 2 values.
Here is the database structure.
The 1st table gtpro contains
a user ID (column name id)
a samples/year number ie 2, 4 or 12 times/year (column name labSamples__yr)
The 2nd table labresults contains
that same user ID (column name idgtpro)
and a date column for the sample dates (when the samples were provided) column name date
so this query returns an overview of all id's and when were the last samples submitted for that id.
SELECT a.id, a.labSamples__yr, max(b.date) as ndate from gtpro as a
join labresults as b on a.id = b.idgtpro group by a.id
the conditions I want to evaluate looks like this.
a.labSamples__yr = 2 and ndate >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH)
a.labSamples__yr = 4 and ndate >= DATE_SUB(CURDATE(), INTERVAL 3 MONTH)
a.labSamples__yr = 12 and ndate >= DATE_SUB(CURDATE(), INTERVAL 1 MONTH)
So if number of samples /year is 2 and the last samle date was more than 6 months ago I want to know the id and latest date of samples for that id.
I tried using CASE and IF statements but can't quite get it right. This was my latest attempt.
select id, ndate,
case when (labSamples__yr = 2 and ndate <= DATE_SUB(CURDATE(), INTERVAL 6 MONTH))is true
then
(SELECT id from gtpro as a join labresults as b on a.id = b.idgtpro where
labSamples__yr = 2 and max(b.date) <= DATE_SUB(CURDATE(), INTERVAL 6 MONTH)) end as id
from (SELECT a.id, a.labSamples__yr, max(b.date) as ndate from gtpro as a
join labresults as b on a.id = b.idgtpro group by a.id) d
this tells me invalid use of group function.
Desperate for a bit of help
EDIT I messed up some of the names in the code above which i have now fixed.
If I understand your question correctly, you should be able to put the conditions in the where clause:
SELECT a.id, a.labSamples__yr, max(b.date) as ndate
from gtpro a join
labresults b
on a.id = b.idgtpro
where (a.labSamples__yr = 2 and b.date >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH)) or
(a.labSamples__yr = 4 and b.date >= DATE_SUB(CURDATE(), INTERVAL 3 MONTH)) or
(a.labSamples__yr = 12 and b.date >= DATE_SUB(CURDATE(), INTERVAL 1 MONTH))
group by a.id;
That fixes your syntax problem. But, if you want the id with the maximum date, try doing this:
select a.labSamples__yr, max(b.date) as ndate,
substring_index(group_concat(a.id order by b.date desc)) as maxid
from gtpro a join
labresults b
on a.id = b.idgtpro
where (a.labSamples__yr = 2 and b.date >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH)) or
(a.labSamples__yr = 4 and b.date >= DATE_SUB(CURDATE(), INTERVAL 3 MONTH)) or
(a.labSamples__yr = 12 and b.date >= DATE_SUB(CURDATE(), INTERVAL 1 MONTH))
group by a.labSamples__yr;
Putting a.id in the group by is not going to give you the maximum id of anything.
Is this meant to be valid MySQL? I wasn't aware of "is true" being valid in a CASE statement. In fairness though I'm more familiar with Oracle and SQL Server but nevertheless... does any part of this statement work?
EDIT
Ok, here is what I have edited the code to be:
select id, ndate,
case when (labSamples__yr = 2 and ndate <= DATE_SUB(CURDATE(), INTERVAL 6 MONTH))
then
(SELECT id from bifipro as a join labresults as b on a.id = b.idBifipro where
labSamples__yr = 2 and max(b.date) <= DATE_SUB(CURDATE(), INTERVAL 6 MONTH) where a.id=d.id) end as id
from (SELECT a.id, a.labSamples__yr, max(b.date) as ndate from bifipro as a
join labresults as b on a.id = b.idBifipro group by a.id) d
In your correlated subquery I have added a predicate of "where a.id =
d.id"
I have removed the text "is true" from your case statement (this may
be incorrect but I didnt' think it should be there.
The answer partly inspired by Tomas (sql clarification and syntax clarification) I got rid of the CASE all together. It seems nice and clean to me but I would like to hear any other suggestions
select id, labSamples__yr, ndate from
(SELECT a.id, a.labSamples__yr, max(b.date) as ndate from gtpro as a
join labresults as b on a.id = b.idgtpro group by a.id)d
where (ndate <= DATE_SUB(CURDATE(), INTERVAL 6 MONTH) and labSamples__yr = 2)
or (ndate <= DATE_SUB(CURDATE(), INTERVAL 3 MONTH) and labSamples__yr = 4)
or (ndate <= DATE_SUB(CURDATE(), INTERVAL 1 MONTH) and labSamples__yr = 12)
Thanks for looking but it would still be nice to see a solution using a CASE statement for future reference???

sql gives no results because empty left join

So I have this query:
SELECT
AED.aId, CJ.*
FROM
AED
LEFT JOIN
Cronjob as CJ
ON CJ.aID = AED.aId
WHERE
AED.aStatus = '1'
AND
(
CJ.cjDatum < CURRENT_DATE - INTERVAL 14 DAY
AND
AED.aRegistratie > CURRENT_DATE - INTERVAL 10 YEAR
)
OR
(
CJ.cjStatus = '9'
OR
CJ.cjStatus = '2'
)
The problem is, the Cronjob table is empty, and if it's empty is still want to give all the Id's from AED with the status 1
I couldn't find anything use full, so I hope you guys can help!
You should move all the CJ criteria into the join's ON clause.
ON (
CJ.aID = AED.aId
AND (cjStatus in ('2','9') OR cjDatum < CURRENT_DATE - INTERVAL 14 DAY ))
Option two would be to leave them in WHERE, but make provisions for the case that cjStatus and friends are NULL (which they will be if no match is found).
OR cjStatus IS NULL
When there is no CronJob associated, the filter CJ.cjStatus = '9' (for example) return false, since CJ.cjStatus is null. That's what a LEFT JOIN do, it returns null field when there is no correspondance.
To add filter on the table you want to LEFT JOIN with, the filter clause must be in the join clauses like this:
SELECT AED.aId
, CJ.*
FROM AED
LEFT JOIN Cronjob as CJ
ON CJ.aID = AED.aId
AND (CJ.cjDatum < CURRENT_DATE - INTERVAL 14 DAY
AND AED.aRegistratie > CURRENT_DATE - INTERVAL 10 YEAR
)
OR (CJ.cjStatus = '9' OR CJ.cjStatus = '2')
WHERE AED.aStatus = '1'
Add OR (CJ.aid is null) to your AND part of condition:
SELECT
AED.aId, CJ.*
FROM
AED
LEFT JOIN
Cronjob as CJ
ON CJ.aID = AED.aId
WHERE
AED.aStatus = '1'
AND
(
(
CJ.cjDatum < CURRENT_DATE - INTERVAL 14 DAY
AND
AED.aRegistratie > CURRENT_DATE - INTERVAL 10 YEAR
)
OR (CJ.aid is null)
)
OR
(
CJ.cjStatus = '9'
OR
CJ.cjStatus = '2'
)

Multiple Left Joins have affected my SUM(TIMESTAMPDIFF calculation

I am currently trying to create a report for the amount of time in total some individuals on my PHPBB3 forum have booked in, for the last week. Initially, I had the following query that worked as expected:
SELECT forum_users.username, SUM(TIMESTAMPDIFF(SECOND, schedule_slots.time_starting, schedule_slots.time_finishing)) AS seconds
FROM forum_users
LEFT JOIN schedule_slots
ON forum_users.user_id = schedule_slots.user_id
AND schedule_slots.time_starting >= (CURDATE() - INTERVAL 1 WEEK)
AND schedule_slots.is_del = 0
AND schedule_slots.channel = 0
WHERE (forum_users.group_id = 8 OR forum_users.group_id = 5 OR forum_users.group_id = 14)
GROUP BY forum_users.username
ORDER BY upper(forum_users.username)
However, when I go to join another table, the timestamp difference ends up being incorrectly calculated (it's higher), here's my newer non working statement:
SELECT forum_users.username, SUM(TIMESTAMPDIFF(SECOND, schedule_slots.time_starting, schedule_slots.time_finishing)) AS seconds, group_concat(DISTINCT forum_user_group.group_id) AS user_groups
FROM forum_users
LEFT JOIN schedule_slots
ON forum_users.user_id = schedule_slots.user_id
AND schedule_slots.time_starting >= (CURDATE() - INTERVAL 1 WEEK)
AND schedule_slots.is_del = 0
AND schedule_slots.channel = 0
LEFT JOIN forum_user_group
ON forum_user_group.user_id = forum_users.user_id
WHERE (forum_users.group_id = 8 OR forum_users.group_id = 5 OR forum_users.group_id = 14 OR forum_users.group_id = 12)
GROUP BY forum_users.username
ORDER BY upper(forum_users.username)
I'm drawing a blank on this one, and your help is greatly appreciated.
The timestamp difference is calculated right, but you multiply it by number of groups the user is in.
I would try:
SELECT forum_users.username, SUM(TIMESTAMPDIFF(SECOND, schedule_slots.time_starting, schedule_slots.time_finishing)) AS seconds,
(select group_concat(DISTINCT group_id) from forum_user_group WHERE forum_user_group.user_id = forum_users.user_id) AS user_groups
FROM forum_users
LEFT JOIN schedule_slots
ON forum_users.user_id = schedule_slots.user_id
AND schedule_slots.time_starting >= (CURDATE() - INTERVAL 1 WEEK)
AND schedule_slots.is_del = 0
AND schedule_slots.channel = 0
WHERE (forum_users.group_id = 8 OR forum_users.group_id = 5 OR forum_users.group_id = 14 OR forum_users.group_id = 12)
GROUP BY forum_users.username, user_groups
ORDER BY upper(forum_users.username)