If statement in mysql query with inner join - mysql

I'm currently showing users that got unfinished jobs and based on the results I run a while loop and a switch case statement to come with the final results. I'm wondering if it is possible to move that statement in the mysql query.
select
sum(cnt_jobs_unfinished = 0) cnt_users_no_unfinished_jobs,
sum(cnt_jobs_unfinished_30d > 0) cnt_users_unfinished_30d,
sum(cnt_jobs_unfinished_31_60d > 0) cnt_users_unfinished_31_60d,
sum(cnt_jobs_unfinished_61_90d > 0) cnt_users_unfinished_61_90d,
sum(cnt_jobs_unfinished_90d_more > 0) cnt_users_unfinished_90d_more
from (
select
u.user_id,
sum(l.job_id is null) cnt_jobs_unfinished,
sum(l.job_id is null and j.date >= curdate() - interval 30 day) cnt_jobs_unfinished_30d,
sum(
l.job_id is null
and j.date < curdate() - interval 30 day
and j.date >= curdate() - interval 60 day
) cnt_jobs_unfinished_31_60d,
sum(
l.job_id is null
and j.date < curdate() - interval 60 day
and j.date >= curdate() - interval 90 day
) cnt_jobs_unfinished_61_90d,
sum(
l.job_id is null
and j.date < curdate() - interval 90 day
) cnt_jobs_unfinished_90d_more
from users u
inner join scheduled_jobs j
on j.date <= curdate()
and j.user_id = u.user_id
left join last_update l
on l.job_id = j.job_id
group by u.user_id
) t
Here is the dbfiddle: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=d2f217e074a391d8b5f769e08b1d2c87
As you can see because a user got both an unfinished job between 61-90 days and 90+ days the final table shows both results which is a mistake. The correct one would be 61-90 days: 0 users and 90+: 1 user.

This is what I came up with (fiddle):
SELECT
sum(IF(t.nb_days IS NULL OR t.nb_days <= 0, 1, 0)) cnt_users_no_unfinished_jobs,
sum(IF(t.nb_days > 0 AND t.nb_days <= 30, 1, 0)) cnt_users_unfinished_30d,
sum(IF(t.nb_days > 30 AND t.nb_days <= 60, 1, 0)) cnt_users_unfinished_31_60d,
sum(IF(t.nb_days > 60 AND t.nb_days <= 90, 1, 0)) cnt_users_unfinished_61_90d,
sum(IF(t.nb_days > 90, 1, 0)) cnt_users_unfinished_90d_more
FROM users u
LEFT JOIN (
SELECT j.user_id, DATEDIFF(curdate(), MIN(j.date)) AS nb_days
FROM scheduled_jobs j
LEFT JOIN last_update l
ON l.job_id = j.job_id
WHERE l.job_id IS NULL
GROUP BY j.user_id
) AS t
ON u.user_id = t.user_id
Instead of counting how many jobs are in each range for each user, I'm only looking at their oldest unfinished job, and extracting the number of days since the deadline: DATEDIFF(curdate(), MIN(j.date)) AS nb_days).
Then this result is LEFT JOIN-ed to users (that way, in case a user doesn't have any unfinished jobs, he'll still show up in the cnt_users_no_unfinished_jobs column by checking for NULL values)
Finally, it's just a matter of SELECT-ing how many nb_days are in each range.

Related

How can I display a value of NULL where the join did not find any existing values?

How can I display a value of NULL where the join did not find any existing values?
SELECT u.display_name Associate
, ROUND(SUM(CASE WHEN w.startdate BETWEEN NOW() AND NOW() + INTERVAL 30 DAY THEN w.timeworked END/3600)) '30 Days'
, ROUND(SUM(CASE WHEN w.startdate BETWEEN NOW() AND NOW() + INTERVAL 60 DAY THEN w.timeworked END/3600)) '60 Days'
, ROUND(SUM(CASE WHEN w.startdate BETWEEN NOW() AND NOW() + INTERVAL 90 DAY THEN w.timeworked END/3600)) '90 Days'
FROM worklog w
JOIN cwd_user u
ON u.user_name = w.author
JOIN cwd_membership m
ON m.directory_id = u.directory_id
AND m.lower_child_name = u.lower_user_name
WHERE m.membership_type = 'GROUP_USER'
AND m.lower_parent_name = 'atl_servicedesk_it_agents'
AND w.startdate BETWEEN NOW() AND DATE_ADD(NOW(), INTERVAL 90 DAY)
GROUP
BY u. display_name
ORDER
BY u.last_name;
So on my join u.user_name = w.author I want to show all values where there is a u.user_name, even if there is not a w.author. The display_name should still show up, but the values for 40, 60, and 90 days would be NULL. Ideally I want to change the NULL to be 0 instead. Users don't appear in the worklog table unless they have logged work, so right now it only shows two rows for the two people who have. I still want to show everyone that exists in m.lower_parent_name = 'atl_servicedesk_it_agents' to know that they have not logged anything.
Anyone have any ideas?
You can use LEFT JOIN, but you need to be careful about all the joins and the WHERE conditions:
SELECT u.display_name as Associate,
ROUND(SUM(CASE WHEN w.startdate BETWEEN NOW() AND NOW() + INTERVAL 30 DAY THEN w.timeworked END/3600)) as `30 Days`,
ROUND(SUM(CASE WHEN w.startdate BETWEEN NOW() AND NOW() + INTERVAL 60 DAY THEN w.timeworked END/3600)) as `60 Days`,
ROUND(SUM(CASE WHEN w.startdate BETWEEN NOW() AND NOW() + INTERVAL 90 DAY THEN w.timeworked END/3600)) as `90 Days`
FROM cwd_user u JOIN -- Not sure if this should be LEFT JOIN or not
cwd_membership m
ON m.directory_id = u.directory_id AND
m.lower_child_name = u.lower_user_name AND
m.membership_type = 'GROUP_USER' AND
m.lower_parent_name = 'atl_servicedesk_it_agents' LEFT JOIN
worklog w
ON u.user_name = w.author AND
w.startdate BETWEEN NOW() AND DATE_ADD(NOW(), INTERVAL 90 DAY)
GROUP BY u. display_name
ORDER BY u.last_name;
I'm not sure if the join to m should be an inner join or left join. It depends on whether you want filtering based on that table as well.

MySql graph query multiple series aligned to same time x-axis

I have queries that I'm using to make a graph of earnings. But now people are able to earn from two different sources, so I want to separate this out into two lines on the same chart
This one for standard earnings:
SELECT DATE_FORMAT(earning_created, '%c/%e/%Y') AS day, SUM(earning_amount) AS earning_standard
FROM earnings
WHERE earning_account_id = ? AND earning_referral_id = 0 AND (earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
GROUP BY DATE(earning_created)
ORDER BY earning_created
And this one for referral earnings:
SELECT DATE_FORMAT(e.earning_created, '%c/%e/%Y') AS day, SUM(e.earning_amount) AS earning_referral
FROM earnings AS e
INNER JOIN referrals AS r
ON r.referral_id = e.earning_referral_id
WHERE e.earning_account_id = ? AND e.earning_referral_id > 0 AND (e.earning_created > DATE_SUB(now(), INTERVAL 90 DAY)) AND r.referral_type = 0
GROUP BY DATE(e.earning_created)
ORDER BY e.earning_created
How do I get it to run the queries together, so that it outputs two columns/series for the y-axis: earning_standard and earning_referral.
But with them both aligned to the same day column/scale for the x-axis - substituting zero when there are no earnings for a specific series.
You'll need to set both of those queries as subqueries
SELECT DATE_FORMAT(earnings.earning_created, '%c/%e/%Y') AS day,
COALESCE(es.earning_standard, 0) AS earning_standard,
COALESCE(er.earning_referral, 0) AS earning_referral
FROM earnings
LEFT JOIN (SELECT DATE_FORMAT(earning_created, '%c/%e/%Y') AS day,
SUM(earning_amount) AS earning_standard
FROM earnings
WHERE earning_account_id = ?
AND earning_referral_id = 0
AND (earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
GROUP BY DATE(earning_created)) AS es
ON (day = es.day)
LEFT JOIN (SELECT DATE_FORMAT(e.earning_created, '%c/%e/%Y') AS day,
SUM(e.earning_amount) AS earning_referral
FROM earnings AS e
INNER JOIN referrals AS r
ON r.referral_id = e.earning_referral_id
WHERE e.earning_account_id = ?
AND e.earning_referral_id > 0
AND (e.earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
AND r.referral_type = 0
GROUP BY DATE(e.earning_created)) AS er
ON (day = er.day)
WHERE earnings.earning_account_id = ?
ORDER BY day
where I'm assuming earning_account_id = ? is intended to be with a question mark because the language you're using to run the query is replacing it with the actual id before running the query.
SELECT
COALESCE(t1.amount,0) AS link_earnings,
COALESCE(t2.amount,0) AS publisher_referral_earnings,
COALESCE(t3.amount,0) AS advertiser_referral_earnings,
t1.day AS day
FROM
(
SELECT DATE_FORMAT(earning_created, '%c/%e/%Y') AS day, SUM(earning_amount) AS amount
FROM earnings
WHERE earning_referral_id = 0
AND (earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
AND earning_account_id = ?
GROUP BY DATE(earning_created)
) t1
LEFT JOIN
(
SELECT DATE_FORMAT(ep.earning_created, '%c/%e/%Y') AS day, (SUM(ep.earning_amount) * rp.referral_share) AS amount
FROM earnings AS ep
INNER JOIN referrals AS rp
ON ep.earning_referral_id = rp.referral_id
WHERE ep.earning_referral_id > 0
AND (ep.earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
AND ep.earning_account_id = ?
AND rp.referral_type = 0
GROUP BY DATE(ep.earning_created)
) t2
ON t1.day = t2.day
LEFT JOIN
(
SELECT DATE_FORMAT(ea.earning_created, '%c/%e/%Y') AS day, (SUM(ea.earning_amount) * ra.referral_share) AS amount
FROM earnings AS ea
INNER JOIN referrals AS ra
ON ea.earning_referral_id = ra.referral_id
WHERE ea.earning_referral_id > 0
AND (ea.earning_created > DATE_SUB(now(), INTERVAL 90 DAY))
AND ea.earning_account_id = ?
AND ra.referral_type = 1
GROUP BY DATE(ea.earning_created)
) t3
ON t1.day = t3.day
ORDER BY day
Seems to run ok....
You can simply use an outer join to retain earnings even when there is no matching referral, and then conditionally sum depending on whether a referral exists or not:
SELECT DATE_FORMAT(e.earning_created, '%c/%e/%Y') AS day,
SUM(IF(r.referral_id IS NULL, e.earning_amount, 0)) earning_standard,
SUM(IF(r.referral_id IS NULL, 0, e.earning_amount)) earning_referral
FROM earnings e LEFT JOIN referrals r ON r.referral_id = e.earning_referral_id
WHERE e.earning_account_id = ?
AND e.earning_created > CURRENT_DATE - INTERVAL 90 DAY
AND (r.referral_id IS NULL OR r.referral_type = 0)
GROUP BY 1
ORDER BY 1
I've assumed here that earnings.earning_referral_id is never negative, though you can add an explicit test to filter such records if so desired.
I've also changed the filter on earnings.earning_created to base from CURRENT_DATE rather than NOW() to ensure that any earnings created earlier than the current time on the first day of the series are still included—this would typically be what one actually wants, but feel free to change back if not.

Convert NOT IN query to better performance

I'm using MySQL 5.0, and I need to fine tune this query. Can anyone please tell me what tuning I can do in this?
SELECT DISTINCT(alert_master_id) FROM alert_appln_header
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
SELECT DISTINCT(alert_master_id) FROM alert_details
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
UNION
SELECT DISTINCT(alert_master_id) FROM alert_sara_header
WHERE sara_master_id IN
(SELECT alert_sara_master_id FROM alert_sara_lines
WHERE end_date IS NULL) AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;
The first thing that I'd do is rewrite the subqueries as joins:
SELECT h.alert_master_id
FROM alert_appln_header h
JOIN schedule_config c
ON c.schedule_name = 'Purging_Config'
LEFT JOIN alert_details d
ON d.alert_master_id = h.alert_master_id
AND d.end_date IS NULL
AND d.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
LEFT JOIN (
alert_sara_header s
JOIN alert_sara_lines l
ON l.alert_sara_master_id = s.sara_master_id
)
ON s.alert_master_id = h.alert_master_id
AND s.end_date IS NULL
AND s.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
WHERE h.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
AND d.alert_master_id IS NULL
AND s.alert_master_id IS NULL
GROUP BY h.alert_master_id
LIMIT 5000
If it's still slow after that, re-examine your indexing strategy. I'd suggest indexes over:
alert_appln_header(alert_master_id,created_date)
schedule_config(schedule_name)
alert_details(alert_master_id,end_date,created_date)
alert_sara_header(sara_master_id,alert_master_id,end_date,created_date)
alert_sara_lines(alert_sara_master_id)
OK, this may be just a shot in the dark, but I think you don't need as many DISTINCT here.
SELECT DISTINCT(alert_master_id) FROM alert_appln_header
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
-- removed distinct here --
SELECT alert_master_id FROM alert_details
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
UNION
-- removed distinct here --
SELECT alert_master_id FROM alert_sara_header
WHERE sara_master_id IN
(SELECT alert_sara_master_id FROM alert_sara_lines
WHERE end_date IS NULL)
AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;
Since using the DISTINCT is very costly, try to avoid it. In the first WHERE clause you are checking for ids that are NOT within some result, so it shouldn't matter if in that result some ids appear more than once.

MySQL : Different behaviors depending on WHERE result

I'm modifying an existing project. I want to group 3 mysql request in one.
These 3 request have the same selected data, only the WHERE change.
here's one of the request for exemple :
SELECT COUNT(seg.my_seg1) FROM (
SELECT COUNT(DISTINCT cp.conference_id) as my_seg1 FROM A.Account a
INNER JOIN A.ConferenceParticipant cp ON a.account_id = cp.user_id
INNER JOIN A.Conference cf ON cf.id = cp.conference_id
WHERE cf.`status` = 0
AND DATE_SUB(CURDATE(), INTERVAL 30 DAY) <= cf.creation_timestamp
GROUP BY a.account_id) as seg
WHERE seg.my_seg1 >= 30
The 2 other requests are exactly the same except :
WHERE seg.my_seg1 >= 11 AND seg.my_seg1 <= 30;
and :
WHERE seg.my_seg1 >= 30;
So my question is how can I get 3 different values depending on the WHERE result in the same request ?
Like this you'll have 3 virtual columns:
SELECT
COUNT(IF(seg.my_seg1 >= 30, 1, 0)) AS res1,
COUNT(IF(seg.my_seg1 >= 11 AND seg.my_seg1 < 30, 1, 0)) AS res2
FROM (
SELECT COUNT(DISTINCT cp.conference_id) as my_seg1
FROM A.Account a
JOIN A.ConferenceParticipant cp ON a.account_id = cp.user_id
JOIN A.Conference cf ON cf.id = cp.conference_id
WHERE
cf.`status` = 0
AND DATE_SUB(CURDATE(), INTERVAL 30 DAY) <= cf.creation_timestamp
GROUP BY a.account_id
) AS seg
But you have to revise your filters, you talk about 3 but I only see 2 different ones.

sql gives no results because empty left join

So I have this query:
SELECT
AED.aId, CJ.*
FROM
AED
LEFT JOIN
Cronjob as CJ
ON CJ.aID = AED.aId
WHERE
AED.aStatus = '1'
AND
(
CJ.cjDatum < CURRENT_DATE - INTERVAL 14 DAY
AND
AED.aRegistratie > CURRENT_DATE - INTERVAL 10 YEAR
)
OR
(
CJ.cjStatus = '9'
OR
CJ.cjStatus = '2'
)
The problem is, the Cronjob table is empty, and if it's empty is still want to give all the Id's from AED with the status 1
I couldn't find anything use full, so I hope you guys can help!
You should move all the CJ criteria into the join's ON clause.
ON (
CJ.aID = AED.aId
AND (cjStatus in ('2','9') OR cjDatum < CURRENT_DATE - INTERVAL 14 DAY ))
Option two would be to leave them in WHERE, but make provisions for the case that cjStatus and friends are NULL (which they will be if no match is found).
OR cjStatus IS NULL
When there is no CronJob associated, the filter CJ.cjStatus = '9' (for example) return false, since CJ.cjStatus is null. That's what a LEFT JOIN do, it returns null field when there is no correspondance.
To add filter on the table you want to LEFT JOIN with, the filter clause must be in the join clauses like this:
SELECT AED.aId
, CJ.*
FROM AED
LEFT JOIN Cronjob as CJ
ON CJ.aID = AED.aId
AND (CJ.cjDatum < CURRENT_DATE - INTERVAL 14 DAY
AND AED.aRegistratie > CURRENT_DATE - INTERVAL 10 YEAR
)
OR (CJ.cjStatus = '9' OR CJ.cjStatus = '2')
WHERE AED.aStatus = '1'
Add OR (CJ.aid is null) to your AND part of condition:
SELECT
AED.aId, CJ.*
FROM
AED
LEFT JOIN
Cronjob as CJ
ON CJ.aID = AED.aId
WHERE
AED.aStatus = '1'
AND
(
(
CJ.cjDatum < CURRENT_DATE - INTERVAL 14 DAY
AND
AED.aRegistratie > CURRENT_DATE - INTERVAL 10 YEAR
)
OR (CJ.aid is null)
)
OR
(
CJ.cjStatus = '9'
OR
CJ.cjStatus = '2'
)