shows mysql records twice because of inner joining - mysql

In below query (Mentors) are 13 which shows me 26, while (SchoolSupervisor) are 5 which shows me 10 which is wrong. it is because of the Evidence which having 2 evidance, because of 2 evidence the Mentors & SchoolSupervisor values shows me double.
please help me out.
Query:
select t.c_id,t.province,t.district,t.cohort,t.duration,t.venue,t.v_date,t.review_level, t.activity,
SUM(CASE WHEN pr.p_association = "Mentor" THEN 1 ELSE 0 END) as Mentor,
SUM(CASE WHEN pr.p_association = "School Supervisor" THEN 1 ELSE 0 END) as SchoolSupervisor,
(CASE WHEN count(file_id) > 0 THEN "Yes" ELSE "No" END) as evidence
FROM review_m t , review_attndnce ra
LEFT JOIN participant_registration AS pr ON pr.p_id = ra.p_id
LEFT JOIN review_files AS rf ON rf.training_id = ra.c_id
WHERE 1=1 AND t.c_id = ra.c_id
group by t.c_id, ra.c_id order by t.c_id desc
enter image description here

You may perform the aggregations in a separate subquery, and then join to it:
SELECT
t.c_id,
t.province,
t.district,
t.cohort,
t.duration,
t.venue,
t.v_date,
t.review_level,
t.activity,
pr.Mentor,
pr.SchoolSupervisor,
rf.evidence
FROM review_m t
INNER JOIN review_attndnce ra
ON t.c_id = ra.c_id
LEFT JOIN
(
SELECT
p_id,
COUNT(CASE WHEN p_association = 'Mentor' THEN 1 END) AS Mentor,
COUNT(CASE WHEN p_association = 'School Supervisor' THEN 1 END) AS SchoolSupervisor,
FROM participant_registration
GROUP BY p_id
) pr
ON pr.p_id = ra.p_id
LEFT JOIN
(
SELECT
training_id,
CASE WHEN COUNT(file_id) > 0 THEN 'Yes' ELSE 'No' END AS evidence
FROM review_files
GROUP BY training_id
) rf
ON rf.training_id = ra.c_id
ORDER BY
t.c_id DESC;
Note that this also fixes another problem your query had, which was that you were selecting many columns which did not appear in the GROUP BY clause. Under this refactor, there is nothing wrong with your current select, because the aggregation take place in a separate subquery.

try adding this to the WHERE part of your query
AND pr.p_id IS NOT NULL AND rf.training_id IS NOT NULL

You can add a group by pr.p_id to remove the duplicate records there. Since, the group by on pr is not present as of now, there might be multiple records of same p_id for same ra
group by t.c_id, ra.c_id, pr.p_id order by t.c_id desc

Related

MySQL upgrade from left join to something similar as FULL join

Sorry for asking here this but I need help and google is not being nice.
I have the following table Products
SELECT
COUNT(CASE when core.kits.Location = core.suppliers.id THEN 1 END) as total,
COUNT(CASE when core.kits.cp = 1 THEN 1 END) as used,
core.suppliers.id, core.suppliers.name, core.suppliers.email,
core.suppliers.cperson, core.suppliers.adress, core.suppliers.phone
FROM core.kits
LEFT join core.suppliers on core.kits.Location = core.suppliers.id
WHERE core.suppliers.id is not null
AND banned=0
GROUP BY core.suppliers.id
ORDER BY name ASC
LIMIT 1000 OFFSET 0
but does not give me all the suppliers with zeros for the ones who have no appearance in kits.
Then in I do
SELECT
COUNT(CASE when core.kits.Location = core.suppliers.id THEN 1 END) as total,
COUNT(CASE when core.kits.cp = 1 THEN 1 END) as used,
core.suppliers.id, core.suppliers.name, core.suppliers.email,
core.suppliers.cperson, core.suppliers.adress, core.suppliers.phone
FROM core.suppliers
LEFT join core.suppliers on core.suppliers.id = core.kits.Location
WHERE core.suppliers.id is not null
AND banned=0
GROUP BY core.suppliers.id
ORDER BY name ASC
LIMIT 1000 OFFSET 0
I get all suppliers and correct numbers but the query takes 8 seconds instead of 1s. Any ideas how can I get all the suppliers with the count of stocks in 1s?
cheers.
If you want all the suppliers, even those that do not appear in kits you should do a LEFT join of suppliers to kits:
SELECT COUNT(k.Location) AS total,
COUNT(CASE WHEN k.cp = 1 THEN 1 END) AS used,
s.id, s.name, s.email, s.cperson, s.adress, s.phone
FROM core.suppliers s LEFT JOIN core.kits k
ON k.Location = s.id
WHERE banned=0
GROUP BY s.id
ORDER BY s.name ASC
LIMIT 1000 OFFSET 0;
I assume that core.suppliers.id is the primary key of suppliers, so that the conition:
core.suppliers.id is not null
is not needed.
Also, if the column banned is contained in the table kits, then the condition should be moved in the ON clause:
ON k.Location = s.id AND k.banned=0
and the WHERE clause should be removed.

Improving the performance of sql joined count query

In my application the users can create campaigns for sending messages. When the campaign tries to send a message, one of the three things can happen:
The message is suppressed and not let through
The message can't reach the recipient and is considered failed
The message is successfully delivered
To keep track of this, I have the following table:
My problem is that when the application has processed a lot of messages (more than 10 million), the query I use for showing campaign statistics for the user slows down by a considerable margin (~ 15 seconds), even when there are only a few (~ 10) campaigns being displayed for the user.
Here is the query I'm using:
select `campaigns`.*, (select count(*) from `processed_messages`
where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'sent') as `messages_sent`,
(select count(*) from `processed_messages` where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'failed') as `messages_failed`,
(select count(*) from `processed_messages` where `campaigns`.`id` = `processed_messages`.`campaign_id` and `status` = 'supressed') as `messages_supressed`
from `campaigns` where `user_id` = 1 and `campaigns`.`deleted_at` is null order by `updated_at` desc;
So my question is: how can I make this query run faster? I believe there should be some way of not having to use sub-queries multiple times but I am not very experienced with MySQL syntax yet.
You should write this as a single join, using conditional aggregation:
SELECT
c.*,
COUNT(CASE WHEN pm.status = 'sent' THEN 1 END) AS messages_sent,
COUNT(CASE WHEN pm.status = 'failed' THEN 1 END) AS messages_failed,
COUNT(CASE WHEN pm.status = 'suppressed' THEN 1 END) AS messages_suppressed
FROM campaigns c
LEFT JOIN processed_messages pm
ON c.id = pm.campaign_id
WHERE
c.user_id = 1 AND
c.deleted_at IS NULL
GROUP BY
c.id
ORDER BY
c.updated_at DESC;
It should be noted that at first glance, doing SELECT c.* appears to be a violation of the GROUP BY rules which say that only columns which appear in the GROUP BY clause can be selected. However, assuming that campaigns.id is the primary key column, then there is nothing wrong with selecting all columns from this table, provided that we aggregate by the primary key.
Edit:
If the above answer does not run on your MySQL server version, with an error message complaining about only full group by, then use this version:
SELECT c1.*, c2.messages_sent, c2.messages_failed, c2.message_suppressed
FROM campaigns c1
INNER JOIN
(
SELECT
c.id
COUNT(CASE WHEN pm.status = 'sent' THEN 1 END) AS messages_sent,
COUNT(CASE WHEN pm.status = 'failed' THEN 1 END) AS messages_failed,
COUNT(CASE WHEN pm.status = 'suppressed' THEN 1 END) AS messages_suppressed
FROM campaigns c
LEFT JOIN processed_messages pm
ON c.id = pm.campaign_id
WHERE
c.user_id = 1 AND
c.deleted_at IS NULL
GROUP BY
c.id
) c2
ON c1.id = c2.id
ORDER BY
c2.updated_at DESC;

MySQL query taking too much time

query taking 1 minute to fetch results
SELECT
`jp`.`id`,
`jp`.`title` AS game_title,
`jp`.`game_type`,
`jp`.`state_abb` AS game_state,
`jp`.`location` AS game_city,
`jp`.`zipcode` AS game_zipcode,
`jp`.`modified_on`,
`jp`.`posted_on`,
`jp`.`game_referal_amount`,
`jp`.`games_referal_amount_type`,
`jp`.`status`,
`jp`.`is_flaged`,
`u`.`id` AS employer_id,
`u`.`email` AS employer_email,
`u`.`name` AS employer_name,
`jf`.`name` AS game_function,
`jp`.`game_freeze_status`,
`jp`.`game_statistics`,
`jp`.`ats_value`,
`jp`.`integration_id`,
`u`.`account_manager_id`,
`jp`.`model_game`,
`jp`.`group_id`,
(CASE
WHEN jp.group_id != '0' THEN gm.group_name
ELSE 'NA'
END) AS group_name,
`jp`.`priority_game`,
(CASE
WHEN jp.country != 'US' THEN jp.country_name
ELSE ''
END) AS game_country,
IFNULL((CASE
WHEN
`jp`.`account_manager_id` IS NULL
OR `jp`.`account_manager_id` = 0
THEN
(SELECT
(CASE
WHEN
account_manager_id IS NULL
OR account_manager_id = 0
THEN
`u`.`account_manager_id`
ELSE account_manager_id
END) AS account_manager_id
FROM
user_user
WHERE
id = (SELECT
user_id
FROM
game_user_assigned
WHERE
game_id = `jp`.`id`
LIMIT 1))
ELSE `jp`.`account_manager_id`
END),
`u`.`account_manager_id`) AS acc,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '1'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS somewhatgame,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '2'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS verygamecommitted,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '3'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS notgame,
(SELECT
COUNT(joa.id) AS applicationcount
FROM
game_refer_to_member jrmm
INNER JOIN
game_refer jrr ON jrr.id = jrmm.rid
INNER JOIN
game_applied joa ON jrmm.id = joa.referred_by
WHERE
jrmm.STATUS = '1'
AND jrr.referby_user_id IN (SELECT
ab_testing_user_id
FROM
ab_testing)
AND joa.game_post_id = rpr.recach_limit_game_id
AND (rpr.recach_limit = 1
OR rpr.recach_limit = 2)) AS gamecount
FROM
(`game_post` AS jp)
JOIN
`user_info` AS u ON `jp`.`user_user_id` = `u`.`id`
JOIN
`game_functional` jf ON `jp`.`game_functional_id` = `jf`.`id`
LEFT JOIN
`group_musesm` gm ON `gm`.`group_id` = `jp`.`group_id`
LEFT JOIN
`recach_limit` rpr ON `jp`.`id` = `rpr`.`recach_limit_game_id`
WHERE
`jp`.`status` != '3'
GROUP BY `jp`.`id`
ORDER BY `posted_on` DESC
LIMIT 10
I would first suggest not nesting select statements because this will cause an n^x performance hit on every xth level and I see at least 3 levels of selects inside this query.
Add index
INDEX(status, posted_on)
Move LIMIT inside
Then, instead of saying
FROM (`game_post` AS jp)
say
FROM ( SELECT id FROM game_post
WHERE status != 3
ORDER BY posted_on DESC
LIMIT 10 ) AS ids
JOIN game_post AS jp USING(id)
(I am assuming that the PK of jp is (id)?)
That should efficiently use the new index to get the 10 ids needed. Then it will reach back into game_post to get the other columns.
LEFT
Also, don't say LEFT unless you need it. It costs something to generate NULLs that you may not be needing.
Is GROUP BY necessary?
If you remove the GROUP BY, does it show dup ids? The above changes may have eliminated the need.
IN(SELECT) may optimize poorly
Change
AND jrr.referby_user_id IN ( SELECT ab_testing_user_id
FROM ab_testing )
to
AND EXISTS ( SELECT * FROM ab_testing
WHERE ab_testing_user_id = jrr.referby_user_id )
(This change may or may not help, depending on the version you are running.)
More
Please provide EXPLAIN SELECT if you need further assistance.

Count, SUM, LEFT JOIN and GROUP BY in query not working right

I have tried a few things but I can't seem to figure out what's causing the problem.
When I remove the totalHours part, the query works fine. But with it, it displays the right number of hours but the wrong number of Jobs, Selected and Winners.
Could someone please tell me what I am doing wrong?
Thanks in advance.
Here is my query;
SELECT
crmCandidate.candidateID,
crmCandidate.candidateName,
COUNT(DISTINCT crmJoin.joinID) AS Jobs,
SUM(IF(crmJoin.joinExtra = 'select', 1, 0)) AS Selected,
SUM(IF(crmJoin.joinExtra = 'winner', 1, 0)) AS Winner,
ROUND(SUM(crmDays.total)) AS totalDays
FROM crmCandidate
LEFT JOIN crmJoin ON (crmJoin.joinChild = crmCandidate.candidateID)
LEFT JOIN crmJob ON (crmJob.jobID = crmJoin.joinParent)
LEFT JOIN crmDays ON (crmDays.dayCandidateID = crmJoin.joinChild)
WHERE
crmDays.dayJobID = crmJob.jobID AND
crmDays.dayCandidateID = crmCandidate.candidateID
GROUP BY
crmCandidate.candidateID
ORDER BY DESC
LIMIT 100
try this one :
SELECT
crmCandidate.candidateID,
crmCandidate.candidateName,
COUNT(DISTINCT crmJoin.joinID) AS Jobs,
Sum(Case When crmJoin.joinExtra = 'select' Then 1 else 0 end) as Selected,
Sum(Case When crmJoin.joinExtra = 'winner' Then 1 else 0 end) as winner,
ROUND(SUM(crmDays.total)) AS totalDays
FROM crmCandidate
LEFT JOIN crmJoin
ON crmJoin.joinChild = crmCandidate.candidateID
LEFT JOIN crmJob
ON crmJob.jobID = crmJoin.joinParent
Inner JOIN crmDays
On crmDays.dayCandidateID = crmCandidate.candidateID
AND crmDays.dayJobID = crmJob.jobID
GROUP BY crmCandidate.candidateID, crmCandidate.candidateName
ORDER BY candidateID DESC
LIMIT 100
The best thing to do is to aggregate the data before you do the join. You can probably do what you want with the following count(distinct) clauses:
COUNT(DISTINCT case when crmJoin.joinExtra = 'select' then crmJoin.JoinId end) AS Selected,
COUNT(DISTINCT case when crmJoin.joinExtra = 'winner' then crmJoin.JoinId end) AS Winner,

MySQL sum of column value from derived table

This is my query:
SELECT usr.id,
count(DISTINCT sol.id) as 'Asked',
count(DISTINCT ans.id) as 'Answered',
sum(DISTINCT CASE ans.accepted WHEN 1 THEN 1 ELSE 0 end) as 'Accepted'
FROM tbl_users usr
LEFT JOIN tbl_solutions sol on sol.authorID = usr.id
LEFT JOIN tbl_solution_answers ans on ans.authorID = usr.id
group by usr.id, sol.authorID
My above query with the sum(DISTINCT CASE ans.accepted WHEN 1 THEN 1 ELSE 0 end) only ever returns 1 though I know that's not the case. I've tried adding a group clause on the ans.authorID but it has no effect.
How can I get the sum of all rows from the tbl_solution_answers ans table where the authorID is that of tbl_users.id and Accepted is 1.
SELECT usr.id,
count(DISTINCT sol.id) as 'Asked',
count(DISTINCT ans.id) as 'Answered',
count(DISTINCT case ans.accepted when 1 then ans.id end) as 'Accepted'
FROM tbl_users usr
LEFT JOIN tbl_solutions sol on sol.authorID = usr.id
LEFT JOIN tbl_solution_answers ans on ans.authorID = usr.id
group by usr.id, sol.authorID, ans.authorID
After so many permutations count(DISTINCT case ans.accepted when 1 then ans.id end) as 'Accepted' seems to work. Now if an authorID in tbl_solution_answers has 8 rows they'll all be returned as Answered and if say 3 of them are Accepted then 3 is returned as Accepted.