MySQL - Counting rows and left join problem - mysql

I have 2 tables, campaigns and campaign_codes:
campaigns: id, partner_id, status
campaign_codes: id, code, status
I want to get a count of all campaign_codes for all campaigns WHERE campaign_codes.status equals 0 OR where there are no campaign_codes records for a campaign.
I have the following SQL, but of course the WHERE statement eliminates those campaigns which have no corresponding records in campaign_codes ( i want those campaigns with zero campaign_codes as well)
SELECT
c.id AS campaign_id,
COUNT(cc.id) AS code_count
FROM
campaigns c
LEFT JOIN campaign_codes cc on cc.campaign_id = c.id
WHERE c.partner_id = 4
AND cc.status = 0
GROUP BY c.id

I'd opt for something like:
SELECT
c.id AS campaign_id,
COUNT(cc.id) AS code_count
FROM
campaigns c
LEFT JOIN campaign_codes cc on cc.campaign_id = c.id
AND cc.status = 0 -- Having this clause in the WHERE, effectively makes this an INNER JOIN
WHERE c.partner_id = 4
GROUP BY c.id
Moving the AND to the join clause makes the join succeed or fail, crucially keeping resulting rows in where there is no matching row in the 'right' table.
If it were in the WHERE, the comparisons to NULL (where there is no campaign_code) would fail, and be eliminated from the results.

SELECT
c.id AS campaign_id,
COUNT(cc.id) AS code_count
FROM
campaigns c
LEFT JOIN campaign_codes cc on cc.campaign_id = c.id
AND c.partner_id = 4
AND cc.status = 0
GROUP BY c.id

Related

Left join Count disturbing left join SUM

When i added a left join for getting count of foreign table, its multiply my sum value of other left join table with the count, also I cant use distinct sum here as two values can be same:
SELECT c.id as company_id, SUM(ct.amount) as total_billed, count(l.id) as load_count
FROM tbl_companies c
LEFT JOIN tbl_company_transactions ct ON c.id = ct.company_id
LEFT JOIN tbl_loads l ON c.id = l.company_id
GROUP BY c.id;
You need to pre-aggregate the data:
SELECT c.id as company_id, ct.total_billed,
l.load_count
FROM tbl_companies c LEFT JOIN
(SELECT ct.company_id, SUM(ct.amount) as total_billed
FROM tbl_company_transactions ct
GROUP BY ct.company_id
) ct
ON c.id = ct.company_id LEFT JOIN
(SELECT l.company_id, COUNT(*) as load_count
FROM tbl_loads l
GROUP BY l.company_id
) l
ON c.id = l.company_id;
As you have observed, the JOIN multiplies the number of rows and affects the aggregations.
You could isolate aggregate statistics and join results afterwards.
WITH
tranStats AS (
SELECT company_id, SUM(amount) AS total_billed
FROM tbl_company_transactions
GROUP BY company_id
),
loadStats AS (
SELECT company_id, COUNT(1) AS load_count
FROM tbl_loads
GROUP BY company_id
)
SELECT id, total_billed, load_count
FROM tbl_companies c
LEFT JOIN tranStats t ON t.company_id = c.id
LEFT JOIN loadStats l ON l.company_id = c.id
Gordon's answer is more scalable but for this specific query you only need one subquery — which may also offer a performance boost since joins on the pre-aggregated data may not be able to use indexes.
SELECT c.id as company_id, SUM(ct.amount) as total_billed, l.load_count
FROM tbl_companies c
LEFT JOIN tbl_company_transactions ct ON c.id = ct.company_id
LEFT JOIN (
SELECT company_id, count(*) as load_count
FROM tbl_loads
GROUP BY company_id
) l ON c.id = l.company_id
GROUP BY c.id;
The important thing to grasp is that if you need results of an aggregate function like SUM() or COUNT(), you need to be careful when you perform more than one join with multiple rows.

Returning all results of an outer query and getting a count of attached items

So I'm struggling to write a query that returns me all categories regardless of what filter I have applied but the count changes based on how many returned recipes there will be in this filter.
This query works nice if I don't apply any filters to it. The count's seem right, but as soon as I add something like this: where c.parent_id is not null and r.time_cook_minutes > 60 I am filtering out most of the categories instead of just getting a count of zero.
here's an example query that I came up with that does not work the way I want it to:
select t.id, t.name, t.parent_id, a.cntr from categories as t,
(select c.id, count(*) as cntr from categories as c
inner join recipe_categories as rc on rc.category_id = c.id
inner join recipes as r on r.id = rc.recipe_id
where c.parent_id is not null and r.time_cook_minutes > 60
group by c.id) as a
where a.id = t.id
group by t.id
so this currently, as you might imagine, returns only the counts of recipes that exist in this filter subset... what I'd like is to get all of them regardless of the filter with a count of 0 if they don't have any recipes under that filter.
any help with this would be greatly appreciated. If this question is not super clear let me know, and I can elaborate.
No need for nested join if you move the condition into a regular outer join:
select t.id, t.name, t.parent_id, count(r.id)
from categories as t
left join recipe_categories as rc on rc.category_id = c.id
left join recipes as r on r.id = rc.recipe_id
and r.time_cook_minutes > 60
where c.parent_id is not null
group by 1, 2, 3
Notes:
Use left joins so you always get every category
Put r.time_cook_minutes > 60 on the left join condition. Leaving it on the where clause cancels the effect of left
Simply use conditional aggregation, moving the WHERE clause into a CASE (or IF() for MySQL) statement wrapped in a SUM() of 1's and 0's (i.e., counts). Also, be sure to consistently use the explicit join, the current industry practice in SQL. While your derived table uses this form of join, the outer query uses implicit join matching IDs in WHERE clause.
select t.id, t.name, t.parent_id, a.cntr
from categories as t
inner join
(select c.id, sum(case when c.parent_id is not null and r.time_cook_minutes > 60
then 1
else 0
end) as cntr
from categories as c
inner join recipe_categories as rc on rc.category_id = c.id
inner join recipes as r on r.id = rc.recipe_id
group by c.id) as a
on a.id = t.id
group by t.id
I believe you want:
select c.id, c.name, c.parent_id, count(r.id)
from categories c left join
recipe_categories rc
on rc.category_id = c.id left join
recipes r
on r.id = rc.recipe_id and r.time_cook_minutes > 60
where c.parent_id is not null and
group by c.id, c.name, c.parent_id;
Notes:
This uses left joins for all the joins.
It aggregates by all the non-aggregated columns.
It counts matching recipes rather than all rows.
The condition on recipes is moved to the on clause from the where clause.

choosing the sql statement which use less loading time

i have 2 sql statements which produce the same result, but wondering which one to choose?
lets say 1 have 3 tables:
supplier
supplier_status
supplier_contact
statement 1)
SELECT a.*, b.status_name
(SELECT c.name FROM contact c
WHERE c.supplier_id = a.supplier_id
ORDER BY c.contact_id DESC limit 1) AS contact_name
FROM supplier a LEFT JOIN supplier_status b
ON a.status_id = b.status_id
statement 2)
SELECT a.*, b.status_name, c.name AS contact_name
FROM supplier a LEFT JOIN supplier_status b
ON a.status_id = b.status_id
LEFT JOIN (SELECT name, supplier_id
FROM contact
ORDER BY contact_id DESC
) c ON a.supplier_id = c.supplier_id
GROUP BY a.supplier_id
Try this query:
SELECT a.*, b.status_name, c.name AS contact_name
FROM supplier a
LEFT JOIN supplier_status b ON a.status_id = b.status_id
LEFT JOIN contact c ON a.supplier_id = c.supplier_id
LEFT JOIN contact d ON c.supplier_id = d.supplier_id AND c.contact_id < d.contact_id
WHERE d.contact_id IS NULL
It's possible that it doesn't produce the same result as yours (I didn't test it) but if it does then all you have to do is to make sure the fields that appear in the JOIN conditions are indexed (they probably are FKs, so they already are).

mySQL Sub Select needed

I have three tables, libraryitems, copies and loans.
A libraryitem hasMany copies, and a copy hasMany loans.
I'm trying to get the latest loan entry for a copy only; The query below returns all loans for a given copy.
SELECT
libraryitems.title,
copies.id,
copies.qruuid,
loans.id AS loanid,
loans.status,
loans.byname,
loans.byemail,
loans.createdAt
FROM copies
INNER JOIN libraryitems ON copies.libraryitemid = libraryitems.id AND libraryitems.deletedAt IS NULL
LEFT OUTER JOIN loans ON copies.id = loans.copyid
WHERE copies.libraryitemid = 1
ORDER BY copies.id ASC, loans.createdAt DESC
I know there needs to be a sub select of some description in here, but struggling to get the correct syntax. How do I only return the latest, i.e MAX(loans.createdAt) row for each distinct copy? Just using group by copies.id returns the earliest, rather than latest entry.
Image example below:
in the subquery , getting maximum created time for a loan i.e. latest entry and joining back with loans to get other details.
SELECT
T.title,
T.id,
T.qruuid,
loans.id AS loanid,
loans.status,
loans.byname,
loans.byemail,
loans.createdAt
FROM
(
SELECT C.id, C.qruuid, L.title, MAX(LN.createdAt) as maxCreatedTime
FROM Copies C
INNER JOIN libraryitems L ON C.libraryitemid = L.id
AND L.deletedAt IS NULL
LEFT OUTER JOIN loans LN ON C.id = LN.copyid
GROUP BY C.id, C.qruuid, L.title) T
JOIN loans ON T.id = loans.copyid
AND T.maxCreatedTime = loans.createdAt
A self left join on loans table will give you latest loan of a copy, you may join the query to the other tables to fetch the desired output.
select * from loans A
left outer join loans B
on A.copyid = B.copyid and A.createdAt < B.createdAt
where B.createdAt is null;
This is your query with one simple modification -- table aliases to make it clearer.
SELECT li.title, c.id, c.qruuid,
l.id AS loanid, l.status, l.byname, l.byemail, l.createdAt
FROM copies c INNER JOIN
libraryitems li
ON c.libraryitemid = li.id AND
li.deletedAt IS NULL LEFT JOIN
loans l
ON c.id = l.copyid
WHERE c.libraryitemid = 1
ORDER BY c.id ASC, l.createdAt DESC ;
With this as a beginning let's think about what you need. You want the load with the latest createdAt date for each c.id. You can get this information with a subquery:
select l.copyid, max(createdAt)
from loans
group by l.copyId
Now, you just need to join this information back in:
SELECT li.title, c.id, c.qruuid,
l.id AS loanid, l.status, l.byname, l.byemail, l.createdAt
FROM copies c INNER JOIN
libraryitems li
ON c.libraryitemid = li.id AND
li.deletedAt IS NULL LEFT JOIN
loans l
ON c.id = l.copyid LEFT JOIN
(SELECT l.copyid, max(l.createdAt) as maxca
FROM loans
GROUP BY l.copyid
) lmax
ON l.copyId = lmax.copyId and l.createdAt = lmax.maxca
WHERE c.libraryitemid = 1
ORDER BY c.id ASC, l.createdAt DESC ;
This should give you the most recent record. And, the use of left join should keep all copies, even those that have never been leant.

Rows missing from mysql pivot query results

I have a mysql query as stated below, it returns exactly the results I want for one row, but doesn't return any other rows where I expect there to be 8 in my test data (there are 8 unique test ids). I was inspired by this answer but obviously messed up my implementation, does anyone see where I'm going wrong?
SELECT
c.first_name,
c.last_name,
n.test_name,
e.doc_name,
e.email,
e.lab_id,
a.test_id,
a.date_req,
a.date_approved,
a.accepts_terms,
a.res_value,
a.reason,
a.test_type,
a.date_collected,
a.date_received,
k.kind_name,
sum(case when metabolite_name = "Creatinine" then t.res_val end) as Creatinine,
sum(case when metabolite_name = "Glucose" then t.res_val end) as Glucose,
sum(case when metabolite_name = "pH" then t.res_val end) as pH
FROM test_requisitions AS a
INNER JOIN personal_info AS c ON (a.user_id = c.user_id)
INNER JOIN test_types AS d ON (a.test_type = d.test_type)
INNER JOIN kinds AS k ON (k.id = d.kind_id)
INNER JOIN test_names AS n ON (d.name_id = n.id)
INNER JOIN docs AS e ON (a.doc_id = e.id)
INNER JOIN test_metabolites AS t ON (t.test_id = a.test_id)
RIGHT JOIN metabolites AS m ON (m.id = t.metabolite_id)
GROUP BY a.test_id
ORDER BY (a.date_approved IS NOT NULL),(a.res_value IS NOT NULL), a.date_req, c.last_name ASC;
Most of your joins are inner joins. The last is a right outer join. As written, the query keeps all the metabolites, but not necessarily all the tests.
I would suggest that you change them all to left outer joins, because you want to keep all the rows in the first table:
FROM test_requisitions AS a
LEFT JOIN personal_info AS c ON (a.user_id = c.user_id)
LEFT JOIN test_types AS d ON (a.test_type = d.test_type)
LEFT JOIN kinds AS k ON (k.id = d.kind_id)
LEFT JOIN test_names AS n ON (d.name_id = n.id)
LEFT JOIN docs AS e ON (a.doc_id = e.id)
LEFT JOIN test_metabolites AS t ON (t.test_id = a.test_id)
LEFT JOIN metabolites AS m ON (m.id = t.metabolite_id)
I would also suggest that your aliases be related to the table, so tr for test_requisition, pi for personal_info, and so on.