Table identifiers with LEFT JOIN - mysql

How does one join say, on tableD.id = tableC.id AND tableD.id = tableE.id? both tableD and E may have 0 rows and I need to count them ie. SELECT COUNT(E.id). The problem is I don't know where to declare the table identifiers.
I've tried:
FROM tableB B, tableD D, tableE E ...
LEFT JOIN (tableC C, tableD D) ON ...
SELECT B.*, COUNT(C.id) AS cCount
FROM tableB B
LEFT JOIN (tableC C)
ON (B.id = C.id)
GROUP BY B.id

It is a little difficult to tell from your question what you're looking for, but I believe this is it:
SELECT B.*, COUNT(C.id) AS cCount
FROM tableB AS B
LEFT JOIN tableC AS C ON B.id = C.id
LEFT JOIN tableD AS D ON C.Id = D.Id
LEFT JOIN tableE AS E ON D.Id = E.Id
GROUP BY B.id

Related

What is the difference between left join (table1, table2) and left join table1 left join table2

What is the difference between the two sql queries?
select * from a
left join (b, c)
on a.id = b.uid and a.id = c.uid
select * from a
left join b on a.id = b.uid
left join c on a.id = c.uid
Lets have this data:
A B C
id uid uid
-- --- ---
1 1 2
2
First, the second query:
select * from a
left join b on a.id = b.uid
left join c on a.id = c.uid
ID UID UID
-- ---- ----
1 1 NULL
2 NULL 2
This should come as no surprise - second column is joined from b and where there's no data in b, NULL is used (outer join); third column behaves the same, just for c.
The first query, rewritten with CROSS JOIN (which it is equivalent to) to be ANSI-compliant:
select * from a
left join (b CROSS JOIN c)
on a.id = b.uid and a.id = c.uid
ID UID UID
-- ---- ----
2 NULL NULL
1 NULL NULL
Why there are all NULLs?
First, the CROSS JOIN is performed, but that results in a resultset with just one row:
b.UID c.UID
----- -----
1 2
Then, the left join is performed, but there's no row in the result of the cross join that would have same uid for both b and c, so no row can be matched for either row in a.
select * from a
left join (b, c)
on a.id = b.uid and a.id = c.uid
is equivalent to
select * from a
left join (b cross join c)
on (a.id = b.uid and a.id = c.uid)
Here you can find the details
https://dev.mysql.com/doc/refman/5.7/en/join.html

Ordering clauses of a left join

I'm trying to join some tables with a query like below. Because I want to get the c.name ideally that the b table refers to. If the b table doesn't have rows in the result set or the b row doesn't refer to c, then just get the c.name that a table refers to.
SELECT a.*, c.name
FROM a
LEFT JOIN b ON a.b_id = b.id
LEFT JOIN c ON (b.c_id IS NOT NULL AND b.c_id = c.id) OR a.c_id = c.id
However mysql is always joining c with a.c_id = c.id and getting the less-favored c.name. Is it possible to avoid this, or is mySQL trying to get a full result set as quick as it can?
Try this may help:
SELECT a.*, c.name
FROM a
LEFT JOIN b ON a.b_id = b.id
LEFT JOIN c ON (b.c_id IS NOT NULL OR b.c_id = c.id) OR a.c_id = c.id
I think this should help:
SELECT a.*, c.name
FROM a
LEFT JOIN b ON a.b_id = b.id
LEFT JOIN c ON c.id = COALESCE(b.c_id, a.c_id)
When b.c_id is NULL, then a.c_id will be used. Otherwise b.c_id will be used.
It's not about speed. OR will give you all possible result rows including both b.c_id and a.c_id mappings for each row in a.
If you're not familiar with COALESCE(), the long form of this is almost exactly like your query but using IF() instead of OR.
SELECT a.*, c.name
FROM a
LEFT JOIN b ON a.b_id = b.id
LEFT JOIN c ON IF(b.c_id IS NOT NULL, b.c_id = c.id, a.c_id = c.id)

MySql strange performance

First of all, for example I have 3 table A, B, C. Table A has relation with table B and table has relation with table C. I want to get SUM of some field from table A which depends on some fields from table C.
Table A has > 300k rows, Table B has > 4k rows, Table C has ~ 100 rows
My query looks like that:
SELECT SUM(a.hours) AS total
FROM table_a a
LEFT JOIN table_b b
ON a.table_b_id = b.id
LEFT JOIN table_c c
ON b.table_c_id = c.id
WHERE a.customer_id = 1
AND c.title IN ('Title D','Title E')
Query execution time is ~7 sec, it's very slow. But execution time of query like below is ~0.0 sec.
SELECT a.hours
FROM table_a a
LEFT JOIN table_b b
ON a.table_b_id = b.id
LEFT JOIN table_c c
ON b.table_c_id = c.id
WHERE a.customer_id = 1
AND c.title IN ('Title D','Title E')
Why SUM is so slow? what should I do?
Move your condition to ON clause for related table:
SELECT SUM(a.hours) AS total
FROM table_a a
LEFT JOIN table_b b
ON a.table_b_id = b.id
LEFT JOIN table_c c
ON b.table_c_id = c.id
AND c.title IN ('Title D','Title E')
WHERE a.customer_id = 1
EDIT 1 According to #dnoeth comment I can agree, probably we should use inner join when join table_c:
SELECT SUM(a.hours) AS total
FROM table_a a
LEFT JOIN table_b b
ON a.table_b_id = b.id
INNER JOIN table_c c
ON b.table_c_id = c.id
AND c.title IN ('Title D','Title E')
WHERE a.customer_id = 1
Use LEFT JOIN in both places and add this composite index:
INDEX(customer_id, title)

Is left join commutative? What are its properties?

Assume tables TableA TableB TableC and TableD:
Is the following query:
TableA INNER JOIN TableB LEFT JOIN TableC LEFT JOIN TableD
(all joined to an id column) equivalent to:
TableA INNER JOIN TableB
INNER JOIN TableC
LEFT JOIN TableD
UNION
TableA INNER JOIN TableB
LEFT JOIN TableC ON TableB.c_id IS NULL
LEFT JOIN TableD
?
Note:
Or instead of union just do
TableA INNER JOIN TableB
INNER JOIN TableC
LEFT JOIN TableD
And then
TableA INNER JOIN TableB
LEFT JOIN TableC ON TableB.c_id IS NULL
LEFT JOIN TableD
and then combine the results
Update
Is
(A INNER JOIN B) LEFT JOIN C LEFT JOIN D
the same as:
A INNER JOIN (B LEFT JOIN C) LEFT JOIN D
?
Wikipedia:
"In mathematics, a binary operation is commutative if changing the order of the operands does not change the result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it."
Answer:
no, a left join is not commutative. And inner join is.
But that's not really what you are asking.
Is the following query:
TableA INNER JOIN TableB LEFT JOIN TableC LEFT JOIN TableD
(all joined to an id column) equivalent to:
TableA INNER JOIN TableB
INNER JOIN TableC
LEFT JOIN TableD
UNION
TableA INNER JOIN TableB
LEFT JOIN TableC ON TableB.c_id IS NULL
LEFT JOIN TableD
Answer:
Also no. Unions and joins don't really accomplish the same thing, generally speaking. In some case you may be able to write them equivalently, but I don't think so general pseudo sql you are showing. The ON constitution seemslike it should not work (maybe something about which I do not know in MySQL?)
Here is a simplified set of queries that I do think would be equivalent.
SELECT *
FROM TableA a
LEFT JOIN
TableB b ON a.id = b.id_a
SELECT *
FROM TableA a
INNER JOIN
TableB b ON a.id = b.id_a
UNION
SELECT *
FROM TableA a
LEFT JOIN
TableB b ON a.id = b.id_a
WHERE TableB.id IS NULL
Edit 2:
Here's another example that is closer to your but in essence the same.
SELECT *
FROM TableA a
INNER JOIN TableB b ON a.id = b.id_a
LEFT JOIN TableC c ON b.id = c.id_b
is the same as
SELECT *
FROM TableA a
INNER JOIN TableB b ON a.id = b.id_a
INNER JOIN TableC c ON b.id = c.id_b
UNION
SELECT *
FROM TableA a
INNER JOIN TableB b ON a.id = b.id_a
LEFT JOIN TableC c ON b.id = c.id_b
WHERE TableC.id IS NULL
But I still don't think I'm answering your real question.

MySQL Multiple Aggregates

Without the third join D.cid = C.id, this query gives me the count of C. With the third join it corrupts the count and gets unwanted tuples into the count of C's join. So how can I get the count of C and D without having the C count effected? Is there a form of parenthesis I can use to make sure I get the correct count?
SELECT A.*, B.*, COUNT(C.aid) AS cCount
FROM tableA A
LEFT JOIN tableC AS C ON A.id = C.aid
INNER JOIN tableB AS B ON A.id = B.aid
LEFT JOIN tableD AS D ON D.cid = C.id
GROUP BY A.id
I would have the counts from the other tables pre-aggregated unto themselves and joined... something like...
SELECT
A.*,
B.*,
COALESCE( PreAggC.CCount, 0 ) as CCount,
COALESCE( PreAggC.WithDCount, 0 ) as WithDCount
FROM
tableA A
JOIN tableB B
on A.ID = B.aID
LEFT JOIN ( select aID,
count( distinct id ) CCount,
count(*) as WithDCount
from tableC
left join tableD D
on c.ID = D.cID
group by aID ) PreAggC
on A.id = PreAggC.aID
Now, do you really want how many entries actually have "D" records? so I included both counts... distinct "C" entries, and the overall count with correlation with "D"