Left join latest inspection date to establishments using MySQL - mysql

I have a table of establishments and I want to return a result set with the latest inspection date from the inspections table. Right now I have:
SELECT business_table.business_name, business_table.address, inspection_table.date
FROM business_table
LEFT JOIN inspection_table ON business_table.id = inspection_table.business_id
WHERE inspection_table.date = (
SELECT MAX(date)
FROM inspection_table)
The problem is I get only one result with the latest inspection date. I need all of the establishments returned. The query will need to be efficient because I have about 600K establishments and 3Million inspections.

You are using an outer join. When a business record has no matching inspection record, then an empty inspection record is created an joined instead. So that outer-joined inspection record will have all columns NULL. Then you have WHERE inspection_table.date = (...). This dismisses all outer-joined records again, because NULL will never match.
Use AND instead, to make the condition part of the WHERE clause:
SELECT
b.business_name,
b.address,
i.date
FROM business_table b
LEFT JOIN inspection_table i
ON i.business_id = b.id
AND i.date = (SELECT MAX(date) FROM inspection_table);

Ah, you are not looking for the latest inspection date at all. You are looking for the latest inspection date per business.
In your solution you are still using an outer join that doesn't work. Don't do that. Either use an outer join and use it properly or use an inner join if you are fine with that.
Here is how to get the latest inspection per business with an outer join (so you also show businesses that have had no inspection, yet):
SELECT
b.business_name,
b.address,
i.date,
...
FROM business_table b
LEFT JOIN inspection_table i
ON i.business_id = b.id
AND (i.business_id, i.date) IN
(
SELECT business_id, MAX(date)
FROM inspection_table
GROUP BY business_id
);
Same with joins:
SELECT
b.business_name,
b.address,
i.date,
...
FROM business_table b
LEFT JOIN
(
SELECT business_id, MAX(date) AS date
FROM inspection_table
GROUP BY business_id
) latest ON latest.business_id = b.id
LEFT JOIN inspection_table i
ON i.business_id = latest.business_id
AND i.date = latest.date;

I ended using the following:
SELECT business_table.business_name AS Name, business_table.address AS Address, business_table.city AS City, business_table.state AS Province, inspection_table.rating AS Rating, inspection_table.date AS "Inspected"
FROM business_table
LEFT JOIN inspection_table ON business_table.id = inspection_table.business_id
WHERE inspection_table.date = (
SELECT MAX(inspection_table.date)
FROM inspection_table
WHERE business_table.id = inspection_table.business_id)
ORDER BY inspection_table.date DESC

Related

Is it possible to select from the result of a subquery in a join

So I have a large subquery and I would like to join on that subquery while using the result of the subquery in the join.
For example, I have a table called patient and one called appointment, and I would like to get the number of appointments per patient with given criteria.
Right now I am doing something like this:
SELECT
t1.*
FROM
(
SELECT
patient.name,
patient.id,
appointment.date
FROM
patient
LEFT JOIN appointment ON appointment.patient_id = patient.id
WHERE
/* a **lot** of filters, additional joins, etc*/
) t1
LEFT JOIN (
SELECT
COUNT(*) number_of_appointments,
patient.id
FROM
patient
LEFT JOIN appointment ON appointment.patient_id = patient.id
GROUP BY
patient.id
) t2 ON t1.id = t2.id
The problem is that this returns the number of appointments for each patient independent from the subquery above it. I tried writing the join as this:
LEFT JOIN (
SELECT
COUNT(*) number_of_appointments,
patient.id
FROM
t1
GROUP BY
patient.id
)
But obviously I'm getting an error saying that table t1 doesn't exist. Is there any way for me to do this cleanly without having to repeat all of the filters from t1 in t2?
Thanks!
Why not use window functions?
SELECT p.name, p.id, a.date,
COUNT(a.patient_id) OVER (PARTITION BY p.id) as num_appointments
FROM patient p LEFT JOIN
appointment a
ON a.patient_id = p.id
WHERE . . .
This provides the count based on the WHERE filtering. If you wanted a count of all appointments, then do the calculation before applying the WHERE:
SELECT p.name, p.id, a.date,
COALESCE(a.cnt, 0) as num_total_appointments,
COUNT(a.patient_id) OVER (PARTITION BY p.id) as num_matching appointments
FROM patient p LEFT JOIN
(SELECT a.*,
COUNT(*) OVER (PARTITION BY a.patient_id) as cnt
FROM appointment a
) a
ON a.patient_id = p.id
WHERE . . .

MySQL right join two tables

I have three tables: hospitals, inspections, and issues. A hospital can have zero or more inspections. Inspections can have zero or more issues. I need to get a table that has all violations with the hospital that they were observed in as well as the date. Tables look like:
Business Table b
----------------------------
|id|name|address|city|state|
Inspection Table i
---------------------
|id|business_id|date|
Issue Table v
-----------------------------------
|id|business_id|inspection_id|desc|
What I need, ordered by i.date desc is:
Query result
--------------------------------
|b.name|b.address|i.date|v.desc|
There will be more than one issue per inspection so I need a row for each as above. That's what I am getting, but the latest inspection data is returned for every issue even though they were observed on different dates.
Here is what I have for my query:
SELECT b.business_name AS Name, b.address, b.city, b.state, i.date, v.desc
FROM business_table AS b
RIGHT JOIN inspection_table i ON i.business_id = b.id
RIGHT JOIN issue_table v ON v.inspection_id = i.id
ORDER BY i.date DESC
Try this
SELECT b.business_name AS Name, b.address, b.city, b.state, i.date, v.desc
FROM issue_table v
LEFT JOIN inspection_table i ON i.id = v.inspection_id
LEFT JOIN business_table AS b ON v.business_id = b.id
ORDER BY v.business_id, i.date DESC

LEFT OUTER JOIN get max() and include NULL values

I've managed to get the data out and include NULL values by using left outer join. This is my current query:
select s.user, a.id, a.datetime as date, a.total_time
from steam_accounts s
left outer join activity a on a.steam_id = s.id
where s.user_id = 1
This returns this:
Which is almost perfect. But now I need to filter the results with max(a.id) and include null values if there are no matches from the outer join.
Here's what I've tried:
select s.id, s.user, max(a.id), a.datetime as date, a.total_time
from steam_accounts s
left outer join activity a on a.steam_id = s.id
where s.user_id = "1"
Result:
All the null values disappeared. I only wanted to filter out the first two results from the previous query.
This is my desired result:
Any much is much appreciated. Thanks
Alas, MySQL doesn't have OUTER APPLY or LATERAL JOIN, so it will be less efficient, than it could have been. It seems that something like this should produce what you want:
SELECT
s.id
,s.user
,ActivityIDs.MaxActivityID
,activity.datetime as date
,activity.total_time
FROM
steam_accounts s
LEFT JOIN
(
SELECT
a.steam_id
,max(a.id) AS MaxActivityID
FROM activity a
GROUP BY a.steam_id
) AS ActivityIDs
ON ActivityIDs.steam_id = s.id
LEFT JOIN activity ON
activity.id = ActivityIDs.MaxActivityID
WHERE
s.user_id = 1
For each steam_account we find one activity with max ID in the first LEFT JOIN. Then we fetch the rest of activity details using found ID in the second LEFT JOIN.
Use max(coalesce(a.id, 0))
Any aggregation done on results with null will always return null
What I can think of would be using ROW_NUMBER() with partitioning functionality as in SQL Server or PostgreSQL. There's an example how to do this in MySQL here:
http://blog.sqlauthority.com/2014/03/09/mysql-reset-row-number-for-each-group-partition-by-row-number/.
What comes next, I'd partition your result set by user and sort it by date DESCENDING and then take records where ROW_NUMBER is equals to 1.
I've given similar answer here using SQL Server functionality: https://stackoverflow.com/a/30952154/3680098
It should produce you result set as follows:
Use aggregation to calculate the maximum value. Then join in that record using another left join:
select s.user, a.id, a.datetime as date, a.total_time
from steam_accounts s left outer join
activity a
on a.steam_id = s.id left outer join
(select a.steam_id, max(a.id) as maxid
from activity a
group by a.steam_id
) amax
on amax.steam_id = a.steam_id and amax.maxid = a.id
where s.user_id = 1;

MySQL query and count from other table

I would like to get the data from one table, and count all results from other table, depending on the first table data, here is what I tried:
SELECT
cars.*, (
SELECT
COUNT(*)
FROM
uploads
WHERE
uploads.cid = cars.customer
) AS `count`,
FROM
`cars`
WHERE
customer = 11;
I dont really have an idea why its not working, as I'm not a regular MySQL user/coder...
Could anyone direct me in the right direction with this one?
SELECT
c.*, COUNT(u.cid) AS count
FROM
cars c
LEFT JOIN
uploads u
ON
u.cid=c.customer
WHERE
u.customer = 11;
GROUP BY c.cid
Try it by joining both tables using LEFT JOIN
SELECT a.customer, COUNT(b.cid) totalCount
FROM cars a
LEFT JOIN uploads b
ON a.customer = b.cid
WHERE a.customer = 11
GROUP BY a.customer
using COUNT(*) in LEFT JOIN will have records to have a minimum count of 1.
SELECT cars.*,COUNT(uploads.*) as uplloaded
from cars
left outer join uploads on uploads.cid = cars.customer
where cars.customer = 11
group by uploads.cid;
Try this :
SELECT customer, COUNT(cid) totalCount
FROM cars
INNER JOIN uploads
ON (customer = cid)
WHERE customer = 11
GROUP BY customer

MySQL inner join different results

I am trying to work out why the following two queries return different results:
SELECT DISTINCT i.id, i.date
FROM `tblinvoices` i
INNER JOIN `tblinvoiceitems` it ON it.userid=i.userid
INNER JOIN `tblcustomfieldsvalues` cf ON it.relid=cf.relid
WHERE i.`tax` = 0
AND i.`date` BETWEEN '2012-07-01' AND '2012-09-31'
and
SELECT DISTINCT i.id, i.date
FROM `tblinvoices` i
WHERE i.`tax` = 0
AND i.`date` BETWEEN '2012-07-01' AND '2012-09-31'
Obviously the difference is the inner join here, but I don't understand why the one with the inner join is returning less results than the one without it, I would have thought since I didn't do any cross table references they should return the same results.
The final query I am working towards is
SELECT DISTINCT i.id, i.date
FROM `tblinvoices` i
INNER JOIN `tblinvoiceitems` it ON it.userid=i.userid
INNER JOIN `tblcustomfieldsvalues` cf ON it.relid=cf.relid
WHERE cf.`fieldid` =5
AND cf.`value`
REGEXP '[A-Za-z]'
AND i.`tax` = 0
AND i.`date` BETWEEN '2012-07-01' AND '2012-09-31'
But because of the different results that seem incorrect when I add the inner join (it removes some results that should be valid) it's not working at present, thanks.
INNER JOIN statement will retrieve rows that are stored in both table of the jion statement.
Try a LEFT JOIN statement. This will return rows that are in first table but not necessary in the second one :
SELECT DISTINCT i.id, i.date
FROM `tblinvoices` i
LEFT JOIN `tblinvoiceitems` it ON it.userid=i.userid
LEFT JOIN `tblcustomfieldsvalues` cf ON it.relid=cf.relid
WHERE i.`tax` = 0
AND i.`date` BETWEEN '2012-07-01' AND '2012-09-31'
INNER JOIN means show only records where the same ID value exists in both tables.
LEFT JOIN means to show all records from left table (i.e. the one that precedes in SQL statement) regardless of the existance of matching records in the right table.
Try LEFT Join instead of INNER JOIN
SELECT DISTINCT i.id, i.date
FROM `tblinvoices` i
LEFT JOIN `tblinvoiceitems` it ON it.userid=i.userid
LEFT JOIN `tblcustomfieldsvalues` cf ON it.relid=cf.relid
WHERE i.`tax` = 0
AND i.`date` BETWEEN '2012-07-01' AND '2012-09-31'