I am trying to work out why the following two queries return different results:
SELECT DISTINCT i.id, i.date
FROM `tblinvoices` i
INNER JOIN `tblinvoiceitems` it ON it.userid=i.userid
INNER JOIN `tblcustomfieldsvalues` cf ON it.relid=cf.relid
WHERE i.`tax` = 0
AND i.`date` BETWEEN '2012-07-01' AND '2012-09-31'
and
SELECT DISTINCT i.id, i.date
FROM `tblinvoices` i
WHERE i.`tax` = 0
AND i.`date` BETWEEN '2012-07-01' AND '2012-09-31'
Obviously the difference is the inner join here, but I don't understand why the one with the inner join is returning less results than the one without it, I would have thought since I didn't do any cross table references they should return the same results.
The final query I am working towards is
SELECT DISTINCT i.id, i.date
FROM `tblinvoices` i
INNER JOIN `tblinvoiceitems` it ON it.userid=i.userid
INNER JOIN `tblcustomfieldsvalues` cf ON it.relid=cf.relid
WHERE cf.`fieldid` =5
AND cf.`value`
REGEXP '[A-Za-z]'
AND i.`tax` = 0
AND i.`date` BETWEEN '2012-07-01' AND '2012-09-31'
But because of the different results that seem incorrect when I add the inner join (it removes some results that should be valid) it's not working at present, thanks.
INNER JOIN statement will retrieve rows that are stored in both table of the jion statement.
Try a LEFT JOIN statement. This will return rows that are in first table but not necessary in the second one :
SELECT DISTINCT i.id, i.date
FROM `tblinvoices` i
LEFT JOIN `tblinvoiceitems` it ON it.userid=i.userid
LEFT JOIN `tblcustomfieldsvalues` cf ON it.relid=cf.relid
WHERE i.`tax` = 0
AND i.`date` BETWEEN '2012-07-01' AND '2012-09-31'
INNER JOIN means show only records where the same ID value exists in both tables.
LEFT JOIN means to show all records from left table (i.e. the one that precedes in SQL statement) regardless of the existance of matching records in the right table.
Try LEFT Join instead of INNER JOIN
SELECT DISTINCT i.id, i.date
FROM `tblinvoices` i
LEFT JOIN `tblinvoiceitems` it ON it.userid=i.userid
LEFT JOIN `tblcustomfieldsvalues` cf ON it.relid=cf.relid
WHERE i.`tax` = 0
AND i.`date` BETWEEN '2012-07-01' AND '2012-09-31'
Related
So I have a large subquery and I would like to join on that subquery while using the result of the subquery in the join.
For example, I have a table called patient and one called appointment, and I would like to get the number of appointments per patient with given criteria.
Right now I am doing something like this:
SELECT
t1.*
FROM
(
SELECT
patient.name,
patient.id,
appointment.date
FROM
patient
LEFT JOIN appointment ON appointment.patient_id = patient.id
WHERE
/* a **lot** of filters, additional joins, etc*/
) t1
LEFT JOIN (
SELECT
COUNT(*) number_of_appointments,
patient.id
FROM
patient
LEFT JOIN appointment ON appointment.patient_id = patient.id
GROUP BY
patient.id
) t2 ON t1.id = t2.id
The problem is that this returns the number of appointments for each patient independent from the subquery above it. I tried writing the join as this:
LEFT JOIN (
SELECT
COUNT(*) number_of_appointments,
patient.id
FROM
t1
GROUP BY
patient.id
)
But obviously I'm getting an error saying that table t1 doesn't exist. Is there any way for me to do this cleanly without having to repeat all of the filters from t1 in t2?
Thanks!
Why not use window functions?
SELECT p.name, p.id, a.date,
COUNT(a.patient_id) OVER (PARTITION BY p.id) as num_appointments
FROM patient p LEFT JOIN
appointment a
ON a.patient_id = p.id
WHERE . . .
This provides the count based on the WHERE filtering. If you wanted a count of all appointments, then do the calculation before applying the WHERE:
SELECT p.name, p.id, a.date,
COALESCE(a.cnt, 0) as num_total_appointments,
COUNT(a.patient_id) OVER (PARTITION BY p.id) as num_matching appointments
FROM patient p LEFT JOIN
(SELECT a.*,
COUNT(*) OVER (PARTITION BY a.patient_id) as cnt
FROM appointment a
) a
ON a.patient_id = p.id
WHERE . . .
So I'm struggling to write a query that returns me all categories regardless of what filter I have applied but the count changes based on how many returned recipes there will be in this filter.
This query works nice if I don't apply any filters to it. The count's seem right, but as soon as I add something like this: where c.parent_id is not null and r.time_cook_minutes > 60 I am filtering out most of the categories instead of just getting a count of zero.
here's an example query that I came up with that does not work the way I want it to:
select t.id, t.name, t.parent_id, a.cntr from categories as t,
(select c.id, count(*) as cntr from categories as c
inner join recipe_categories as rc on rc.category_id = c.id
inner join recipes as r on r.id = rc.recipe_id
where c.parent_id is not null and r.time_cook_minutes > 60
group by c.id) as a
where a.id = t.id
group by t.id
so this currently, as you might imagine, returns only the counts of recipes that exist in this filter subset... what I'd like is to get all of them regardless of the filter with a count of 0 if they don't have any recipes under that filter.
any help with this would be greatly appreciated. If this question is not super clear let me know, and I can elaborate.
No need for nested join if you move the condition into a regular outer join:
select t.id, t.name, t.parent_id, count(r.id)
from categories as t
left join recipe_categories as rc on rc.category_id = c.id
left join recipes as r on r.id = rc.recipe_id
and r.time_cook_minutes > 60
where c.parent_id is not null
group by 1, 2, 3
Notes:
Use left joins so you always get every category
Put r.time_cook_minutes > 60 on the left join condition. Leaving it on the where clause cancels the effect of left
Simply use conditional aggregation, moving the WHERE clause into a CASE (or IF() for MySQL) statement wrapped in a SUM() of 1's and 0's (i.e., counts). Also, be sure to consistently use the explicit join, the current industry practice in SQL. While your derived table uses this form of join, the outer query uses implicit join matching IDs in WHERE clause.
select t.id, t.name, t.parent_id, a.cntr
from categories as t
inner join
(select c.id, sum(case when c.parent_id is not null and r.time_cook_minutes > 60
then 1
else 0
end) as cntr
from categories as c
inner join recipe_categories as rc on rc.category_id = c.id
inner join recipes as r on r.id = rc.recipe_id
group by c.id) as a
on a.id = t.id
group by t.id
I believe you want:
select c.id, c.name, c.parent_id, count(r.id)
from categories c left join
recipe_categories rc
on rc.category_id = c.id left join
recipes r
on r.id = rc.recipe_id and r.time_cook_minutes > 60
where c.parent_id is not null and
group by c.id, c.name, c.parent_id;
Notes:
This uses left joins for all the joins.
It aggregates by all the non-aggregated columns.
It counts matching recipes rather than all rows.
The condition on recipes is moved to the on clause from the where clause.
I have a table of establishments and I want to return a result set with the latest inspection date from the inspections table. Right now I have:
SELECT business_table.business_name, business_table.address, inspection_table.date
FROM business_table
LEFT JOIN inspection_table ON business_table.id = inspection_table.business_id
WHERE inspection_table.date = (
SELECT MAX(date)
FROM inspection_table)
The problem is I get only one result with the latest inspection date. I need all of the establishments returned. The query will need to be efficient because I have about 600K establishments and 3Million inspections.
You are using an outer join. When a business record has no matching inspection record, then an empty inspection record is created an joined instead. So that outer-joined inspection record will have all columns NULL. Then you have WHERE inspection_table.date = (...). This dismisses all outer-joined records again, because NULL will never match.
Use AND instead, to make the condition part of the WHERE clause:
SELECT
b.business_name,
b.address,
i.date
FROM business_table b
LEFT JOIN inspection_table i
ON i.business_id = b.id
AND i.date = (SELECT MAX(date) FROM inspection_table);
Ah, you are not looking for the latest inspection date at all. You are looking for the latest inspection date per business.
In your solution you are still using an outer join that doesn't work. Don't do that. Either use an outer join and use it properly or use an inner join if you are fine with that.
Here is how to get the latest inspection per business with an outer join (so you also show businesses that have had no inspection, yet):
SELECT
b.business_name,
b.address,
i.date,
...
FROM business_table b
LEFT JOIN inspection_table i
ON i.business_id = b.id
AND (i.business_id, i.date) IN
(
SELECT business_id, MAX(date)
FROM inspection_table
GROUP BY business_id
);
Same with joins:
SELECT
b.business_name,
b.address,
i.date,
...
FROM business_table b
LEFT JOIN
(
SELECT business_id, MAX(date) AS date
FROM inspection_table
GROUP BY business_id
) latest ON latest.business_id = b.id
LEFT JOIN inspection_table i
ON i.business_id = latest.business_id
AND i.date = latest.date;
I ended using the following:
SELECT business_table.business_name AS Name, business_table.address AS Address, business_table.city AS City, business_table.state AS Province, inspection_table.rating AS Rating, inspection_table.date AS "Inspected"
FROM business_table
LEFT JOIN inspection_table ON business_table.id = inspection_table.business_id
WHERE inspection_table.date = (
SELECT MAX(inspection_table.date)
FROM inspection_table
WHERE business_table.id = inspection_table.business_id)
ORDER BY inspection_table.date DESC
Here is my sql data fiddler http://sqlfiddle.com/#!2/63178/1. Waht's wrong with my query?
SELECT DISTINCT curr.id,curr.curr_tittle, curr.curr_desc
FROM wp_curriculum curr LEFT JOIN (SELECT DISTINCT * FROM wp_curriculum_topic WHERE curr_topic IN (4,12)) AS A ON A.curr_id = curr.id ORDER BY A.id
If you are looking for matching row from both the table then just replace LEFT JOIN to INNER JOIN, otherwise your sql query is showing expected result for LEFT JOIN condition.
SQL Query with INNER JOIN:
SELECT DISTINCT curr.id,curr.curr_tittle, curr.curr_desc FROM wp_curriculum curr INNER JOIN (SELECT DISTINCT * FROM wp_curriculum_topic WHERE curr_topic IN (4,12)) AS A ON curr.id = A.curr_id ORDER BY A.id
Your query works as expected. Could it be you are mixing ID and CURR_ID?
I have the following query which makes 2 inner joins. This works fine unless there are no entries for the account_id in the ratings table.
SELECT c.comment_id, a.account_id, a.first_name, a.second_name, a.points, a.image_url, c.body, c.creation_time, AVG(r.rating_overall)
FROM comments AS c
INNER JOIN accounts AS a
ON c.account_id=a.account_id
INNER JOIN ratings AS r
ON r.baker_id=a.account_id
WHERE c.blog_id = ?
GROUP BY c.comment_id, a.account_id, a.first_name, a.second_name, a.points, a.image_url, c.body, c.creation_time
ORDER BY c.creation_time DESC
How do I make this query return a result even if there are no entries in the ratings table. In other words produce AVG(r.rating_overall) = 0 whenever there are no ratings?
You should use a LEFT JOIN:
SELECT
...
FROM comments AS c
INNER JOIN accounts AS a
ON c.account_id=a.account_id
LEFT JOIN ratings AS r
ON r.baker_id=a.account_id
....
that will return all rows from the previous join, and only the rows that matches the last join. If there's no match, all columns from rating tables will be null.
You learn more about joins on this visual explanation of SQL joins.