Left Join when looking for no matches - mysql

I have 3 tables and I am pretty sure I need to use a left join when joining the 3rd however the join between the first 2 tables I think just needs to be a regular join and I'm not sure if that requires some kind of nesting or not.
So the first 2 tables (once I set my conditions) should always work out to a 1 to 1 relationship. Then I need to join that to the 3rd table but I need to know if there is no match (which means i need a left join here). Essentially that's all I need to know in this query and actually want to filter table 3 and only show if it is NULL. Furthermore those NULL responses I want to see (or lack thereof) have a date field. I also want to only see that there is NO record for today's date in table 3.
At the end of the day I want to know when there is no record existing in table 3 for todays date for the primary key in table 1/2 (since the way that first join works even though there is a primary/foreign key relation, the primary key of table 1 is all that matters when matching on table 3.
Query:
SELECT
subscribers.*,
check_times.*
FROM subscribers, check_times
LEFT JOIN checks
ON checks.subscriber_id = check_times.subscriber_id
WHERE
subscribers.subscriber_id = check_times.subscriber_id
AND check_times.dow = 1
AND check_times.time <= '19:52'
AND checks.date = '2015-02-16'
AND checks.status IS NULL

Since you're referencing the table in the LEFT JOIN on the WHERE clause, it became an INNER JOIN. What you want is to put them in the JOIN condition.
SELECT
subscribers.*,
check_times.*
FROM subscribers,
INNER JOIN check_times
ON subscribers.subscriber_id = check_times.subscriber_id
LEFT JOIN checks
ON checks.subscriber_id = check_times.subscriber_id
AND checks.date = '2015-02-16'
WHERE
check_times.dow = 1
AND check_times.time <= '19:52'
AND checks.status IS NULL
Also, please refrain from using the old-style JOIN. Read on this for more info:
Bad habits to kick : using old-style JOINs

Related

Select all records in MySQL unless record exist in another table

I have two MySQL tables. One is called match_rail and match_complete.
When a bill_number from match_rail is actioned, the record moves to the match_complete table and should no longer be displayed in the match_rail table.
The match_rail table is refreshed hourly. Therefore I need to make sure not to display the same bill_number if it already exists in the match_complete table.
Here is the query:
SELECT
mr.RAMP
mr.ETA
mr.BILL_NUMBER
// few more columns
FROM
matchback_rail mr
JOIN
matchback_complete mc ON mr.BILL_NUMBER = mc.BILL_NUMBER
The above query gives me 0 records. It should give me all records except the ones that exist in both tables.
Not sure if I should be using a JOIN or LEFT JOIN.
Try this query:
SELECT
mr.RAMP
mr.ETA
mr.BILL_NUMBER
// few more columns
FROM
matchback_rail mr
WHERE NOT EXISTS(SELECT 1 FROM matchback_complete
WHERE BILL_NUMBER = mr.BILL_NUMBER)
You want to use a LEFT JOIN, this gives all records in mr, even if there is nothing joined. Then use WHERE to filter out the ones you don't want.
SELECT
mr.RAMP
mr.ETA
mr.BILL_NUMBER
// few more columns
FROM
matchback_rail mr
LEFT JOIN
matchback_complete mc ON mr.BILL_NUMBER = mc.BILL_NUMBER
WHERE mc.BILL_NUMBER IS NULL

How to use left outer join in mysql to get my desirable results?

I have made an sql fiddle. Users with id 100 and 118 should have 0 records for assigned_scopes, assigned_qa, failed_qa and assigned_canvass. But it does not display that data; it just shows the record for the user with id 210. What I really need is to display all the users, with 0 in each column if they have nothing. How can I do this?
What I have tried since is in this fiddle. It works according to my requirements but there is problem with query optimization; its execution time is not what I want. It took more than 33 seconds to load the page (which is not good at all) when I used the second fiddle query because of the huge data. The query in the first fiddle executed in 2 seconds even with huge data. How can correct the first query to give the results of the second query while (hopefully) staying fast?
The "ten second explanation" for why most queries with LEFT JOIN do not display rows that you expect is that you've put one of the left joined tables into the where clause, demanding a value. This automatically converts any left join into an inner join behaviour
You either have to say WHERE leftjoinedtable.column =value OR leftjoinedtable.column is null or preferentially put the statement in the ON condition (doesn't need a qualifying 'or xxx is null which makes your query simpler and more readable
Move all the statements in your where clause out, and into the respective ON conditions so your query no longer has a where clause section at all. Throw your latter attempt away; it has a Cartesian join and sub selects in the main selection list; in a production system it could be generating millions more rows than needed, selecting extra data based on them, and then discarding them. A very difficult query style to optimise and usually unnecessary
Edit.. Had another look at your fiddle, noticed you also GROUP BY a column that could be null.. When it is, those rows disappear from the output
In terms of how you should write queries, consider first which tables will always have all the values. In your case these are the user and teams tables. These should be inner joined together first, then other tables which may not always have a matching row should be added LEFT join. Right join is seldom needed
Alter the order of your tables so it's teams, inner join users, then left join the two tables you're counting stars on. When you group by, group on the users Id from the users table, not the stats table
I did this with your fiddles but couldn't find a way to save them as a new fiddle on the iPad, couldn't copy the text, but here's a screenshot of a query that does what you require, I'll leave the typing as an exercise for the reader :) , don't forget to adjust the group by
Note, now the users table is inner joined, the =15 filter could go in the where clause... I'll leave the general advice though, to encourage and remind you that a) it works just as well in the ON clause and b) to prefer putting these things in the ON because it means left joins work as you expect
Read a definition of left join on. It returns rows that inner join on does plus unmatched rows in the left table extended by nulls.
Re your 1st link query: If you want all the records from user then it must be the leftmost table in your left joins. You must not remove rows that might have null values by tests requiring non-null values in where--but, why would you?--ie, why do you? The following returns rows only for users with role 15, per the 2nd link query; you don't explain why 15 tested for in the 1st.
select u.user_id,
ut.name as team_name,
/* ... */
count(case when a.status = 2 AND a.qc_id = 0 and o.class_id= 3 then 1 else null end)
AS assigned_canvass
from am_user u
left join user_team ut
on u.user_team_id = ut.user_team_id
left join am_ts_assignment a
on u.user_id = a.tech_id
left join am_ts_order o
on o.assignment_id = a.assignment_id
where u.user_role_id = 15
group by u.user_id
order by u.user_id asc
Compose your queries incrementally and test as you add joins & columns.
(This is the same query I gave on your last post re essentially your 1st link query except it returns a row for every am_user instead of every am_ts_assignment, per each post. The where might have been an and in the first left join with am_user; either will do here.)

Replace the id's with name using single query

SELECT team_with.participant1,team_with.participant2,team_with.participant3
FROM event,team_with
WHERE team_with.for_event_no=event.event_no AND
event.event_no=4 AND
team_with.participant1=9 OR
team_with.participant2=9 OR
team_with.participant3=9;
I have written the particular query, and obtained the required id's in a row. I am not able to modify this query such that, in place of these id's, names connected to the id's are displayed.
The student_detatil table consists of PK(sam_id) and the attribute name.
IDs displayed by the present query are FKs connected to student_detail.sam_id..
It seems like a bad design to multiply columns storing different participants. Consider creating a separate row for each participant and storing them in a table. Your joining logic would also be easier.
Also, please use explicit JOIN syntax - it makes the query clearer and easier to understand by separating join logic with conditions for data retrieval.
Remember that operator AND has a precedence over OR, so that your event.event_no = 4 does not apply to each participant condition. I believe this was a mistake, but you are the one to judge.
As to the query itself, you could apply OR conditions into join, or simply join the student_detail table thrice.
SELECT
s1.name,
s2.name,
s3.name
FROM
event e
INNER JOIN team_with t ON t.for_event_no = e.event_no
LEFT JOIN student_detail s1 ON s1.sam_id = t.participant1
LEFT JOIN student_detail s2 ON s2.sam_id = t.participant2
LEFT JOIN student_detail s3 ON s3.sam_id = t.participant3
WHERE
e.event_no = 4
AND ( t.participant1=9 OR t.participant2=9 OR t.participant3=9 );

Dependant SubQuery v Left Join

This query displays the correct result but when doing an EXPLAIN, it lists it as a "Dependant SubQuery" which I'm led to believe is bad?
SELECT Competition.CompetitionID, Competition.CompetitionName, Competition.CompetitionStartDate
FROM Competition
WHERE CompetitionID NOT
IN (
SELECT CompetitionID
FROM PicksPoints
WHERE UserID =1
)
I tried changing the query to this:
SELECT Competition.CompetitionID, Competition.CompetitionName, Competition.CompetitionStartDate
FROM Competition
LEFT JOIN PicksPoints ON Competition.CompetitionID = PicksPoints.CompetitionID
WHERE UserID =1
and PicksPoints.PicksPointsID is null
but it displays 0 rows. What is wrong with the above compared to the first query that actually does work?
The seconds query cannot produce rows: it claims:
WHERE UserID =1
and PicksPoints.PicksPointsID is null
But to clarify, I rewrite as follows:
WHERE PicksPoints.UserID =1
and PicksPoints.PicksPointsID is null
So, on one hand, you are asking for rows on PicksPoints where UserId = 1, but then again you expect the row to not exist in the first place. Can you see the fail?
External joins are so tricky at that! Usually you filter using columns from the "outer" table, for example Competition. But you do not wish to do so; you wish to filter on the left-joined table. Try and rewrite as follows:
SELECT Competition.CompetitionID, Competition.CompetitionName, Competition.CompetitionStartDate
FROM Competition
LEFT JOIN PicksPoints ON (Competition.CompetitionID = PicksPoints.CompetitionID AND UserID = 1)
WHERE
PicksPoints.PicksPointsID is null
For more on this, read this nice post.
But, as an additional note, performance-wise you're in some trouble, using either subquery or the left join.
With subquery you're in trouble because up to 5.6 (where some good work has been done), MySQL is very bad with optimizing inner queries, and your subquery is expected to execute multiple times.
With the LEFT JOIN you are in trouble since a LEFT JOIN dictates the order of join from left to right. Yet your filtering is on the right table, which means you will not be able to use an index for filtering the USerID = 1 condition (or you would, and lose the index for the join).
These are two different queries. The first query looks for competitions associated with user id 1 (via the PicksPoints table), which the second joins with those rows that are associated with user id 1 that in addition have a null PicksPointsID.
The second query is coming out empty because you are joining against a table called PicksPoints and you are looking for rows in the join result that have PicksPointsID as null. This can only happen if
The second table had a row with a null PickPointsID and a competition id that matched a competition id in the first table, or
All the columns in the second table's contribution to the join are null because there is a competition id in the first table that did not appear in the second.
Since PicksPointsID really sounds like a primary key, it's case 2 that is showing up. So all the columns from PickPointsID are null, your where clause (UserID=1 and PicksPoints.PicksPointsID is null) will always be false and your result will be empty.
A plain left join should work for you
select c.CompetitionID, c.CompetitionName, c.CompetitionStartDate
from Competition c
left join PicksPoints p
on (c.CompetitionID = p.CompetitionID)
where p.UserID <> 1
Replacing the final where with an and (making a complex join clause) might also work. I'll leave it to you to analyze the plans for each query. :)
I'm not personally convinced of the need for the is null test. The article linked to by Shlomi Noach is excellent and you may find some tips in there to help you with this.

simple joins between 2 mysql tables returning all results every time.. Help!

I just imported a large amount of data into two tables. Let's call them shipments and returns.
When trying to do a simple join (left or inner) based on any criteria in these two tables. query looks like it tries to do a cross join or find every combination instead of what the query should be pulling.
each table has an PK id field, but there is not FK relationship between the two other than some shared field.
I'm currently just trying to related them on shipment_id.
I feel this is a simple answer. Am I missing a reference or something obvious that is causing this? Thanks!
here's an example. This should returned under 100 rows. This instead returns hundreds of thousands.
SELECT r.*
FROM returns as r
left outer join shipments as s
on r.shipment_id = s.shipment_id
where r.date = '2011-06-20'
Here is a query that should work:
SELECT T0.*, T1.*
FROM shipments AS T0 LEFT JOIN returns AS T1 ON T0.shipment_id = T1.shipment_id
ORDER BY T0.shipment_id;
This query join assumes 1:1 on the shipment_id
It would be nice if you included the query you were using
You need to specify what you are joining on, otherwise it will do a cartesian join:
SELECT r.*
FROM returns as r
LEFT JOIN shipments as s ON s.shipment_id = r.shipment_id
where r.date = '2011-06-20'
Josh,
I would be interested in seeing what would happen if you forced a join to a specific record or set of records instead of the whole table. Assuming there is a shipment with an id of 5 in your table, you could try:
SELECT r.* FROM returns as r
left join shipments as s
ON 5 = r.shipment_id
WHERE r.date = '2011-06-20'
While just a fancy where clause, it would at least prove that the join you are attempting will eventually work correctly. The issue is that your on clause is always returning true, no matter what the value is. This could be because it's not interpreting the shipment_id as an integer, but instead as a true/false variable where any value evaluates to true.
Original Rejected Solution:
No Foreign Key relationship should be needed in order to make the joins happen. The PK id fields I'm assuming are an integer (or number, or whatever your rdms equivalent is)?
Can you past a snippet of your sql query?
Updating based on posted query:
I would add your explicit join criteria in order to rule out any funny business (my guess is since no criteria is specified, it's using 1=1, which always joins). So I would change your query to look like:
SELECT r.*
FROM returns as r
left join shipments as s ON
s.ShipId = R.ReturnId
where r.date = '2011-06-20'
The issue turned out to be very simple, just not readily apparent until going through all the columns. It turns out that the shipment ID was duplicated through every row as it hit the upper limit for the int datatype. This is why joins were returning every record.
After switching the datatype to bigint and reimporting, everything worked great. Thanks all for looking into it.