Dependant SubQuery v Left Join - mysql

This query displays the correct result but when doing an EXPLAIN, it lists it as a "Dependant SubQuery" which I'm led to believe is bad?
SELECT Competition.CompetitionID, Competition.CompetitionName, Competition.CompetitionStartDate
FROM Competition
WHERE CompetitionID NOT
IN (
SELECT CompetitionID
FROM PicksPoints
WHERE UserID =1
)
I tried changing the query to this:
SELECT Competition.CompetitionID, Competition.CompetitionName, Competition.CompetitionStartDate
FROM Competition
LEFT JOIN PicksPoints ON Competition.CompetitionID = PicksPoints.CompetitionID
WHERE UserID =1
and PicksPoints.PicksPointsID is null
but it displays 0 rows. What is wrong with the above compared to the first query that actually does work?

The seconds query cannot produce rows: it claims:
WHERE UserID =1
and PicksPoints.PicksPointsID is null
But to clarify, I rewrite as follows:
WHERE PicksPoints.UserID =1
and PicksPoints.PicksPointsID is null
So, on one hand, you are asking for rows on PicksPoints where UserId = 1, but then again you expect the row to not exist in the first place. Can you see the fail?
External joins are so tricky at that! Usually you filter using columns from the "outer" table, for example Competition. But you do not wish to do so; you wish to filter on the left-joined table. Try and rewrite as follows:
SELECT Competition.CompetitionID, Competition.CompetitionName, Competition.CompetitionStartDate
FROM Competition
LEFT JOIN PicksPoints ON (Competition.CompetitionID = PicksPoints.CompetitionID AND UserID = 1)
WHERE
PicksPoints.PicksPointsID is null
For more on this, read this nice post.
But, as an additional note, performance-wise you're in some trouble, using either subquery or the left join.
With subquery you're in trouble because up to 5.6 (where some good work has been done), MySQL is very bad with optimizing inner queries, and your subquery is expected to execute multiple times.
With the LEFT JOIN you are in trouble since a LEFT JOIN dictates the order of join from left to right. Yet your filtering is on the right table, which means you will not be able to use an index for filtering the USerID = 1 condition (or you would, and lose the index for the join).

These are two different queries. The first query looks for competitions associated with user id 1 (via the PicksPoints table), which the second joins with those rows that are associated with user id 1 that in addition have a null PicksPointsID.
The second query is coming out empty because you are joining against a table called PicksPoints and you are looking for rows in the join result that have PicksPointsID as null. This can only happen if
The second table had a row with a null PickPointsID and a competition id that matched a competition id in the first table, or
All the columns in the second table's contribution to the join are null because there is a competition id in the first table that did not appear in the second.
Since PicksPointsID really sounds like a primary key, it's case 2 that is showing up. So all the columns from PickPointsID are null, your where clause (UserID=1 and PicksPoints.PicksPointsID is null) will always be false and your result will be empty.
A plain left join should work for you
select c.CompetitionID, c.CompetitionName, c.CompetitionStartDate
from Competition c
left join PicksPoints p
on (c.CompetitionID = p.CompetitionID)
where p.UserID <> 1
Replacing the final where with an and (making a complex join clause) might also work. I'll leave it to you to analyze the plans for each query. :)
I'm not personally convinced of the need for the is null test. The article linked to by Shlomi Noach is excellent and you may find some tips in there to help you with this.

Related

How to use left outer join in mysql to get my desirable results?

I have made an sql fiddle. Users with id 100 and 118 should have 0 records for assigned_scopes, assigned_qa, failed_qa and assigned_canvass. But it does not display that data; it just shows the record for the user with id 210. What I really need is to display all the users, with 0 in each column if they have nothing. How can I do this?
What I have tried since is in this fiddle. It works according to my requirements but there is problem with query optimization; its execution time is not what I want. It took more than 33 seconds to load the page (which is not good at all) when I used the second fiddle query because of the huge data. The query in the first fiddle executed in 2 seconds even with huge data. How can correct the first query to give the results of the second query while (hopefully) staying fast?
The "ten second explanation" for why most queries with LEFT JOIN do not display rows that you expect is that you've put one of the left joined tables into the where clause, demanding a value. This automatically converts any left join into an inner join behaviour
You either have to say WHERE leftjoinedtable.column =value OR leftjoinedtable.column is null or preferentially put the statement in the ON condition (doesn't need a qualifying 'or xxx is null which makes your query simpler and more readable
Move all the statements in your where clause out, and into the respective ON conditions so your query no longer has a where clause section at all. Throw your latter attempt away; it has a Cartesian join and sub selects in the main selection list; in a production system it could be generating millions more rows than needed, selecting extra data based on them, and then discarding them. A very difficult query style to optimise and usually unnecessary
Edit.. Had another look at your fiddle, noticed you also GROUP BY a column that could be null.. When it is, those rows disappear from the output
In terms of how you should write queries, consider first which tables will always have all the values. In your case these are the user and teams tables. These should be inner joined together first, then other tables which may not always have a matching row should be added LEFT join. Right join is seldom needed
Alter the order of your tables so it's teams, inner join users, then left join the two tables you're counting stars on. When you group by, group on the users Id from the users table, not the stats table
I did this with your fiddles but couldn't find a way to save them as a new fiddle on the iPad, couldn't copy the text, but here's a screenshot of a query that does what you require, I'll leave the typing as an exercise for the reader :) , don't forget to adjust the group by
Note, now the users table is inner joined, the =15 filter could go in the where clause... I'll leave the general advice though, to encourage and remind you that a) it works just as well in the ON clause and b) to prefer putting these things in the ON because it means left joins work as you expect
Read a definition of left join on. It returns rows that inner join on does plus unmatched rows in the left table extended by nulls.
Re your 1st link query: If you want all the records from user then it must be the leftmost table in your left joins. You must not remove rows that might have null values by tests requiring non-null values in where--but, why would you?--ie, why do you? The following returns rows only for users with role 15, per the 2nd link query; you don't explain why 15 tested for in the 1st.
select u.user_id,
ut.name as team_name,
/* ... */
count(case when a.status = 2 AND a.qc_id = 0 and o.class_id= 3 then 1 else null end)
AS assigned_canvass
from am_user u
left join user_team ut
on u.user_team_id = ut.user_team_id
left join am_ts_assignment a
on u.user_id = a.tech_id
left join am_ts_order o
on o.assignment_id = a.assignment_id
where u.user_role_id = 15
group by u.user_id
order by u.user_id asc
Compose your queries incrementally and test as you add joins & columns.
(This is the same query I gave on your last post re essentially your 1st link query except it returns a row for every am_user instead of every am_ts_assignment, per each post. The where might have been an and in the first left join with am_user; either will do here.)

include null and zero in count() from related table

I would like to list in table (staging) the number of related records from table (studies).
So far this statement works well but returns only the rows where there are >0 related records:
SELECT staging.*,
COUNT(studies.PMID) AS refcount
FROM studies
LEFT JOIN staging
ON studies.rs_number = staging.rs
GROUP BY staging.idstaging;
How can I adjust this statement to list ALL rows in table (staging) including where there are zero or null related records from table (studies)?
Thank you
You have the tables in the wrong order in the LEFT JOIN:
SELECT staging.*, COUNT(studies.PMID) AS refcount
FROM staging LEFT JOIN
studies
ON studies.rs_number = staging.rs
GROUP BY staging.idstaging;
LEFT JOIN keeps everything in the first ("left") table and all matching rows in the second. If you want to keep everything in the staging table, then put it first.
And, in case anyone wants to complain about the use of staging.* with GROUP BY. This particular usage is (presumably) ANSI compliant because staging.idstaging is (presumably) a unique id in that table.

Left Join when looking for no matches

I have 3 tables and I am pretty sure I need to use a left join when joining the 3rd however the join between the first 2 tables I think just needs to be a regular join and I'm not sure if that requires some kind of nesting or not.
So the first 2 tables (once I set my conditions) should always work out to a 1 to 1 relationship. Then I need to join that to the 3rd table but I need to know if there is no match (which means i need a left join here). Essentially that's all I need to know in this query and actually want to filter table 3 and only show if it is NULL. Furthermore those NULL responses I want to see (or lack thereof) have a date field. I also want to only see that there is NO record for today's date in table 3.
At the end of the day I want to know when there is no record existing in table 3 for todays date for the primary key in table 1/2 (since the way that first join works even though there is a primary/foreign key relation, the primary key of table 1 is all that matters when matching on table 3.
Query:
SELECT
subscribers.*,
check_times.*
FROM subscribers, check_times
LEFT JOIN checks
ON checks.subscriber_id = check_times.subscriber_id
WHERE
subscribers.subscriber_id = check_times.subscriber_id
AND check_times.dow = 1
AND check_times.time <= '19:52'
AND checks.date = '2015-02-16'
AND checks.status IS NULL
Since you're referencing the table in the LEFT JOIN on the WHERE clause, it became an INNER JOIN. What you want is to put them in the JOIN condition.
SELECT
subscribers.*,
check_times.*
FROM subscribers,
INNER JOIN check_times
ON subscribers.subscriber_id = check_times.subscriber_id
LEFT JOIN checks
ON checks.subscriber_id = check_times.subscriber_id
AND checks.date = '2015-02-16'
WHERE
check_times.dow = 1
AND check_times.time <= '19:52'
AND checks.status IS NULL
Also, please refrain from using the old-style JOIN. Read on this for more info:
Bad habits to kick : using old-style JOINs

MySQL query for finding rows that are in one table but not another

Let's say I have about 25,000 records in two tables and the data in each should be the same. If I need to find any rows that are in table A but NOT in table B, what's the most efficient way to do this.
We've tried it as a subquery of one table and a NOT IN the result but this runs for over 10 minutes and almost crashes our site.
There must be a better way. Maybe a JOIN?
Hope LEFT OUTER JOIN will do the job
select t1.similar_ID
, case when t2.similar_ID is not null then 1 else 0 end as row_exists
from table1 t1
left outer join (select distinct similar_ID from table2) t2
on t1.similar_ID = t2.similar_ID // your WHERE goes here
I would suggest you read the following blog post, which goes into great detail on this question:
Which method is best to select values present in one table but missing
in another one?
And after a thorough analysis, arrives at the following conclusion:
However, these three methods [NOT IN, NOT EXISTS, LEFT JOIN]
generate three different plans which are executed by three different
pieces of code. The code that executes EXISTS predicate is about 30%
less efficient than those that execute index_subquery and LEFT JOIN
optimized to use Not exists method.
That’s why the best way to search for missing values in MySQL is using a LEFT JOIN / IS NULL or NOT IN rather than NOT
EXISTS.
If the performance you're seeing with NOT IN is not satisfactory, you won't improve this performance by switching to a LEFT JOIN / IS NULL or NOT EXISTS, and instead you'll need to take a different route to optimizing this query, such as adding indexes.
Use exixts and not exists function instead
Select * from A where not exists(select * from B);
Left join. From the mysql documentation
If there is no matching row for the right table in the ON or USING
part in a LEFT JOIN, a row with all columns set to NULL is used for
the right table. You can use this fact to find rows in a table that
have no counterpart in another table:
SELECT left_tbl.* FROM left_tbl LEFT JOIN right_tbl ON left_tbl.id =
right_tbl.id WHERE right_tbl.id IS NULL;
This example finds all rows in left_tbl with an id value that is not
present in right_tbl (that is, all rows in left_tbl with no
corresponding row in right_tbl).

simple joins between 2 mysql tables returning all results every time.. Help!

I just imported a large amount of data into two tables. Let's call them shipments and returns.
When trying to do a simple join (left or inner) based on any criteria in these two tables. query looks like it tries to do a cross join or find every combination instead of what the query should be pulling.
each table has an PK id field, but there is not FK relationship between the two other than some shared field.
I'm currently just trying to related them on shipment_id.
I feel this is a simple answer. Am I missing a reference or something obvious that is causing this? Thanks!
here's an example. This should returned under 100 rows. This instead returns hundreds of thousands.
SELECT r.*
FROM returns as r
left outer join shipments as s
on r.shipment_id = s.shipment_id
where r.date = '2011-06-20'
Here is a query that should work:
SELECT T0.*, T1.*
FROM shipments AS T0 LEFT JOIN returns AS T1 ON T0.shipment_id = T1.shipment_id
ORDER BY T0.shipment_id;
This query join assumes 1:1 on the shipment_id
It would be nice if you included the query you were using
You need to specify what you are joining on, otherwise it will do a cartesian join:
SELECT r.*
FROM returns as r
LEFT JOIN shipments as s ON s.shipment_id = r.shipment_id
where r.date = '2011-06-20'
Josh,
I would be interested in seeing what would happen if you forced a join to a specific record or set of records instead of the whole table. Assuming there is a shipment with an id of 5 in your table, you could try:
SELECT r.* FROM returns as r
left join shipments as s
ON 5 = r.shipment_id
WHERE r.date = '2011-06-20'
While just a fancy where clause, it would at least prove that the join you are attempting will eventually work correctly. The issue is that your on clause is always returning true, no matter what the value is. This could be because it's not interpreting the shipment_id as an integer, but instead as a true/false variable where any value evaluates to true.
Original Rejected Solution:
No Foreign Key relationship should be needed in order to make the joins happen. The PK id fields I'm assuming are an integer (or number, or whatever your rdms equivalent is)?
Can you past a snippet of your sql query?
Updating based on posted query:
I would add your explicit join criteria in order to rule out any funny business (my guess is since no criteria is specified, it's using 1=1, which always joins). So I would change your query to look like:
SELECT r.*
FROM returns as r
left join shipments as s ON
s.ShipId = R.ReturnId
where r.date = '2011-06-20'
The issue turned out to be very simple, just not readily apparent until going through all the columns. It turns out that the shipment ID was duplicated through every row as it hit the upper limit for the int datatype. This is why joins were returning every record.
After switching the datatype to bigint and reimporting, everything worked great. Thanks all for looking into it.