SELECT * FROM A
JOIN B
ON B.ID = A.ID
AND B.Time = (SELECT max(Time)
FROM B B2
WHERE B2.ID = B.ID)
I am trying to join these two tables in MYSQL. Don't pay attention to that if the ID is unique then I wouldn't be trying to do this. I condensed the real solution to paint a simplified picture. I am trying to grab and join the table B on the max date for a certain record. This procedure is getting run by an SSIS package and is saying B2.ID is an unknown column. I do things like this frequently in MSSQL and am new to MYSQL. Anyone have any pointers or ideas?
I do this type of query differently, with an exclusion join instead of a subquery. You want to find the rows of B which have the max Time for a given ID; in other words, where no other row has a greater Time and the same ID.
SELECT A.*, B.*
FROM A JOIN B ON B.ID = A.ID
LEFT OUTER JOIN B AS B2 ON B.ID = B2.ID AND B.Time < B2.Time
WHERE B2.ID IS NULL
You can also use a derived table, which should perform better than using a correlated subquery.
SELECT A.*, B.*
FROM A JOIN B ON B.ID = A.ID
JOIN (SELECT ID, MAX(Time) AS Time FROM B GROUP BY ID) AS B2
ON (B.ID, B.Time) = (B2.ID, B2.Time)
P.S.: I've added the greatest-n-per-group tag. This type of SQL question comes up every week on Stack Overflow, so you can follow that tag to see dozens of similar questions and their answers.
Related
I'm trying to JOIN a Master Dataset, via a left join with 2 other Datasets, all of them have the same Key field. So nothing special there.
One of those secondary Datasets is the result of another Query and therefor might or might not exist. Obviously my JOIN statement fails when this table doesn't exist.
Below a really simplified version of the code, the JOIN is used to exclude rows from the table_a that exist in table b or c (if they exist).
SELECT a.id, a.name
FROM table_a a
LEFT JOIN table_b b
ON a.id = b.id
LEFT JOIN table c c
ON a.id = c.id
WHERE b.id IS NULL
AND c.id IS NULL;
I am not sure that I understand your question well, but I think that you should better do:
SELECT a.id,a.name
FROM table_a a
WHERE a.id NOT IN
(SELECT id FROM table_b)
AND a.id NOT IN
(SELECT id FROM table_c)
Any query optimizer should have the exact same performance with this request, and I find it much more readable.
This is not a duplicate of this Q&A because the question and answers here concerns the table mentioned in the FROM clause. Which mine doesn't.
Assuming the table in the FROM clause is always the same and I'm never going to change it. Does it matter which order I add my joins?
I am using an in-house built query builder. (Yes I know there are things out there already but that's out of scope for the question).
I want to be able to set some of the joins at the beginning of my script and some later based on conditionals, the query builder adds them to the query from the top down. Will the SQL engine optimize the order of the joins anyway, regardless of their order in the query?
example:
SELECT a.col1, d.col2, c.col1, b.col3
FROM table1 A
INNER JOIN table2 B
ON B.a_id = A.id
LEFT JOIN table3 C
ON C.id = A.c_id
LEFT JOIN table4 D
ON D.id = C.d_id;
SELECT a.col1, d.col2, c.col1, b.col3
FROM table1 A
LEFT JOIN table4 D
ON D.id = C.d_id
INNER JOIN table2 B
ON B.a_id = A.id
LEFT JOIN table3 C
ON C.id = A.c_id;
Here you can see that I have declared the join for table4 D before the join for it's dependent table is declared in the script (C). Does this matter?
Simple answer: No you can't reference a table object/alias before the object has been declared.
mySQL will throw an error on 2nd query. `Unknown column 'C.d_id' in 'on clause'
So yes... the compiler doesn't look ahead to see if it's been referenced later.. It only knows the order first then it tries to figure out which method of joining is best.
SQLFiddle
*To address question of: Will the SQL engine optimize the order of the joins anyway, regardless of their order in the query? *
Yes it would optimize the order; but the "FROM" order can't include a reference to a table before it's been declared or the query will not compile. (See error above and link for example)
I have create a sql query that the sketch is like this
select *
from A
where A.id in (select B.id1, B.id2 from B);
where the main select returns those values for which A.id coincides with either B.id1 or B.id2.
Clearly this solution doesn't work as the cardinality doesn't match in the where clause. How can I overcome this problem?
One solution would be to make two sub-queries, one for B.id1 and one for B.id2, but as my sub-query is much longer than in this example I was looking for a more elegant solution.
I'm using Mysql
EDIT 1
As long as the syntax is simpler than using two sub-queries I have no issues using joins
EDIT 2
Thanks #NullSoulException. I tried the first solution and works as expected!!
Something like the below should do the trick.
select *
From table1 a , (select id1 , id2 from table2 ) b
where (a.id = b.id1) or (a.id = b.id2)
or you can JOIN with the same table twice by giving the joined tables an alias.
select * from table1 a
INNER JOIN table2 b1 on a.id = b1.id1
INNER JOIN table2 b2 on a.id = b2.id2
Please test the above against your datasets/tables..
I'm trying to improve a (not so much) simple query:
I need to retrieve every row from Table A.
Then join Table A with Table B so I get all the data I need.
At the same time, I need to add an extra column with the count() from Table C.
Something like:
SELECT a.*,
(SELECT Count(*)
FROM table_c c
WHERE c.a_id = a.id) AS counter,
b.*
FROM table_a a
LEFT JOIN table_b b
ON b.a_id = a.id
This works, ok, but in reality, I'm just making 2 queries and I need to improve this so it only do one (if, its even possible).
Anyone knows how can I achive that?
The simplest approach is likely to just move the correlated sub-query into a sub-query.
NOTE: Many optimisers deal with correlated sub-queries extremely effectively. Your example query could be perfectly reasonable.
SELECT
a.*,
b.*,
c.row_count
FROM
table_a a
LEFT JOIN
table_b b
ON b.a_id = a.id
LEFT JOIN
(
SELECT
a_id,
Count(*) row_count
FROM
table_c
GROUP BY
a_id
)
c
ON c.a_id = a.id
Another Note: SQL is an expression, it is not executed directly, it is translated into a plan using nest loops, hash joins, etc. Do not assume that having two queries is a bad thing. In this case my example may significantly minimise the number of reads compared to a single query and then use of GROUP BY and COUNT(DISTINCT).
Try this:
SELECT
tmp.*,
SUM(IF(c.a_id IS NULL,0,1)) as counter,
FROM (
SELECT
a.id as aid,
b.id as bid,
a.*,
b.*
FROM
table_a a
LEFT JOIN table_b b
ON b.a_id = a.id
) as tmp
LEFT JOIN table_c c
ON c.a_id = tmp.id
GROUP BY
tmp.aid,
tmp.bid
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Full Outer Join in MySQL
There are two tables tableA and table B
both the tables have common column id
We want to get the results which are having all records in A but not in B
and all records which exists in B but not in A
Regards,
Chinta kiran
You can use UNION operator as follow
SELECT * FROM tablea
UNION
SELECT * FROM tableb
if you want to read more about
UNION operator
This is best accomplished with a LEFT OUTER JOIN where the predicate (WHERE clause) ensures that the joined row is NULL; something like:
SELECT A.* FROM A LEFT OUTER JOIN B ON A.id = B.a_id WHERE B.a_id IS NULL;
I suggest you to read the article A Visual Explanation of SQL Joins by Jeff Atwood of Coding Horror.
Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available. If there is no match, the missing side will contain null.
You are looking for MINUS SET OPERATOR:
"We want to get the results which are having all records in A but not in B"
Easy way:
SELECT A.*
FROM A
WHERE A.id not in (SELECT id FROM B)
With Full Outer Join
SELECT A.*
FROM A full outer join B on A.id = B.id
WHERE B.id is Null
The right way:
SELECT A.*
FROM A left outer join B on A.id = B.id
WHERE B.id is Null
Change A to B and B to A for in order to get the results which are having all records in B but not in A.