Discrepance obtaining values not in inner join by difference - mysql

I have table A and table B. I know table B has 7848 rows (count(*)) and I want see which of those 7848 exist inside table A. As far as I know INNER JOIN returns the values that appear in BOTH tables A and B. So I inner joined them like this:
SELECT *
FROM
TABLE1 AS A
INNER JOIN
TABLE2 AS B
ON A.field1 = B.field1
This query returns 1902 rows. Now, I want to find out which rows did NOT appear in table B so I do this:
SELECT * FROM TABLE_B WHERE FIELD1 NOT IN (field1*1902....);
By difference I think I should be getting a result of 5946 rows, since I found 1902 positive rows. What is weird is that this NOT IN statement returns 6175 rows and if I add them I get 8077 which is more than count(*) told me table B had.
What can I possibly be doing wrong?
Thanks in advance.

A join is a kind-of multiply. If you have multiple rows in table A with the same field1, then rows in B are counted multiple times.
Perhaps you want
SELECT * FROM TABLE_B B
WHERE EXISTS (SELECT field1 from TABLE_A A WHERE A.field1 = B.field1);

Try:
SELECT *
FROM
TABLE1 AS A
LEFT JOIN
TABLE2 AS B
ON A.field1 = B.field1
WHERE B.field1 IS NULL

The following query returns rows from table A that aren't on table B:
SELECT * FROM TABLE1 WHERE field1 NOT IN (SELECT field1 FROM TABLE2)
You can also get rid of the IN condition for better performance:
SELECT * FROM TABLE1 A WHERE NOT EXISTS (SELECT 1 FROM TABLE2 B WHERE B.field1 = A.field1)

You might have some duplicated values in Table1 that are also present in Table2. Your first query will return those records multiple times.
You also need to be careful if you have null values: INNER JOIN and NOT IN won't return those values.

Related

More rows after Left Join in MySQL

I've two tables, table1 contains 22780 rows. Now I left join table1 with table2 (which doesn't contain any duplicates) and I get 23588 rows.
SELECT * FROM Table1
left join Tabelle6 ON CAST(Table1.Customer AS Int) = table2.Customer
Why do I get more rows now? I only need every row from table1 once.
Edit: found my issue, table 2 does contain duplicates. But is there any way to join every row only once and ignore any further matches?
As the comment suggests, the easiest way to handle this would probably be to do SELECT DISTINCT to remove duplicates from your result set:
SELECT DISTINCT
t1.col1,
t1.col2,
t1.Customer,
...
FROM Table1 t1
LEFT JOIN Table2 t2
ON CAST(t1.Customer AS Int) = t2.Customer
But there is another option here. We could also join to a subquery which removes duplicate customers. This would ensure that no record from the first table gets duplicated from matching to more than one record in the second table.
SELECT *
FROM Table1 t1
LEFT JOIN
(
SELECT DISTINCT Customer
FROM Table2
) t2
ON CAST(t1.Customer AS Int) = t2.Customer

Oracle Select Query based on Column value

I have two tables lets say
Table A
columns id , name address
Table B
columns id , age, import_date
The Table B id is a reference key of Table A.
Now I want to return results from A & B but if the record is not in B I still want to see the record so for this I use left outer join
Select * from A a left join B b
on a.id = b.id
Now even I don't have record in B I still get the record.
Table B may contain duplicate ids but unique import_date.
Now I want to results in a way that if there is duplicate id in table B then I want to get the records only where import_date is as of today.
I still want to get the records for ids which are not there but if the ID is there in table B then I want to apply above condition.
I hope someone can help me with this.
Sample data
Table A
01|John|London
02|Matt|Glasgow
03|Rodger|Paris
Table B
02|22|31-AUG-2015
02|21|30-AUG-2015
02|23|29-AUG-2015
The query will return
01|John|London|null|null|null
02|Matt|Glasgow|22|31-Aug-2015
03|Rodger|Paris|null|null
You almost got the solution. Just add one more condition like below
Select a.id,a.name,a.address,b.age,b.import_date
from tablea a left join tableb b
on a.id=b.id and b.import_date=trunc(sysdate)
order by a.id;---This line optional
Check the DEMO HERE
SELECT *
FROM Table_A t1 LEFT OUTER JOIN Table_B t2 ON t1.id=t2.id UNION
SELECT *
FROM Table_A t1 LEFT OUTER JOIN Table_B t2 ON t1.id=t2.id
GROUP BY t2.import_date
HAVING t2.import_date=CURDATE();

3 tables and 2 left joins

Query 1:
SELECT sum(total_revenue_usd)
FROM table1 c
WHERE c.irt1_search_campaign_id IN (
SELECT assign_id
FROM table2 ga
LEFT JOIN table3 d
ON d.campaign_id = ga.assign_id
)
Query 2:
SELECT sum(total_revenue_usd)
FROM table1 c
LEFT JOIN table2 ga
ON c.irt1_search_campaign_id = ga.assign_id
LEFT JOIN table3 d
ON d.campaign_id = ga.assign_id
Query 1 gives me the correct result where as I need it in the second style without using 'in'. However Query 2 doesn't give the same result.
How can I change the first query without using 'in' ?
The reason being is that the small query is part of a much larger query, there are other conditions that won't work with 'in'
You could try something along the lines of
SELECT sum(total_revenue_usd)
FROM table1 c
JOIN
(
SELECT DISTINCT ga.assign_id
FROM table2 ga
JOIN table3 d
ON d.campaign_id = ga.assign_id
) x
ON c.irt1_search_campaign_id = x.assign_id
The queries do very different things:
The first query sums the total_revenue_usd from table1 where irt1_search_campaign_id exists in table2 as assign_id. (The outer join to table3 is absolutely unnecessary, by the way, because it doesn't change wether a table2.assign_id exists or not.) As you look for existence in table2, you can of course replace IN with EXISTS.
The second query gets you combinations of table1, table2 and table3. So, in case there are two records in table2 for an entry in table1 and three records in table3 for each of the two table2 records, you will get six records for the one table1 record. Thus you sum its total_revenue_usd sixfold. This is not what you want. Don't join table1 with the other tables.
EDIT: Here is the query using an exists clause. As mentioned, outer joining table3 doesn't alter the results.
Select sum(total_revenue_usd)
from table1 c
where exists
(
select *
from table2 ga
-- left join table3 d on d.campaign_id = ga.assign_id
where ga.assign_id = c.irt1_search_campaign_id
);

Insert missing records from one table to another using mysql

I don't know why I am confused with this query.
I have two table: Table A with 900 records and Table B with 800 records. Both table need to contain the same data but there is some mismatch.
I need to write a mysql query to insert missing 100 records from Table A to Table B.
In the end, both Table A and Table B should be identical.
I do not want to truncate all the entries first and then do a insert from another table. So please any help is appreciated.
Thank you.
It is also possible to use LEFT OUTER JOIN for that. This will avoid subquery overhead (when system might execute subquery one time for each record of outer query) like in John Woo's answer, and will avoid doing unnecessary work overwriting already existing 800 records like in user2340435's one:
INSERT INTO b
SELECT a.* FROM a
LEFT OUTER JOIN b ON b.id = a.id
WHERE b.id IS NULL;
This will first select all rows from A and B tables including all columns from both tables, but for rows which exist in A and don't exist in B all columns for B table will be NULL.
Then it filter only such latter rows (WHERE b.id IS NULL),
and at last it inserts all these rows into B table.
I think you can use IN for this. (this is a simpliplification of your query)
INSERT INTO table2 (id, name)
SELECT id, name
FROM table1
WHERE (id,name) NOT IN
(SELECT id, name
FROM table2);
SQLFiddle Demo
AS you can see on the demonstration, table2 has only 1 records but after executing the query, 2 records were inserted on table2.
If it's mysql and the tables are identical, then this should work:
REPLACE INTO table1 SELECT * FROM table2;
This will insert the missing records into Table1
INSERT INTO Table2
(Col1, Col2....)
(
SELECT Col1, Col2,... FROM Table1
EXCEPT
SELECT Col1, Col2,... FROM Table2
)
You can then run an update query to match the records that differ.
UPDATE Table2
SET
Col1= T1.Col1,
Col2= T1.Col2,
FROM
Table T1
INNER JOIN
Table2 T2
ON
T1.Col1 = T2.Col1
Code also works when a group by and having clauses are used. Tested SQL 2012 (11.0.5058) Tab1 is source with new records, Tab 2 is the destination to be updated. Tab 2 also has an Identity column. (Yes folks, real world is not as neat and clean as the lab assignments)
INSERT INTO Tab2
SELECT a.T1,a.T2,a.T3,a.T4,a.Val1,a.Val2,a.Val3,a.Val4,-9,-9,-9,-9,MIN(hits) MinHit,MAX(hits) MaxHit,SUM(count) SumCnt, count(distinct(week)) WkCnt
FROM Tab1 a
LEFT OUTER JOIN Tab2 b ON b.t1 = a.t1 and b.t2 = a.t2 and b.t3 = a.t3 and b.t4 = a.t4 and b.val1 = a.val1 and b.val2 = a.val2 and b.val3 = a.val3 and b.val4 = a.val4
WHERE b.t1 IS NULL or b.Val1 is NULL
group by a.T1,a.T2,a.T3,a.T4,a.Val1,a.Val2,a.Val3,a.Val4 having MAX(returns)<4 and COUNT(distinct(week))>2 ;

Inserting distinct entries into the database

I have two tables with exactly the same fields. Table A contains 7160 records and table B 7130 records.Now I want to insert distinct records from table A into table B such that B should not have any duplicate entry in it. How should I go about doing this?
This basically selects records that are in A that are not in B. It would work, but you might have to tweak the field you use to uniquely identify a record. In this example I used field 'ID' but you might have to change that to A.field1 = B.field1 AND A.field2 = B.field2 etc.
INSERT INTO TABLEB
(
SELECT A.*
FROM TABLEA A
LEFT JOIN TABLEB B ON A.ID = B.ID
WHERE B.ID IS NULL
)
You can use a "union" query to combine the results from multiple tables into a single result set. "union" will only return distinct rows from all tables.
See this page for more info:
http://www.tutorialspoint.com/mysql/mysql-union-keyword.htm
insert into tableB (id)
select t1.id from tableA t1
where t1.id not in (select t2.id from tableB t2)