Clearly, I am not a SQL guy, so I have to ask for help on the following rather simple task.
I have two SQL Server 2008 tables: t1 and t2 with many identical columns and a key column (entry_ID). T2 has rows that do not exist in t1 but should.
I want to merge those rows from t2 that do not exist in t1 but I also do not want any rows from t2 that already exist in t1. I would like the result set to fill a new t3.
I have looked at many solutions online but can't find the solution to the above scenario.
Thank you.
There are a number of ways to do it you could use UNION ALL or OUTER JOIN.
Assuming you are using Entry_ID to find identical records, and Entry_ID is unique within each table, here is a OUTER JOIN method:
This gets you your recordset: T1 and T2 merged:
SELECT
CASE
WHEN T1.Entry_ID IS NULL THEN 'T2'
WHEN T2.Entry_ID IS NULL THEN 'T1'
ELSE 'Both'
END SourceTable,
COALESCE(T1.Entry_ID,T2.Entry_ID) As Entry_ID,
COALESCE(T1.Col1, T2.Col1) As Col1,
COALESCE(T1.Col2, T2.Col2) As Col2,
COALESCE(T1.Col3, T2.Col3) As Col3,
COALESCE(T1.Col4, T2.Col4) As Col4
FROM T1 FULL OUTER JOIN T2
ON T1.Entry_DI = T2.Entry_ID
ORDER BY COALESCE(T1.Entry_DI,T2.Entry_ID)
This inserts it into T3:
INSERT INTO T3 (Entry_ID,Col1, COl2,Col3,Col4)
SELECT
COALESCE(T1.Entry_DI,T2.Entry_ID) As Entry_ID,
COALESCE(T1.Col1, T2.Col1) As Col1,
COALESCE(T1.Col2, T2.Col2) As Col2,
COALESCE(T1.Col3, T2.Col3) As Col3,
COALESCE(T1.Col4, T2.Col4) As Col4
FROM T1 FULL OUTER JOIN T2
ON T1.Entry_DI = T2.Entry_ID
Again you must note that Entry_ID needs to be unique within their tables, and it uses this to match between the tables.
Also note the columns from the select line up with the column list in the insert statement - the order of the columns in the physical table doesn't matter, the INSERT and SELECT just have to line up.
Related
I have a simple MYSQL query that unions two tables:
SELECT * FROM (
SELECT col1 AS col1A FROM table1
UNION
SELECT col1 AS col1B FROM table2
) AS t WHERE col1A <> col1B
I have a column called col1 in both tables and I need to select only rows that have a different value of that column so I select them as aliases. When I run this query I got:
Unknown column 'col1B' in 'where clause'
Table1 data:
col1
----
test
Table2 data:
col1
----
test
The query should return no rows as each value in col1 in table1 is equal to each value in col1 in table2 instead it returns that col1 in table2 is unknown although I select it as an alias
I think you need to look up the appropriate usage of UNION. It will return all results from first query combined with all results from the second query. This results in a single dataset, with a single column (not col1 and col2), just col1 in this case.
Assuming you're trying to get all records in table1 that don't exist in table2, you can use NOT EXISTS:
SELECT col1
FROM table1 t1
WHERE NOT EXISTS (
SELECT 1
FROM table2 t2
WHERE t1.col1 = t2.col1
)
Why Error 1054 is being returned by OP query
The error that's being returned is because the name assigned to a column from the result of a UNION is taken from the first SELECT.
You can observe this by running a simple example:
SELECT 1 AS one
UNION
SELECT 2 AS two
The resultset returned by that query will contain a single column, the name assigned to the column will be one, the column name from the first SELECT. This explains why you are getting the error from your query.
One way to return rows with no match
To return values of col1 from table1 which do not match any value in the col1 column from table2...
one option to use an anti-join pattern...
SELECT t1.col1
FROM table1 t1
LEFT
JOIN table2 t2
ON t2.col1 = t1.col1
WHERE t2.col1 IS NULL
The LEFT JOIN operation returns all rows from table1, along with any "matching" rows found in table2. The "trick" is the predicate in the WHERE clause... any "matching" rows from table2 will have a non-NULL value in col1. So, if we exclude all of the rows where we found a match, we're left with rows from table1 that didn't have a match.
If we want to get rows from table2 that don't have a "matching" row in table1, we can do the same thing, just flipping the order of the tables.
If we combine the two sets, but only want a "distinct" list of "not matched" values, we can use the UNION set operator:
SELECT t1.col1
FROM table1 t1
LEFT
JOIN table2 t2
ON t2.col1 = t1.col1
WHERE t2.col1 IS NULL
UNION
SELECT s2.col1
FROM table2 s2
LEFT
JOIN table1 s1
ON s1.col1 = s2.col1
WHERE s1.col1 IS NULL
--
Finding out which table the non-matched value is from
Sometimes, we want to know which query returned the value; we can get that by including a literal value as a discriminator in each query.
SELECT 'table1' AS src
, t1.col1
FROM table1 t1
LEFT
JOIN table2 t2
ON t2.col1 = t1.col1
WHERE t2.col1 IS NULL
UNION
SELECT 'table2' AS src
, s2.col1
FROM table2 s2
LEFT
JOIN table1 s1
ON s1.col1 = s2.col1
WHERE s1.col1 IS NULL
ORDER BY 2
A different (usually less performant) approach to finding non-matching rows
An entirely different approach, to returning an equivalent result, would be do something like this:
SELECT q.col1
FROM ( SELECT 't1' AS src, t1.col1 FROM table1 t1
UNION
SELECT 't2' AS src, t2.col1 FROM table2 t2
) q
GROUP BY q.col1
HAVING COUNT(DISTINCT q.src) < 2
ORDER BY q.col1
(The inline view q will be "materialized" as a derived table, so this approach can be expensive for large sets, and this approach won't take advantage of indexes on col1 to perform the matching.) One other small difference between this and the anti-join approach: this will omit a col1 value of NULL if a NULL exists in both tables. Aside from that, the resultset is equivalent.
If I have 2 tables and want to find if they have the same data, what is the most straightforward way to do it in MySQL?
I have read about doing a correlated subquery and UNION ALL but this query is about 2 pages (!) and can not really follow what it is doing. There must be an easier way.
Even if it is e.g. make MySQL copy the table data to files and do a vimdiff (I am not sure that this is even possible -is it?- just thinking out loud).
UPDATE
I am interested only in the table data and not structure. This is to clarify due to an ambiguous comment I made
If you just want to tell whether the tables are identical or not as efficiently as possible, use this query:
SELECT 1 FROM (
SELECT * FROM table1
UNION ALL
SELECT * FROM table2
) t
GROUP BY col1, col2, col3
HAVING count(*) = 1
LIMIT 1
List all the columns in GROUP BY to compare the entire table.
If the result is an empty set, the two tables are identical.
If you want to see the differences, use this query:
SELECT * FROM (
SELECT 'table1' tname, col1, col2, col3 FROM table1
UNION ALL
SELECT 'table2' tname, col1, col2, col3 FROM table2
) t
GROUP BY col1, col2, col3
HAVING count(*) = 1
List the same columns in the inner SELECT as in the GROUP BY, plus a column to distinguish the two tables.
Just throwing this out there, you could emulate a full outer join and then return the rows where just the right or the left side is null.
select t1.*
from table1 t1
LEFT OUTER JOIN table2 t2
ON t1.col1 = t2.col1
AND t1.col2 = t2.col2
AND ...
WHERE t2.id is null
UNION
select t2.*
from table2 t2
LEFT OUTER JOIN table1 t1
ON t2.col1 = t1.col1
AND t2.col2 = t1.col2
AND ...
WHERE t1.id is null
With the FULL OUTER JOIN you can show all rows where the other row is not available in the other table.
Use the following query:
SELECT c1 = cjoin AND c2 = cjoin equiv
FROM (SELECT COUNT(*) c1 FROM Table1) t1,
(SELECT COUNT(*) c2 FROM Table2) t2,
(SELECT COUNT(*) cjoin
FROM Table1 t1
JOIN Table2 t2
ON t1.col1 = t2.col1 AND t1.col2 = t2.col2 AND t1.col3 = t2.col3 ...) tjoin
Assuming the tables have a unique key, this will return equiv = 1 if the tables are equal. It doesn't show the differences, it's just a binary test.
I was reading SQL Cookbook from A.Molinaro, when I came across a solution.
It is based on to tables
emp(empno,ename,job,mgr,hiredate,sal,comm,deptno)
and a view
V
which has the same columns but different rows. The columns mgr and comm might be NULL, other columns not.
The solution in the book is very long and it does not show all differences, although this was the stated problem in 3.7.
I made up my solution which is shorter and shows all differences (means all rows which have different counts in the two tables).
select * from
# those which are contained in the (distinct) union of (col1,col2,...,coln, count) of both tables:
( select empno,ename,job,mgr,hiredate,comm,deptno, count(*) cnt from emp group by empno,ename,job,mgr,hiredate,comm,deptno
union
select empno,ename,job,mgr,hiredate,comm,deptno, count(*) cnt from V group by empno,ename,job,mgr,hiredate,comm,deptno
) as unionOfBoth
where (empno,ename,job,mgr,hiredate,comm,deptno,cnt)
not in
# those which are contained in the intersection of both tables with the equal number of counts:
( select e.empno,e.ename,e.job,e.mgr,e.hiredate,e.comm,e.deptno,e.cnt
from
(select empno, ename,job,mgr,hiredate,comm,deptno, count(*) cnt from emp group by empno,ename,job,mgr,hiredate,comm,deptno) e,
(select empno, ename,job,mgr,hiredate,comm,deptno, count(*) cnt from V group by empno,ename,job,mgr,hiredate,comm,deptno) v
where
e.empno = v.empno
and e.ename = v.ename
and e.job = v.job
and ifnull(e.mgr,0) = ifnull(v.mgr,0)
and e.hiredate = v.mgr
and e.deptno = v.deptno
and ifnull(e.comm,0) = ifnull(v.comm,0)
and e.cnt = v.cnt
);
Basically you count the distinct rows in both tables and do a union (not union all) to get the tmp.table unionBoth. Then you remove those rows, which both tables have in common.
Here two rows r1 from table t1 and r2 from table t2 are considered the same, if
(r1,count of r1 in t1) = (r2, count of r2 in t2), which is equivalent to r1=r2 (on all columns) and (count of r1 in t1) = (count of r2 in t2).
If the tables are small enough, you can export both tables as csv files and then copy one of the tables and paste them side-by-side with the other table. You can just go row by row and see if the outputs are the same that way.
I don't know why I am confused with this query.
I have two table: Table A with 900 records and Table B with 800 records. Both table need to contain the same data but there is some mismatch.
I need to write a mysql query to insert missing 100 records from Table A to Table B.
In the end, both Table A and Table B should be identical.
I do not want to truncate all the entries first and then do a insert from another table. So please any help is appreciated.
Thank you.
It is also possible to use LEFT OUTER JOIN for that. This will avoid subquery overhead (when system might execute subquery one time for each record of outer query) like in John Woo's answer, and will avoid doing unnecessary work overwriting already existing 800 records like in user2340435's one:
INSERT INTO b
SELECT a.* FROM a
LEFT OUTER JOIN b ON b.id = a.id
WHERE b.id IS NULL;
This will first select all rows from A and B tables including all columns from both tables, but for rows which exist in A and don't exist in B all columns for B table will be NULL.
Then it filter only such latter rows (WHERE b.id IS NULL),
and at last it inserts all these rows into B table.
I think you can use IN for this. (this is a simpliplification of your query)
INSERT INTO table2 (id, name)
SELECT id, name
FROM table1
WHERE (id,name) NOT IN
(SELECT id, name
FROM table2);
SQLFiddle Demo
AS you can see on the demonstration, table2 has only 1 records but after executing the query, 2 records were inserted on table2.
If it's mysql and the tables are identical, then this should work:
REPLACE INTO table1 SELECT * FROM table2;
This will insert the missing records into Table1
INSERT INTO Table2
(Col1, Col2....)
(
SELECT Col1, Col2,... FROM Table1
EXCEPT
SELECT Col1, Col2,... FROM Table2
)
You can then run an update query to match the records that differ.
UPDATE Table2
SET
Col1= T1.Col1,
Col2= T1.Col2,
FROM
Table T1
INNER JOIN
Table2 T2
ON
T1.Col1 = T2.Col1
Code also works when a group by and having clauses are used. Tested SQL 2012 (11.0.5058) Tab1 is source with new records, Tab 2 is the destination to be updated. Tab 2 also has an Identity column. (Yes folks, real world is not as neat and clean as the lab assignments)
INSERT INTO Tab2
SELECT a.T1,a.T2,a.T3,a.T4,a.Val1,a.Val2,a.Val3,a.Val4,-9,-9,-9,-9,MIN(hits) MinHit,MAX(hits) MaxHit,SUM(count) SumCnt, count(distinct(week)) WkCnt
FROM Tab1 a
LEFT OUTER JOIN Tab2 b ON b.t1 = a.t1 and b.t2 = a.t2 and b.t3 = a.t3 and b.t4 = a.t4 and b.val1 = a.val1 and b.val2 = a.val2 and b.val3 = a.val3 and b.val4 = a.val4
WHERE b.t1 IS NULL or b.Val1 is NULL
group by a.T1,a.T2,a.T3,a.T4,a.Val1,a.Val2,a.Val3,a.Val4 having MAX(returns)<4 and COUNT(distinct(week))>2 ;
I have two tables Table1 and Table2. There are 10 fields in Table1 and 9 fields in Table2. There is one common column in both the tables i.e. AdateTime. This column saves unix time stamp of user actions. I want to display records from both the tables as a single result but sorting must me according to AdateTime. Recent action should be display first. Sometimes many recent actions in Table1 but few in Table2. Vice versa is also possible. So I want to fetch combine result set from both the tables using single query. I am using PHP MySQL.
Try
SELECT t1.*, t2.*
FROM table1 t1 INNER JOIN table2 t2
ON t1.AdateTime = t2.AdateTime
ORDER BY t1.AdateTime
or (if tables are not related)
SELECT * FROM
(SELECT ADateTime, col1, col2, col3, col4 FROM table1
UNION
SELECT ADateTime, col1, col2, 1 AS col3, NULL AS col4 FROM table2) t2
ORDER by ADateTime
I would use UNION ALL with an inline view. So something like
select col1,col2,col3,col4,col5,col6,col7,col8,col9,AdateTime
from
(
select col1,col2,col3,col4,col5,col6,col7,col8,col9,AdateTime from Table1
UNION ALL
select col1,col2,col3,col4,col5,col6,col7,col8,null as col9,AdateTime from Table2
) t
order by t.Adatetime desc;
yes you can do it. you just need to join these 2 tables with a join condition. when the join condition matches for a row only that row ll be displayed then further you can write the Code for any operation. use order by AdateTime
select t1.column_1234,t2.column_1234
from t1 table1 , t2 table2
where t1.matching_column = t2.matching_column
order by t1.AdateTime;
t1.matching_column And t2.matching_column are the Primary And Foreign keys for these tables (Matching Column)
Say I've two tables - "Table1" and "Table2" in my MySQL database.
"id" primary key (auto_increment) in "Table1" is the reference key in "Table2" - "tab_id".
There could be zero or more "Table2" rows for one "Table1" row.
Now I'm trying to do a search on one of the column in "Table2" say "email" column OR on one of the column in "Table1" say "address" and print "Table1" row values.
I see there are 3 possibilities:
1. Join
2. Sub-Query
3. Union
1 Join
SELECT *
FROM Table1 t1, Table t2
WHERE t1.id = t2.tab_id
AND (t1.address like '%str%' OR t2.email like '%str%');
-- This works fine, but when there are no rows in "Table2" relevant to "Table1" .. the JOIN will fail, hence output is in-consistent.
2 Sub-Query
SELECT *
FROM Table1 t1
WHERE t1.address like '%str%'
OR t1.id IN (SELECT t2.tab_id
FROM Table2 t2
WHERE t2.email like '%str%');
-- This works fine, but when there are two manys rows in "Table2" (say 5K) the query goes very slow :(
3 Union
SELECT 'relevant_columns'
FROM Table1 t1, Table t2
WHERE t1.id = t2.tab_id
AND (t1.address like '%str%' OR t2.email like '%str%')
UNION
SELECT 'relevant_columns'
FROM Table1 t1
WHERE t1.address like '%str%'
ORDER BY relevant_column
-- This works fine, may be create a view with a similar UNION, does the job.
Now, my question what is the correct way ... is it okay to call a UNION always?
MySQL Engine: MyISAM
SELECT *
FROM Table1 t1
LEFT JOIN Table t2 ON t2.tab_id = t1.id
WHERE t1.address like '%str%'
OR t2.email like '%str%';
You need to do a LEFT JOIN. When you make a FROM from two tables as you did, it works as an INNER JOIN (or a CROSS JOIN if there is no WHERE clause), which means that the output shows only rows that have a match in both tables. With LEFT JOIN you said that you want all rows from the left table (t1) with the matched row on the right table (t2). If there is no match in t2, then null is used.
You can use sub-query, but as you can see it is not the best choice
An UNION here does not give you any advantage. An UNION is useful to merge together datasets with same columns.
Edit
If you have issues with JOIN, because some Table1 rows do not appear, then you need a LEFT JOIN. The fact that takes a long time, is another problem. Those tables are not big at all, so I guess you need to do some index work on those tables.
If you want help about the union you need to tell me which are those relevant_columns, because they must have the same number of columns, same type and same sequence.
You might optimize the union without joins, depending on what you want to output when t2.email has a match. Here is an example
SELECT t1.id, t1.address, null as email
FROM Table1 t1
WHERE t1.address like '%str%'
union
SELECT t2.tab_id as id, null as address, t2.email
FROM Table t2
WHERE t2.email like '%str%';
SELECT *
FROM Table1 t1
LEFT JOIN Table2 t2
ON t1.id = t2.tab_id
WHERE t1.address like '%str%' OR t2.email like '%str%';