Using SQL Server 2008, suppose I have several tables with 3 common columns (not related):
TABLE1
col1 colSomeOther col2 colAnotherOne
TABLE2
col1 colSomeOther col2 colAnotherOne
TABLE3
col1 colSomeOther col2 colAnotherOne
I would like to create a view which merges col1 and 2 for the 3 tables above. Something like:
VIEW
col1 col2
where col1 contains ALL elements from table 1, 2 and 3, and col2 contains ALL elements from col2 in table 1, 2, 3.
Is this possible?
Yep. This is a "union"; multiple result sets of the same "signature" (number and type of data columns), concatenated one after the other. The query to do this is as simple as:
SELECT col1, col2 FROM TABLE1
UNION ALL
SELECT col1, col2 FROM TABLE2
UNION ALL
SELECT col1, col2 FROM TABLE3
If you want the query to "de-duplicate" the results, returning only unique rows, omit the "ALL" keywords from the unions. With the ALL keywords, it simply tacks on the results of each SELECT to the combined result set, including rows from Table2 that may have exactly the same data as Table1.
I think you are asking for an UNION:
select col1, col2 from table1
UNION ALL
select col1, col2 from table2
UNION ALL
select col1, col2 from table3
Should work as long as col1 and col2 have compatible data types across all three tables.
If you want to eliminate duplicate rows then use UNION instead of UNION ALL.
Related
I have 22 tables where each has many columns. I want to select 10 columns conditioning on 4 column values using WHERE. For this, I have to repeat these 4 conditions and 10 columns for all 22 tables, which is inconvenient. Is there a more efficient way to do this?
If the 22 tables have the same structure and the conditions on 4 columns are equal or similar, you can union tables in a subquery and put only one where externally.
Example:
select x.*
from
(
select col1, col2, col3, col4
from tab1
union all
select col1, col2, col3, col4
from tab2
union all
select col1, col2, col3, col4
from tab3
) x
where x.col1 = 'value'
and x.col2 = 'other value'
I was having trouble determining a possible unique key in a poorly defined table. The table had 5000 rows. I selected distinct on the fields I thought might be a unique key.
select count(distinct col1, col2)
from tab1;
The result was 4980 records. Then I checked the 20 records and found that the values for col2 where null, but adding col3 should give me uniqueness.
select count(distinct col1, col2, col3)
from tab1;
The result was still 4980. What the? So I changed the query to this.
select col1, col2, col3, count(*)
from tab1
group by col1, col2, col3
having count(*) > 1;
With this I got zero rows, so col1, col2, and col3 are unique. So what was wrong with the first three column query? I tried this.
select count(distinct col1, coalesce(col2, ''), col3)
from tab1;
This returned 5000 records.
It is likely that the multiple fields are being concatenated together in one field in the engine, and concatenating col1, NULL, col3 is resulting in NULL and that is why it is acting this way. But, the result seems to break the NULL standards that MySQL seems to want to follow. Is this a MySQL bug?
The manual specifically says that COUNT(DISTINCT expr [,expr...])
Returns a count of the number of rows with different non-NULL expr values.
which is the behaviour you are seeing.
I have two tables with different columns. Tables doesn't have id column. They have the same number of rows. I want to merge them in new table. I've tried to do this like this:
CREATE TABLE test_3_cut_dest as
SELECT * FROM test_3_cut_
UNION
SELECT * FROM test_3_cut
but got error:
each UNION query must have the same number of columns
I want to know how achieve merging of two tables with different number of columns without specifying list of columns?
I don't think it works with select * when it comes to not knowing how many columns each table has. The best way to do it is like this:
SELECT A, B, C, D, E, F, G, H FROM test_3_cut_
UNION
SELECT A, B, NULL AS C, NULL AS D, NULL AS E, NULL AS ... FROM test_3_cut
I've done it like this considering test_3_cut has fewer columns than test_3_cut_
Union requires that all participating selects yield matching record sets, meaning same number of columns and compatible data types for each column.
As for your question, you can use placeholders wherever there is a mismatch. For instance:
SELECT Col1, Col2, 'DUMMYVALUE'
FROM Table1
UNION ALL
SELECT Col1, Col2, Col3
FROM Table2 ;
Now, if table1 has 3 textual columns and table2 has 6 textual columns, you can go with:
SELECT * , 'DUMMYVALUE1','DUMMYVALUE2','DUMMYVALUE3'
FROM Table1
UNION ALL
SELECT *
FROM Table2 ;
Union works like this:
Select column1 from table1
union
select column1 from table2
Number of columns must be same in both Select Queries.
You have to select from the two tables the same set of columns. If one of the tables has more columns, you can either select only the columns they share or select a fake value for the missing column from the table that has less.
Example
Table1
col1 | col2 | col3
Table2
col1 | col2
You can do this
select col1, col2 from Table1
union all
select col1, col2 from Table2
or this
select col1, col2, col3 from Table1
union all
select col1, col2, '' from Table2
I know the topic is already on other threads, but my problem is that i could not use union (table 1 has 60 columns; table 2 has only 7). Is there another way than creating for table 2 ...53 empty columns?
Is it possible to generate the result in one query?
Thank you!
You can simply do this by replacing non existent columns with nulls like below
Select Col1, Col2, Col3, Col4, Col5 from Table1
Union
Select Col1, Col2, Col3, Null as Col4, Null as Col5 from Table2
Replace the columns in tables with null if the column does not exist.
It is possible, since you can add any number of arbitrary columns in a select:
select field1, field2, field3 from table1
union
select field4, null, field5 from table2
In the above example I used a constant null value as the 2nd field, but you can choise any value befitting the data type of the existing column in the other table.
In a stored procedure, I need to INSERT the result of a long UNION into a temp table.
The WHERE clause is the same for all tables, which is being in a SELECT DISTINCT.
Simplified for readability, it goes like this:
INSERT INTO #MyTemp
SELECT col1, col2, col3 FROM tab1 WHERE col1 in (SELECT DISTINCT myId FROM TabIds) UNION
SELECT col1, col2, col3 FROM tab2 WHERE col1 in (SELECT DISTINCT myId FROM TabIds) UNION
SELECT col1, col2, col3 FROM tab3 WHERE col1 in (SELECT DISTINCT myId FROM TabIds) UNION
.
.
.
SELECT col1, col2, col3 FROM tab20 WHERE col1 in (SELECT DISTINCT myId FROM TabIds)
Although TabIds is a small temp table, typically 3-6 records long, this seems to be pretty inneficient.
Is there a better way to do this?
Summarizing my question:
Is there a way I can do SELECT DISTINCT myId FROM TabIds just once and assign it to a kind of array/list/set (not to another temp table) and just use that in the WHERE clauses, and if there is a way, does it really matter for such a small (3-6 recs) temp table?
I'm ignoring your requirement ("not to another temp table") because I don't believe it is well-founded. Try and see if this solution gives you better performance:
SELECT i = myId
INTO #x
FROM dbo.TabIds -- please always use schema prefix
GROUP BY myId;
CREATE UNIQUE CLUSTERED INDEX x ON #x(i);
INSERT INTO #MyTemp(col1, col2, col3)
SELECT col1, col2, col3
FROM
(
SELECT col1, col2, col3 FROM dbo.tab1 WHERE EXISTS -- likely better than IN
(SELECT 1 FROM #x WHERE i = tab1.col1)
UNION ALL
SELECT col1, col2, col3 FROM dbo.tab2 WHERE EXISTS
(SELECT 1 FROM #x WHERE i = tab2.col1)
UNION ALL
...
UNION ALL
SELECT col1, col2, col3 FROM dbo.tab20 WHERE EXISTS
(SELECT 1 FROM #x WHERE i = tab20.col1)
) AS x
GROUP BY col1, col2, col3; -- likely more efficient than `UNION` to remove dupes
Of course this will work best if col1 is indexed in all 20 tables, and if that index includes col2 and col3.
The reason I suggested a view is not because I thought it would make this code run faster. Just that you could create a view that generates this UNION for you, making this code simpler (and any other code that repeats this monotonous UNION). It was a suggestion for convenience, not for performance - though I need to make it clear that using a view does not magically make things slower. Sometimes I can, but that's a dangerous and illogical reason to avoid views.
Finally, I'd strongly consider normalization. Why are these 20 different tables in the first place, when they could all be in one single table?
CREATE TABLE dbo.Normal
(
SourceTableID INT,
col1 <data type>,
col2 <data type>,
col3 <data type>
);
-- indexes / constraints
INSERT dbo.Normal
SELECT 1, col1, col2, col3 FROM dbo.tab1
UNION ALL
SELECT 2, col1, col2, col3 FROM dbo.tab2
UNION ALL
...
UNION ALL
SELECT 20, col1, col2, col3 FROM dbo.tab20;
Now all your queries can simply reference this new table. If you will commonly look for only one of the sources (e.g. tab5), then indexing or partitioning on SourceTableID would be useful.
What you're doing, conceptually, is fine for one-offs and data loads. I hope this isn't part of a bigger pattern in production code, though.
What you're looking for is a Common Table Expression.
My T-SQL is a bit rusty, but with a CTE, your query would go something like:
WITH TabIds_CTE AS (SELECT DISTINCT myId FROM TabIds)
INSERT INTO #MyTemp
SELECT col1, col2, col3 FROM tab1 WHERE col1 IN (SELECT * FROM TabIds_CTE)
UNION ALL ...
I think the following might be better for small tables, but still - it's horrible idea to leave it like this in some production process :)
INSERT INTO #MyTemp (col1,col2,col3)
select distinct
x.col1,x.col2,x.col3
from (
SELECT col1, col2, col3 FROM tab1 union all
SELECT col1, col2, col3 FROM tab2 union all
SELECT col1, col2, col3 FROM tab3 union all
-- ...
SELECT col1, col2, col3 FROM tab20
) x
join (
SELECT DISTINCT myId FROM TabIds
) y
on x.col1=y.myid