MySQL Query Help - Using != with ANY - mysql

I have the following query (MySQL):
SELECT col1, col2 FROM database1.table
->WHERE col3 != ANY(SELECT col1 FROM database2.table)
->ORDER BY this, that;
And I had hoped this would allow me to select col1 and col2 from a table in database1 where col3 (still in database1) does not equal anything from col1 in a table in database2.
Naturally, this wont work because SELECT col1 FROM database2.table returns more than one row, so if is equal to row1, then it's not equal to row2 so it's still returned.
Any thoughts on how to do this the right way?

Use NOT IN
SELECT col1, col2 FROM database1.table
->WHERE col3 NOT IN(SELECT col1 FROM database2.table)
->ORDER BY this, that;
but keep in mind that subselects are not optimized in MySQL, and if there are a lot of records in database1.table this would be slow. Faster way is to use JOIN - there are a lot of examples at SO

WHERE col3 NOT IN (SELECT col1 FROM database2.table)

you can use NOT IN operator for this
SELECT col1, col2 FROM database1.table
->WHERE col3 NOT IN(SELECT col1 FROM database2.table)
->ORDER BY this, that;

Just use ALL instead of ANY:
SELECT col1, col2 FROM database1.table
WHERE col3 != ALL(SELECT col1 FROM database2.table)
ORDER BY this, that;

Related

How to select two columns together distinctly in no particular order?

I want to select two columns together distinctly in no particular order in MySQL.
For example, the given table is below -
col1 col2 col3
--------------
a b val1
a c val2
b a val1
b c val3
c a val2
c b val3
I need to distinctly select col1 and col2 in no particular order.
col1 = a AND col2 = b
is equivalent to
col1 = b AND col2 = a
in my case, as col3 value will be same for both combinations of col1 and col2.
Expected result is below -
col1 col2 col3
--------------
a b val1
a c val2
b c val3
I want to eliminate duplicates actually.
Any help you can give would be greatly appreciated.
Thank you in advance.
Use greatest and least functions to create groups:
SELECT col1, col2, col3
FROM (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY least(col1, col2), greatest(col1, col2) ORDER BY least(col1, col2), greatest(col1, col2)) AS rn
FROM mytable
) t
WHERE rn = 1
See Demo.
Just to give you an alternative and probably better solution in terms of performance tuning, You may try below query without using window functions -
SELECT * FROM mytable M1
WHERE NOT EXISTS (SELECT 1 FROM mytable M2
WHERE M1.col1 = M2.col2
AND M1.col2 = M2.col1
AND M2.col1 < M2.col2)
Since it uses exists clause, It will perform faster than above query. Here is the demo for both the queries.

is there a optimized/better way to write this query?

I have below query and wanted to know if this can be re-written in a better way?
SELECT COL1, COL2 FROM TABLE1 WHERE ID = 1 and COL4 = 1415 AND COL3 IN
(SELECT MAX(COL3) FROM TABLE1 WHERE PRI = ID = 1 and COL4 = 1415);
The question arises from the fact that filters ID and Col4 in where clause of subquery are same as the filters in the main query.
You can use:
SELECT COL1, COL2,MAX(COL3) as mx FROM TABLE1 WHERE ID = 1
and COL4 = 1415 having mx=MAX(COL3);
This will avoid the extra sub-query.

How to limit the result of UNION ALL query?

I have a query like this:
select col1, col2 from table1 where col1 = ?
union all
select col1, col2 from table2 where col2 = ?
Now I need to limit the result of the above query, Now I want to know, if I use limit clause after second select, then just the result of second select will be limited or the result of both select?
Anyway, which approach is good for limiting the result of union all query?
One:
select col1, col2 from table1 where col1 = ?
union all
select col1, col2 from table2 where col2 = ?
limit ?,10
Two:
select * from
(
select col1, col2 from table1 where col1 = ?
union all
select col1, col2 from table2 where col2 = ?
) x
limit ?,10
According to MySQL manual:
To use an ORDER BY or LIMIT clause to sort or limit the entire UNION
result, parenthesize the individual SELECT statements and place the
ORDER BY or LIMIT after the last one.
Hence, you can use:
(select col1, col2 from table1 where col1 = ?)
union all
(select col1, col2 from table2 where col2 = ?)
LIMIT ?, 10
Using a sub-query should also work, but can't be more efficient in comparison to the above query.
The first is better from a performance perspective. The second materializes the subquery, which is additional overhead.
Note: You are using limit without an order by, so the results may not be consistent from one execution of the query to the next.
You should be using order by, which probably makes it irrelevant which version you use (because the order by needs to read and write the data anyway).

Similar WHERE clause in a long UNION statement in SQL Server 2008 R2

In a stored procedure, I need to INSERT the result of a long UNION into a temp table.
The WHERE clause is the same for all tables, which is being in a SELECT DISTINCT.
Simplified for readability, it goes like this:
INSERT INTO #MyTemp
SELECT col1, col2, col3 FROM tab1 WHERE col1 in (SELECT DISTINCT myId FROM TabIds) UNION
SELECT col1, col2, col3 FROM tab2 WHERE col1 in (SELECT DISTINCT myId FROM TabIds) UNION
SELECT col1, col2, col3 FROM tab3 WHERE col1 in (SELECT DISTINCT myId FROM TabIds) UNION
.
.
.
SELECT col1, col2, col3 FROM tab20 WHERE col1 in (SELECT DISTINCT myId FROM TabIds)
Although TabIds is a small temp table, typically 3-6 records long, this seems to be pretty inneficient.
Is there a better way to do this?
Summarizing my question:
Is there a way I can do SELECT DISTINCT myId FROM TabIds just once and assign it to a kind of array/list/set (not to another temp table) and just use that in the WHERE clauses, and if there is a way, does it really matter for such a small (3-6 recs) temp table?
I'm ignoring your requirement ("not to another temp table") because I don't believe it is well-founded. Try and see if this solution gives you better performance:
SELECT i = myId
INTO #x
FROM dbo.TabIds -- please always use schema prefix
GROUP BY myId;
CREATE UNIQUE CLUSTERED INDEX x ON #x(i);
INSERT INTO #MyTemp(col1, col2, col3)
SELECT col1, col2, col3
FROM
(
SELECT col1, col2, col3 FROM dbo.tab1 WHERE EXISTS -- likely better than IN
(SELECT 1 FROM #x WHERE i = tab1.col1)
UNION ALL
SELECT col1, col2, col3 FROM dbo.tab2 WHERE EXISTS
(SELECT 1 FROM #x WHERE i = tab2.col1)
UNION ALL
...
UNION ALL
SELECT col1, col2, col3 FROM dbo.tab20 WHERE EXISTS
(SELECT 1 FROM #x WHERE i = tab20.col1)
) AS x
GROUP BY col1, col2, col3; -- likely more efficient than `UNION` to remove dupes
Of course this will work best if col1 is indexed in all 20 tables, and if that index includes col2 and col3.
The reason I suggested a view is not because I thought it would make this code run faster. Just that you could create a view that generates this UNION for you, making this code simpler (and any other code that repeats this monotonous UNION). It was a suggestion for convenience, not for performance - though I need to make it clear that using a view does not magically make things slower. Sometimes I can, but that's a dangerous and illogical reason to avoid views.
Finally, I'd strongly consider normalization. Why are these 20 different tables in the first place, when they could all be in one single table?
CREATE TABLE dbo.Normal
(
SourceTableID INT,
col1 <data type>,
col2 <data type>,
col3 <data type>
);
-- indexes / constraints
INSERT dbo.Normal
SELECT 1, col1, col2, col3 FROM dbo.tab1
UNION ALL
SELECT 2, col1, col2, col3 FROM dbo.tab2
UNION ALL
...
UNION ALL
SELECT 20, col1, col2, col3 FROM dbo.tab20;
Now all your queries can simply reference this new table. If you will commonly look for only one of the sources (e.g. tab5), then indexing or partitioning on SourceTableID would be useful.
What you're doing, conceptually, is fine for one-offs and data loads. I hope this isn't part of a bigger pattern in production code, though.
What you're looking for is a Common Table Expression.
My T-SQL is a bit rusty, but with a CTE, your query would go something like:
WITH TabIds_CTE AS (SELECT DISTINCT myId FROM TabIds)
INSERT INTO #MyTemp
SELECT col1, col2, col3 FROM tab1 WHERE col1 IN (SELECT * FROM TabIds_CTE)
UNION ALL ...
I think the following might be better for small tables, but still - it's horrible idea to leave it like this in some production process :)
INSERT INTO #MyTemp (col1,col2,col3)
select distinct
x.col1,x.col2,x.col3
from (
SELECT col1, col2, col3 FROM tab1 union all
SELECT col1, col2, col3 FROM tab2 union all
SELECT col1, col2, col3 FROM tab3 union all
-- ...
SELECT col1, col2, col3 FROM tab20
) x
join (
SELECT DISTINCT myId FROM TabIds
) y
on x.col1=y.myid

"Merging" columns from several tables

Using SQL Server 2008, suppose I have several tables with 3 common columns (not related):
TABLE1
col1 colSomeOther col2 colAnotherOne
TABLE2
col1 colSomeOther col2 colAnotherOne
TABLE3
col1 colSomeOther col2 colAnotherOne
I would like to create a view which merges col1 and 2 for the 3 tables above. Something like:
VIEW
col1 col2
where col1 contains ALL elements from table 1, 2 and 3, and col2 contains ALL elements from col2 in table 1, 2, 3.
Is this possible?
Yep. This is a "union"; multiple result sets of the same "signature" (number and type of data columns), concatenated one after the other. The query to do this is as simple as:
SELECT col1, col2 FROM TABLE1
UNION ALL
SELECT col1, col2 FROM TABLE2
UNION ALL
SELECT col1, col2 FROM TABLE3
If you want the query to "de-duplicate" the results, returning only unique rows, omit the "ALL" keywords from the unions. With the ALL keywords, it simply tacks on the results of each SELECT to the combined result set, including rows from Table2 that may have exactly the same data as Table1.
I think you are asking for an UNION:
select col1, col2 from table1
UNION ALL
select col1, col2 from table2
UNION ALL
select col1, col2 from table3
Should work as long as col1 and col2 have compatible data types across all three tables.
If you want to eliminate duplicate rows then use UNION instead of UNION ALL.