Mariadb bug - where condition with "in select" containing null - mysql

I found a strange bug (I guess) in mariadb
Suppose you have a table table1 with col1 and other table2 with col1 and you want to list all row in table1 whose col1 values exist in table2.
We could code this as:
select *
from table1
where col1 in (
select col1 from table2
)
The result contains the expected rows if all data in col1 in table2 are not null.
However, if any values from table2 are null then it returns no rows.
This is unexpected to me and scary as I've used this clause many times.

This is how in is defined to work in SQL: if any of the values in the list used by in are null, none match.
See "Law of the excluded fourth" section of https://en.wikipedia.org/wiki/Null_(SQL) for more info.
All SQL databases behave this way.

I normally use exists, so I haven't struck this problem
select * from table1 t1
where exists (select t2.col1 from table2 t2
where t2.col1 = t1.col1)
Not Tested

Related

Alternative to except in MySQL

I must write a Query like this in MySQL:
SELECT *
FROM Tab1
EXCEPT
SELECT *
FROM Tab1
WHERE int_attribute_of_Tab1>0
but MySQL doesn't support the keyword EXCEPT.
Is there a standard mode to use correctly another operator that simulate the except in MySQL?
You could use NOT IN
SELECT *
FROM Tab1
WHERE id NOT IN (
SELECT id
FROM Tab1
WHERE int_attribute_of_Tab1>0
)
Try this
SELECT *
FROM Tab1
WHERE [....] NOT EXISTS
(SELECT *
FROM Tab1
WHERE int_attribute_of_Tab1>0)
A couple of definitions
SqlServer https://learn.microsoft.com/en-us/sql/t-sql/language-elements/set-operators-except-and-intersect-transact-sql EXCEPT
Returns any distinct values from the query to the left of the EXCEPT operator that are not also returned from the right query.
PLsql https://docs.oracle.com/cd/B19306_01/server.102/b14200/queries004.htm MINUS
statement combines results with the MINUS operator, which returns only unique rows returned by the first query but not by the second
A pedantic translation to mysql would be
SELECT distinct t1.*
FROM Tab1 as t1
left outer join
(SELECT *
FROM Tab1
WHERE int_attribute_of_Tab1>0) as t2 on t1.id = t2.id
where t2.id is null;
Assuming there is an id column, And I wouldn't like to use distinct on a lot of columns.
You can use multiple NOT IN operators combined with AND operators for multiple columns.
For example:
SELECT col1, col2 FROM table1 WHERE col1 NOT IN(SELECT col1 FROM table2) AND col2 NOT IN(SELECT col2 FROM table2)...;
Since MySQL version 8.0.31 update, the EXCEPT operator has become available to use in this DBMS. If you are allowed to update your MySQL version, you are free to use the notation:
SELECT * FROM Tab1
EXCEPT
SELECT * FROM Tab1
WHERE int_attribute_of_Tab1>0
If Tab1 has a primary key (f.e. ID) then you could use a NOT EXISTS to itself like this :
SELECT *
FROM Tab1 AS t1
WHERE NOT EXISTS (
SELECT 1
FROM Tab1 AS t2
WHERE t2.ID = t1.ID
AND t2.int_attribute_of_Tab1 > 0
)
But it's kinda pointless in this case.
And it's not what an EXCEPT/MINUS tries to do.
(excluding identical rows)
The question's query with the EXCEPT uses the same table twice.
So reversing that WHERE criteria on Tab1 would give the same results.
SELECT *
FROM Tab1
WHERE (int_attribute_of_Tab1 <= 0 OR int_attribute_of_Tab1 IS NULL)
If it were 2 different tables then this
SELECT t1col1, t1col2, t1col3
FROM Table1
EXCEPT
SELECT t2col4, t2col5, t2col6
FROM Table2
WHERE int_attribute_of_Tab1 > 0
Could be replaced by comparing each selected column
SELECT DISTINCT t1col1, t1col2, t1col3
FROM Table1 AS t1
WHERE NOT EXISTS (
SELECT 1
FROM Table1 AS t2
WHERE t2.t2col4 = t1.t1col1
AND t2.t2col5 = t1.t1col2
AND t2.t2col6 = t1.t1col3
AND t2.int_attribute_of_Tab1 > 0
)

SQL: insert only new records

This must be very trivial but I can't seem to find the solution.
I work with two tables, both without any primary key.
I want to add all the records of the first table to the second table only if they don't exist.
Basically:
INSERT INTO Table2
SELECT Table1.*
FROM Table
WHERE "the record to be added doesn't already exists in Table2"
You could do something like this. You would need to check each relevant column - I have just put in 2 as an example. With a Not Exists clause you can check if a record already existed across multiple columns. With a NOT IN you would only be able to check if a record already existed against one column.
INSERT INTO Table2
SELECT t1.*
FROM Table1 t1
WHERE NOT EXISTS
(
SELECT 1
FROM table2 t2 WHERE
t2.col1 = t1.col1 AND
t2.col2 = t1.col2
)
you could make usage of the EXISTS function:
INSERT INTO Table2
SELECT Table1.*
FROM Table1
WHERE NOT EXISTS(SELECT * FROM table2 WHERE <your expression to compare the two tables goes here>)
But i would advise you to check the use of unique index for your tables
Just an idea - untested:
INSERT INTO Table2
SELECT *
FROM Table1
WHERE NOT EXISTS(SELECT * FROM Table2 WHERE Table2.Field1 = Table1.Field1 AND Table2.Field2 = Table1.Field2)
You must add every Field of both Tables in the WHERE clause of the NOT EXISTS Query
INSERT INTO X.TableX1
(ColumX1,ColumnX2)
SELECT DISTINCT c.[ColumnY1],c.[ColumnY2]
FROM Y.Table2 c INNER JOIN Database.z.Table3 i ON c.ColumnX1= i.ColumnY1
WHERE NOT EXISTS
(
SELECT *
FROM X.TableX1 WHERE
X.TableX1.ColumnX1=c.ColumnY1
)
This is Joining two tables and filtering the data required and updating only the new values to third table on every run

MYSQL select same column name as alias in union not working

I have a simple MYSQL query that unions two tables:
SELECT * FROM (
SELECT col1 AS col1A FROM table1
UNION
SELECT col1 AS col1B FROM table2
) AS t WHERE col1A <> col1B
I have a column called col1 in both tables and I need to select only rows that have a different value of that column so I select them as aliases. When I run this query I got:
Unknown column 'col1B' in 'where clause'
Table1 data:
col1
----
test
Table2 data:
col1
----
test
The query should return no rows as each value in col1 in table1 is equal to each value in col1 in table2 instead it returns that col1 in table2 is unknown although I select it as an alias
I think you need to look up the appropriate usage of UNION. It will return all results from first query combined with all results from the second query. This results in a single dataset, with a single column (not col1 and col2), just col1 in this case.
Assuming you're trying to get all records in table1 that don't exist in table2, you can use NOT EXISTS:
SELECT col1
FROM table1 t1
WHERE NOT EXISTS (
SELECT 1
FROM table2 t2
WHERE t1.col1 = t2.col1
)
Why Error 1054 is being returned by OP query
The error that's being returned is because the name assigned to a column from the result of a UNION is taken from the first SELECT.
You can observe this by running a simple example:
SELECT 1 AS one
UNION
SELECT 2 AS two
The resultset returned by that query will contain a single column, the name assigned to the column will be one, the column name from the first SELECT. This explains why you are getting the error from your query.
One way to return rows with no match
To return values of col1 from table1 which do not match any value in the col1 column from table2...
one option to use an anti-join pattern...
SELECT t1.col1
FROM table1 t1
LEFT
JOIN table2 t2
ON t2.col1 = t1.col1
WHERE t2.col1 IS NULL
The LEFT JOIN operation returns all rows from table1, along with any "matching" rows found in table2. The "trick" is the predicate in the WHERE clause... any "matching" rows from table2 will have a non-NULL value in col1. So, if we exclude all of the rows where we found a match, we're left with rows from table1 that didn't have a match.
If we want to get rows from table2 that don't have a "matching" row in table1, we can do the same thing, just flipping the order of the tables.
If we combine the two sets, but only want a "distinct" list of "not matched" values, we can use the UNION set operator:
SELECT t1.col1
FROM table1 t1
LEFT
JOIN table2 t2
ON t2.col1 = t1.col1
WHERE t2.col1 IS NULL
UNION
SELECT s2.col1
FROM table2 s2
LEFT
JOIN table1 s1
ON s1.col1 = s2.col1
WHERE s1.col1 IS NULL
--
Finding out which table the non-matched value is from
Sometimes, we want to know which query returned the value; we can get that by including a literal value as a discriminator in each query.
SELECT 'table1' AS src
, t1.col1
FROM table1 t1
LEFT
JOIN table2 t2
ON t2.col1 = t1.col1
WHERE t2.col1 IS NULL
UNION
SELECT 'table2' AS src
, s2.col1
FROM table2 s2
LEFT
JOIN table1 s1
ON s1.col1 = s2.col1
WHERE s1.col1 IS NULL
ORDER BY 2
A different (usually less performant) approach to finding non-matching rows
An entirely different approach, to returning an equivalent result, would be do something like this:
SELECT q.col1
FROM ( SELECT 't1' AS src, t1.col1 FROM table1 t1
UNION
SELECT 't2' AS src, t2.col1 FROM table2 t2
) q
GROUP BY q.col1
HAVING COUNT(DISTINCT q.src) < 2
ORDER BY q.col1
(The inline view q will be "materialized" as a derived table, so this approach can be expensive for large sets, and this approach won't take advantage of indexes on col1 to perform the matching.) One other small difference between this and the anti-join approach: this will omit a col1 value of NULL if a NULL exists in both tables. Aside from that, the resultset is equivalent.

Merging two SQL Server tables conditionally into a third table

Clearly, I am not a SQL guy, so I have to ask for help on the following rather simple task.
I have two SQL Server 2008 tables: t1 and t2 with many identical columns and a key column (entry_ID). T2 has rows that do not exist in t1 but should.
I want to merge those rows from t2 that do not exist in t1 but I also do not want any rows from t2 that already exist in t1. I would like the result set to fill a new t3.
I have looked at many solutions online but can't find the solution to the above scenario.
Thank you.
There are a number of ways to do it you could use UNION ALL or OUTER JOIN.
Assuming you are using Entry_ID to find identical records, and Entry_ID is unique within each table, here is a OUTER JOIN method:
This gets you your recordset: T1 and T2 merged:
SELECT
CASE
WHEN T1.Entry_ID IS NULL THEN 'T2'
WHEN T2.Entry_ID IS NULL THEN 'T1'
ELSE 'Both'
END SourceTable,
COALESCE(T1.Entry_ID,T2.Entry_ID) As Entry_ID,
COALESCE(T1.Col1, T2.Col1) As Col1,
COALESCE(T1.Col2, T2.Col2) As Col2,
COALESCE(T1.Col3, T2.Col3) As Col3,
COALESCE(T1.Col4, T2.Col4) As Col4
FROM T1 FULL OUTER JOIN T2
ON T1.Entry_DI = T2.Entry_ID
ORDER BY COALESCE(T1.Entry_DI,T2.Entry_ID)
This inserts it into T3:
INSERT INTO T3 (Entry_ID,Col1, COl2,Col3,Col4)
SELECT
COALESCE(T1.Entry_DI,T2.Entry_ID) As Entry_ID,
COALESCE(T1.Col1, T2.Col1) As Col1,
COALESCE(T1.Col2, T2.Col2) As Col2,
COALESCE(T1.Col3, T2.Col3) As Col3,
COALESCE(T1.Col4, T2.Col4) As Col4
FROM T1 FULL OUTER JOIN T2
ON T1.Entry_DI = T2.Entry_ID
Again you must note that Entry_ID needs to be unique within their tables, and it uses this to match between the tables.
Also note the columns from the select line up with the column list in the insert statement - the order of the columns in the physical table doesn't matter, the INSERT and SELECT just have to line up.

Erro 1062: error in SQL syntax

I have three tables. t1,t2,t3. I want to get the common values between t2,t3 and select the whole record from t1. When I type my statement as in:
select * from db.t1 where col1
IN (
select col1 from t2 , t3
where t2.col1=t3.col1);
I get a syntax error. What is wrong ?
Assuming you're testing that t2.col1=t3.col1 (instead of col1.t2=col1.t3, as shown), then one other problem is the ambiguity for col1 in the inner SELECT.
This:
where col1.t2=col1.t3);
Should be:
where t2.col1=t3.col1);
I'm not sure, but maybe you have to rename your encapsulated col1 :
select * from db.t1 where t1.col1
IN (
select col1 as col1bis from t2 , t3
where t2.col1 = t3.col1);