More efficient query than NOT IN (nested select) - mysql

I have two tables table1 and table2 their definitions are:
CREATE `table1` (
'table1_id' int(11) NOT NULL AUTO_INCREMENT,
'table1_name' VARCHAR(256),
PRIMARY KEY ('table1_id')
)
CREATE `table2` (
'table2_id' int(11) NOT NULL AUTO_INCREMENT,
'table1_id' int(11) NOT NULL,
'table1_name' VARCHAR(256),
PRIMARY KEY ('table2_id'),
FOREIGN KEY ('table1_id') REFERENCES 'table1' ('table1_id')
)
I want to know the number of rows in table1 that are NOT referenced in table2, that can be done with:
SELECT COUNT(t1.table1_id) FROM table1 t1
WHERE t1.table1_id NOT IN (SELECT t2.table1_id FROM table2 t2)
Is there a more efficient way of performing this query?

Upgrade to MySQL 5.6, which optimizes semi-joins against subqueries better.
See http://dev.mysql.com/doc/refman/5.6/en/subquery-optimization.html
Or else use an exclusion join:
SELECT COUNT(t1.table1_id) FROM table1 t1
LEFT OUTER JOIN table2 t2 USING (table1_id)
WHERE t2.table1_id IS NULL
Also, make sure table2.table1_id has an index on it.

try using EXISTS.. its generally more efficient than IN
SELECT COUNT(t1.table1_id)
FROM table1 t1
WHERE EXISTS
( SELECT 1
FROM table2 t2
WHERE t2.table1_id <=> t1.table1_id
)
you can do it with NOT EXISTS as well
SELECT COUNT(t1.table1_id)
FROM table1 t1
WHERE NOT EXISTS
( SELECT 1
FROM table2 t2
WHERE t2.table1_id = t1.table1_id
)
EXISTS is generally faster because the execution plan is once it finds a hit, it will quit searching since the condition has proved true. The problem with IN is it will collect all the results from the subquery before further processing... and that takes longer
As #billkarwin noted in the comments EXISTS is using a dependent subquery.. Here is the explain on my two queries and also the OP's query.. http://sqlfiddle.com/#!2/53199d/5

Related

Can I add a new auto-increment column to this UNION?

I’m trying to join two tables getting all the data from both tables.
I managed to UNION both tables, but I need to add an ID with auto-increment as the primary key for the new table that I’m creating.
I don't know how to do it and can’t find a way to add it to the query.
CREATE TABLE NEWTABLE
SELECT T1.TEXT as TEXT
[...]
FROM TABLE1 T1
LEFT JOIN TABLE2 T2
on T1.TEXT = T2.TEXT
UNION
SELECT T2.TEXT as TEXT
FROM TABLE1 T1
[...]
RIGHT JOIN TABLE2 T2
on T1.TEXT = T2.TEXT
You need to put the column definitions in the CREATE TABLE statement to add the id column. Then provide NULL as the values for it in the SELECT queries.
CREATE TABLE NEWTABLE (
id INT PRIMARY KEY NOT NULL AUTO_INCREMENT,
text TEXT,
[...]
) AS
SELECT null, T1.TEXT
[...]
FROM TABLE1 T1
LEFT JOIN TABLE2 T2
on T1.TEXT = T2.TEXT
UNION
SELECT null, T2.TEXT as TEXT
FROM TABLE1 T1
[...]
RIGHT JOIN TABLE2 T2
on T1.TEXT = T2.TEXT

Issue with all MySQL SELECT queries containing EXISTS subquery and LEFT JOIN with ON where ON has reference to external SELECT

This is the problem. When running any query of that type
SELECT field1
FROM table1
WHERE EXISTS (SELECT table2.field2, table3.field3, table3.field4
FROM table2 LEFT JOIN table3 ON table3.field3 = table2.field2
AND table3.field4 = table1.field1
WHERE "some condition");
I get this error:
Unknown column 'table1.field1' in 'on clause'
On the other hand, this query
SELECT field1
FROM table1
WHERE EXISTS (SELECT table2.field2, table3.field3, table3.field4
FROM table2 LEFT JOIN table3 ON table3.field3 = table2.field2
WHERE "some condition"
AND table3.field4 = table1.field1);
works fine.
There are possible alternatives, for example it can be inner join rather than outer join, negative subquery check (not exists), where clause is not necessary and field list can be different. The only critical part is EXISTS subquery and reference to table1.field1 under ON condition from JOIN.
I tried it on several MySQL and MariaDB servers with the same result! Also tried to find exactly the same issue online and here on SO - no success.
As per suggestion given in one of the comments, I modify the question with a real example.
Tables:
CREATE TABLE `sessions` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`browser` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
)
CREATE TABLE `browsers` (
`id` int(11) NOT NULL DEFAULT '0',
`browser` varchar(255) DEFAULT NULL
)
And to get all users who used all browsers, I run this query
select distinct user_id
from sessions as t1
where not exists (select t2.id, browsers.id
from sessions as t2 LEFT JOIN browsers ON t2.browser = browsers.browser
AND t2.user_id = t1.user_id
where browsers.id IS NULL);
Error message I get:
Error Code : 1054
Unknown column 't1.user_id' in 'on clause'
And of course the desired output I need is select query result set with a listing of users.
I know how to rewrite the query for this particular task, so this is not a problem. The problem is to run the query with this pattern for any other task since it seems very logical and good SQL.
My question is what I am doing wrong and if that is a bug, how I can avoid it keeping the same query structure.
I think you have met bug#96946, MySQL does not allow outer references in the JOIN ON clause.
If I am not mistaken, this is a rewrite of a double-nested NOT EXISTS query, and I think this statement will actually be accepted in MySQL:
SELECT DISTINCT user_id
FROM sessions AS s1
WHERE NOT EXISTS (SELECT *
FROM browsers AS b
WHERE NOT EXISTS (SELECT *
FROM sessions s2
WHERE s1.user_id = s2.user_id AND
s2.browser = b.browser
)
);

Best way of deleting SQL rows how i want

How can i construct a SQL query to delete how i want.
I have two tables.
Table 1.
ID: Some Random Not Significant To This Question Columns : DateTime : UserID
Table 2.
ID: Some Random Not Significant To This Question Columns : DateTime : UserID
The two tables are related by DateTime and UserID
Is there anyway i can create a query so that it deletes from table 2 if no rows in table1 have a matching DateTime & UserID.
Thanks
You can use LEFT JOIN :
DELETE table2
FROM table2 t2 LEFT JOIN table1 t1 ON t1.`DateTime` = t2.`DateTime`
AND t1.`UserID` = t2.`UserID`
WHERE t1.`UserID` IS NULL
DELETE
FROM table2 t2
WHERE NOT EXISTS
(
SELECT NULL
FROM table1 t1
WHERE (t1.userId, t1.dateTime) = (t2.userId, t2.dateTime)
)
First of all: create a backup before you delete lots of records :)
The idea:
DELETE FROM
table1
WHERE
NOT EXISTS (SELECT 1 FROM table2 WHERE table1.referenceColumn = table2.referenceColumn)
You can check which records will be deleted by replacing the DELETE with SELECT *
And now the solution
DELETE FROM
table2
WHERE
NOT EXISTS (
SELECT 1 FROM
table1
WHERE
table2.UserID = table1.UserID
AND table2.DateTime = table1.DateTime
)

Select from MySQL table while ordering by IDs from another table

This might be something very simple to do. If so, I apologize. I'm still learning MySQL.
Say, I have two tables:
Table1:
`id` int autoincrement primary key
`Name` tinytext
`Phone` tinytext
`Date` etc.
and
Table2:
`id` int autoincrement primary key
`itmID` int
Each row in Table2 specifes the order at which elements should be selected out of Table1. itmID field in Table2 is linked to id field in Table1.
So right at this moment to select elements from Table1 I do this:
SELECT * FROM `Table1`;
But how do you order them according to Table2, something like this?
SELECT * FROM `Table1` ORDER BY <itmID's in Table2> ASC;
If all ids of the Table1 have an entry on Table2 use an INNER JOIN, like this.
SELECT * FROM Table1 t1
INNER JOIN Table2 t2 ON t1.id = t2.itmID
ORDER BY t2.itmID
If not all of them have an entry, then use a LEFT JOIN, like this:
SELECT * FROM Table1 t1
LEFT JOIN Table2 t2 ON t1.id = t2.itmID
ORDER BY t2.itmID
Select from the first table, join it to the second, and order by the second. Something like
SELECT *
FROM table1
LEFT JOIN table 2 on table.id = table2.id
ORDER by table2.itmID
Ryan's answer is almost right
SELECT *
FROM table1
INNER JOIN table2 on table1.id = table2.itmID
ORDER BY table2.id
http://dev.mysql.com/doc/refman/5.5/en/join.html
SELECT * FROM `Table1`
INNER JOIN `Table2` USING (`id`)
ORDER BY `Table2`.`itmID` ASC

mysql left join multiple tables with one-to-many relation

I created a simple test case:
CREATE TABLE `t1` (
`id` int NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
)
CREATE TABLE `t2` (
`id2` int NOT NULL AUTO_INCREMENT,
`id1` int,
PRIMARY KEY (`id2`)
)
CREATE TABLE `t3` (
`id3` int NOT NULL AUTO_INCREMENT,
`id1` int,
PRIMARY KEY (`id3`)
)
insert into t1 (id) values (1);
insert into t2 (id1) values (1),(1);
insert into t3 (id1) values (1),(1),(1),(1);
I need to select all DISTINCT data from t1 left join t2 and DISTINCT data from t1 left join t3, returning a total of 6 rows ,1 x (2 [from t2] + 4 [from t3]) = 6, but beacause of the nature of this join I get 8 rows, 1 [from t1] x 2 [from t2] x 4 [from t3] = 8.
select * from t1 left join t2 on (t1.id = t2.id1);
2 rows in set (0.00 sec)
select * from t1 left join t3 on (t1.id = t3.id1);
4 rows in set (0.00 sec)
select * from t1 left join t2 on (t1.id = t2.id1) left join t3 on (t1.id = t3.id1);
8 rows in set (0.00 sec)
select * from t1 left join t2 on (t1.id = t2.id1) union select * from t1 left join t3 on (t1.id = t3.id1);
4 rows in set (0.00 sec)
What query should I use to get just the 6 rows I need, is it posible without subquery's or I need them (It will be more complicatet in the big query where I need this) ?
I need this for a big query where I allready get data from 8 tables, but I need to get data from 2 more to get all the data I need in just one single query, but when joining the 9th table, the returned data get's duplicated (the 9th table in this simple test case would be t3, and the 8th will be t2).
I hope someone could show me the right path to follow.
Thank you.
UPDATE SOLVED:
I realy don't know how to do this test case in one select, but in my BIG query I solved it this way: beacause I used group_concat and group by, I did it by spliting a value in multipe group_concat(DISTINCT ... ) and concat all of them like this
// instead of this
... group_concat(DISTINCT concat(val1, val2, val3)) ...
// I did this
concat(group_concat(DISTINCT val1,val2), group_concat(DISTINCT val1,val3)) ...
so the distinct on small groups of value prevent all of those duplicates.
I'm not sure if you're looking for at this solution
select * from t1 left join t2 on (t1.id = t2.id1);
union all
select * from t1 left join t3 on (t1.id = t3.id1);
I think there is a small mistake in #nick rulez's query. If it is written like this it really returns 6 rows:
(SELECT * FROM t1 LEFT JOIN t2 ON (t1.id = t2.id1))
UNION ALL
(SELECT * FROM t1 LEFT JOIN t3 ON (t1.id = t3.id1))