empty result when using a subquery select - mysql

I have two tables (table1, table2) in a database (both with type InnoDB). They both have a column "article". In table1 "article" is the primary index, in table2 "article" is defined as "unique". Both of those columns have data type varchar(32), also the same collation.
I am trying to get a list of all "article" values which are in table1, but NOT in table2.
table1 contains about 5000 rows, table2 contains about 3000 rows, so I should get at least 2000 "article" values as a result. My query looks like this:
SELECT article FROM table1
WHERE article NOT IN
(SELECT article FROM table2);
But this returns an empty result...
When I do it the other way around (i.e. select all "article"s from table2 which are not in table1), it works, that query returns around 700 values.
I suppose this must have to do with the different index/unique status of "article" in the two tables. But how can I modify the query to get it working?

Use a left join instead. It is faster with many values anyway:
SELECT t1.article
FROM table1 t1
LEFT JOIN table2 t2 ON t1.article = t2.article
WHERE t2.article IS NULL

I just found a second solution myself (despite the accepted answer fully working): Apparently in this situation the subquery requires a WHERE clause for the whole query to work. So I added a WHERE clause that will apply to all rows in table2 (i.e. WHERE article != ""). So the complete (working) query now looks like this:
SELECT article FROM table1
WHERE article NOT IN
(SELECT article FROM table2 WHERE article != "");

Related

MySQL sub select and return multiple records from the sub select table

I don't know if this is possible, but can mysql do a sub select and retrieve multiple records?
Here is my simplified query:
SELECT table1.*,
(
SELECT table2.*
FROM Table2 table2
WHERE table2.key_id = table1.key_id
)
FROM Table1 table1
Basically, Table2 has X amount of records that I need to pull back in the query and I don't want to have to run a secondary query (for instance get the results from Table1 and then loop over those results and then get all the results from Table2).
Thanks.
No. The subquery in the SELECT clause is called a scalar subquery. A scalar subquery has two important properties:
It can only retrieve one column.
It can only retrieve zero or one rows.
A scalar subquery -- as its name implies -- substitutes for a scalar value in an expression. If the subquery returns no rows, the value used in the expression is NULL.
In your case, you can use a LEFT JOIN instead:
SELECT t1.*, t2.*
FROM Table1 t1 LEFT JOIN
Table2 t2
ON t2.key_id = t1.keyid;
Note that table aliases are a good thing. However, they should make the query simpler, so repeating the table name is not a big win.
MySQL can do a subquery that returns multiple rows or multiple columns, but it's not valid to do that in a scalar context.
You're putting a subquery in a scalar context. In other words, in the select-list, a subquery must return one column and one row (or zero rows), because it will be used for one item on the respective row as it uses the select-list to build a result.

Query Two Tables - but not trying to JOIN?

I have two tables that almost have identical columns. The first table contains the "current" state of a particular record and the second table contains all the previous stats of that records (it's a history table). The second table has a FK to the first table.
I'd like to query both tables so I get the entire records history, including its current state in one result. I don't think a JOIN is what I'm trying to do as that "joins" multiple tables "horizontally" (one or more columns of one table combined with one or more columns of another table to produce a result that includes columns from both tables). Rather, I'm trying to "join"(???) the tables "vertically" (meaning, no columns are getting added to the result, just that the results from both tables are falling under the same columns in the result set).
Not exactly sure if what I'm expressing make sense -- or if it's possible in MySQL.
To accomplish this, you could use a UNION between two SELECT statements. I would also suggest selecting from a derived table in the following manner so that you can sort by columns in your result set. Suppose we wanted to combine results from the following two queries:
SELECT FieldA, FieldB FROM table1;
SELECT FieldX, FieldY FROM table2;
We could join these with a UNION statement as follows:
SELECT Field1, Field2 FROM (
SELECT FieldA AS `Field1`, FieldB AS `Field2` FROM table1
UNION SELECT FieldX AS `Field1`, FieldY AS `Field2` FROM table2)
AS `derived_table`
ORDER BY Field1 ASC, Field2 DESC
In this example, I have selected from table1 and table2 fields which are similar, but not identically named, sharing the same data type. They are matched up using aliases (e.g., FieldA in table1 and FieldX in table2 both map to Field1 in the result set, etc.).
If each table has the same column names, field aliasing is not required, and the query becomes simpler.
Note: In MySQL it is necessary to name derived tables, even if the name given is not intended to be used.
UNION.
Select colA, colB From TblA
UNION
Select colA, colB From TblB
Your after a left join on the first table. That will make the right side I'd he their a number (exists in both) or null (exists only in the left table )
You want
select lhs.* , rhs.id from lhs left join rhs using(Id)

Nested SELECT SQL Queries Workbench

Hi i have this query but its giving me an error of Operand should contain 1 column(s) not sure why?
Select *,
(Select *
FROM InstrumentModel
WHERE InstrumentModel.InstrumentModelID=Instrument.InstrumentModelID)
FROM Instrument
according to your query you wanted to get data from instrument and instrumentModel table and in your case its expecting "from table name " after your select * .when the subselect query runs to get its result its not finding table instrument.InstrumentModelId inorder to fetch result from both the table by matching you can use join .or you can also select perticuler fields by tableName.fieldName and in where condition use your condition.
like :
select Instrument.x,InstrumentModel.y
from instrument,instrumentModel
where instrument.x=instrumentModel.y
You can use a join to select from 2 connected tables
select *
from Instrument i
join InstrumentModel m on m.InstrumentModelID = i.InstrumentModelID
When you use subqueries in the column list, they need to return exactly one value. You can read more in the documentation
as a user commented in the documentation, using subqueries like this can ruin your performance:
when the same subquery is used several times, mysql does not use this fact to optimize the query, so be careful not to run into performance problems.
example:
SELECT
col0,
(SELECT col1 FROM table1 WHERE table1.id = table0.id),
(SELECT col2 FROM table1 WHERE table1.id = table0.id)
FROM
table0
WHERE ...
the join of table0 with table1 is executed once for EACH subquery, leading to very bad performance for this kind of query.
Therefore you should rather join the tables, as described by the other answer.

What's the SELECT list and the subquery that is ignored in an EXISTS statement?

This is a quote from http://dev.mysql.com/doc/refman/5.0/en/exists-and-not-exists-subqueries.html: "If a subquery returns any rows at all, EXISTS subquery is TRUE, and NOT EXISTS subquery is FALSE. For example:
SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);
Traditionally, an EXISTS subquery starts with SELECT *, but it could begin with SELECT 5 or SELECT column1 or anything at all. MySQL ignores the SELECT list in such a subquery, so it makes no difference."
What do the last two sentences mean? Can I have an example of why this is important? I've realized that regardless of what I use in my initial SELECT, I get all columns in my result sets. Is that what this is talking about?
These sentences are not about the SELECT column... part but ONLY about the ... EXISTS (SELECT *... part. The two sentences tell you that the following statements are equivalent:
SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);
SELECT column1 FROM t1 WHERE EXISTS (SELECT 5 FROM t2);
SELECT column1 FROM t1 WHERE EXISTS (SELECT 42 FROM t2);
SELECT column1 FROM t1 WHERE EXISTS (SELECT random() FROM t2);
SELECT column1 FROM t1 WHERE EXISTS (SELECT column FROM t2);
SELECT column1 FROM t1 WHERE EXISTS (SELECT NULL FROM t2);
SELECT column1 FROM t1 WHERE EXISTS (SELECT 1/0 FROM t2);
No more no less.
Since your returned relation is specified in the first SELECT column1 part, the second SELECT has no influence on the returned rows.
The last two sentences are basically saying that, as long as the subquery returns something that is NOT an empty table, then what you put in that SELECT statement is moot as you are merely checking for existence. Example:
SELECT DISTINCT store_type FROM stores
WHERE EXISTS (SELECT 'poop' FROM cities_stores
WHERE cities_stores.store_type = stores.store_type);
We will get poop for every row that is returned in the second SELECT. Again, as long as there exists at least one row in cities_stores where store_type equals the initial SELECT's store_type, then that table WILL exist.
Importance? In this example, say we own a lot of stores of varying types. All of a sudden we want to know all the different store types that we have in existence with a couple of caveats: the store type is a type found in a city (represented by cities_stores), and also NOT ones that are in the works, but the ones that actually are open (represented by our stores table). Well, we would use this query to get a list of all the store_types.
And the reason you are getting all the columns is because of the *, which means all columns in the table(s) you SELECTed FROM.
It's says that if you have following select:
SELECT a1,b1,c1,...,z1 FROM t1 WHERE EXISTS (SELECT a2,b2,c3,...,z2 FROM t2)
The column list a2,b2,c3,...,z2 is ignored as it's value is not necessary to decide if a row exists or not. It's saves memory and computation time.
edit:
To make things clearer: EXISTS tests for non emptyness of the query result. Since one doesn't need to know WHAT was returned by the query to know if it returned something or not, the select columns can be ignored.
When using EXISTS you only check for the existence of a row, independent of the values. The SELECT part is important if you use IN.
So the selected columns for EXISTS can be ignored, whereas for an IN subquery it is important.
I think you are confusing un-correlated subqueries (what your code is), where the subquery can be evaluated independendly of the "outer" query. "Indepentently evaluable subqueries"--subquery which can be evaluated without the context of the superquery:
SELECT column1
FROM t1
WHERE EXISTS
( SELECT *
FROM t2
)
with correlated subqueries, where the subquery cannot stand on their own. The correlated-subquery code analogous to the above uncorrelated subquery code is:
SELECT column1
FROM t1
WHERE EXISTS
( SELECT *
FROM t2
WHERE t2.somecolumn = t1.somecolumn
)
and can usually be written also as:
SELECT column1
FROM t1
WHERE somecolumn IN
( SELECT somecolumn
FROM t2
)

Finding the intersection between two columns

I'm trying to find the (set) intersection between two columns in the same table in MySQL. I basically want to find the rows that have either a col1 element that is in the table's col2, or a col2 element that is in the table's col1.
Initially I tried:
SELECT * FROM table WHERE col1 IN (SELECT col2 FROM table)
which was syntactically valid, however the run-time is far too high. The number of rows in the table is ~300,000 and the two columns in question are not indexed. I assume the run time is either n^2 or n^3 depending on whether MySQL executes the subquery again for each element of the table or if it stores the result of the subquery temporarily.
Next I thought of taking the union of the two columns and removing distinct elements, because if an element shows up more than once in this union then it must have been present in both columns (assuming both columns contain only distinct elements).
Is there a more elegant (i.e. faster) way to find the set intersection between two columns of the same table?
SELECT t1.*
FROM table t1
INNER JOIN table t2
ON t1.col1 = t2.col2
Creating indexes on col1 and col2 would go a long way to help this query as well.
If you only want the values, try the INTERSECT command:
(SELECT col1 FROM table) INTERSECT (SELECT col2 FROM table)