MySQL nested queries execution - mysql

I am writing a nested MySQL query where a subquery returns more than one row and hence the query can not be executed.
Can anyone suggest me a solution for this problem?
Thanks in advance.

An error about a subquery returning more than one value says to me that you're attempting a straight value comparison, like this:
WHERE col = (SELECT col2 FROM TABLE_2)
The solution depends on the data coming from the subquery - do you want the query to use all the values being returned? If yes, then change the equals sign for an IN:
WHERE col IN (SELECT col2 FROM TABLE_2)
Otherwise, you need to correct the subquery so it only ever returns one value. The MAX or MIN aggregate functions are a possibliity - they'll return the highest or lowest value. It could just be a matter of correlating the subquery:
FROM TABLE_1 t1
WHERE t1.col = (SELECT MAX(t2.col2)
FROM TABLE_2 t2
WHERE t2.fk_col = t1.id) -- correlated example
As Tabhaza points out, a subquery generally doesn't return more than one column (though some databases support tuple matching), in which case you need to define a derived table/inline view and join to it.
Would've been nice to have more information on the issue you're having...

Try joining to a derived table rather than doing a subquery; it will allow you to return multiple fields:
SELECT a.Field1, a.Field2, b.Field3, b.Field4
FROM table1 a INNER JOIN
(SELECT c.Field3, c.Field4, c.Key FROM table2 as c) as b ON a.Key = b.Key
WHERE ...

this sounds like a logic problem, not a syntax problem.
why is the subquery returning more than one row?
why do you have that in a place that requires only one row?
you need to restructure something to fit these two things together. without any indication of your system, your query, or your intent, it is very hard to help further.

If the database says you are returning more than one row, you should listen to what it says and change your query so that it only returns one row.
This is a problem in your logic.
Change the query so that it only returns one row.
Think about why the query is returning more than one row, and determine how to get the query to return just the single row you need from that result.

Use a LIMIT clause on the subquery so it always returns a maximum of 1 row

You could add a LIMIT 1 to the subquery so the top query only considers the first result. You can also sort the results from the subquery before doing the LIMIT, to return the result with the highest/lowest X. But make sure that that's actually what you want to happen, as the multi-row subquery is often a symptom of an underlying problem.

Related

Join Performances When Searching For NULL Value

I need to find a value that exists in LoyaltyTransactionBasketItemStores table but not in DimProductConsolidate table. I need the item code and its corresponding company. This is my query
SELECT
A.ProductReference, A.CompanyCode
FROM
(SELECT ProductReference, CompanyCode FROM dwhdb.LoyaltyTransactionsBasketItemsStores GROUP BY ProductReference) A
LEFT JOIN
(SELECT LoyaltyVariantArticleCode FROM dwhdb.DimProductConsolidate) B ON B.LoyaltyVariantArticleCode = A.ProductReference
WHERE
B.LoyaltyVariantArticleCode IS NULL
It is a pretty straight forward query. But when I run it, it's taking 1 hour and still not finish. Then I use EXPLAIN and this is the result
But when I remove the CompanyCode from my query, its performance is increasing a lot. This is the EXPLAIN result
I want to know why is this happening and is there any way to get ProductReference and its company with a lot more better performance?
Your current query is rife with syntax and structural errors. I would use exists logic here:
SELECT a.ProductReference, a.CompanyCode
FROM dwhdb.LoyaltyTransactionsBasketItemsStores a
WHERE NOT EXISTS (SELECT 1 FROM dwhdb.DimProductConsolidate b
WHERE b.LoyaltyVariantArticleCode = a.ProductReference);
Your current query is doing a GROUP BY in the first subquery, but you never select aggregates, but rather other non aggregate columns. On most other databases, and even on MySQL in strict mode, this syntax is not allowed. Also, there is no need to have 2 subqueries here. Rather, just select from the basket table and then assert that matching records do not exist in the other table.

MySQL aggregate function to filter nulls and conform with ONLY_FULL_GROUP_BY

I have a single record which joins to N other tables, and extracts a single column from each of them. I would like to put all N of those extracted columns in a single record.
After constructing the diagram below it seems like I can get to the second step easily, and then I should be able to use an aggregate function to filter out the NULL's. I have looked around for something like GROUP_COALESCE, but I couldn't find something which accomplishes this.
I have a fiddle here which unfortunately works, because MySQL will let you select columns which aren't in the GROUP BY without an aggregate at your own peril http://sqlfiddle.com/#!9/304992/1/0.
Is there a way I can make sure that it always selects the column from the record, if the record exists?
The end result should one record per group, and each column would contain the value which was inside the only row successfully joined for that group..
If I followed you correctly, you can just use aggregate functions on the columns coming from the joined tables. Aggregate functions ignore null values, so, since you have two null values and one non-null value for each column and each group, this will return the expected output (while conforming to the ONLY_FULL_GROUP_BY option).
SELECT
group_table_id,
MAX(t1.v) t1_v,
MAX(t2.v) t2_v,
MAX(t3.v) t3_v
FROM group_table
LEFT JOIN t1 ON t1.group_id = group_table_id
LEFT JOIN t2 ON t2.group_id = group_table_id
LEFT JOIN t3 ON t3.group_id = group_table_id
GROUP BY group_table_id

Is using correlated subquery better than join? (indexing perspective)

I read somewhere something like this:
Indexes will be used per queries.
So as you know, this is two queries:
SELECT m1.*, (SELECT 1 FROM mytable2 m2 WHERE col2 = ?) AS sth
FROM mytable1 m1 WHERE col1 = ?
Well query above can use two indexes: mytable1(col1), mytable2(col2). Because of being two separated queries.
Now take a look at this one: (the same as previous query, just uses join instead of subquery)
SELECT m1.*, m2.1 AS sth
FROM mytable1 m1
JOIN mytable2 m2 ON m2.col2 = ?
WHERE m1.col1 = ?
But this ^ query, is just one query. So it can use just one index. Is my understanding right? So using subquery is better for indexing, right?
But this ^ query, is just one query. So it can use just one index. Is my understanding right? So using subquery is better for indexing, right?
You misunderstand. MySQL can use one index per table reference.
So in this case, it can use both indexes: mytable1(col1), mytable2(col2).
You can even use two different indexes from the same table, if you do a self-join or a UNION or a subquery. Each time you reference the table counts as a separate table reference.
SELECT m1.*, m2.1 AS sth
FROM mytable1 m1
JOIN mytable2 m2 ON m2.col2 = ?
WHERE m1.col1 = ?
Regardless of indexing, this is a strange query. You have no condition that relates mytable1 to mytable2. So you're doing a Cartesian product between the two tables. One or both table may be selecting a single row, depending on your conditions for col1 and col2. But it's still a Cartesian product, so if the conditions on both tables return multiple rows, you'll get result set with a lot of repetition.
This is too long for a comment.
The two queries are different, in multiple respects:
The first returns all rows in mytable1 that match the where condition, regardless of whether there is a match in the second table. The second only returns rows that match.
The first fails with an error if the subquery returns more than one row. The second returns multiple rows that match.
As a consequence, the first could return NULL for sth, the second cannot.
My advice is to first learn to write the query that meets your functional needs. Then worry about performance.
As for your question, both correlated subqueries and joins can make use of an index. The idea that correlated subqueries are always worse than joins is an old-wives' tale (no offense to old wives) that should be forgotten.
Generally speaking, it all depends. At the end of the day, SQL Server will create execution plans, and depending on how it interprets your query, one might be better than the other. Having that said, generally, join is better.

only select the row if the field value is unique

I sort the rows on date. If I want to select every row that has a unique value in the last column, can I do this with sql?
So I would like to select the first row, second one, third one not, fourth one I do want to select, and so on.
What you want are not unique rows, but rather one per group. This can be done by taking the MIN(pk_artikel_Id) and GROUP BY fk_artikel_bron. This method uses an IN subquery to get the first pk_artikel_id and its associated fk_artikel_bron for each unique fk_artikel_bron and then uses that to get the remaining columns in the outer query.
SELECT * FROM tbl
WHERE pk_artikel_id IN
(SELECT MIN(pk_artikel_id) AS id FROM tbl GROUP BY fk_artikel_bron)
Although MySQL would permit you to add the rest of the columns in the SELECT list initially, avoiding the IN subquery, that isn't really portable to other RDBMS systems. This method is a little more generic.
It can also be done with a JOIN against the subquery, which may or may not be faster. Hard to say without benchmarking it.
SELECT *
FROM tbl
JOIN (
SELECT
fk_artikel_bron,
MIN(pk_artikel_id) AS id
FROM tbl
GROUP BY fk_artikel_bron) mins ON tbl.pk_artikel_id = mins.id
This is similar to Michael's answer, but does it with a self-join instead of a subquery. Try it out to see how it performs:
SELECT * from tbl t1
LEFT JOIN tbl t2
ON t2.fk_artikel_bron = t1.fk_artikel_bron
AND t2.pk_artikel_id < t1.pk_artikel_id
WHERE t2.pk_artikel_id IS NULL
If you have the right indexes, this type of join often out performs subqueries (since derived tables don't use indexes).
This non-standard, mysql-only trick will select the first row encountered for each value of pk_artikel_bron.
select *
...
group by pk_artikel_bron
Like it or not, this query produces the output asked for.
Edited
I seem to be getting hammered here, so here's the disclaimer:
This only works for mysql 5+
Although the mysql specification says the row returned using this technique is not predictable (ie you could get any row as the "first" encountered), in fact in all cases I've ever seen, you'll get the first row as per the order selected, so to get a predictable row that works in practice (but may not work in future releases but probably will), select from an ordered result:
select * from (
select *
...
order by pk_artikel_id) x
group by pk_artikel_bron

What does it mean by select 1 from table?

I have seen many queries with something as follows.
Select 1
From table
What does this 1 mean, how will it be executed and, what will it return?
Also, in what type of scenarios, can this be used?
select 1 from table will return the constant 1 for every row of the table. It's useful when you want to cheaply determine if record matches your where clause and/or join.
SELECT 1 FROM TABLE_NAME means, "Return 1 from the table". It is pretty unremarkable on its own, so normally it will be used with WHERE and often EXISTS (as #gbn notes, this is not necessarily best practice, it is, however, common enough to be noted, even if it isn't really meaningful (that said, I will use it because others use it and it is "more obvious" immediately. Of course, that might be a viscous chicken vs. egg issue, but I don't generally dwell)).
SELECT * FROM TABLE1 T1 WHERE EXISTS (
SELECT 1 FROM TABLE2 T2 WHERE T1.ID= T2.ID
);
Basically, the above will return everything from table 1 which has a corresponding ID from table 2. (This is a contrived example, obviously, but I believe it conveys the idea. Personally, I would probably do the above as SELECT * FROM TABLE1 T1 WHERE ID IN (SELECT ID FROM TABLE2); as I view that as FAR more explicit to the reader unless there were a circumstantially compelling reason not to).
EDIT
There actually is one case which I forgot about until just now. In the case where you are trying to determine existence of a value in the database from an outside language, sometimes SELECT 1 FROM TABLE_NAME will be used. This does not offer significant benefit over selecting an individual column, but, depending on implementation, it may offer substantial gains over doing a SELECT *, simply because it is often the case that the more columns that the DB returns to a language, the larger the data structure, which in turn mean that more time will be taken.
If you mean something like
SELECT * FROM AnotherTable
WHERE EXISTS (SELECT 1 FROM table WHERE...)
then it's a myth that the 1 is better than
SELECT * FROM AnotherTable
WHERE EXISTS (SELECT * FROM table WHERE...)
The 1 or * in the EXISTS is ignored and you can write this as per Page 191 of the ANSI SQL 1992 Standard:
SELECT * FROM AnotherTable
WHERE EXISTS (SELECT 1/0 FROM table WHERE...)
it does what it says - it will always return the integer 1. It's used to check whether a record matching your where clause exists.
select 1 from table is used by some databases as a query to test a connection to see if it's alive, often used when retrieving or returning a connection to / from a connection pool.
The result is 1 for every record in the table.
To be slightly more specific, you would use this to do
SELECT 1 FROM MyUserTable WHERE user_id = 33487
instead of doing
SELECT * FROM MyUserTable WHERE user_id = 33487
because you don't care about looking at the results. Asking for the number 1 is very easy for the database (since it doesn't have to do any look-ups).
Although it is not widely known, a query can have a HAVING clause without a GROUP BY clause.
In such circumstances, the HAVING clause is applied to the entire set. Clearly, the SELECT clause cannot refer to any column, otherwise you would (correct) get the error, "Column is invalid in select because it is not contained in the GROUP BY" etc.
Therefore, a literal value must be used (because SQL doesn't allow a resultset with zero columns -- why?!) and the literal value 1 (INTEGER) is commonly used: if the HAVING clause evaluates TRUE then the resultset will be one row with one column showing the value 1, otherwise you get the empty set.
Example: to find whether a column has more than one distinct value:
SELECT 1
FROM tableA
HAVING MIN(colA) < MAX(colA);
If you don't know there exist any data in your table or not, you can use following query:
SELECT cons_value FROM table_name;
For an Example:
SELECT 1 FROM employee;
It will return a column which contains the total number of rows & all rows have the same constant value 1 (for this time it returns 1 for all rows);
If there is no row in your table it will return nothing.
So, we use this SQL query to know if there is any data in the table & the number of rows indicates how many rows exist in this table.
If you just want to check a true or false based on the WHERE clause, select 1 from table where condition is the cheapest way.
This means that You want a value "1" as output or Most of the time used as Inner Queries because for some reason you want to calculate the outer queries based on the result of inner queries.. not all the time you use 1 but you have some specific values...
This will statically gives you output as value 1.
I see it is always used in SQL injection,such as:
www.urlxxxxx.com/xxxx.asp?id=99 union select 1,2,3,4,5,6,7,8,9 from database;
These numbers can be used to guess where the database exists and guess the column name of the database you specified.And the values of the tables.
it simple means that you are retrieving the number first column from table ,,,,means
select Emply_num,Empl_no From Employees ;
here you are using select 1 from Employees;
that means you are retrieving the Emply_num column.
Thanks
The reason is another one, at least for MySQL. This is from the MySQL manual
InnoDB computes index cardinality values for a table the first time that table is accessed after startup, instead of storing such values in the table. This step can take significant time on systems that partition the data into many tables. Since this overhead only applies to the initial table open operation, to “warm up” a table for later use, access it immediately after startup by issuing a statement such as SELECT 1 FROM tbl_name LIMIT 1
This is just used for convenience with IF EXISTS(). Otherwise you can go with
select * from [table_name]
Image In the case of 'IF EXISTS', we just need know that any row with specified condition exists or not doesn't matter what is content of row.
select 1 from Users
above example code, returns no. of rows equals to no. of users with 1 in single column