MySQL GROUP BY functional dependence in subquery - mysql

I'm writing a query to find duplicate rows in a table of people (including each duplicate):
SELECT *
FROM Person
WHERE CONCAT(firstName,lastName) IN (
SELECT CONCAT(firstName,lastName) AS name
FROM Person
GROUP BY CONCAT(firstName,lastName)
HAVING COUNT(*) > 1
)
When running this in MySQL 8.0.19 with ONLY_FULL_GROUP_BY enabled, it fails with the following error:
Query 1 ERROR: Expression #1 of HAVING clause is not in GROUP BY clause and contains nonaggregated column 'Person.firstName' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
I can't figure out how to fix this. I tried changing COUNT(*) to COUNT(CONCAT(firstName,lastName)) but that didn't help.
What's odd is that a) it runs fine in MariaDB 10.2, with or without ONLY_FULL_GROUP_BY, and b) running the subquery by itself causes no issue.
What am I doing wrong? It almost seems like a bug in MySQL.
[edit]: I certainly appreciate alternative solutions to my query, however I'm really interested in an answer as to why my error is occurring.

try like below it will do the same that you tried
SELECT *
FROM Person
WHERE (firstName,lastName) IN (
SELECT firstName,lastName
FROM Person
GROUP BY firstName,lastName
HAVING COUNT(*) > 1
)

Do not merge fields:
SELECT *
FROM Person
WHERE (firstName,lastName) IN (
SELECT firstName,lastName AS name
FROM Person
GROUP BY firstName,lastName
HAVING COUNT(*) > 1
)
Or use ANY_VALUE() function:
SELECT *
FROM Person
WHERE CONCAT(firstName,lastName) IN (
SELECT ANY_VALUE(CONCAT(firstName,lastName)) AS name
FROM Person
GROUP BY CONCAT(firstName,lastName)
HAVING COUNT(*) > 1
)

I would write your query with exists logic:
SELECT p1.*
FROM Person p1
WHERE EXISTS (SELECT 1 FROM Person p2
WHERE p2.firstName = p1.firstName AND
p2.lastName = p1.lastName AND
p2.id <> p1.id);
This effectively says to select every person for whom we can find another, different, person (going by the primary key id column, or whatever the PK might be), with same first and last name.
The following index may speed up the above query:
CREATE INDEX idx ON Person (lastName, firstName);
This should allow the exists lookup to evaluate quickly. Note that on InnoDB, MySQL should automatically cover id by adding it to the end of the above two-column index.
Regarding your error, I can't help but wonder if perhaps the problem is that you did not use proper aliases in the subquery, leading MySQL to think that you are referring to the columns in the outer query. Try this version:
SELECT p1.*
FROM Person p1
WHERE CONCAT(firstName, lastName) IN (
SELECT CONCAT(p2.firstName, p2.lastName)
FROM Person p2
GROUP BY CONCAT(p2.firstName, p2.lastName)
HAVING COUNT(*) > 1
);

Related

How to use AVG() function after GROUP BY with CASE in MySQL [duplicate]

I am running this query on MySQL
SELECT ID FROM (
SELECT ID, msisdn
FROM (
SELECT * FROM TT2
)
);
and it is giving this error:
Every derived table must have its own alias.
What's causing this error?
Every derived table (AKA sub-query) must indeed have an alias. I.e. each query in brackets must be given an alias (AS whatever), which can the be used to refer to it in the rest of the outer query.
SELECT ID FROM (
SELECT ID, msisdn FROM (
SELECT * FROM TT2
) AS T
) AS T
In your case, of course, the entire query could be replaced with:
SELECT ID FROM TT2
I think it's asking you to do this:
SELECT ID
FROM (SELECT ID,
msisdn
FROM (SELECT * FROM TT2) as myalias
) as anotheralias;
But why would you write this query in the first place?
Here's a different example that can't be rewritten without aliases ( can't GROUP BY DISTINCT).
Imagine a table called purchases that records purchases made by customers at stores, i.e. it's a many to many table and the software needs to know which customers have made purchases at more than one store:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases)
GROUP BY customer_id HAVING 1 < SUM(1);
..will break with the error Every derived table must have its own alias. To fix:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases) AS custom
GROUP BY customer_id HAVING 1 < SUM(1);
( Note the AS custom alias).
I arrived here because I thought I should check in SO if there are adequate answers, after a syntax error that gave me this error, or if I could possibly post an answer myself.
OK, the answers here explain what this error is, so not much more to say, but nevertheless I will give my 2 cents, using my own words:
This error is caused by the fact that you basically generate a new table with your subquery for the FROM command.
That's what a derived table is, and as such, it needs to have an alias (actually a name reference to it).
Given the following hypothetical query:
SELECT id, key1
FROM (
SELECT t1.ID id, t2.key1 key1, t2.key2 key2, t2.key3 key3
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
WHERE t2.key3 = 'some-value'
) AS tt
At the end, the whole subquery inside the FROM command will produce the table that is aliased as tt and it will have the following columns id, key1, key2, key3.
Then, with the initial SELECT, we finally select the id and key1 from that generated table (tt).

Select duplicate using partition by in MySQL?

I would like to select all duplicate from a table:
SELECT * FROM people HAVING (count(*) OVER (PARTITION BY name)) > 1;
Unfortunately I get the error:
Error Code: 3593. You cannot use the window function 'count' in this context.
One less elegant solution would be:
SELECT
*
FROM
people
WHERE
code IN (SELECT
name
FROM
people
GROUP BY name
HAVING COUNT(*) > 1);
How can I rewrite my first query to make it work?
If the code is identical then you can use exists :
select p.*
from people p
where exists (select 1 from people p1 where p1.name = p.name and p.code <> p1.code);
Use identity column instead if code column doesn't have a identity feature if entire table has no any identity column then your method works fine with following updated query :
SELECT p.*
FROM people p
WHERE name IN (SELECT name FROM people GROUP BY name HAVING COUNT(*) > 1);

MSQL error - #1248 - Every derived table must have its own alias [duplicate]

I am running this query on MySQL
SELECT ID FROM (
SELECT ID, msisdn
FROM (
SELECT * FROM TT2
)
);
and it is giving this error:
Every derived table must have its own alias.
What's causing this error?
Every derived table (AKA sub-query) must indeed have an alias. I.e. each query in brackets must be given an alias (AS whatever), which can the be used to refer to it in the rest of the outer query.
SELECT ID FROM (
SELECT ID, msisdn FROM (
SELECT * FROM TT2
) AS T
) AS T
In your case, of course, the entire query could be replaced with:
SELECT ID FROM TT2
I think it's asking you to do this:
SELECT ID
FROM (SELECT ID,
msisdn
FROM (SELECT * FROM TT2) as myalias
) as anotheralias;
But why would you write this query in the first place?
Here's a different example that can't be rewritten without aliases ( can't GROUP BY DISTINCT).
Imagine a table called purchases that records purchases made by customers at stores, i.e. it's a many to many table and the software needs to know which customers have made purchases at more than one store:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases)
GROUP BY customer_id HAVING 1 < SUM(1);
..will break with the error Every derived table must have its own alias. To fix:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases) AS custom
GROUP BY customer_id HAVING 1 < SUM(1);
( Note the AS custom alias).
I arrived here because I thought I should check in SO if there are adequate answers, after a syntax error that gave me this error, or if I could possibly post an answer myself.
OK, the answers here explain what this error is, so not much more to say, but nevertheless I will give my 2 cents, using my own words:
This error is caused by the fact that you basically generate a new table with your subquery for the FROM command.
That's what a derived table is, and as such, it needs to have an alias (actually a name reference to it).
Given the following hypothetical query:
SELECT id, key1
FROM (
SELECT t1.ID id, t2.key1 key1, t2.key2 key2, t2.key3 key3
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
WHERE t2.key3 = 'some-value'
) AS tt
At the end, the whole subquery inside the FROM command will produce the table that is aliased as tt and it will have the following columns id, key1, key2, key3.
Then, with the initial SELECT, we finally select the id and key1 from that generated table (tt).

SQLite select all records and count

I have the following table:
CREATE TABLE sometable (my_id INTEGER PRIMARY KEY AUTOINCREMENT, name STRING, number STRING);
Running this query:
SELECT * FROM sometable;
Produces the following output:
1|someone|111
2|someone|222
3|monster|333
Along with these three fields I would also like to include a count representing the amount of times the same name exists in the table.
I've obviously tried:
SELECT my_id, name, count(name) FROM sometable GROUP BY name;
though that will not give me an individual result row for every record.
Ideally I would have the following output:
1|someone|111|2
2|someone|222|2
3|monster|333|1
Where the 4th column represents the amount of time this number exists.
Thanks for any help.
You can do this with a correlated subquery in the select clause:
Select st.*,
(SELECT count(*) from sometable st2 where st.name = st2.name) as NameCount
from sometable st;
You can also write this as a join to an aggregated subquery:
select st.*, stn.NameCount
from sometable st join
(select name, count(*) as NameCount
from sometable
group by name
) stn
on st.name = stn.name;
EDIT:
As for performance, the best way to find out is to try both and time them. The correlated subquery will work best when there is an index on sometable(name). Although aggregation is reputed to be slow in MySQL, sometimes this type of query gets surprisingly good results. The best answer is to test.
Select *, (SELECT count(my_id) from sometable) as total from sometable

What is the error "Every derived table must have its own alias" in MySQL?

I am running this query on MySQL
SELECT ID FROM (
SELECT ID, msisdn
FROM (
SELECT * FROM TT2
)
);
and it is giving this error:
Every derived table must have its own alias.
What's causing this error?
Every derived table (AKA sub-query) must indeed have an alias. I.e. each query in brackets must be given an alias (AS whatever), which can the be used to refer to it in the rest of the outer query.
SELECT ID FROM (
SELECT ID, msisdn FROM (
SELECT * FROM TT2
) AS T
) AS T
In your case, of course, the entire query could be replaced with:
SELECT ID FROM TT2
I think it's asking you to do this:
SELECT ID
FROM (SELECT ID,
msisdn
FROM (SELECT * FROM TT2) as myalias
) as anotheralias;
But why would you write this query in the first place?
Here's a different example that can't be rewritten without aliases ( can't GROUP BY DISTINCT).
Imagine a table called purchases that records purchases made by customers at stores, i.e. it's a many to many table and the software needs to know which customers have made purchases at more than one store:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases)
GROUP BY customer_id HAVING 1 < SUM(1);
..will break with the error Every derived table must have its own alias. To fix:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases) AS custom
GROUP BY customer_id HAVING 1 < SUM(1);
( Note the AS custom alias).
I arrived here because I thought I should check in SO if there are adequate answers, after a syntax error that gave me this error, or if I could possibly post an answer myself.
OK, the answers here explain what this error is, so not much more to say, but nevertheless I will give my 2 cents, using my own words:
This error is caused by the fact that you basically generate a new table with your subquery for the FROM command.
That's what a derived table is, and as such, it needs to have an alias (actually a name reference to it).
Given the following hypothetical query:
SELECT id, key1
FROM (
SELECT t1.ID id, t2.key1 key1, t2.key2 key2, t2.key3 key3
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
WHERE t2.key3 = 'some-value'
) AS tt
At the end, the whole subquery inside the FROM command will produce the table that is aliased as tt and it will have the following columns id, key1, key2, key3.
Then, with the initial SELECT, we finally select the id and key1 from that generated table (tt).