Why does DISTINCT has to go first in MySQL? - mysql

I have a query that works when I do
SELECT DISTINCT(table.field.id), 1 FROM ...
but fails when I do
SELECT 1, DISTINCT(table.field.id) FROM ...
Is this a known behavior?
Why does the first one work while the second doesn't?

Unfortunately I'm not able to add a comment yet.
What #Gordon Linoff has written is exactly right.
You are getting error as DISTINCT in general works as part of SELECT clause or AGGREGATE function. It is used to return unique rows from a result set and it can be used to force unique column values within an aggregate function.
Examples: SELECT DISTINCT * ... COUNT(DISTINCT COLUMN) or SUM(DISTINCT COLUMN).
More information's about DISTINCT in popular DB engines:
PostgreSQL:https://www.postgresql.org/docs/9.0/static/sql-select.html#SQL-DISTINCT
SQL Server: https://www.techonthenet.com/sql_server/distinct.php
Oracle: https://www.techonthenet.com/oracle/distinct.php
MySQL:
https://dev.mysql.com/doc/refman/5.7/en/distinct-optimization.html
https://dev.mysql.com/doc/refman/5.7/en/select.html

Related

How to find rows in SQL that end with the same string?

I have a question similar to the one found here: How to find rows in SQL that start with the same string (similar rows)?, and this solution works in MySQL 5.6 but not 5.7.
I have a database (t) with multiple columns, the important ones being id and filepath, and what I am trying to accomplish is retrieving all the file paths which have the same last 5 characters. The following works in MySQL5.6, and the second SELECT works fine in 5.7:
SELECT id, filepath FROM t
WHERE SUBSTRING(filepath, -5) IN
(
SELECT SUBSTRING(filepath, -5)
FROM t
GROUP BY SUBSTRING(filepath, -5)
HAVING COUNT(*) > 1
)
But when I try to run it on 5.7 I get the error
Expression #1 of HAVING clause is not in GROUP BY clause and contains
nonaggregated column 't.filepath' which is not functionally dependent on
columns in GROUP BY clause; this is incompatible with
sql_mode=only_full_group_by
Sample data:
id filepath
1 /Desktop/file1.txt
2 /Desktop/file2.txt
3 /Desktop/file1.txt
and I would want to return the rows with id 1 and 3. How can I fix this for MySQL5.7?
EDIT: Also can anybody point me in the right direction for the SQL to remove the duplicates? So I would want to remove the entry for id 3 but keep the entry for id 1 and 2.
Please read the mysql documentation on the subject GROUP BY and sql_mode only_full_group_by (like your error message says):
https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
I think changing the inner query to this might fix the problem:
SELECT SUBSTRING(filepath, -5) AS fpath
FROM t
GROUP BY fpath
HAVING COUNT(fpath) > 1
Edit:
As to your question of why adding the "AS fpath" works:
Adding the alias "fpath" is just a clean way to do this. The point of ONLY_FULL_GROUP_BY is that each field you use in the SELECT, HAVING, or ORDER BY must also be in the GROUP BY.
So I added the fpath-alias for multiple reasons:
For performance: The query you wrote had SUBSTRING(filepath, -5) twice, which
is bad for performance. Mysql has to execute that SUBSTRING call twice,
while in my case it has to do it only once (per row).
To fix the group-by issue: You had COUNT() in the having, but "" was not in your GROUP BY statement (I'm not even sure whether that would be possible). You had to count "something", so since "fpath" was in your SELECT and in your GROUP BY, using that as your COUNT() would fix the problem.
I prefer not to put subqueries in an IN() predicate because MySQL tends to run the subquery many times.
You can write the query differently to put the subquery in the FROM clause as a derived table. That will make MySQL run the subquery just once.
SELECT id, filepath
FROM (
SELECT SUBSTRING(filepath, -5) AS suffix, COUNT(*) AS count
FROM t
GROUP BY suffix
HAVING count > 1
) AS t1
JOIN t AS t2 ON SUBSTRING(t2.filepath, -5) = t1.suffix
This is bound to do a table-scan though, so it's going to be a costly query. It can't use an index when doing a substring comparison like that.
To optimize this, you might create a virtual column with an index.
ALTER TABLE t
ADD COLUMN filepath_last VARCHAR(10) AS (SUBSTRING_INDEX(filepath, '/', -1)),
ADD KEY (filepath_last);
Then you can query it like this, and at least the subquery uses an index:
SELECT id, filepath
FROM (
SELECT filepath_last, COUNT(*) AS count
FROM t
GROUP BY filepath_last
HAVING count > 1
) AS t1
STRAIGHT_JOIN t AS t2 ON t2.filepath_last = t1.filepath_last
The solution that ended up working for me was found here: Disable ONLY_FULL_GROUP_BY
I ran SELECT ##sql_mode then SET ##sql_mode = followed by a string containing all the values returned by the first query except for only_full_group_by, but I'm still interested in how this is to be accomplished without changing the SQL settings.

TSQL WITH - second reference fails

SQL Server 2008 R2 - Query from 2014 SSMS but fails from code as well.
Strange - first reference to table B works, second fails with an 'Invalid object B' error. What am I doing wrong? GO's don't help.
WITH B as (SELECT BatchOutId, SettleMerchantCode, BatchDate, BatchStatusCode, BatchTransCnt, BatchTotAmt, BatchAdjustAmt, BatchAdjustCnt
FROM MAF01
GROUP BY BatchOutId, SettleMerchantCode, BatchDate, BatchStatusCode, BatchTransCnt, BatchTotAmt, BatchAdjustAmt, BatchAdjustCnt)
SELECT * FROM B ORDER BY BatchOutId DESC
SELECT * FROM B ORDER BY BatchOutId DESC
This is as expected.
CTEs are only in scope for the next statement. They are just named queries.
You would need to either
Repeat the definition of the CTE.
Move the definition out into a view or inline function.
Materialise the results into a temp table.
Depending on what you were expecting to happen.
A cte is only valid for one query, not an entire batch. So once you do the first SELECT * FROM B, that query is finished. The next query no longer has access to the cte that the first query used.
I know this has to be duplicate
You can have multiple CTE but only one statement
The two Select are two statements
If you use a #temp then you can have more than one statement

Can not Access Aliased Columns in Select Statement (MySQL)

I have a problem with Aliased Columns in MySQL!
My Query:
SELECT Price AS Pr, (Pr*10/100) FROM MyTable;
MySQL WorkBench Error: UnKnown Column 'Pr' in Field List !!!
I tested my query in W3Schools with no error !
I tested my query in W3Schools with no error!
This doesn't prove that your query is valid.
You can only use aliases in GROUP BY, ORDER BY or HAVING clauses. Your usage variant is not allowed, because the value of alias is not known when MySQL is selecting the 2-nd column.
I've got a suspicion that W3Schools uses MS Access to run user queries, and MS Access does allow such atrocity as referencing column aliases in a SELECT clause that are defined in the same SELECT clause.
The standard doesn't allow this and MySQL does follow standard in this particular case.
As for solution to your problem, I can see two options.
The more generic solution, which would run in probably any SQL product, would be to use a derived table:
SELECT
Pr,
(Pr * 10 / 100) AS SomethingElse
FROM
(
SELECT
SomeComplexExpression AS Pr
FROM MyTable
) AS sub
;
The other option would be to use a variable, which is MySQL-specific:
SELECT
#Pr := SomeComplexExpression AS Pr,
(#Pr * 10 / 100) AS SomethingElse
FROM MyTable
;
Finally, if you need to test/demonstrate if something can/cannot work in MySQL, I'd recommend using SQL Fiddle.

AS not working with COUNT(*) and UNION

So I have joined 3 queries together using UNIONs and want to count the number of lines in the result, but it's a bit weird. It actually works, and gives the correct answer, but it doesn't assign the "AS" part correctly.
SELECT COUNT(*) FROM (
(Long Select Statement)
UNION
(AnotherLong Select Statement)
UNION
(Even Longer Select Statement)
)
AS NoOfTweets";
The outcome is correct, but instead of assigning it to "NoOfTweets" it assigns it to "Count(*)". If I remove the "AS NoOfTweets" it stops working. If I remove some brackets it stops working. I'm running low on ideas after a long day! I can post the whole code if needs be but would rather not as it's quite long and I think that bit works.
Thanks in advance, Jack.
Edit: Fixed with:
SELECT COUNT(*) NoOfTweets FROM (
(Long Select Statement)
UNION
(AnotherLong Select Statement)
UNION
(Even Longer Select Statement)
)
AS NoOfTweets";
Thanks guys :)
You aren't putting it in the correct location. The beginning of your query should look like this:
SELECT COUNT(*) AS NoOfTweets
More on Column Alias
SELECT COUNT(*) NoOfTweets FROM
(Long Select Statement)
UNION
(AnotherLong Select Statement)
UNION
(Even Longer Select Statement)
or
SELECT COUNT(*) AS NoOfTweets FROM
(Long Select Statement)
UNION
(AnotherLong Select Statement)
UNION
(Even Longer Select Statement)
You have to use AS exactly after the item you are counting:
SELECT COUNT(*) AS `NoOfTweets`
FROM ( ... )
Also be careful with the " you have near the end. Or maybe it comes from a longer string.
The error is Every derived table must have its own alias which is something I didn't know, so thanks for the education :)
http://sqlfiddle.com/#!9/d30f4/4
Nice of MySQL to give an explanation - I tried with MS SQL on SQLFiddle and just got Incorrect syntax near ')'. which isn't so helpful!
So, your 'NoOfTweets' is the name given to the results column, and also to the 'derived table' which is required by the SQL engine but could be a different name ... it's not returned in the results. The point of naming a derived table is in case you wish to JOIN to other tables and reference the fields in the joins.

Checksum of SELECT results in MySQL

Trying to get a check sum of results of a SELECT statement, tried this
SELECT sum(crc32(column_one))
FROM database.table;
Which worked, but this did not work:
SELECT CONCAT(sum(crc32(column_one)),sum(crc32(column_two)))
FROM database.table;
Open to suggestions, main idea is to get a valid checksum for the SUM of the results of rows and columns from a SELECT statement.
The problem is that CONCAT and SUM are not compatible in this format.
CONCAT is designed to run once per row in your result set on the arguments as defined by that row.
SUM is an aggregate function, designed to run on a full result set.
CRC32 is of the same class of functions as CONCAT.
So, you've got functions nested in a way that just don't play nicely together.
You could try:
SELECT CONCAT(
(SELECT sum(crc32(column_one)) FROM database.table),
(SELECT sum(crc32(column_two)) FROM database.table)
);
or
SELECT sum(crc32(column_one)), sum(crc32(column_two))
FROM database.table;
and concatenate them with your client language.
SELECT SUM(CRC32(CONCAT(column_one, column_two)))
FROM database.table;
or
SELECT SUM(CRC32(column_one) + CRC32(column_two))
FROM database.table;