Omit aggregate function - mysql

According to w3schools group by needs an aggregate_function(column_name).
SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name
What it will happen if you commit this aggregate_function?
Thanks

If you are looking for best practice, do not use GROUP BY unless you are using an aggregate function. MySQL is unique in that it allows you to specify additional columns without using aggregates. This is not found in other DBMS's.
MySQL extends the use of GROUP BY to permit selecting fields that are not mentioned in the GROUP BY clause.
The other effect that GROUP BY has is it's sorting property.
If you use GROUP BY, output rows are sorted according to the GROUP BY columns as if you had an ORDER BY for the same columns.
If you do not include any aggregate functions, it will most likely show the DISTINCT values of the grouped column.

Depending on MySQL version and ONLY_FULL_GROUP_BY mode, it either rejects the query or picks any value from each group (it is undefined which one it chooses).
Here is a thorough description on this: MySQL Handling of GROUP BY

Related

Explanation query with GROUP_BY and ONLY_FULL_GROUP_BY

I want to understand how queries works with ONLY_FULL_GROUP_BY enabled.
If i list all the columns of the table with a MIN() on one column, it works fine:
$query = "SELECT id, member_id, name, code, MIN(price) AS price, FROM tbl_product GROUP BY code";
But if I select everything I have an error:
$query = "SELECT *, MIN(price) AS price FROM tbl_product GROUP BY code";
Can you explain me the differences between both ?
It's about a bug that was fixed in MySQL 5.7.5. According to the manual 12.20.3 MySQL Handling of GROUP BY MySQL 5.7.5 and newer can detect functional dependence between the primary key and the rest of the columns of the table. Literally it says:
MySQL 5.7.5 and up implements detection of functional dependence. If the ONLY_FULL_GROUP_BY SQL mode is enabled (which it is by default), MySQL rejects queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor are functionally dependent on them.
ONLY_FULL_GROUP_BY is now the default option and works as specified by the SQL Standard.
The question is really why you would think that group by would work with select *.
What group by does is produce one row for each combination of values for the group by keys. That is by definition. Multiple rows become one.
The expressions allowed in the select are then:
The group by keys or expressions containing only those keys.
Summary functions on other values.
Combinations of summary functions with group by keys.
Any column in the select that is not in the group by could have multiple values among the original row. SQL does not allow this. Most databases do not allow this. MySQL no longer allows this by default.
Once upon a time it did, but the returned values were from indeterminate matching rows. That "functionality" (really a bug) has now been fixed.
Note: There is an exception to this -- allowed by the standard -- that allows aggregating by primary keys/unique keys and then select the rest of the columns. This is allowed because the primary key uniquely identifies the rest of the column values.

query on self defined variable [duplicate]

I need to use an alias in the WHERE clause, but It keeps telling me that its an unknown column. Is there any way to get around this issue? I need to select records that have a rating higher than x. Rating is calculated as the following alias:
sum(reviews.rev_rating)/count(reviews.rev_id) as avg_rating
You could use a HAVING clause, which can see the aliases, e.g.
HAVING avg_rating>5
but in a where clause you'll need to repeat your expression, e.g.
WHERE (sum(reviews.rev_rating)/count(reviews.rev_id))>5
BUT! Not all expressions will be allowed - using an aggregating function like SUM will not work, in which case you'll need to use a HAVING clause.
From the MySQL Manual:
It is not allowable to refer to a
column alias in a WHERE clause,
because the column value might not yet
be determined when the WHERE clause
is executed. See Section B.1.5.4,
“Problems with Column Aliases”.
I don't know if this works in mysql, but using sqlserver you can also just wrap it like:
select * from (
-- your original query
select .. sum(reviews.rev_rating)/count(reviews.rev_id) as avg_rating
from ...) Foo
where Foo.avg_rating ...
This question is quite old and one answer already gained 160 votes...
Still I would make this clear: The question is actually not about whether alias names can be used in the WHERE clause.
sum(reviews.rev_rating) / count(reviews.rev_id) as avg_rating
is an aggregation. In the WHERE clause we restrict records we want from the tables by looking at their values. sum(reviews.rev_rating) and count(reviews.rev_id), however, are not values we find in a record; they are values we only get after aggregating the records.
So WHERE is inappropriate. We need HAVING, as we want to restrict result rows after aggregation. It can't be
WHERE avg_rating > 10
nor
WHERE sum(reviews.rev_rating) / count(reviews.rev_id) > 10
hence.
HAVING sum(reviews.rev_rating) / count(reviews.rev_id) > 10
on the other hand is possible and complies with the SQL standard. Whereas
HAVING avg_rating > 10
is only possible in MySQL. It is not valid SQL according to the standard, as the SELECT clause is supposed to get executed after HAVING. From the MySQL docs:
Another MySQL extension to standard SQL permits references in the HAVING clause to aliased expressions in the select list.
The MySQL extension permits the use of an alias in the HAVING clause for the aggregated column
https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
SELECT * FROM (SELECT customer_Id AS 'custId', gender, age FROM customer
WHERE gender = 'F') AS c
WHERE c.custId = 100;
If your query is static, you can define it as a view then you can use that alias in the where clause while querying the view.

For My SQL 'Group By', what is the criteria of picking one row from many rows?

For My SQL 'Group By', what is the criteria of picking one row from many rows? For example if I use group by user_id would it choose the row in some order or in some random way?
For example this table
id user_id message created_at
1 1 a 2016-08-25 07:00:15
2 2 c 2016-08-25 08:00:15
3 1 b 2016-08-25 09:46:15
4 2 d 2016-08-25 10:49:12
who will group by user_id find which row to take for user_id=1 row 1 or 3 because I could find any solution.
It will find the one specified in the aggregation (MAX(), MIN() etc.) statement, as you should only select grouped or aggregated columns when using GROUP BY.
Otherwise it is not determined which value will be chosen, it is pretty random.
Also see the MySQL manual:
https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
MySQL 5.7.5 and up implements detection of functional dependence. If
the ONLY_FULL_GROUP_BY SQL mode is enabled (which it is by default),
MySQL rejects queries for which the select list, HAVING condition, or
ORDER BY list refer to nonaggregated columns that are neither named in
the GROUP BY clause nor are functionally dependent on them.
So since MySQL 5.7 you explicitly have to enable an option so mysql can execute those queries.
Before MySQL 5.7 it allowed those queries but, as mentioned, chose the values of the nonaggegated and nongrouped fields randomly.
Group by works on a specific field. If you group by user_id and SELECT any other column then that column from that particular GROUP will be selected randomly.
That is why it is not recommended to SELECT the field which is not in GROUP BY clause.
who will group by user_id find which row to take for user_id=1 row 1
or 3 because i could find any solution.
Yes it will take other fields randomly.
If you have a query like
select user_id from yourtable group by user_id
then it does not matter from which record the values come from. However, if you have a query like
select user_id, created_at from yourtable group by user_id
where you have a field in the select list that is not subject of an aggregate function (max(), min(), etc), then as MySQL documentation on MySQL Handling of GROUP BY says:
In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want.
In reality, MySQL will pick the value for such fields from the 1st record it encounters while assembling the resultset.
Pls alo note that unless such fields are functionally dependent on the fields in the group by, the query is against all sql standards. In MySQL you can use the only_full_group_by sql mode setting (also part of the strict sql mode) to determine if MySQL accepts such queries at all. In the more recent versions of MySQL this qsl mode is turned on by default preventing you to run such queries without changing the settings.
The GROUP BY clause does not return rows from the database. It generates values using the rows filtered by the WHERE clause.
There are three types of columns that are valid in the expressions present in the SELECT clause of a query that contains a GROUP BY clause:
columns that also appear in the GROUP BY clause;
columns that are functionally dependent on the columns that appear in the GROUP BY clause;
any column can be used as argument of a GROUP BY aggregate function.
A GROUP BY query whose columns present in the SELECT clause do not follow the rules above is invalid SQL.
Up to version 5.7.5, MySQL allows invalid GROUP BY queries. It is explained in the documentation that for the columns that do not follow the rules above, "the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want."
Since version 5.7.5 MySQL rejects such invalid queries. Other RDBMSes (SQL Server, Oracle etc) do not allow them too because, well, they are invalid SQL.

using coalesce or if results in where clause mysql

I use some sql like this:
SELECT COALESCE(group.display,item.display) as display....
I would like to add in the WHERE clause:
WHERE display='1'
WHERE display is the result of the coalesce.
Similarly I'd like to be able to do the same with something like this:
IF(ISNULL(gd.group_main_image),p.main_image,gd.group_main_image) AS image
... WHERE image IS NOT NULL
How can I do this please?
You can't use aliases in the same level of query.
You must repeat yourself.
WHERE COALESCE(group.display,item.display) = '1'
EDIT
Well, I've been too restrictive. You can use alias in an having clause in MySql. You can't do that in other DBMS (Oracle, SQl Server). Generally it's also not permitted in ANSI SQL.
You can't use column aliases in WHERE clause
So:
WHERE COALESCE(group.display,item.display)='1'
OR:
HAVING display='1'
However, HAVING is performed after all the result-set is discovered, so basically this is more memory consuming
As described in Problems with Column Aliases:
An alias can be used in a query select list to give a column a different name. You can use the alias in GROUP BY, ORDER BY, or HAVING clauses to refer to the column:
SELECT SQRT(a*b) AS root FROM tbl_name
GROUP BY root HAVING root > 0;
SELECT id, COUNT(*) AS cnt FROM tbl_name
GROUP BY id HAVING cnt > 0;
SELECT id AS 'Customer identity' FROM tbl_name;
Standard SQL disallows references to column aliases in a WHERE clause. This restriction is imposed because when the WHERE clause is evaluated, the column value may not yet have been determined. For example, the following query is illegal:
SELECT id, COUNT(*) AS cnt FROM tbl_name
WHERE cnt > 0 GROUP BY id;
The WHERE clause determines which rows should be included in the GROUP BY clause, but it refers to the alias of a column value that is not known until after the rows have been selected, and grouped by the GROUP BY.

Can you use an alias in the WHERE clause in mysql?

I need to use an alias in the WHERE clause, but It keeps telling me that its an unknown column. Is there any way to get around this issue? I need to select records that have a rating higher than x. Rating is calculated as the following alias:
sum(reviews.rev_rating)/count(reviews.rev_id) as avg_rating
You could use a HAVING clause, which can see the aliases, e.g.
HAVING avg_rating>5
but in a where clause you'll need to repeat your expression, e.g.
WHERE (sum(reviews.rev_rating)/count(reviews.rev_id))>5
BUT! Not all expressions will be allowed - using an aggregating function like SUM will not work, in which case you'll need to use a HAVING clause.
From the MySQL Manual:
It is not allowable to refer to a
column alias in a WHERE clause,
because the column value might not yet
be determined when the WHERE clause
is executed. See Section B.1.5.4,
“Problems with Column Aliases”.
I don't know if this works in mysql, but using sqlserver you can also just wrap it like:
select * from (
-- your original query
select .. sum(reviews.rev_rating)/count(reviews.rev_id) as avg_rating
from ...) Foo
where Foo.avg_rating ...
This question is quite old and one answer already gained 160 votes...
Still I would make this clear: The question is actually not about whether alias names can be used in the WHERE clause.
sum(reviews.rev_rating) / count(reviews.rev_id) as avg_rating
is an aggregation. In the WHERE clause we restrict records we want from the tables by looking at their values. sum(reviews.rev_rating) and count(reviews.rev_id), however, are not values we find in a record; they are values we only get after aggregating the records.
So WHERE is inappropriate. We need HAVING, as we want to restrict result rows after aggregation. It can't be
WHERE avg_rating > 10
nor
WHERE sum(reviews.rev_rating) / count(reviews.rev_id) > 10
hence.
HAVING sum(reviews.rev_rating) / count(reviews.rev_id) > 10
on the other hand is possible and complies with the SQL standard. Whereas
HAVING avg_rating > 10
is only possible in MySQL. It is not valid SQL according to the standard, as the SELECT clause is supposed to get executed after HAVING. From the MySQL docs:
Another MySQL extension to standard SQL permits references in the HAVING clause to aliased expressions in the select list.
The MySQL extension permits the use of an alias in the HAVING clause for the aggregated column
https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
SELECT * FROM (SELECT customer_Id AS 'custId', gender, age FROM customer
WHERE gender = 'F') AS c
WHERE c.custId = 100;
If your query is static, you can define it as a view then you can use that alias in the where clause while querying the view.