Example:
SELECT customer_id, address_id as addressID
FROM customer
WHERE addressID = 5
But, using the HAVING clause works perfectly fine. So why aliases don't work in the where clause?
Only MySQL permits alises in HAVING, it is not standard SQL (see here: https://dba.stackexchange.com/questions/50391/why-does-mysql-allow-having-to-use-select-aliases ) please note that no other major RDBMS allows the use of aliases in WHERE or HAVING.
The reason you can't use aliases in WHERE (and HAVING) is because SELECT is actually evaluated after most other sub-clauses: https://stackoverflow.com/a/21693272/159145
A SELECT query is evaluated, conceptually, in the following order:
The FROM clause
The WHERE clause
The GROUP BY clause
The HAVING clause
The SELECT clause
The ORDER BY clause
So your query:
SELECT
customer_id,
address_id AS addressID
FROM
customer
WHERE
addressID = 5
Is evaluated in this order:
1: FROM
customer
2: WHERE
address_id = 5
3: SELECT
customer_id,
address_id AS addressID
As you cans see, if the WHERE part referenced addressID instead of address_id the query execution engine would complain because addressID is not defined at that point.
MySQL does permit the referencing of (normal) aliases in HAVING by doing a (non-standard) neat trick where it partially evaluates the SELECT before it evaluates HAVING - and because MySQL has a handling of aliases that means the evaluation engine can be sure that the alias is valid (which is why most other RDBMS engines don't allow the use of aliases in HAVING when they otherwise should be able to). But you can't use an alias in WHERE because if there's a GROUP BY then it might render an alias meaningless, consider:
SELECT
SUM( foo ) AS baz,
created
FROM
foo
WHERE
baz > 5 -- Meaningless: the GROUP BY hasn't been evaluated yet, so `baz` is unavailable
GROUP BY
created
MySQL explains this in their manual: https://dev.mysql.com/doc/refman/5.7/en/problems-with-alias.html
Standard SQL disallows references to column aliases in a WHERE clause. This restriction is imposed because when the WHERE clause is evaluated, the column value may not yet have been determined.
The WHERE clause determines which rows should be included in the GROUP BY clause, but it refers to the alias of a column value that is not known until after the rows have been selected, and grouped by the GROUP BY.
Related
I'm using MySQL and I have the following table employees: table.
I had an exercise in which I had to select the oldest person. I know the correct way to do that is with a subquery: SELECT name, dob FROM employees WHERE dob = (SELECT MIN(dob) FROM employees).
However, I did it like so: SELECT name, dob FROM employees HAVING dob = MIN(dob). Now this returns an empty set, but doesn't throw any errors. So what does it do exactly? I've read that MySQL allows to refer to columns from SELECT clause in HAVING clause, without any GROUP BY clause. But why does it return an empty set?
When you use MAX (or other aggregate functions) in the select columns or the having clause, you cause an implicit GROUP BY () (that is, all rows are grouped together into a single result row).
And when grouping (whether all rows or with a specific GROUP BY), if you specify a column outside of an aggregate function (such as your dob =) that is not one of the things being aggregated on or something functionally dependent on it (for example, some other column in a table when you are grouping by the primary key for that table), one of two things will happen:
If you have enabled the ONLY_FULL_GROUP_BY sql_mode (which is the default in newer versions), you will receive an error:
In aggregated query without GROUP BY, expression ... contains nonaggregated column '...'; this is incompatible with sql_mode=only_full_group_by
If you have not enabled ONLY_FULL_GROUP_BY, a value from some arbitrary one of the grouped rows will be used. So it is possible your dob = MIN(dob) will be true (and it will definitely be true if all rows have the same dob), but you can't rely on it doing anything useful and should avoid doing this.
I want to understand how queries works with ONLY_FULL_GROUP_BY enabled.
If i list all the columns of the table with a MIN() on one column, it works fine:
$query = "SELECT id, member_id, name, code, MIN(price) AS price, FROM tbl_product GROUP BY code";
But if I select everything I have an error:
$query = "SELECT *, MIN(price) AS price FROM tbl_product GROUP BY code";
Can you explain me the differences between both ?
It's about a bug that was fixed in MySQL 5.7.5. According to the manual 12.20.3 MySQL Handling of GROUP BY MySQL 5.7.5 and newer can detect functional dependence between the primary key and the rest of the columns of the table. Literally it says:
MySQL 5.7.5 and up implements detection of functional dependence. If the ONLY_FULL_GROUP_BY SQL mode is enabled (which it is by default), MySQL rejects queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor are functionally dependent on them.
ONLY_FULL_GROUP_BY is now the default option and works as specified by the SQL Standard.
The question is really why you would think that group by would work with select *.
What group by does is produce one row for each combination of values for the group by keys. That is by definition. Multiple rows become one.
The expressions allowed in the select are then:
The group by keys or expressions containing only those keys.
Summary functions on other values.
Combinations of summary functions with group by keys.
Any column in the select that is not in the group by could have multiple values among the original row. SQL does not allow this. Most databases do not allow this. MySQL no longer allows this by default.
Once upon a time it did, but the returned values were from indeterminate matching rows. That "functionality" (really a bug) has now been fixed.
Note: There is an exception to this -- allowed by the standard -- that allows aggregating by primary keys/unique keys and then select the rest of the columns. This is allowed because the primary key uniquely identifies the rest of the column values.
I need to use an alias in the WHERE clause, but It keeps telling me that its an unknown column. Is there any way to get around this issue? I need to select records that have a rating higher than x. Rating is calculated as the following alias:
sum(reviews.rev_rating)/count(reviews.rev_id) as avg_rating
You could use a HAVING clause, which can see the aliases, e.g.
HAVING avg_rating>5
but in a where clause you'll need to repeat your expression, e.g.
WHERE (sum(reviews.rev_rating)/count(reviews.rev_id))>5
BUT! Not all expressions will be allowed - using an aggregating function like SUM will not work, in which case you'll need to use a HAVING clause.
From the MySQL Manual:
It is not allowable to refer to a
column alias in a WHERE clause,
because the column value might not yet
be determined when the WHERE clause
is executed. See Section B.1.5.4,
“Problems with Column Aliases”.
I don't know if this works in mysql, but using sqlserver you can also just wrap it like:
select * from (
-- your original query
select .. sum(reviews.rev_rating)/count(reviews.rev_id) as avg_rating
from ...) Foo
where Foo.avg_rating ...
This question is quite old and one answer already gained 160 votes...
Still I would make this clear: The question is actually not about whether alias names can be used in the WHERE clause.
sum(reviews.rev_rating) / count(reviews.rev_id) as avg_rating
is an aggregation. In the WHERE clause we restrict records we want from the tables by looking at their values. sum(reviews.rev_rating) and count(reviews.rev_id), however, are not values we find in a record; they are values we only get after aggregating the records.
So WHERE is inappropriate. We need HAVING, as we want to restrict result rows after aggregation. It can't be
WHERE avg_rating > 10
nor
WHERE sum(reviews.rev_rating) / count(reviews.rev_id) > 10
hence.
HAVING sum(reviews.rev_rating) / count(reviews.rev_id) > 10
on the other hand is possible and complies with the SQL standard. Whereas
HAVING avg_rating > 10
is only possible in MySQL. It is not valid SQL according to the standard, as the SELECT clause is supposed to get executed after HAVING. From the MySQL docs:
Another MySQL extension to standard SQL permits references in the HAVING clause to aliased expressions in the select list.
The MySQL extension permits the use of an alias in the HAVING clause for the aggregated column
https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
SELECT * FROM (SELECT customer_Id AS 'custId', gender, age FROM customer
WHERE gender = 'F') AS c
WHERE c.custId = 100;
If your query is static, you can define it as a view then you can use that alias in the where clause while querying the view.
I use some sql like this:
SELECT COALESCE(group.display,item.display) as display....
I would like to add in the WHERE clause:
WHERE display='1'
WHERE display is the result of the coalesce.
Similarly I'd like to be able to do the same with something like this:
IF(ISNULL(gd.group_main_image),p.main_image,gd.group_main_image) AS image
... WHERE image IS NOT NULL
How can I do this please?
You can't use aliases in the same level of query.
You must repeat yourself.
WHERE COALESCE(group.display,item.display) = '1'
EDIT
Well, I've been too restrictive. You can use alias in an having clause in MySql. You can't do that in other DBMS (Oracle, SQl Server). Generally it's also not permitted in ANSI SQL.
You can't use column aliases in WHERE clause
So:
WHERE COALESCE(group.display,item.display)='1'
OR:
HAVING display='1'
However, HAVING is performed after all the result-set is discovered, so basically this is more memory consuming
As described in Problems with Column Aliases:
An alias can be used in a query select list to give a column a different name. You can use the alias in GROUP BY, ORDER BY, or HAVING clauses to refer to the column:
SELECT SQRT(a*b) AS root FROM tbl_name
GROUP BY root HAVING root > 0;
SELECT id, COUNT(*) AS cnt FROM tbl_name
GROUP BY id HAVING cnt > 0;
SELECT id AS 'Customer identity' FROM tbl_name;
Standard SQL disallows references to column aliases in a WHERE clause. This restriction is imposed because when the WHERE clause is evaluated, the column value may not yet have been determined. For example, the following query is illegal:
SELECT id, COUNT(*) AS cnt FROM tbl_name
WHERE cnt > 0 GROUP BY id;
The WHERE clause determines which rows should be included in the GROUP BY clause, but it refers to the alias of a column value that is not known until after the rows have been selected, and grouped by the GROUP BY.
I need to use an alias in the WHERE clause, but It keeps telling me that its an unknown column. Is there any way to get around this issue? I need to select records that have a rating higher than x. Rating is calculated as the following alias:
sum(reviews.rev_rating)/count(reviews.rev_id) as avg_rating
You could use a HAVING clause, which can see the aliases, e.g.
HAVING avg_rating>5
but in a where clause you'll need to repeat your expression, e.g.
WHERE (sum(reviews.rev_rating)/count(reviews.rev_id))>5
BUT! Not all expressions will be allowed - using an aggregating function like SUM will not work, in which case you'll need to use a HAVING clause.
From the MySQL Manual:
It is not allowable to refer to a
column alias in a WHERE clause,
because the column value might not yet
be determined when the WHERE clause
is executed. See Section B.1.5.4,
“Problems with Column Aliases”.
I don't know if this works in mysql, but using sqlserver you can also just wrap it like:
select * from (
-- your original query
select .. sum(reviews.rev_rating)/count(reviews.rev_id) as avg_rating
from ...) Foo
where Foo.avg_rating ...
This question is quite old and one answer already gained 160 votes...
Still I would make this clear: The question is actually not about whether alias names can be used in the WHERE clause.
sum(reviews.rev_rating) / count(reviews.rev_id) as avg_rating
is an aggregation. In the WHERE clause we restrict records we want from the tables by looking at their values. sum(reviews.rev_rating) and count(reviews.rev_id), however, are not values we find in a record; they are values we only get after aggregating the records.
So WHERE is inappropriate. We need HAVING, as we want to restrict result rows after aggregation. It can't be
WHERE avg_rating > 10
nor
WHERE sum(reviews.rev_rating) / count(reviews.rev_id) > 10
hence.
HAVING sum(reviews.rev_rating) / count(reviews.rev_id) > 10
on the other hand is possible and complies with the SQL standard. Whereas
HAVING avg_rating > 10
is only possible in MySQL. It is not valid SQL according to the standard, as the SELECT clause is supposed to get executed after HAVING. From the MySQL docs:
Another MySQL extension to standard SQL permits references in the HAVING clause to aliased expressions in the select list.
The MySQL extension permits the use of an alias in the HAVING clause for the aggregated column
https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
SELECT * FROM (SELECT customer_Id AS 'custId', gender, age FROM customer
WHERE gender = 'F') AS c
WHERE c.custId = 100;
If your query is static, you can define it as a view then you can use that alias in the where clause while querying the view.