I am working with the employees table in SQL and I would like to fetch the data for max count of employees
SELECT (COUNT(emp_no)) AS emp_count, dept_no
FROM
dept_emp
GROUP BY dept_no
HAVING COUNT(emp_no) = (SELECT MAX(COUNT(emp_no)) FROM dept_emp)
ORDER BY emp_count DESC
So far this is what I have got but this results in an error saying 'Invalid use of group function'. There is another approach I followed by making a table first and then using the having clause but what would be the correct code in the above approach?
You don't have to use the having at all. The query as it is without the having will bring you all the departments with the number of employees at each one. The one with the most employees in the first row. If you want only that one you can add limit 1 at the end of the query.
You can't not use an aggregation over an aggretation, MAX(COUNT()) is invalid
SELECT (COUNT(emp_no)) AS emp_count, dept_no
FROM dept_emp
GROUP BY dept_no
HAVING COUNT(emp_no) = (
SELECT MAX(count_result) FROM (SELECT COUNT(emp_no) as count_result FROM dept_emp) as count_table
)
ORDER BY emp_count DESC
Side notes:
I think there is a missing WHERE in the subquery as the result will be always the same, as we are getting the MAX of a COUNT to a unfiltered table dept_emp
I think the MAX(COUNT()) is irrelevant in the subquery, since you can just order by the count and limit by one, for example SELECT COUNT(id) FROM foo ORDER BY COUNT(id) DESC LIMIT 1
If you can avoid the subqueries, databases are incredible slow understanding subqueries, if you are curious prepend EXPLAIN to the sql statement, and see what mysql does for it
Edit: If you provide the output of SHOW CREATE TABLE table_name for the tables that are involved in employee counting, I can give you the WHERE that you have to write in the subquery
select avg(select count(aid)
from athlete
group by codepays)
I get a "more than one row error".
How with I go about getting the average of the result from my fist select ?
You need to use a table expression (subquery).
For example:
select avg(cnt)
from (
select count(aid) as cnt
from athlete
group by codepays
) x
You can do this using division:
select count(aid) / count(distinct codepay)
from athlete;
No subqueries are necessary.
(Although the arithmetic needs to be tweaked if codepay can actually be NULL.)
I have the following two tables:
1. Lecturers (LectID, Fname, Lname, degree).
2. Lecturers_Specialization (LectID, Expertise).
I want to find the lecturer with the most Specialization.
When I try this, it is not working:
SELECT
L.LectID,
Fname,
Lname
FROM Lecturers L,
Lecturers_Specialization S
WHERE L.LectID = S.LectID
AND COUNT(S.Expertise) >= ALL (SELECT
COUNT(Expertise)
FROM Lecturers_Specialization
GROUP BY LectID);
But when I try this, it works:
SELECT
L.LectID,
Fname,
Lname
FROM Lecturers L,
Lecturers_Specialization S
WHERE L.LectID = S.LectID
GROUP BY L.LectID,
Fname,
Lname
HAVING COUNT(S.Expertise) >= ALL (SELECT
COUNT(Expertise)
FROM Lecturers_Specialization
GROUP BY LectID);
What is the reason? Thanks.
WHERE clause introduces a condition on individual rows; HAVING clause introduces a condition on aggregations, i.e. results of selection where a single result, such as count, average, min, max, or sum, has been produced from multiple rows. Your query calls for a second kind of condition (i.e. a condition on an aggregation) hence HAVING works correctly.
As a rule of thumb, use WHERE before GROUP BY and HAVING after GROUP BY. It is a rather primitive rule, but it is useful in more than 90% of the cases.
While you're at it, you may want to re-write your query using ANSI version of the join:
SELECT L.LectID, Fname, Lname
FROM Lecturers L
JOIN Lecturers_Specialization S ON L.LectID=S.LectID
GROUP BY L.LectID, Fname, Lname
HAVING COUNT(S.Expertise)>=ALL
(SELECT COUNT(Expertise) FROM Lecturers_Specialization GROUP BY LectID)
This would eliminate WHERE that was used as a theta join condition.
First we should know the order of execution of Clauses i.e
FROM > WHERE > GROUP BY > HAVING > DISTINCT > SELECT > ORDER BY.
Since WHERE Clause gets executed before GROUP BY Clause the records cannot be filtered by applying WHERE to a GROUP BY applied records.
"HAVING is same as the WHERE clause but is applied on grouped records".
first the WHERE clause fetches the records based on the condition then the GROUP BY clause groups them accordingly and then the HAVING clause fetches the group records based on the having condition.
HAVING operates on aggregates. Since COUNT is an aggregate function, you can't use it in a WHERE clause.
Here's some reading from MSDN on aggregate functions.
WHERE clause can be used with SELECT, INSERT, and UPDATE statements, whereas HAVING can be used only with SELECT statement.
WHERE filters rows before aggregation (GROUP BY), whereas HAVING filter groups after aggregations are performed.
Aggregate function cannot be used in WHERE clause unless it is in a subquery contained in HAVING clause, whereas aggregate functions can be used in HAVING clause.
Source
Didn't see an example of both in one query. So this example might help.
/**
INTERNATIONAL_ORDERS - table of orders by company by location by day
companyId, country, city, total, date
**/
SELECT country, city, sum(total) totalCityOrders
FROM INTERNATIONAL_ORDERS with (nolock)
WHERE companyId = 884501253109
GROUP BY country, city
HAVING country = 'MX'
ORDER BY sum(total) DESC
This filters the table first by the companyId, then groups it (by country and city) and additionally filters it down to just city aggregations of Mexico. The companyId was not needed in the aggregation but we were able to use WHERE to filter out just the rows we wanted before using GROUP BY.
You can not use where clause with aggregate functions because where fetch records on the basis of condition, it goes into table record by record and then fetch record on the basis of condition we have give. So that time we can not where clause. While having clause works on the resultSet which we finally get after running a query.
Example query:
select empName, sum(Bonus)
from employees
order by empName
having sum(Bonus) > 5000;
This will store the resultSet in a temporary memory, then having clause will perform its work. So we can easily use aggregate functions here.
1.
We can use aggregate function with HAVING clause not by WHERE clause e.g. min,max,avg.
2.
WHERE clause eliminates the record tuple by tuple
HAVING clause eliminates entire group from the collection of group
Mostly HAVING is used when you have groups of data and WHERE is used when you have data in rows.
WHERE clause is used to eliminate the tuples in a relation,and HAVING clause is used to eliminate the groups in a relation.
HAVING clause is used for aggregate functions such as
MIN,MAX,COUNT,SUM .But always use GROUP BY clause before HAVING clause to minimize the error.
Both WHERE and HAVING are used to filter data.
In case of a WHERE statement, data filtering happens before you pull the data for operation.
SELECT name, age
FROM employees
WHERE age > 30;
Here the WHERE clause filters rows before the SELECT operation is performed.
SELECT department, avg(age) avg_age
FROM employees
GROUP BY department
HAVING avg_age> 35;
HAVING filters the data after the SELECT operation is performed. Here the operation of computing (aggregation) is done first and then a filter is applied to the result using a HAVING clause.
SELECT SUM(SELECT type.value
FROM type,item_type
WHERE item_type.type_id = type.id AND item_type.type_id IN (4,7)
GROUP BY type.id)
What's wrong with this query? I would like to sum all rows coming from the internal query.
You can't use SUM() function with subquerys. According to the manual the SUM() function returns the total sum of a numeric column. The way you were doing, was something like that: SELECT 1+50+30+10. Where is the table you were selecting the values? The sintax is:
SELECT SUM(column) FROM table
Take a look at:
http://www.w3schools.com/sql/sql_func_sum.asp
The correct way is
SELECT t.id, SUM(t.value)
FROM type as t,
INNER JOIN item_type as it
ON it.type_id = t.id
WHERE it.type_id IN (4,7)
GROUP BY t.id
Consider to use JOIN sintax instead of multiple tables: SQL left join vs multiple tables on FROM line?
You should learn to use proper join syntax and table aliases:
SELECT SUM(t.value)
FROM type t JOIN
item_type it
ON it.type_id = t.id
WHERE it.type_id IN (4,7);
If you want one row for each type.id, then you need a GROUP BY.
Your query doesn't work because subqueries are not allowed as arguments to aggregation functions. Even if they were, the context would be for a scalar subquery and your subquery is likely to return more than one row.
Just use SUM without internal query :
SELECT type.id, SUM(type.value)
FROM type,item_type
WHERE item_type.type_id = type.id AND item_type.type_id IN (4,7)
GROUP BY type.id
Why can't I write queries like
select *
from STAFF
having Salary > avg(salary);
in SQL?
PS: to find staff members with a salary = avg salary, I need to write
select *
from STAFF
having Salary > (select avg(Salary) from STAFF);
Is there any other way to do this?
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
The functions SUM(),AVG(),MIN(),MAX(),COUNT(),etc are called aggregate functions. Read more here.
Example using WHERE clause :
select *
from staff
where salary > (select avg(salary) from staff)
See example in SQL Fiddle.
Example using HAVING clause :
select deptid,COUNT(*) as TotalCount
from staff
group by deptid
having count(*) >= 2
See example in SQL Fiddle.
Where can we use having clause:
Having clause specifies a search condition for a group or an aggregate. HAVING can be used only with the SELECT statement. HAVING is typically used in a GROUP BY clause. When GROUP BY is not used, HAVING behaves like a WHERE clause.
Read more here.