SQL HAVING clause with aggregate fuctions? - mysql

I am trying to find the employees of company that have salary bigger than the average salary of all the employees. I would like to point out from the start that I don't want the average salary to be present in the final result, so I ommit it in the SELECT statement. These are the things I've tried:
SELECT employee.lastname,employee.firstname,employee.salary FROM employee
GROUP BY employee.salary
HAVING employee.salary > avg(employee.salary);
This results in an empty result table
However the following surprisingly returns all the employees of the company despite the '=' symbol.
SELECT employee.lastname,employee.firstname,employee.salary FROM employee
GROUP BY employee.salary
HAVING employee.salary = avg(employee.salary);
This returns empty table again:
SELECT employee.lastname,employee.firstname,employee.salary FROM employee
WHERE (SELECT avg(employee.salary) FROM employee
GROUP BY employee.salary
HAVING employee.salary > AVG(employee.salary));
So to conclude this post I would appreciate some insight about the right use of HAVING with an aggregate function, some insight about the reason that the snippets result in an empty table.

When you GROUP BY employee.salary then the average salary of each group is equal to employee.salary because all the salaries of the group are equal.
So the condition:
employee.salary > avg(employee.salary)
is always FALSE and you get no rows,
and the condition:
employee.salary = avg(employee.salary)
is always TRUE and the result is to get all the groups returned.
The correct code to get what you want is:
SELECT employee.lastname, employee.firstname, employee.salary
FROM employee
WHERE employee.salary > (SELECT avg(employee.salary) FROM employee);

Remove the starting open bracket ( before avg(.. and last closing bracket ) before the semicolon as you have misplaced the brackets leading to syntax error
SELECT employee.lastname,
employee.firstname,employee.salary
FROM employee
WHERE employee.salary >
( SELECT avg(employee.salary) FROM
employee);

Try this
SELECT lastname, firstname, salary
FROM employee
WHERE salary > (SELECT AVG(salary) FROM employee)
ORDER BY salary DESC
The sub-query for the average doesn't need a GROUP BY when only an aggregate function is used in the SELECT or HAVING clause.
Or to use something more fancy:
SELECT lastname, firstname, salary
FROM
(
SELECT lastname, firstname, salary
, AVG(salary) OVER () AS avg_salary
FROM employee
) q
WHERE salary > avg_salary

You can deal with two sets/tables, one record-level and the other aggregated, even if they're the same set:
select e.lastname , e.firstname , e.salary
FROM employee e, (
select avg(a.salary) avg_salary
from employee a
) av
where 1=1
and e.salary > av.avg_salary
;

You have aggregated by employee.salary. So, in this query:
HAVING employee.salary > avg(employee.salary);
Each row before the HAVING has exactly one salary value. The average of a single value -- no matter how many are in the group -- is that value. Because a value cannot be bigger than itself, no rows are returned.
This clause:
HAVING employee.salary = avg(employee.salary);
is exactly the same thing, except all rows with non-NULL salaries match this condition. Hence, all rows are returned.
As others have mentioned, the more typical solution is a subquery:
select e.*
from employee e
where e.salary > (select avg(e2.salary) from employee e2);
Note the use of table aliases. These are highly recommended.
A more modern solution would use window functions:
select . . . -- select the columns you want
from (select e.*, avg(e.salary) over () as avg_salary
from employee e
) e
where e.salary > avg_salary;

Related

Why does my HAVING caluse return nothing?

The problem is to find the Second Highest Salary from the employees table.
However my HAVING clause returns nothing, and I have no clue why. My logic is
I will just group by salary, and the condition I set in the HAVING clause is that
group by salary, only if salary != the maximum salary.
This way I thought I excluded the highest value for salary in the grouping, and
then I will only display the first record, which I thought would be the 2nd highest salary.
SELECT salary
FROM Employee
GROUP BY salary
HAVING salary != MAX(salary)
ORDER BY salary desc
LIMIT 1
You don't need group by, order by or limit at all, you just can take the highest salary that is smaller than the maximum:
SELECT MAX(salary)
FROM employee
WHERE salary < (SELECT MAX(salary) FROM employee);
Grouping or ordering should be avoided whenever they are not required due to their high execution time. In case the table contains very many rows, they make the query slow.
Use a subquery to get the max salary:
SELECT salary
FROM Employee
WHERE salary != (SELECT MAX(salary) FROM Employee)
ORDER BY salary desc
LIMIT 1
Grouping is not required.
You can use a join too:
SELECT MAX(a.salary)
FROM Employee a
JOIN Employee b ON b.salary > a.salary
This works because the highest salary doesn't have a row to join to and so is excluded from the result.
It trades brevity for efficiency, but unless you have millions of employees (unlikely), it will execute fast enough.

Count the average issue sql

I have a table with workers and their salary.
I try to count how many workers have salary bigger than the average
I know how to show the average, I know how to count how many workers the company have
but I failed to answer the question. This what I tried but I get an error:
SELECT COUNT(workers_id) FROM flight_company.workers
WHERE Salary > AVG(Salary);
If you are running MySQL 8.0, use window functions:
select avg_salary, count(*) no_workers_above_average
from (select salary, avg(salary) over() avg_salary from flight_company.workers) t
where salary > avg_salary
group by avg_salary
In earlier versions, one option is a join with an aggregate query:
select a.avg_salary, count(*) no_workers_above_average
from flight_company.workers w
inner join (select avg(salary) avg_salary from flight_company.workers) a
where w.salary > a.avg_salary
group by a.avg_salary
You can use subquery :
select count(w.workers_id))
from flight_company.workers w
where salary > (select avg(salary) from flight_company.workers);
You can do it with window function avg() (MySql 8.0+) and aggregation:
select sum(t.flag)
from (select salary > avg(salary) over () flag from workers) t
This one worked for me
SELECT COUNT(workers_id) as Num_Above_Average FROM flight_company.workersWHERE Salary > (select avg(salary) from flight_company.workers)
how do i add the average salary colume to this?
thanks

invail use of group function

SELECT department_id, MIN(salary)
FROM employees
WHERE department_id
HAVING AVG(salary) >= (SELECT MAX(AVG(salary))
FROM employees
GROUP BY department_id);
why it give me the invaild use of group function
If you want the minimum salary from departments whose average salary is the largest, then change the having clause:
HAVING AVG(salary) >= (SELECT AVG(salary)
FROM employees
GROUP BY department_id
ORDER BY AVG(salary) DESC
LIMIT 1
);
subquery - MAX(AVG()) you cannot do such a things, dont know what you want to achieve but if you want to hava a max of the departments averages you should do it with
SELECT MAX(avg_salary)
FROM (SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id) AS a;
or easier:
SELECT TOP 1 AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
ORDER BY AVG(salary) DESC
main query - WHERE department_id should be specified what values should department_id be or WHERE clause should be skipped
main query - lack of GROUP BY department_id
but for me this query does not have any sense...I think it would be better if you explained what do you want to calculate, probably it is achivable much easier...

Find second highest salary from a table with three coloumns Id, name, salary

How to find second highest salary from a table with three columns and these are id, name, salary but using in SELF JOIN.Got answer via nested query. but, I want to know how can we frame using SELF JOIN
why join if you can do it using SELECT statement only?
try this:
SELECT DISTINCT salary FROM myTable ORDER BY salary DESC LIMIT 1,1 ;
If you must use self join, you can do something like this...
SELECT x.val
FROM my_table x
JOIN my_table y
ON y.val >= x.val
GROUP
BY x.val HAVING COUNT(DISTINCT y.val) = ?
Find second highest salary from a table with three coloumns Id, name, salary:
SELECT id,NAME,salary FROM high
WHERE salary = (SELECT DISTINCT(salary) FROM high AS e1
WHERE (SELECT COUNT(DISTINCT(salary))=2 FROM high AS e2
WHERE e1.salary <=e2.salary))
ORDER BY NAME;
look at the sqlfiddle
anthor one interesting answer:
SELECT id,NAME,salary
FROM high
WHERE salary = (SELECT DISTINCT(salary)
FROM high AS e1
WHERE id = (SELECT COUNT(DISTINCT(salary))
FROM high AS e2
WHERE e1.salary <= e2.salary))

Question regarding writing SQL query

Consider I have two tables/columns:
Employee - > EmpId, DeptNo, EmpName, Salary
Department -> DeptNo, DeptName
Write a query to get employee names who is having maximum salary in all of the departments.
I have tried this:
Select max(salary),empname
from Employee
where deptno = (select deptno
from department
where deptname in('isd','it','sales')
Is it correct? Actually it's a interview question.
This is an example of groupwise max mysql pattern. One way to do it would be:
SELECT e.salary, e.name, d.deptname
FROM Employee AS e
JOIN (
SELECT max(salary) AS max_sal, deptno
FROM Employee
GROUP BY deptno
) AS d_max ON (e.salary=d_max.max_sal AND e.deptno=d_max.deptno)
JOIN Department AS d ON (e.deptno = d_max.deptno)
Though it will return more than one row for a department if more than one employee has a maximum salary in a department
Personally I'd use a cte and row_number for a question like this. For example:
with myCTE as
(
select e.empName, e.salary, d.deptName,
row_number() over (partition by e.deptNo order by e.salary desc) as rn
from Employee as e
inner join Department as d
on d.DeptNo=e.DeptNo
)
select m.empName, m.deptName, m.salary
from myCTE as m
where m.rn=1
In the case of ties (two employees in the same department have the same max salary) then this is non-deterministic (it will just return one of them). If you want to return both of them then change the row_number to a dense_rank.