Here is the schema
Employee (name,sex,salary,deptName)
and name is the primary key
SELECT deptname
FROM employee
WHERE sex=m
GROUP BY deptName HAVING avg(salary) >
(SELECT avg(salary)
FROM employee)
I want to understand the part having avg(salary) what does that part actually do?
since we dont include the salary select clause,
SELECT deptname
FROM employee
WHERE sex=m
GROUP BY deptName
This part will give me group of deptName, just one column nothing else, I am wondering how having (avg(salary)) is working, it is taking avg of all employees from the table or what?
Anyone who can tell me
Thanks
WHERE filters records before they are grouped; whereas HAVING filters the results after they have been grouped. Expressions, using functions or operators, can be used in either clause (although aggregate functions like AVG() cannot be used in the WHERE clause as the records would not have been grouped when that clause is evaluated).
Thus your query filters a list of departments for those where the average salary of that department's male workers is greater than the overall (company) average salary.
SELECT AVG(salary)
FROM employee
By above query first you will get avg salary of all employees.
Now you are getting only department whose avg salary is greater than avg salary of all employees.
The having clause works like a where condition for the 'group by deptName' clause. All rows are grouped by value of deptName column. For each group, the average is calculated on the values of salary for that particular group.
Therefore, for all groups, only if the average salary for that particular 'deptName' group is greater than the average salary for all employees, the row from that group would show.
having is like WHERE clause but its for aggregate functions Like AVG.
so your query will look for average of every deptname. BUT in your query
having avg(salary) > (select avg(salary) from employee)
you maybe want give an average to compare with.
like
having avg(salary) > 25
so this will select only those who have average > 25.
Related
I am not getting the logic of how to do Grouping by the Department and Finding the Average of the Salary of the Department and then filtering all the rows of the table by the values that is greater than Average salary of that department in SQL
Department
Salary
A
100
B
200
A
200
B
50
So avg of group A is 150
and avg of grp B is 125
My query should return :-
Department
Salary
B
200
A
200
You should please have a look how grouping works in SQL. This query will find the department and its average salary:
SELECT department, AVG(salary) salary FROM yourtable
GROUP BY department;
In order to find the departments having a higher salary, you can just join the whole table and this "average result" and choose those entries only that have a higher salary:
SELECT y.department, y.salary FROM yourtable y
JOIN (SELECT department, AVG(salary) salary FROM yourtable
GROUP BY department) average
ON y.department = average.department
WHERE y.salary > average.salary
ORDER BY y.department;
The order clause let department A appear before department B. In your description, it's sorted the other way. If you want to change this, you can write ORDER BY y.department DESC;
A last note: If there are NULL values in the salary table, they will note be considered by the average function. So if you have 10 null values, one row with a salary of 100 and one with a salary of 50, the average will be 75 and "ignore" the NULL values. If you don't want this, you need to replace the NULL values by the value you want. As example, you could write COALESCE(salary,0) within your query if you want to replace all NULL values by zero when calculating your average salary.
Say I have a table called Employee with attributes Name, Salary, Department.
I know that this will work:
SELECT Department, AVG(Salary)
FROM EMPLOYEE
GROUP BY Department;
Would it be incorrect to discard 'Department' from the SELECT clause like so:
SELECT AVG(Salary)
FROM EMPLOYEE
GROUP BY Department;
Or would it still work?
Both queries will work and provide the output. However, second query will only result in one column (i.e. average salary) and hence, you won't be able to trace it back to department id from second query alone, e.g.:
Query 1 output:
dept | salary
1 | 5000
2 | 6000
Query 2 output:
salary
5000
6000
no, not necessarily. You can use the 2nd query too.But you can't see for which depart the salary got sorted by 'group-by' key-word.
My table is as below
WORKS ( emp-name, comp-name, salary)
I want to find the comp-name which pays to lowest total salary to it's employees
I tries below query, but it gives SUM of salaries for all comp-name
SELECT comp-name, MIN(sal) as lowest
FROM
(
SELECT comp-name, SUM(salary) as sal from WORKS group by comp-name
)tmt group by comp-name;
How do I find only one company which pays lowest total salary.
You can use LIMIT to get only one company with lowest total salary , also need to need to sort in ascending order
SELECT comp-name,
SUM(salary) as sal
from WORKS
group by comp-name
Order by sal ASC
LIMIT 1
SELECT *
FROM Employees Emp1
WHERE (n) = ( SELECT COUNT(DISTINCT(Emp2.Salary))
FROM Employees Emp2
WHERE Emp2.Salary >= Emp1.Salary )
I think what matters most is the subquery. It returns number of distinct salary that is greater than or equals to current Emp1.Salary. This value returned is equal to the employee's salary rank.
Assume that you're the employee with third greatest salary, 10000. The subquery will count number of distinct salary that is greater than you which is 2, plus one (2+1=3). Plus one counted from employee, including your self, having salary equals to 10000. This is because >= used in the WHERE clause.
Having said that, it makes perfect sense that the entire query select employee based on his salary rank.
second highest salary in table in sql
select * from emp e where
2 =(select count(distinct sal) from emp where e.sal<=sal)
I am unable to understand this query...can anyone help me.
Let's analyze the inner query: you are selecting all the distinct salaries there are more or equal than a predefined salary, which is the salary of the outer query.
So, for EVERY row, you are searching and counting all the other rows with a salary greater or equal that one, and you finally select the one which have a value of 2, which is exactly the second highest salary (because the second highest salary has just 2 salary greater or equal itself: the greater salary at all, and itself).
Tremendous inefficent, because for every row you re-scan the entire table, but funny :)
This is a very inefficient way of doing it.
It uses a correlated sub query. Conceptually for each row in the outer query it does a self join back into the emp table and counts the number of distinct salary values less than or equal to the current salary. If this is 2 then this salary is the 2nd highest salary in the table.
This is known as a "triangular join" and as the number of rows increase the work required grows exponentially.
We can get nth highest salary from table, see below sql:
SELECT DISTINCT salary, name, id
FROM emp_salary
GROUP BY salary
ORDER BY salary DESC
LIMIT N-1,1;
For example : Find second highest salary
SELECT DISTINCT salary, name, id
FROM emp_salary
GROUP BY salary
ORDER BY salary DESC
LIMIT 1,1;
We can get nth lowest salary from table, see below sql:
SELECT DISTINCT salary, name, id
FROM emp_salary
GROUP BY salary
ORDER BY salary ASC
LIMIT N-1,1;
For example : Find second lowest salary
SELECT DISTINCT salary, name, id
FROM emp_salary
GROUP BY salary
ORDER BY salary ASC
LIMIT 1,1;