SELECT*
FROM employee A
WHERE n-1 = (SELECT count (*)
FROM employee B
WHERE B.salary > A.salary
So I'm trying to get the nth highest salary from the employee table. This code works exactly as I want it to, but I don't understand it
particularly the 3rd line where "WHERE n-1 = (SELECT count(*)"
I understand how the count function works, but I don't get what happens when you input a number and state WHERE it equals to the count function
It looks like an example from a tutorial or book. Where it says n-1, you would substitute an integer value.
For example, if you want the 4th highest salary, you'd substitute 4-1, or 3.
So the query would be:
SELECT*
FROM employee A
WHERE 3 = (SELECT count (*)
FROM employee B
WHERE B.salary > A.salary);
The subquery that returns the count is a correlated subquery. So it searches for rows with a greater salary relative to A.salary, the salary of the respective row currently being evaluated in the outer query. This means it will run the subquery many times, once for each row of the outer query. That's usually what a correlated subquery does. It has to do that, because the result of the subquery may be different for each row of the outer query.
So this subquery will return the count of employees whose salary is greater than the salary of the respective row in the outer query. If that count is 3, then there are exactly three employees with a greater salary than the employee represented by the row A. Therefore that employee has the 4th highest salary.
In MySQL 8.0, you can use window functions, so another way to get this result without using a correlated subquery is the following:
SELECT *
FROM (
SELECT *, RANK() OVER (ORDER BY salary DESC) AS `rank`
FROM employee
) AS t
WHERE `rank` = 4;
Related
The problem is to find the Second Highest Salary from the employees table.
However my HAVING clause returns nothing, and I have no clue why. My logic is
I will just group by salary, and the condition I set in the HAVING clause is that
group by salary, only if salary != the maximum salary.
This way I thought I excluded the highest value for salary in the grouping, and
then I will only display the first record, which I thought would be the 2nd highest salary.
SELECT salary
FROM Employee
GROUP BY salary
HAVING salary != MAX(salary)
ORDER BY salary desc
LIMIT 1
You don't need group by, order by or limit at all, you just can take the highest salary that is smaller than the maximum:
SELECT MAX(salary)
FROM employee
WHERE salary < (SELECT MAX(salary) FROM employee);
Grouping or ordering should be avoided whenever they are not required due to their high execution time. In case the table contains very many rows, they make the query slow.
Use a subquery to get the max salary:
SELECT salary
FROM Employee
WHERE salary != (SELECT MAX(salary) FROM Employee)
ORDER BY salary desc
LIMIT 1
Grouping is not required.
You can use a join too:
SELECT MAX(a.salary)
FROM Employee a
JOIN Employee b ON b.salary > a.salary
This works because the highest salary doesn't have a row to join to and so is excluded from the result.
It trades brevity for efficiency, but unless you have millions of employees (unlikely), it will execute fast enough.
I am working with the employees table in SQL and I would like to fetch the data for max count of employees
SELECT (COUNT(emp_no)) AS emp_count, dept_no
FROM
dept_emp
GROUP BY dept_no
HAVING COUNT(emp_no) = (SELECT MAX(COUNT(emp_no)) FROM dept_emp)
ORDER BY emp_count DESC
So far this is what I have got but this results in an error saying 'Invalid use of group function'. There is another approach I followed by making a table first and then using the having clause but what would be the correct code in the above approach?
You don't have to use the having at all. The query as it is without the having will bring you all the departments with the number of employees at each one. The one with the most employees in the first row. If you want only that one you can add limit 1 at the end of the query.
You can't not use an aggregation over an aggretation, MAX(COUNT()) is invalid
SELECT (COUNT(emp_no)) AS emp_count, dept_no
FROM dept_emp
GROUP BY dept_no
HAVING COUNT(emp_no) = (
SELECT MAX(count_result) FROM (SELECT COUNT(emp_no) as count_result FROM dept_emp) as count_table
)
ORDER BY emp_count DESC
Side notes:
I think there is a missing WHERE in the subquery as the result will be always the same, as we are getting the MAX of a COUNT to a unfiltered table dept_emp
I think the MAX(COUNT()) is irrelevant in the subquery, since you can just order by the count and limit by one, for example SELECT COUNT(id) FROM foo ORDER BY COUNT(id) DESC LIMIT 1
If you can avoid the subqueries, databases are incredible slow understanding subqueries, if you are curious prepend EXPLAIN to the sql statement, and see what mysql does for it
Edit: If you provide the output of SHOW CREATE TABLE table_name for the tables that are involved in employee counting, I can give you the WHERE that you have to write in the subquery
I want to understand the following query:
SELECT DISTINCT salary
FROM employees a
WHERE 3 >= (
SELECT COUNT(DISTINCT salary)
FROM employees b
WHERE b.salary <= a.salary
)
ORDER BY a.salary DESC;
Starting from the inner SELECT (a correlated sub-query). Such a query will be executed for each row in the outer query. So what does it do?
Return the number of unique salaries that are less than or equal to the current employee's salary.
SELECT COUNT(DISTINCT salary)
FROM employees b
WHERE b.salary <= a.salary
So, given that number for the current row of the outer select, what does that do? Return the unique salaries (in order) where the number returned from the sub-query is less than or equal to 3.
SELECT DISTINCT salary
FROM employees a
WHERE 3 >= (some number)
ORDER BY a.salary DESC;
Putting it all together, we fetch:
Unique salaries in order where such a salary is one of the worst 3.
I think that this query should return the 3 worst salaries!
Here is the example from this thread:
select (
select distinct Salary from Employee order by Salary Desc limit 1 offset 1
)as second;
The select(...) as second looks confusing to me because I've never seen a query-set instead of column names can be used as the argument of SELECT..
Does anyone have ideas about how to understand nested select clause like this? Is there any tutorials about this feature?
That's a subquery in the SELECT list of a query.
To get there, let's look at some other examples
SELECT t.id
, 'bar' AS foo
FROM mytable
WHERE ...
LIMIT ...
'bar' is just a string literal that gets returned in every row (in a column named foo) in the resultset from the query.
Also, MySQL allows us to run a query without a FROM clause
SELECT 'fee' AS fum
We can also put a subquery in the SELECT list of a query. For example:
SELECT t.id
, (SELECT r.id FROM resorts r ORDER BY r.id ASC LIMIT 1) AS id
FROM mytable
WHERE ...
LIMIT ...
The query pattern you asked about is a SELECT statement without a FROM clause
And the only expression being returned is the result from a subquery.
For example:
SELECT e.salary
FROM Employee e
GROUP BY e.salary
ORDER BY e.salary DESC
LIMIT 4,1
If this query runs, it will return one column, and will return either one or zero rows. (No more than one.) This satisfies the requirements for a subquery used in a SELECT list of another query.
SELECT ( subquery ) AS alias
With that, the outer query executes. There's no FROM clause, so MySQL returns one row. The resultset is going to consist of one column, with a name of "alias".
For each row returned by the outer query, MySQL will execute the subquery in the SELECT list. If the subquery returns a row, the value of the expression in the SELECT list of the subquery is assigned to the "alias" column of the resultset. If the execution of the subquery doesn't return a row, then MySQL assigns a NULL to the "alias" column.
Why can't I write queries like
select *
from STAFF
having Salary > avg(salary);
in SQL?
PS: to find staff members with a salary = avg salary, I need to write
select *
from STAFF
having Salary > (select avg(Salary) from STAFF);
Is there any other way to do this?
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
The functions SUM(),AVG(),MIN(),MAX(),COUNT(),etc are called aggregate functions. Read more here.
Example using WHERE clause :
select *
from staff
where salary > (select avg(salary) from staff)
See example in SQL Fiddle.
Example using HAVING clause :
select deptid,COUNT(*) as TotalCount
from staff
group by deptid
having count(*) >= 2
See example in SQL Fiddle.
Where can we use having clause:
Having clause specifies a search condition for a group or an aggregate. HAVING can be used only with the SELECT statement. HAVING is typically used in a GROUP BY clause. When GROUP BY is not used, HAVING behaves like a WHERE clause.
Read more here.