Understanding some SQL subquery examples - mysql

I've been self-learning SQL from resources on the internet. I have two SQL queries that I would like to understand.
Write an SQL query to fetch three max salaries from a table.
SELECT distinct Salary from worker a WHERE 3 >= (SELECT count(distinct Salary) from worker b WHERE a.Salary <= b.Salary) order by a.Salary desc;
Write an SQL query to fetch three min salaries from a table.
SELECT distinct Salary from worker a WHERE 3 >= (SELECT count(distinct Salary) from worker b WHERE a.Salary >= b.Salary) order by a.Salary desc;
As you can see these two are similar. The part I don't particularly understand is this:
(a.Salary >= b.Salary) or (a.Salary <= b.Salary)
I don't understand its logic here. What is it doing here?
Table:

In your query for each given salary it is calculated how many salaries are more/less than current. If count is more than 3, then the current one shouldn't be shown. The conditions (a.Salary >= b.Salary) are responsible for counting salaries more/less then the current one.
But to get 3 min/max salareis I would do thise:
SELECT slary
FROM worker
ORDER BY salary DESC
LIMIT 3;
For min:
SELECT slary
FROM worker
ORDER BY salary ASC
LIMIT 3;
If one worker can have more than one record and you need to show the max total, this would be:
SELECT SUM(slary) total_salary
FROM worker
GROUP BY worker_id
ORDER BY salary DESC
LIMIT 3

Related

How does these 2 sql queries work - to find two minimum and maximum salaries

Can anyone explain how does this two queries work ?
Q) Write a query to retrieve two minimum and maximum salaries from the EmployeePosition table.
To retrieve two minimum salaries, you can write a query as below:
A)To retrieve two minimum salaries, you can write a query as below:
SELECT DISTINCT Salary
FROM EmployeePosition E1
WHERE 2 >= (SELECT COUNT(DISTINCT Salary )
FROM EmployeePosition E2
WHERE E1.Salary >= E2.Salary
) ORDER BY E1.Salary DESC;
To retrieve two maximum salaries, you can write a query as below:
SELECT DISTINCT Salary
FROM EmployeePosition E1
WHERE 2 >= (SELECT COUNT(DISTINCT Salary)
FROM EmployeePosition E2
WHERE E1.Salary <= E2.Salary
)
ORDER BY E1.Salary DESC;
Reference table
is there any alternative SQL query to get the same result?
The question is different from what you have asked
Q21. Write a query to find the Nth highest salary from the table
without using TOP/limit keyword.
That is the second highest salary and it can be done by using row_number supported on MySQL 8.x
WITH max_salary AS
(
SELECT *,
DENSE_RANK() OVER (ORDER BY Salary Desc) AS Rnk
FROM EmployeePosition
)
SELECT max_salary.*
FROM max_salary
WHERE Rnk=2;
MySQL DENSE_RANK Function assigns a rank to each row within a partition or result set (in your case it is a result set) with no gaps in ranking values.
Meaning the same salary will have the same rank.
For example using the data on the linked question:
create table EmployeePosition (
EmpID int,
EmpPosition varchar(25),
DateOfJoining date ,
Salary int );
insert into EmployeePosition values
(1,'Manager','2022-05-01',500000),
(2,'Executive','2022-05-02',75000),
(3,'Manager','2022-05-01',90000),
(2,'Lead','2022-05-02',85000),
(1,'Executive','2022-05-01',300000),
(3,'Manager','2022-05-01',500000);
SELECT *,
DENSE_RANK() OVER (ORDER BY Salary Desc) AS Rnk
FROM EmployeePosition
Result:
EmpID EmpPosition DateOfJoining Salary Rnk
1 Manager 2022-05-01 500000 1
3 Manager 2022-05-01 500000 1
1 Executive 2022-05-01 300000 2
3 Manager 2022-05-01 90000 3
2 Lead 2022-05-02 85000 4
2 Executive 2022-05-02 75000 5
As you can see each Salary is assigned a rank you have two 500000 salary with rank 1 , so the second highest value is 300000 which is filtered on the WHERE Rnk=2;.
The above main query could be written differently:
select EmpID,EmpPosition,DateOfJoining,Salary
from ( SELECT *,
DENSE_RANK() OVER (ORDER BY Salary Desc) AS Rnk
FROM EmployeePosition
) as tbl
WHERE Rnk=2;
https://dbfiddle.uk/Meh2AloO
Can you please explain the sql queries in the question?
Let's explain below example
SELECT DISTINCT Salary
FROM EmployeePosition E1
WHERE 2 >= ( SELECT COUNT(DISTINCT Salary )
FROM EmployeePosition E2
WHERE E1.Salary <= E2.Salary
)
ORDER BY E1.Salary DESC;
This is known as Correlated subqueries, which are the one in which inner query or subquery reference outer query. Outer query needs to be executed before inner query.
For each record processed by outer query, inner query will be executed and will return how many records has records has salary less than the current salary. If you are looking for second highest salary then your query will stop as soon as inner query will return 2.
I think your answer would be:
By the inner selection, I mean:
(SELECT COUNT(DISTINCT Salary)
FROM EmployeePosition E2
WHERE E1.Salary <= E2.Salary
)
The system checks all the states which each salary is greater/less than how many salaries of the rest of the list(Salary list). In another word the system calculates it for all the rows and for each row it returns the number which says that this salary value is greater/less than how many salaries.
Imagine we have distinguish values in rows and they are sorted descending, so when it checks for the highest salary it returns the total row number as a result because it calculates that there are total row number states which the salary is less and equal to the highest salary.
In the same way, For the lowest salary we will only have 1 state which the salary is equal to itself.
So when the system checks for this logic:
WHERE 2 >= (SELECT COUNT(DISTINCT Salary)
FROM EmployeePosition E2
WHERE E1.Salary <= E2.Salary
)
it looks up for the situations that the salary is greater/less and equal to 2 other salaries and by the outer selection it returns the value of salaries.
I think that is really time consuming especially when dealing with large databases. As an alternative you can use this code:
SELECT
Salary, Rank
FROM(
SELECT
Salary,
Rank= ROW_NUMBER() OVER(ORDER BY(Salary) DESC)
FROM EmployeePosition
) X
WHERE Rank<=2

Need to understand the solution to this query

We define an employee's total earnings to be their monthly worked, and the maximum total earnings to be the maximum total earnings for any employee in the Employee table. Write a query to find the maximum total earnings for all employees as well as the total number of employees who have maximum total earnings. Then print these values as space-separated integers.
This is link to the question for better understanding https://www.hackerrank.com/challenges/earnings-of-employees/problem
I am a beginner in SQL and couldn't understand the solution which was given
1.SELECT (months*salary) as earnings,
2.COUNT(*) FROM Employee
3. GROUP BY earnings
4. ORDER BY earnings DESC
5.LIMIT 1;
I understood the first step where we are giving months*salary an Alias which is earnings
in the 2nd step we are counting the no. of employees from employee table. I didn't understand why we are using group by here, 4th and 5th step is also clear, we used order by clause in order to rank earnings from highest to lowest, and limit 1 will fetch me the highest value. But why GROUP BY?
Ignore 1,2,3,4,5. These are just numbers I used for better clarity
You have splitted the query errorneously.
Must be:
SELECT (months*salary) as earnings, -- 2 and 4
COUNT(*) -- 4
FROM Employee -- 1
GROUP BY earnings -- 3
ORDER BY earnings DESC -- 5
LIMIT 1; -- 6
Step 1 - table Employee is used as data source
Step 2 - the value of the expression (months*salary) for each record in the table is calculated
Step 3 - the records which have the same value of the expression from (2) are treated as a group
Step 4 - for each group the value of the expression from (2) is put into output buffer, and the amount of records in a group is calculated and added to output buffer
Step 5 - the rows in output buffer are sorted by the expression from (2) in descending order
Step 6 - the first row from the buffer (i.e. which have greatest value of the expression from (2)) is returned.
Step 3: GROUP BY earnings was used to GROUP TOGETHER same value earnings. If you have, for example, earnings of $3,000, and there were 3 of them, they will be grouped together. GROUP BY is also required in combination with the aggregate function COUNT(*). Otherwise, COUNT(*) will not work and return an error.
Step 4: ORDER BY earnings DESC was used to order the GROUPED EARNINGS in DESCENDING order. Meaning, from HIGHEST EARNINGS down to the LOWEST EARNINGS.
Step 5: LIMIT 1 limits the returned row count to only 1.
Hope this helps! :)
GROUP BY aggregate your Results. When there are multiple numbers of "earnings" with a same value is just a single one in your table.
SELECT TOP 1 (months * salary), COUNT( * )
FROM Employee GROUP BY (months * salary)
ORDER BY (months * salary) DESC;
use above one for MS SQL server.
Use MySQL
select (months * salary) as earnings, -- select employee's total earnings to be their monthly worked
count(*) -- select all data
from Employee
group by earnings -- group by earnings to find the count of the number of employees who have earned max
order by earnings desc -- order by descending to get the height earning to lowest earning
limit 1; -- get the height earning and total number of employees who have maximum total earnings
select (salary * months) as max_earning -- for selecting max_earning
,count(*) -- for counting total employees with max_earning
from employee --table from which data is fetched
Group by (salary * months) desc --grouping the max_earning in desc order
limit 1; -- selecting the 1 record with max_earnings.
This code is for MYSQL
Try this query:
SELECT months*salary,COUNT(*) FROM Employee GROUP BY months*salary having count(*) = 7 ORDER BY months*salary DESC;
SELECT
MAX(salary * months), COUNT(*)
FROM
Employee
GROUP BY
(salary * months)
ORDER BY
(salary* months) desc limit 1;
select max(salary*months) ,count(*)
from employee
where (salary*months) in (
select max(salary*months) from employee
)
this is implemented in mysql

How to find second highest salary in mysql

How to find second highest salary in mysql.
All record find in second highest salary.
Table : Employee
ID salary emp_name
1 400 A
2 800 B
3 300 C
4 400 D
4 400 C
*** Mysql Query: ***
SELECT * FROM employee ORDER by salary DESC LIMIT 1,2
This return two record.I do not know how many record in second highest salary.
Try this:
SELECT emp_name,salary
FROM Employee
WHERE salary = (SELECT DISTINCT salary FROM Employee as emp1
WHERE (SELECT COUNT(DISTINCT salary)=2 FROM Employee as emp2
WHERE emp1.salary <= emp2.salary))
ORDER BY emp_name
SELECT sal
FROM emp
ORDER BY sal DESC
LIMIT 1, 1;
You will get only the second max salary.
And if you need any 3rd or 4th or Nth value you can increase the first value followed by LIMIT (n-1) ie. for 4th salary: LIMIT 3, 1;
1st one with limit-
SELECT SALARY FROM tbl_name ORDER BY SALARY DESC LIMIT 1,1
2nd one without limit-
SELECT MAX(SALARY) FROM tbl_name WHERE SALARY < (SELECT MAX(SALARY) FROM tbl_name)
SELECT * FROM employee GROUP BY salary ORDER BY salary DESC LIMIT 1, 1
Above query first grouped salary column (for distinct record) and display records in descending order then apply limit function (limit function accept two parameter first one is for index(which is start from 0) and second one is for how many record we want).
If you want third highest salary just change limit 2,1 and so on for next.
Use this to find 2nd highest salary.
SELECT * FROM tablename ORDER BY salary DESC LIMIT 1,1
first 1 in limit is to skipping the rows and second 1 in limit is to display the row.
For third highest skip two rows same scenario to find nth salaries of the employees.
SELECT * FROM tablename ORDER BY salary DESC LIMIT 2,1
select salary
from (
select salary
from Employee
order by desc
limit 2) as minimumTwoSalary
order by minimumTwoSalary.salary
limit 1;
Using GROUP BY with ORDER BY will give you the second highest salary even if there is two same salary
SELECT * FROM employee GROUP BY salary ORDER by salary DESC LIMIT 1,1
Try this:
SELECT MAX(SALARY)
FROM tbl_name
WHERE
SALARY NOT IN
(
SELECT MAX(SALARY)
FROM tbl_name
)
Use the below query to get the 2nd or nth highest salary. Basically, The DENSE_RANK()
assigns a rank to each row within a partition or result set with no gaps in ranking values.
The rank of a row is increased by one from the number of distinct rank values which come before the row.
SELECT salary as highest_salary FROM
(SELECT salary, DENSE_RANK() OVER( ORDER BY salary DESC) row_num FROM Employee) with_dense_rank
WHERE row_num = 2;
select max(salary) from Employee where salary != (select max(salary) from Employee)
Explanation: -
In the above query, we are using the max function of SQL to find the maximum salary then we are comparing max salary using where clause with the help of subquery to filter out the maximum salary from the salary column.
Another example of the same approach is below. I believe it is the easiest logical method but maybe not the fastest because the subquery makes the output a bit slower.
select max(salary) from Employee where salary < (select max(salary) from Employee)
So, I will go with #Renuka Kulkarni Approach, but just a little different in the query, because some employees can also have the same amount of salary. so we need to select Distinct, it is important to use distinct because maybe 2 and multiple employees can have the same amount of salary.
SELECT DISTINCT price
FROM Product
ORDER BY price DESC
LIMIT 1, 1;
We can find Second Highest Salary in many ways.
1.Normal Where Condition Using MAX() Function
SELECT MAX(salary) AS SecondHighestSalary
FROM Employee WHERE salary<(SELECT MAX(salary) FROM Employee);
2.Using LIMIT
SELECT salary AS SecondHighestSalary FROM Employee ORDER BY salary DESC
LIMIT 1,1;
3.Using Self JOIN
SELECT MAX(e2.salary) AS SecondHighestSalary
FROM Employee e1, Employee e2 WHERE e1.salary>e2.salary;
4.Using NOT IN Keyword
SELECT MAX(salary) AS SecondHighestSalary
FROM Employee
WHERE salary NOT IN (SELECT MAX(salary) FROM Employee);
SELECT * FROM employee ORDER by salary DESC LIMIT 1,1;

Using multiple aggregate functions in one SQL query

My table is as below
WORKS ( emp-name, comp-name, salary)
I want to find the comp-name which pays to lowest total salary to it's employees
I tries below query, but it gives SUM of salaries for all comp-name
SELECT comp-name, MIN(sal) as lowest
FROM
(
SELECT comp-name, SUM(salary) as sal from WORKS group by comp-name
)tmt group by comp-name;
How do I find only one company which pays lowest total salary.
You can use LIMIT to get only one company with lowest total salary , also need to need to sort in ascending order
SELECT comp-name,
SUM(salary) as sal
from WORKS
group by comp-name
Order by sal ASC
LIMIT 1

MYSQL query to find the all employees with nth highest salary

The two tables are salary_employee and employee
employee_salary
salary_id emp_id salary
Employee
emp_id | first_name | last_name | gender | email | mobile | dept_id | is_active
Query to get the all employees who have nth highest salary where n =1,2,3,... any integer
SELECT a.salary, b.first_name
FROM employee_salary a
JOIN employee b
ON a.emp_id = b.emp_id
WHERE a.salary = (
SELECT salary
FROM employee_salary
GROUP BY salary
DESC
LIMIT 1 OFFSET N-1
)
My Questions:
1) Is there any better and optimized way we can query this,
2) Is using LIMIT an good option
3) We have more options to calculate the nth highest salary, which is the best and what to follow and when?
One option using :
SELECT *
FROM employee_salary t1
WHERE ( N ) = ( SELECT COUNT( t2.salary )
FROM employee_salary t2
WHERE t2.salary >= t1.salary
)
Using Rank Method
SELECT salary
FROM
(
SELECT #rn := #rn + 1 rn,
a.salary
FROM tableName a, (SELECT #rn := 0) b
GROUP BY salary DESC
) sub
WHERE sub.rn = N
You have asked what seems like a reasonable question. There are different ways of doing things in SQL and sometimes some methods are better than others. The ranking problem is just one of many, many examples. The "answer" to your question is that, in general, order by is going to perform better than group by in MySQL. Although even that depends on the particular data and what you consider to be "better".
The specific issues with the question are that you have three different queries that return three different things.
The first returns all employees with a "dense rank" that is the same. That terminology is use purposely because it corresponds to the ANSI dense_rank() function which MySQL does not support. So, if your salaries are 100, 100, and 10, it will return two rows with a ranking of 1 and one with a ranking of 2.
The second returns different results if there are ties. If the salaries are 100, 100, 10, this version will return no rows with a ranking of 1, two rows with a ranking of 2, and one row with a ranking of 3.
The third returns an entirely different result set, which is just the salaries and the ranking of the salaries.
My comment was directed at trying the queries on your data. In fact, you should decide what you actually want, both from a functional and a performance perspective.
LIMIT requires the SQL to skim through all records between 0 and N and therefore requires increasing time the further back in your ranking you want to look. However, IMO that problem cannot be solved better.
As Gordon Linoff suggested: Run your option against your data set, using the commonly used ranks (which ranks are queried often, which are not? The result might be fast on rank 1 but terrible on rank 100).
Execute and analyze the Query Execution Plan and create indexes accordingly (for example on the salary column) and retest your queries.
Other options:
Option 4:
You could build a ranking table whichs serves as cache. The execution plan of your Limit-Query shows (see sqlfiddle here), that mysql already does create a temporary table to solve the query.
Pros: Easy and fast
Cons: Forces you to regenerate the ranking table each time the data changes
Option 5:
You could reconsider how you define "ranks".
If we have the following salaries:
100'000
100'000
80'000
Is the employee Nr 3 considered to be of rank 3 or 2?
Are 1 and 2 on the same rank (rank 1), but 3 is on rank 3?
If you define rank = order, you can greatly simplify the query to
SELECT a.salary, b.first_name
FROM employee_salary a, employee b
WHERE a.emp_id = b.emp_id
order by salary desc
LIMIT 1 OFFSET 4
demo: http://sqlfiddle.com/#!2/e7321d/1/0
try this,
SELECT * FROM one as A WHERE ( n ) = ( SELECT COUNT(DISTINCT(b.salary)) FROM one as B WHERE
B.salary >= A.salary )
Suppose emp_salary table have the below records:
And you want to select all employees with nth (N=1,2,3 etc.) highest/lowest (only change >(for highest), < (for lowest) operator according to your needs) salary, use the below sql:
SELECT DISTINCT(a.salary),
a.id,
a.name
FROM emp_salary a
WHERE N = (SELECT COUNT( DISTINCT(b.salary)) FROM emp_salary b
WHERE b.salary >= a.salary
);
For example, if you want to select all employees with 2nd highest salary, use below sql:
SELECT DISTINCT(a.salary),
a.id,
a.name
FROM emp_salary a
WHERE 2 = (SELECT COUNT( DISTINCT(b.salary)) FROM emp_salary b
WHERE b.salary >= a.salary
);
But if you want to display only second highest salary(only single record), use the below sql:
SELECT DISTINCT(a.salary),
a.id,
a.name
FROM emp_salary a
WHERE 2 = (SELECT COUNT( DISTINCT(b.salary)) FROM emp_salary b
WHERE b.salary >= a.salary
) limit 1;