MySQL- Can any one please explain this query - mysql

I want to understand the following query:
SELECT DISTINCT salary
FROM employees a
WHERE 3 >= (
SELECT COUNT(DISTINCT salary)
FROM employees b
WHERE b.salary <= a.salary
)
ORDER BY a.salary DESC;

Starting from the inner SELECT (a correlated sub-query). Such a query will be executed for each row in the outer query. So what does it do?
Return the number of unique salaries that are less than or equal to the current employee's salary.
SELECT COUNT(DISTINCT salary)
FROM employees b
WHERE b.salary <= a.salary
So, given that number for the current row of the outer select, what does that do? Return the unique salaries (in order) where the number returned from the sub-query is less than or equal to 3.
SELECT DISTINCT salary
FROM employees a
WHERE 3 >= (some number)
ORDER BY a.salary DESC;
Putting it all together, we fetch:
Unique salaries in order where such a salary is one of the worst 3.

I think that this query should return the 3 worst salaries!

Related

Can someone explain this SQL statement? Has to do with count function

SELECT*
FROM employee A
WHERE n-1 = (SELECT count (*)
FROM employee B
WHERE B.salary > A.salary
So I'm trying to get the nth highest salary from the employee table. This code works exactly as I want it to, but I don't understand it
particularly the 3rd line where "WHERE n-1 = (SELECT count(*)"
I understand how the count function works, but I don't get what happens when you input a number and state WHERE it equals to the count function
It looks like an example from a tutorial or book. Where it says n-1, you would substitute an integer value.
For example, if you want the 4th highest salary, you'd substitute 4-1, or 3.
So the query would be:
SELECT*
FROM employee A
WHERE 3 = (SELECT count (*)
FROM employee B
WHERE B.salary > A.salary);
The subquery that returns the count is a correlated subquery. So it searches for rows with a greater salary relative to A.salary, the salary of the respective row currently being evaluated in the outer query. This means it will run the subquery many times, once for each row of the outer query. That's usually what a correlated subquery does. It has to do that, because the result of the subquery may be different for each row of the outer query.
So this subquery will return the count of employees whose salary is greater than the salary of the respective row in the outer query. If that count is 3, then there are exactly three employees with a greater salary than the employee represented by the row A. Therefore that employee has the 4th highest salary.
In MySQL 8.0, you can use window functions, so another way to get this result without using a correlated subquery is the following:
SELECT *
FROM (
SELECT *, RANK() OVER (ORDER BY salary DESC) AS `rank`
FROM employee
) AS t
WHERE `rank` = 4;

Why does my HAVING caluse return nothing?

The problem is to find the Second Highest Salary from the employees table.
However my HAVING clause returns nothing, and I have no clue why. My logic is
I will just group by salary, and the condition I set in the HAVING clause is that
group by salary, only if salary != the maximum salary.
This way I thought I excluded the highest value for salary in the grouping, and
then I will only display the first record, which I thought would be the 2nd highest salary.
SELECT salary
FROM Employee
GROUP BY salary
HAVING salary != MAX(salary)
ORDER BY salary desc
LIMIT 1
You don't need group by, order by or limit at all, you just can take the highest salary that is smaller than the maximum:
SELECT MAX(salary)
FROM employee
WHERE salary < (SELECT MAX(salary) FROM employee);
Grouping or ordering should be avoided whenever they are not required due to their high execution time. In case the table contains very many rows, they make the query slow.
Use a subquery to get the max salary:
SELECT salary
FROM Employee
WHERE salary != (SELECT MAX(salary) FROM Employee)
ORDER BY salary desc
LIMIT 1
Grouping is not required.
You can use a join too:
SELECT MAX(a.salary)
FROM Employee a
JOIN Employee b ON b.salary > a.salary
This works because the highest salary doesn't have a row to join to and so is excluded from the result.
It trades brevity for efficiency, but unless you have millions of employees (unlikely), it will execute fast enough.

How to handle SQL subqueries with sums

I am practicing queries on an example database in MySQL.
I have an employee table with a primary key of emp_id.
I have a works_with table with a composite key of emp_id and client_id. It also has a column of total_sales.
I am trying to write a query that returns the name of any employee who has sold over 100,000 total.
I was able to return the employee id and total for sums over 100,000 like so:
SELECT SUM(total_sales) AS total_sales, emp_id
FROM works_with
WHERE total_sales > 100000
GROUP BY emp_id;
But I am unsure how to use this to also get employee name. I have tried nested queries but with no luck. For example when I try this:
SELECT first_name, last_name
FROM employee
WHERE emp_id IN (
SELECT SUM(total_sales) AS total_sales, emp_id
FROM works_with WHERE total_sales > 100000
GROUP BY emp_id
)
I get Error 1241: Operand should contain 1 column(s). I believe this is because I am selecting two columns in the nested query? So how would I handle this problem?
Just join:
select sum(w.total_sales) as total_sales, e.first_name, e.lastnmae
from works_with w
inner join employee e on e.emp_id = w.emp_id
group by e.emp_id
having sum(w.total_sales) > 10000;
Note that I used a having clause rather than the where clause: presumably, you want to sum all sales of each employee, and filter on that result. Your original queried sums only individual values that are greater than 100000.
Adding to GMB's solution.
Take your existing Select and wrap it in a Derived Table/CTE:
SELECT e.first_name, e.last_name, big_sales.total_sales
FROM employee as e
join
(
SELECT SUM(total_sales) AS total_sales, emp_id
FROM works_with
GROUP BY emp_id
HAVING total_sales > 100000
) as big_sales
on e.emp_id = big_sales.emp_id
Now you can show the total_sales plus employee details. Additionally this should be more efficient, because you aggregate & filter before the join.
If you only need to show the employee you can use a SubQuery (like the one you tried), but it must return a single column, i.e. remove the SUM from the Select list:
SELECT first_name, last_name
FROM employee
WHERE emp_id IN (
SELECT emp_id -- , SUM(total_sales) AS total_sales
FROM works_with
GROUP BY emp_id
HAVING SUM(total_sales) > 100000
)

MySQL - Group and total, but return all rows in each group

I'm trying to write a query that finds each time the same person occurs in my table between a specific date range. It then groups this person and totals their spending for a specific range. If their spending habits are greater than X amount, then return each and every row for this person between date range specified. Not just the grouped total amount. This is what I have so far:
SELECT member_id,
SUM(amount) AS total
FROM `sold_items`
GROUP BY member_id
HAVING total > 50
This is retrieving the correct total and returning members spending over $50, but not each and every row. Just the total for each member and their grand total. I'm currently querying the whole table, I didn't add in the date ranges yet.
JOIN this subquery with the original table:
SELECT si1.*
FROM sold_items AS si1
JOIN (SELECT member_id
FROM sold_items
GROUP BY member_id
HAVING SUM(amount) > 50) AS si2
ON si1.member_id = si2.member_id
The general rule is that the subquery groups by the same column(s) that it's selecting, and then you join that with the original query using the same columns.
SELECT member_id, amount
FROM sold_items si
INNER JOIN (SELECT member_id,
SUM(amount) AS total
FROM `sold_items`
GROUP BY member_id
HAVING total > 50) spenders USING (member_id)
The query you have already built can be used as a temporary table to join with. if member_id is not an index on the table, this will become slow with scale.
The word spenders is a table alias, you can use any valid alias in its stead.
There are a few syntaxes that will get the result you are looking, here is one using an inner join to ensure that all rows returned have a member_id in the list returned by the group by and that the total is repeated for each a certain member has:
SELECT si.*, gb.total from sold_items as si, (SELECT member_id as mid,
SUM(amount) AS total
FROM `sold_items`
GROUP BY member_id
HAVING total > 50) as gb where gb.mid=si.member_id;
I think that this might help:
SELECT
member_id,
SUM(amount) AS amount_value,
'TOTAL' as amount_type
FROM
`sold_items`
GROUP BY
member_id
HAVING
SUM(amount) > 50
UNION ALL
SELECT
member_id,
amount AS amount_value,
'DETAILED' as amount_type
FROM
`sold_items`
INNER JOIN
(
SELECT
A.member_id,
SUM(amount) AS total
FROM
`sold_items` A
GROUP BY
member_id
HAVING
total <= 50
) AS A
ON `sold_items`.member_id = A.member_id
Results of the above query should be like the following:
member_id amount_value amount_type
==========================================
1 55 TOTAL
2 10 DETAILED
2 15 DETAILED
2 10 DETAILED
so the column amount_type would distinguish the two specific member groups
You could do subquery with EXISTS as an alternative:
select *
from sold_items t1
where exists (
select * from sold_items t2
where t1.member_id=t2.member_id
group by member_id
having sum(amount)>50
)
ref: http://dev.mysql.com/doc/refman/5.7/en/exists-and-not-exists-subqueries.html
In case you need to group by multiple columns, you can use a composite identifier with concatenate in combination with a group by subquery
select id, key, language, group
from translation
--query all key-language entries by composite identifier...
where concat(key, '_', language) in (
--by lookup of all key-language combinations...
select concat(key, '_', language)
from translation
group by key, language
--that occur more than once
having count(*) > 1
)

Find second highest salary from a table with three coloumns Id, name, salary

How to find second highest salary from a table with three columns and these are id, name, salary but using in SELF JOIN.Got answer via nested query. but, I want to know how can we frame using SELF JOIN
why join if you can do it using SELECT statement only?
try this:
SELECT DISTINCT salary FROM myTable ORDER BY salary DESC LIMIT 1,1 ;
If you must use self join, you can do something like this...
SELECT x.val
FROM my_table x
JOIN my_table y
ON y.val >= x.val
GROUP
BY x.val HAVING COUNT(DISTINCT y.val) = ?
Find second highest salary from a table with three coloumns Id, name, salary:
SELECT id,NAME,salary FROM high
WHERE salary = (SELECT DISTINCT(salary) FROM high AS e1
WHERE (SELECT COUNT(DISTINCT(salary))=2 FROM high AS e2
WHERE e1.salary <=e2.salary))
ORDER BY NAME;
look at the sqlfiddle
anthor one interesting answer:
SELECT id,NAME,salary
FROM high
WHERE salary = (SELECT DISTINCT(salary)
FROM high AS e1
WHERE id = (SELECT COUNT(DISTINCT(salary))
FROM high AS e2
WHERE e1.salary <= e2.salary))