I've 2 tables emp and expenditure.
Emp:
ID, NAME
Expenditure:
ID, EMP_ID, AMOUNT
Each emp has a limit of 100 that he/she can spend. We want to check which emp has expenditure > 100.
Output attributes needed: Emp name, exp id, amount
My query:
SELECT E.NAME,
EXP.ID,
EXP.AMOUNT
FROM EMP E
INNER JOIN expenditure EXP ON E.ID = EXP.EMP_ID
WHERE E.ID in
(SELECT EMP_ID
FROM
(SELECT EMP_ID,
SUM(AMOUNT) AS TOTAL
FROM expenditure
GROUP BY EMP_ID
HAVING SUM(AMOUNT) > 100.00
ORDER BY TOTAL DESC) SUBQ)
ORDER BY EXP.AMOUNT desc;
Is it possible to optimize this?
Just like code, SQL queries can be written in many different ways. Run an Execution Plan on your SQL. Check here and here
Below is more "conciser" version although it may not be any more optimised than your current code. Use Execution Plans to analyse performance.
SELECT E.NAME,
E.ID, -- THIS IS EMPLOYEE ID NOT EXPENDITURE ID
EXP.EMP_SPENT
FROM EMP E
JOIN (SELECT EMP_ID, sum(AMOUNT) as EMP_SPENT FROM expenditure GROUP BY EMP_ID) EXP
ON E.ID = EXP.EMP_ID
WHERE EXP.EMP_SPENT > 100;
Additionally...
I noticed that question is a bit confusing. The original query spits out "every line" of expense of an employee, whose "total" expenses are above 100. The text however, says "Which employee has gone above 100". These are two different questions. My answer above is for the latter "Who has gone above". It will NOT list all expenses of employees who had spent over 100, but only a list of employees who have gone above 100 and their total expenditure.
You can use a simple aggregation with HAVING clause such as
SELECT e.name, e.id, SUM(amount) AS total
FROM emp e
JOIN expenditure ep
ON e.id = ep.emp_id
GROUP BY e.name, e.id
HAVING SUM(amount) > 100
but it's not logical to have a non-aggregated column along with aggregated ones within the result
Related
I want to print name and salary amount of the employee which has highest salary, till now its okay but if there are multiple records than print all. There are two table given :-
EMPLOYEE TABLE :-
SALARY TABLE:-
my query is: -
SELECT E.NAME, S.AMOUNT
FROM `salary` S,
employee E
WHERE S.EMPLOYEE_ID = E.ID
and S.AMOUNT = (SELECT max(`AMOUNT`)
FROM `salary`)
is there any better way to find out the solution ?
It is "with ties" functionality what you're trying to achieve. Unfortunately mySQL doesn't support that (in the docs there is nothing to add to the "LIMIT" part of the query), so you have no other option rather than looking for max salary first and filter records afterwards.
So, your solution is fine for that case.
Alternatively, if you're on version 8 and newer, you may move the subquery to the with clause
with max_sal as (
select max(amount) ms from salary
)
SELECT E.NAME, S.AMOUNT
FROM salary S
JOIN employee E
ON S.EMPLOYEE_ID = E.ID
JOIN max_sal ms
ON S.AMOUNT = ms.ms
or search for it in the join directly
SELECT E.NAME, S.AMOUNT
FROM salary S
JOIN employee E
ON S.EMPLOYEE_ID = E.ID
JOIN (select max(amount) ms from salary) ms
ON S.AMOUNT = ms.ms
But I'm sure it won't get you any better performance
I like solving them with a join:
WITH M as (select max(amount) as amount from salary)
SELECT E.NAME, S.AMOUNT
FROM M JOIN SALARY USING(AMOUNT) JOIN Employee USING(Id)
but your solution is perfectly fine..
have this problem for a school example problem where I have to get the total salary for coaches and participants in March (done below) and then I have to sum to get the total salary due in March for all employees which I just want to add onto the end of the Total Salary column.
This is what I have so far:
(SELECT Coach.name AS Name, COUNT(*) AS 'Shows Attended In March',
dailySalary AS 'Daily Salary', sum(dailySalary) AS 'Total Salary'
FROM Coach, TVShow, CoachInShow
WHERE monthname(dateOfShow)='March' AND
Coach.idCoach=CoachInShow.idCoach AND TVShow.idShow =
CoachInShow.idShow
GROUP BY Coach.name, Coach.dailySalary)
UNION
(SELECT Participant.name AS Name, COUNT(*) AS 'Shows Attended In
March', dailySalary AS 'Daily Salary', sum(dailySalary) AS 'Total
Salary'
FROM Participant, TVShow, Contender, ContenderInShow
WHERE monthname(dateOfShow)='March' AND Participant.idContender =
Contender.idContender AND Contender.idContender =
ContenderInShow.idContender AND ContenderInShow.idShow = TVShow.idShow
GROUP BY Participant.name, Participant.dailySalary);
I tried using GROUP BY WITH ROLLBACK on the whole thing but it doesn't add up only the TotalSalary columns. I've spent a while on this and kinda stumped.
I pasted the data here for what I'm working with: https://www.db-fiddle.com/f/gPKVQrZCMkvHUqViAUzCqZ/0 http://sqlfiddle.com/#!9/535f6d/1
Put the UNION into a subquery. In the main query, sum all the counts and total salaries, and use WITH ROLLUP to get the grand total.
You don't need dailySalary in the GROUP BY clause, since it's functionally dependent on the ID.
SELECT name AS Name, SUM(count) AS `Shows Attended in March`, SUM(totalSalary) AS `Total Salary`
FROM (
SELECT Coach.name, COUNT(*) AS count, SUM(dailySalary) AS totalSalary
FROM Coach
JOIN CoachInShow ON Coach.idCoach=CoachInShow.idCoach
JOIN TVShow ON TVShow.idShow = CoachInShow.idShow
WHERE monthname(dateOfShow)='March'
GROUP BY Coach.idCoach
UNION
SELECT Participant.name, COUNT(*) AS count, SUM(dailySalary) AS totalSalary
FROM Participant
JOIN Contender ON Participant.idContender = Contender.idContender
JOIN ContenderInShow ON Contender.idContender = ContenderInShow.idContender
JOIN TVShow ON ContenderInShow.idShow = TVShow.idShow
WHERE monthname(dateOfShow)='March'
GROUP BY Participant.idParticipant
) AS x
GROUP BY Name
WITH ROLLUP
DEMO
I am trying to understand why it is a popular belief that avoiding a group by is always beneficial. My problem statement is : From an employee table where department_id is a foreign key, find out those departments where an employees maximum salary is 40000
1 the group by approach :
select d.department_name , e.max_salary
from department d
join ( select department_id, max(salary) as max_salary
from emp
group by 1
having max_salary = 40000 ) e
on (d.department_id = e.department_id)
2 Now the left join approach :
select d.department_name, inner_q.salary
from department d
join
(select e.department_id , e.salary
from emp e
left join emp e_inner
on (e.department_id = e_inner.department_id and e.salary < e_inner.salary)
where e_inner.department_id is null and e.salary = 40000 ) inner_q
on (d.department_id = inner_q.department_id)
Unfortunately explain plan does not make much sense to me. Any help in explaining which one should perform better and why would be much appreciated.
You are working too hard.
SELECT department_name, MAX(salary) AS max_salary
FROM emp
GROUP BY department_name
HAVING max_salary >= 40000
That will be faster than any version with subqueries.
This will make it run faster: INDEX(department_name, salary)
(Perhaps you want >= 40000, not = 40000?)
This version will make a single pass over the entire table (or INDEX, if you add that "covering" index), gathering the max salary for each department. Then it will throw away results that fail the HAVING clause; delivering the rest.
I would have not qualms about running this GROUP BY on a table of 10K rows. A million-row table would take a noticeable, but small, amount of time.
I have a problem with extracting data from db. My code looks like:
SELECT MAX(maximum)as Number FROM
(
SELECT department_name, COUNT(employees.employee_id) AS maximum
FROM departments, employees
WHERE departments.department_id=employees.department_id
GROUP BY department_name
)t
and the result is:
Number
1 46
and this is the number of maximum employees in one of the departments.
The problem is that i want to have additional column with the name of the department in wich there is 46 employees. I tried something like:
select department_name, count(employees.employee_id)
from departments, employees
where departments.department_id=employees.department_id
group by department_name
having count(employees.employee_id) =
( SELECT MAX(maxx)FROM
(SELECT department_name, COUNT(employees.employee_id) AS maxx
FROM departments, employees
WHERE departments.department_id=employees.department_id
GROUP BY department_name
);
but it doesn't work. Please help!
try this
SELECT t.department_name, MAX(t.maximum) as Number FROM (
SELECT
department_name,
COUNT(employees.employee_id) AS maximum
FROM departments, employees
WHERE departments.department_id=employees.department_id
GROUP BY department_name
) t
How about this just order your results by maximum in descending order with limit 1,also use modern join syntax query
SELECT d.department_name,
COUNT(e.employee_id) AS maximum
FROM departments d
LEFT JOIN employees e
ON (d.department_id=e.department_id)
GROUP BY d.department_name
ORDER BY maximum DESC
LIMIT 1
This will give you the department name who has the highest count of employees
You could also tack on an additional JOIN to the first query you posted as such:
SELECT TOP 1 d.department_name, MAX(maximum) as Number FROM
(
SELECT department_id AS department_id, COUNT(employees.employee_id) AS maximum
FROM departments, employees
WHERE departments.department_id=employees.department_id
GROUP BY department_id
)t
JOIN departments d ON d.department_id = t.department_id
GROUP BY d.department_name
ORDER BY Number DESC
I tried this with a similar data model in my domain and it works quite nicely.
I'm getting a problem when trying to run this query:
Select
c.cname as custName,
count(distinct o.orderID) as No_of_orders,
avg(count(distinct o.orderID)) as avg_order_amt
From Customer c
Inner Join Order_ o
On o.customerID = c.customerID
Group by cname;
This is an error message: #1111 (HY000) - Invalid use of group function
I just want to select each customer, find how many orders each customer has, and average the total number of orders for each customer. I think it might have a problem with too many aggregates in query.
The issue is that you need to have two separate groupings if you want to calculate the average over a count, so this expression isn't valid:
avg(count(distinct o.orderID))
Now it's hard to understand what exactly you mean, but it sounds as if you just want to use avg(o.amount) instead.
[edit] I see your addition now, so while the error is still the same, the solution will be slightly more complex. The last value you need, the avarage number of orders per customer, is not a value to calculate per customer. You'd need analytical functions to that, but that might be quite tricky in MySQL. I'd recommend to write a separate query for that, otherwise you would have very complex query which would return the same number for each row anyway.
select c.cname, o.customerID, count(*), avg(order_total)
from order o join customer using(customerID)
group by 1,2
This will calculate the number of orders and average order total (substitute the real column name for order_total) for each customer.
how many orders each customer has,
average the total number of orders.
SELECT
c1.cname AS custName,
c1.No_of_orders,
c2.avg_order_amt
FROM (
SELECT
c.id,
c.cname,
COUNT(DISTINCT o.orderID) AS No_of_orders
FROM
Customer c
JOIN Order_ o ON o.customerID = c.customerID
GROUP BY c.id, c.cname
) c1
CROSS JOIN (SELECT AVG(No_of_orders) AS avg_order_amt FROM (
SELECT
c.id,
COUNT(DISTINCT o.orderID) AS No_of_orders
FROM
Customer c
JOIN Order_ o ON o.customerID = c.customerID
GROUP BY c.id
)) c2