SQL Subtract two different averages in one single coulmn - mysql

I have a database for the airport.
I have one table named "employees" and one other table named "certified".
The "employees" table includes the names of all employees(pilot and other employees) and the staff code
named "eid" and the number of their salaries.
The certified table contains the pilot staff who specify which aircrafts they can fly.
WHAT I WANT: I want to subtract the average salary of pilots and the average salary of all employees(pilot and other employees).
I use this code:
Select AVG(salary) - (Select AVG(salary) from employees) as subtract
from employees as e,
certified as c
where e.eid = c.eid
but I have a problem :
As you can see in the "certified" table the "eid" of some pilots is repeated and affects the average value.
How can I use a salary only once in the calculation?
thank you in advance.

you can try something like below using SWITCH statement in MS Access
select
AVG(salary) - AVG( switch(IsPilot =1,salary,IsPilot=0,NULL)) as substract
from
(
select
distinct
e.eid,
e.salary,
Switch( c.eid is not null, 1 ,c.eid is NULL, 0) as IsPilot
from
employees e left join certified c
on e.id=c.eid )T
corresponding statement in MySql is
select
AVG(salary) - AVG( case when IsPilot =1 then salary when IsPilot=0 then NULL end) as substract
from
(
select
distinct
e.eid,
e.salary,
case when c.eid is not null then 1 when c.eid is NULL then 0 as IsPilot
from
employees e left join certified c
on e.id=c.eid )T

Use conditional aggregation:
SELECT AVG(e.salary) - AVG(CASE WHEN c.eid IS NOT NULL THEN e.salary END) as difference
FROM employees AS e
LEFT JOIN ( SELECT DISTINCT eid
FROM certified ) AS c ON e.eid=c.eid
First AVG calculates average salary over all rows, second - only for those rows (those employees) which have matched row(s) in certified table (all another rows have NULL for inner CASE, so they are not counted).
The query assumes that eid column is unique (maybe, primary key) in employees table.
PS. Maybe AVGs in the substraction must be swapped...
In MS Access this code may be
SELECT AVG(e.salary) - AVG(IIF(c.eid IS NOT NULL, e.salary, NULL)) as difference
FROM employees AS e
LEFT JOIN ( SELECT DISTINCT eid
FROM certified ) AS c ON e.eid=c.eid

Related

Reduce number of subqueries

I've 2 tables emp and expenditure.
Emp:
ID, NAME
Expenditure:
ID, EMP_ID, AMOUNT
Each emp has a limit of 100 that he/she can spend. We want to check which emp has expenditure > 100.
Output attributes needed: Emp name, exp id, amount
My query:
SELECT E.NAME,
EXP.ID,
EXP.AMOUNT
FROM EMP E
INNER JOIN expenditure EXP ON E.ID = EXP.EMP_ID
WHERE E.ID in
(SELECT EMP_ID
FROM
(SELECT EMP_ID,
SUM(AMOUNT) AS TOTAL
FROM expenditure
GROUP BY EMP_ID
HAVING SUM(AMOUNT) > 100.00
ORDER BY TOTAL DESC) SUBQ)
ORDER BY EXP.AMOUNT desc;
Is it possible to optimize this?
Just like code, SQL queries can be written in many different ways. Run an Execution Plan on your SQL. Check here and here
Below is more "conciser" version although it may not be any more optimised than your current code. Use Execution Plans to analyse performance.
SELECT E.NAME,
E.ID, -- THIS IS EMPLOYEE ID NOT EXPENDITURE ID
EXP.EMP_SPENT
FROM EMP E
JOIN (SELECT EMP_ID, sum(AMOUNT) as EMP_SPENT FROM expenditure GROUP BY EMP_ID) EXP
ON E.ID = EXP.EMP_ID
WHERE EXP.EMP_SPENT > 100;
Additionally...
I noticed that question is a bit confusing. The original query spits out "every line" of expense of an employee, whose "total" expenses are above 100. The text however, says "Which employee has gone above 100". These are two different questions. My answer above is for the latter "Who has gone above". It will NOT list all expenses of employees who had spent over 100, but only a list of employees who have gone above 100 and their total expenditure.
You can use a simple aggregation with HAVING clause such as
SELECT e.name, e.id, SUM(amount) AS total
FROM emp e
JOIN expenditure ep
ON e.id = ep.emp_id
GROUP BY e.name, e.id
HAVING SUM(amount) > 100
but it's not logical to have a non-aggregated column along with aggregated ones within the result

Find max salary and name of employee, if multiple records than print all

I want to print name and salary amount of the employee which has highest salary, till now its okay but if there are multiple records than print all. There are two table given :-
EMPLOYEE TABLE :-
SALARY TABLE:-
my query is: -
SELECT E.NAME, S.AMOUNT
FROM `salary` S,
employee E
WHERE S.EMPLOYEE_ID = E.ID
and S.AMOUNT = (SELECT max(`AMOUNT`)
FROM `salary`)
is there any better way to find out the solution ?
It is "with ties" functionality what you're trying to achieve. Unfortunately mySQL doesn't support that (in the docs there is nothing to add to the "LIMIT" part of the query), so you have no other option rather than looking for max salary first and filter records afterwards.
So, your solution is fine for that case.
Alternatively, if you're on version 8 and newer, you may move the subquery to the with clause
with max_sal as (
select max(amount) ms from salary
)
SELECT E.NAME, S.AMOUNT
FROM salary S
JOIN employee E
ON S.EMPLOYEE_ID = E.ID
JOIN max_sal ms
ON S.AMOUNT = ms.ms
or search for it in the join directly
SELECT E.NAME, S.AMOUNT
FROM salary S
JOIN employee E
ON S.EMPLOYEE_ID = E.ID
JOIN (select max(amount) ms from salary) ms
ON S.AMOUNT = ms.ms
But I'm sure it won't get you any better performance
I like solving them with a join:
WITH M as (select max(amount) as amount from salary)
SELECT E.NAME, S.AMOUNT
FROM M JOIN SALARY USING(AMOUNT) JOIN Employee USING(Id)
but your solution is perfectly fine..

Left Join or Group By for Finding Maximum Salary in Mysql

I am trying to understand why it is a popular belief that avoiding a group by is always beneficial. My problem statement is : From an employee table where department_id is a foreign key, find out those departments where an employees maximum salary is 40000
1 the group by approach :
select d.department_name , e.max_salary
from department d
join ( select department_id, max(salary) as max_salary
from emp
group by 1
having max_salary = 40000 ) e
on (d.department_id = e.department_id)
2 Now the left join approach :
select d.department_name, inner_q.salary
from department d
join
(select e.department_id , e.salary
from emp e
left join emp e_inner
on (e.department_id = e_inner.department_id and e.salary < e_inner.salary)
where e_inner.department_id is null and e.salary = 40000 ) inner_q
on (d.department_id = inner_q.department_id)
Unfortunately explain plan does not make much sense to me. Any help in explaining which one should perform better and why would be much appreciated.
You are working too hard.
SELECT department_name, MAX(salary) AS max_salary
FROM emp
GROUP BY department_name
HAVING max_salary >= 40000
That will be faster than any version with subqueries.
This will make it run faster: INDEX(department_name, salary)
(Perhaps you want >= 40000, not = 40000?)
This version will make a single pass over the entire table (or INDEX, if you add that "covering" index), gathering the max salary for each department. Then it will throw away results that fail the HAVING clause; delivering the rest.
I would have not qualms about running this GROUP BY on a table of 10K rows. A million-row table would take a noticeable, but small, amount of time.

Mysql query to count data from multiple tables and display 0 for null values

My tables are:
Department(Dept_id,Dept_name#)
Employee(Emp_id#,Emp_Name,Address,Phone,Email,Dept_name)
From the above tables to show the following details (Dept_Id,Dept_name,Total Employees).
I use the following query:
SELECT dept_id,department.dept_name,count(emp_id)"Total"
FROM department,employee_details
WHERE department.dept_name=employee_details.dept_name
GROUP BY dept_id;
In the above query I don't get all dept_name rather I get dept_name and dept_id of whose emp_id is counted. So how can I get the all data of dept_name and dept_id and corresponding result as 0 i.e Count(emp_id)=0.
Use LEFT JOIN instead of implicit INNER JOIN and add department.dept_name in your GROUP BY clause.
SELECT d.dept_id
, d.dept_name
, COUNT(ed.emp_id) AS Total
FROM department d
LEFT JOIN employee_details ed ON d.dept_name = ed.dept_name
GROUP BY d.dept_id
, d.dept_name

SELECT MAX from Counting SQL

I have a problem with extracting data from db. My code looks like:
SELECT MAX(maximum)as Number FROM
(
SELECT department_name, COUNT(employees.employee_id) AS maximum
FROM departments, employees
WHERE departments.department_id=employees.department_id
GROUP BY department_name
)t
and the result is:
Number
1 46
and this is the number of maximum employees in one of the departments.
The problem is that i want to have additional column with the name of the department in wich there is 46 employees. I tried something like:
select department_name, count(employees.employee_id)
from departments, employees
where departments.department_id=employees.department_id
group by department_name
having count(employees.employee_id) =
( SELECT MAX(maxx)FROM
(SELECT department_name, COUNT(employees.employee_id) AS maxx
FROM departments, employees
WHERE departments.department_id=employees.department_id
GROUP BY department_name
);
but it doesn't work. Please help!
try this
SELECT t.department_name, MAX(t.maximum) as Number FROM (
SELECT
department_name,
COUNT(employees.employee_id) AS maximum
FROM departments, employees
WHERE departments.department_id=employees.department_id
GROUP BY department_name
) t
How about this just order your results by maximum in descending order with limit 1,also use modern join syntax query
SELECT d.department_name,
COUNT(e.employee_id) AS maximum
FROM departments d
LEFT JOIN employees e
ON (d.department_id=e.department_id)
GROUP BY d.department_name
ORDER BY maximum DESC
LIMIT 1
This will give you the department name who has the highest count of employees
You could also tack on an additional JOIN to the first query you posted as such:
SELECT TOP 1 d.department_name, MAX(maximum) as Number FROM
(
SELECT department_id AS department_id, COUNT(employees.employee_id) AS maximum
FROM departments, employees
WHERE departments.department_id=employees.department_id
GROUP BY department_id
)t
JOIN departments d ON d.department_id = t.department_id
GROUP BY d.department_name
ORDER BY Number DESC
I tried this with a similar data model in my domain and it works quite nicely.