Left outer join vs subquery to include departments with no employees

Left outer join vs subquery to include departments with no employees - mysql

Lets say I have the following database model:
And the question is as follows:
List ALL department names and the total number of employees in the department. The total number of employees column should be renamed as "total_emps". Order the list from the department with the least number of employees to the most number of employees. Note: You need to include a department in the list even when the department does not currently have any employee assigned to it.
This was my attempt:
SELECT Department.deptname
(SELECT COUNT(*)
FROM Department
WHERE Department.empno = Employee.empno ) AS total_emps
FROM Department
I'm pretty sure my solution is not correct as it won't include departments with no employees. How do you use a left inner join to solve this problem?

The query as you were trying to write it is:
(table creates modified from shree.pat18's sqlfiddle to this sqlfiddle)
create table department (deptno int, deptname varchar(20));
insert into department values (1, 'a'),(2, 'b'),(3, 'c');
create table employee (empno int, deptno int);
insert into employee values (1,1),(2,1),(3,3);
SELECT d.deptname,
(SELECT COUNT(*)
FROM EMPLOYEE e
WHERE d.deptno = e.deptno ) AS total_emps
FROM DEPARTMENT d
ORDER BY total_emps ASC;
(You were counting from DEPARTMENT instead of EMPLOYEE and comparing empno instead of deptno. And you left out a comma.)
(You were asked for every department's name and employee count so this returns that. In practice we would include a presumably unique deptno if deptname was not unique.)
I'm pretty sure my solution is not correct as it won't include
departments with no employees.
Even your answer's version of the query (with the missing comma added) has an outer select that returns a count for every department no matter what the subselect returns. So I don't know why/how you thought it wouldn't.
If you want to use LEFT (OUTER) JOIN then the DEPARTMENT rows with no employees get extended by NULL. But COUNT of a column only counts non-NULL rows.
SELECT d.deptname, COUNT(e.empno) AS total_emps
FROM DEPARTMENT d
LEFT JOIN EMPLOYEE e
ON d.deptno = e.deptno
GROUP BY d.deptno
ORDER BY total_emps ASC;
(Nb the LEFT JOIN version uses more concepts: LEFT JOIN extending by NULL, GROUP BY, and COUNT's NULL behaviour for non-*.)

First off, it's a left outer join. Now, for your query, you want to join the 2 tables based on deptno, then also group by deptno (or deptname, since that is as likely to be unique) to ensure that any aggregation we do is done for each unique department in the table. Finally, the counting is done with the count function, leading to this query:
select d.deptname, count(e.empno) as total_emps
from department d
left join employee e on d.deptno = e.deptno
group by d.deptname
SQLFiddle
Note that since we want all records from department regardless of whether there are matching records in employee or not, department must appear at the left side of the join. We could have done the same thing using a right outer join by swapping the positions of the 2 tables in the join.

Related

Having trouble a query and specifically with joins

The code below is completely wrong and does not work at all. Im basically trying to look through my tables and compile a list of DeptName and the total student number for a department where a department has more than 40 students.
Im confused about joins in general and if someone could explain and show where im going wrong. im sure there is also other problems so any help with them would help
So basically one department is connected to one module, and a student is enrolled in a module. A student cannot take a module outside of their department. So each student should have one module that connects to one department
All of the ID fields in other tables are foreign keys as you can guess and changing the tables is not what I want to do here I just want to do this query as this stands
Relevant tables columns
Table Department DeptID, DeptName, Faculty, Address
Table Modules ModuleID, ModuleName, DeptID, Programme
Table Students StudentID,StudentName,DoB,Address,StudyType,`
Table Enrolments EID,StudentID,ModuleID,Semester,Year
SELECT Department.DeptName, COUNT(Student.StudentID) AS 'No of Students' FROM Department LEFT JOIN Module ON Department.DeptID= Module.DeptID LEFT JOIN Enrolment ON Module.ModuleID= Enrolment.StudentID LEFT JOIN Student.StudentID
GROUP BY(Department.DeptID)
HAVING COUNT(Student.StudentID)>=40
I have not included every table here as there are quite a lot.
But unless i've got this completely wrong you don't need to access a ModuleID in a staff table for the module they teach or something not relevant to this at all. As no student or Dept details are in there.
If that is the case i will fix it very quickly.

SELECT Department.DeptName, COUNT(Student.StudentID) AS 'No of Students'
FROM Department
LEFT JOIN Module
ON Department.DeptID= Module.DeptID
LEFT JOIN Enrolment
-- problem #1:
ON Module.ModuleID= Enrolment.StudentID
-- problem #2:
LEFT JOIN Student.StudentID
-- problem #3:
GROUP BY(Department.DeptID)
HAVING COUNT(Student.StudentID)>=40
You're joining these two tables using the wrong field. Generally when the modeling is done correctly, you should use USING instead of ON for joins
The right side of any JOIN operator has to be a table, not a column.
You have to group by every column in the select clause that is not part of an aggregate function like COUNT. I recommend that you select the DeptID instead of the name, then use the result of this query to look up the name in a subsequent select.
Note : Following code is untested.
WITH bigDepts AS (
SELECT DeptId, COUNT(StudentID) AS StudentCount
FROM Department
JOIN Module
USING ( DeptID )
JOIN Enrolment
USING ( ModuleID )
JOIN Student
USING ( StudentID )
GROUP BY DeptID
HAVING COUNT(StudentID)>=40
)
SELECT DeptID, DeptName, StudentCount
FROM Department
JOIN bigDepts
USING ( DeptID )

Instead of left join you need to use inner join since you need to select related rows only from those three tables.
Groupy by and having clause seems fine. Since you need departments with more than 40 students instead of >= please use COUNT(e.StudentID)>40
SELECT d.DeptName, COUNT(e.StudentID) AS 'No of Students' FROM Department d INNER JOIN Module m ON d.DeptID= m.DeptID inner JOIN Enrolment e ON m.ModuleID= e.StudentID LEFT JOIN Student.StudentID
GROUP BY(d.DeptName)
HAVING COUNT(e.StudentID)>40

So your join clause was a bit iffy to students as you wrote it, and presumably these should all be inner joins.
I've reformatted your query using aliases to make it easier to read.
Since you're counting the number of rows per DeptName you can simply do count(*), likewise in your having you are after counts greater than 40 only. Without seeing your schemas and data it's not possible to know if you might have duplicate Students, if that's the case and you want distinct students count can amend to count(distinct s.studentId)
select d.DeptName, Count(*) as 'No of Students'
from Department d
join Module m on m.DeptId=d.DeptId
join Enrolment e on e.StudentId=m.ModuleId
join Students s on s.StudentId=e.studentId
group by(d.DeptName)
having Count(*)>40
Also, looking at your join conditions, is the Enrolement table relevant?
select d.DeptName, Count(*) as 'No of Students'
from Department d
join Module m on m.DeptId=d.DeptId
join Students s on s.StudentId=m.moduleId
group by(d.DeptName)
having Count(*)>40

Trying to get a row count in a subquery

I have two tables, one is departments and the other is employees. The department id is a foreign key in the employees table. The employee table has a name and a flag saying if the person is part-time. I can have zero or more employees in a department. I'm trying to figure out out to get a list of all departments where a department has at least one employee and if it does have at least one employee, that all the employees are part time. I think this has to be some kind of subquery to get this. Here's what I have so far:
SELECT dept.name
,dept.id
,employee.deptid
,count(employee.is_parttime)
FROM employee
,dept
WHERE dept.id = employee.deptid
AND employee.is_parttime = 1
GROUP BY employee.is_parttime
I would really appreciate any help at this point.

You must join (properly) the tables and group by department with a condition in the HAVING clause:
select d.name, d.id, count(e.id) total
from dept d inner join employee e
on d.id = e.deptid
group by d.name, d.id
having total = sum(e.is_parttime)
The inner join returns only departments with at least 1 employee.
The column is_parttime (I guess) is a flag with values 0 or 1 so by summing it the result is the number of employees that are part time in the department and this number is compared to the total number of employees of the department.

As a preliminary aside, I recommend expressing joins with the JOIN keyword, and segregating join conditions from filter conditions. Doing so would make the original query look like so:
select dept.name, dept.id, employee.deptid, count(employee.is_parttime)
from employee
join dept on dept.id = employee.deptid
where employee.is_parttime = 1
group by employee.is_parttime
It doesn't make much practical difference for inner joins, but it does make the structure of the data and the logic of the query a bit clearer. On the other hand, it does make a difference for outer joins, and there is value in consistency.
As for the actual question, yes, one can rewrite the original query using a subquery or an inline view to produce the requested result. (An "inline view" is technically what one should call an embedded query used as a table in the FROM clause, but some people lump these in with subqueries.)
Example using a subquery
select dept.name, dept.id
from dept
where dept.id in (
select deptid
from employee
group by deptid
having count(*) == sum(is_parttime)
)
Example using an inline view
select dept.name, dept.id
from dept
join (
select deptid
from employee
group by deptid
having count(*) == sum(is_parttime)
) pt_dept
on dept.id = pt_dept.deptid
In each case, the subquery / inline view does most of the work. It aggregates employees by department, then filters the groups (HAVING clause) to select only those in which the part-time employee count is the same as the total count. Naturally, departments without any employees will not be represented. If a list of department IDs would suffice for a list of departments, then that's actually all you need. To get the department names too, however, you need to combine that with data from the dept table, as demonstrated in the two example queries.

JOIN subquery and different reault

I was writing an exercise about "Write a query to find the names (first_name, last_name) of the employees who are not supervisors"
I write it on my own and when i check the result or both, mine has less rows than the other.
I was using the JOIN function and the other doesn't.
I want help to know why two results are so different.
Thanks
the one i use join
SELECT
first_name, last_name
FROM
employees AS E
JOIN
departments AS D ON E.department_id = D.department_id
WHERE
NOT EXISTS( SELECT
0
FROM
departments
WHERE
E.manager_id = D.manager_id)
order by last_name;
the one doesn't use join
SELECT
b.first_name, b.last_name
FROM
employees b
WHERE
NOT EXISTS( SELECT
0
FROM
employees a
WHERE
a.manager_id = b.employee_id);

The big problem is that the NOT EXISTS subquery is referencing rows from the joined table. The manager_id column is qualified with D., and that's a reference to the joined table, not the table in the FROM clause of the subquery.
E.manager_id = D.manager_id
^^^^^^^ ^
We also suspect that an employee's supervisor is recorded in the employee row, as a reference to another row in the the employee table. But we don't have the schema definition or any example data, so we're just guessing.
It seems like there would be a supervisor_id in the employee table...
SELECT e.first_name
, e.last_name
, e.department_id
, d.department_id
FROM employees e
WHERE NOT EXISTS
( SELECT 1
FROM employees s
WHERE s.id = e.supervisor_id
)
ORDER
BY e.last_name
, e.first_name
It's also possible that some rows in employee have a value in the department_id column that don't have a matching row in the department table. If there is no matching row in department, the inner join will prevent the row from employee from being returned.
We can use an outer join when we want to return rows even when no matching row is found in the joined table. If we want to involve the departments table, because a "supervisor" is defined to be an employee that is the manager of a department, we can employ an anti-join pattern...
SELECT e.first_name
, e.last_name
FROM employees e
LEFT
JOIN departments d
ON d.manager_id = e.employee_id
WHERE d.manager_id IS NULL
ORDER
BY e.last_name
, e.first_name
Again, without a schema and some sample data, we're just guessing.

This query can be written using only Employees table, but in your query you have joined the employees table with department table. A query should be written with the minimal amount of tables that suffice your expected output, joining unnecessary tables may result in wrong out puts.
In this case you are joining Employees with department here what if there is no Department_ID in employee table for some employees, so those data will be dropped in the join and result won't be the expected.

SQL Complex Query w/ Inner Join

I have 2 tables, 1 called Employee and 1 called Salary. Employee table consists of Emp_Name, Emp_Address, Emp_ID & Salary table consists of Salary_Details and Emp_ID. > Can you write down a query for retrieving the Salary_Details of 1 of the employee based on last name using Inner Join?

I am not sure what you are looking for, but this might help you:
SELECT * FROM Employee e
INNER JOIN Salary s ON e.Emp_ID = s.Emp_ID
WHERE e.Emp_Name = 'EMPLOYEENAME'
That will give you back all fields from Employee and Salary for an Employee with the name = 'EMPLOYEENAME' (which you can exchange then).

You can adjust the columns returned as needed depending on your app...
SELECT e.Emp_Name, e.Emp_ID, s.Salary_Details
FROM Employee e
INNER JOIN Salary s USING (Emp_ID)
WHERE e.Emp_Name = 'Smith';
The USING keyword is kind of obscure and works only if the join column is named identically in both tables. The previous answer with ON instead of USING will work in all cases. I like USING as a personal preference.

Select data from one table which is connected to another 2 tables

I have 3 tables: Emplyees, Jobs and Departments
What I'm trying to achieve is to get the number of emplyees from one department.
I tried something:
SELECT count(Emplyees.id) FROM Emplyees
INNER JOIN Job ON (Job.id = Emplyees.job_id)
INNER JOIN Department ON (Department.id = 2)
but it returns the number of emplyees from all departments.
Any advice please?

An EXISTS clause will allow you to limit by the existence of something without having to worry about whether or not an employee also has other jobs, which will keep your count easy to figure.
Also, since the only thing you need from the department is the id, you can leave that table out and just filter by the dept_id field of the Job table.
SELECT count(id)
FROM Employees
WHERE EXISTS (
SELECT 1
FROM Job
WHERE id = Employees.job_id
AND dept_id = 2
)

Use WHERE clause to filter out department,where clause will apply for the whole result set returned by joins,while condition in on clause will filter result from joined table only
SELECT count(e.id)
FROM Emplyees e
INNER JOIN Job j ON (j.id = e.job_id)
INNER JOIN Department d ON (j.dept_id =d.id )
WHERE d.id = 2
And also use DISTINCT in count so if any employee has applied on multiple jobs that belong to same department will be counted as 1 i.e COUNT(DISTINCT e.id)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008