Making sense of Inner-Join with Sub-query in SQL Server 2008 - sql-server-2008

I have 2 tables called 'table123' and 'table246'.
'table123' columns: 'ID', 'Dept_ID', 'First_Name', 'Surname', 'Salary', 'Address'.
'table246' columns: 'Dept_ID', 'Dept_Name'.
I want to find the Average Salary for each 'Dept_Name'. So I tried using the query below, which is an Equi-Join with a Sub-query:
SELECT Dept_Name, alt.Average_Salary AS Avg_Salary
FROM table123 a, table246 b,
(SELECT Dept_ID, AVG(Salary)Avg_Salary
FROM table123
GROUP BY Dept_ID)alt
WHERE a.Dept_ID = alt.Dept_ID
AND a.Salary = alt.Average_Salary
AND a.Dept_ID = b.Dept_ID;
However, when I run the above query, it gives the desired 2 column names 'Dept_Name' and 'Avg_Salary', but with no data in it (just a blank table).
What am I doing wrong in the code, which is causing this blank result table?
Also, is there an alternative method of getting the same result, using an Inner- Join? The Equi-Join is quite confusing.

Never use commas in the FROM clause. Always use proper, explicit JOIN syntax.
Your query does not return results because no one has exactly the average value, so the salary condition fails. Based on what you are selecting, the subquery is the query you want:
SELECT Dept_ID, AVG(Salary) as Avg_Salary
FROM table123
GROUP BY Dept_ID;
Presumably, the other table brings in the name, so:
SELECT b.Dept_Name, AVG(a.Salary) as Avg_Salary
FROM table123 a JOIN
table246 b
ON a.Dept_ID = b.Dept_Id
GROUP BY b.Dept_Name;

What about:
SELECT Dept_Name, alt.Avg_Salary
FROM table246 b
INNER JOIN (SELECT Dept_ID, AVG(Salary)Avg_Salary
FROM table123
GROUP BY Dept_ID)alt ON B.DEPT_ID = ALT.DEPT_ID

Go through this maybe it will help you.
http://www.databasejournal.com/features/mysql/article.php/3835506/Fetching-Data-from-Multiple-Tables-using-Joins.htm

Related

Is CTE better for optimization than sub-queries in sql/mysql?

I'm giving one example
-- sub-query
SELECT p.first_name, p.last_name,
d.department_count, s.total_sales
FROM persons as p
INNER JOIN
(
SELECT department_id,
COUNT(people) as department_count
FROM department as d
WHERE department_type = 'sales'
GROUP BY department_id
) as d ON d.department_id = p.department_id
LEFT OUTER JOIN
(
SELECT person_id,
SUM(sales) as total_sales
FROM orders
WHERE orders.department_id = d.department_id
GROUP BY person_id
) as s ON s.person_id = p.person_id
-- cte
WITH deps as
(
SELECT department_id,
COUNT(people) as department_count
FROM department as d
WHERE department_type = 'sales'
GROUP BY department_id
), sales as
(
SELECT person_id,
SUM(sales) as total_sales
FROM orders
WHERE orders.department_id = d.department_id
GROUP BY person_id
)
SELECT p.first_name, p.last_name,
d.department_count, s.total_sales
FROM persons as p
INNER JOIN deps as d
ON d.department_id = p.department_id
LEFT OUTER JOIN sales as s
ON s.person_id = p.person_id
but I'm also wanting the answer in overall case. In some cases it may depend on the dataset and objective? But usually, which one is better for optimization/performance when running the query? Moreover, if there's few less lines in any of these procedure compared to the other, will that make the execution faster?
Both examples you show will be executed by MySQL using temporary tables. That is, the result of both the subquery or the CTE will be stored in a temporary table that lives for the duration of the query, then automatically dropped when the query ends.
Temporary tables are used for other types of queries in MySQL. You can read more about them here: https://dev.mysql.com/doc/refman/8.0/en/internal-temporary-tables.html
Temporary tables are often associated with performance overhead. It takes time for the temporary table to be created and filled with rows from the result of the subquery or CTE. This is unavoidable.
If you can run a different query to get the result you want without creating a temporary table, that's almost always better for performance. But in the examples you show, I don't think it's possible to do in a single query.
Almost every general rule about performance has exceptions, so you really need to be careful to evaluate performance on a case by case basis. Performance optimization is a complex subject.
These indexes may help:
orders: INDEX(department_id, person_id)
p: INDEX(department_id, first_name, last_name, person_id)
s: INDEX(person_id, total_sales)
d: INDEX(department_type, department_id)
Typically COUNT(*) is better than COUNT(col)

Difference between Equi-Join and Inner-Join in SQL

I have 2 tables called 'table123' and 'table246'.
'table123' columns: 'ID', 'Dept_ID', 'First_Name', 'Surname', 'Salary', 'Address'.
'table246' columns: 'Dept_ID', 'Dept_Name'.
I want to find the list of employees with the lowest salary per department. Two of the ways I can do it is an Equi-Join or an Inner-Join. I've been told they can both be used to provide the desired result.
The queries I used:
Equi-Join:
SELECT First_Name, b.Dept_Name, alt.Min_Salary AS Min_Salary
FROM table123 a, table246 b,
(SELECT Dept_ID, MIN(Salary)Min_Salary
FROM table123
GROUP BY Dept_ID)alt
WHERE a.Dept_ID = b.Dept_ID
AND a.salary = alt.Min_Salary
AND a.Dept_ID = alt.Dept_ID;
Inner-Join:
SELECT MIN(Salary)Min_Salary, Dept_Name
FROM table123 a, table246 b
INNER JOIN (SELECT First_Name, MIN(Salary)
FROM table123
GROUP BY Dept_ID)alt
ON b.Dept_ID = alt.Dept_ID;
The Equi-Join statement gives me the desired table, containing the columns 'First_Name', 'Dept_Name' & 'Min_Salary', with all relevant data.
However, the Inner-Join statement doesn't run because the First_Name column needs to be included in the aggregate function or GROUP BY clause. This really confuses me, as I don't know how to go about fixing it. How can I adjust the Inner-Join query, so as to give the same result as the Equi-Join query?
Try this:
SELECT a.First_Name, b.Dept_Name, alt.Min_Salary AS Min_Salary
FROM table123 a
INNER JOIN table246 b
ON a.Dept_ID = b.Dept_ID
INNER JOIN (
SELECT Dept_ID, MIN(Salary) Min_Salary
FROM table123
GROUP BY Dept_ID
) alt
ON b.Dept_ID = alt.Dept_ID
WHERE a.Salary = alt.Min_Salary;

Converting a Join Query into a Sub Query

I'm attempting to write a sub query that wold accomplish the same results as the join query shown below.
SELECT Department_to_major.DNAME
FROM Department_to_major
INNER JOIN Course
ON Department_to_major.Dcode = Course.OFFERING_DEPT
WHERE Course.COURSE_NAME LIKE '%INTRO%'
GROUP BY Department_to_major.DNAME
However each attempt has produced errors.
Is there a way to write this as a sub query?
Hi, You can use below query,
SELECT DNAME FROM Department_to_major WHERE
Dcode IN (SELECT OFFERING_DEPT FROM Course
WHERE COURSE_NAME LIKE '%INTRO%')
You have used GROUP BY clause, but there is no any aggregate function in the query. Is your query works fine?
Here is a way to use a subquery:
SELECT DISTINCT dm.DNAME
FROM Department_to_major dm
WHERE EXISTS (SELECT 1
FROM Course c
WHERE dm.Dcode = c.OFFERING_DEPT AND
c.COURSE_NAME LIKE '%INTRO%'
);
I assume the GROUP BY is to prevent duplicates in the output; SELECT DISTINCT does the same thing.
That said, storing the department code and name in Department_to_major is not a good data structure, because the department name is (presumably) repeated multiple times. I would expect you to have just a Departments table, with one row per department.
Then the query would look like:
SELECT d.DNAME
FROM Departments d
WHERE EXISTS (SELECT 1
FROM Course c
WHERE d.Dcode = c.OFFERING_DEPT AND
c.COURSE_NAME LIKE '%INTRO%'
);
And the SELECT DISTINCT/GROUP BY is unnecessary.
Try the below query. I am assuming that you have used "GROUP BY" clause to make DNAME field unique.
SELECT DISTINCT(DNAME)
FROM Department_to_major
WHERE Dcode IN (SELECT OFFERING_DEPT
FROM Course
WHERE COURSE_NAME LIKE '%INTRO%');

Which one of it is a better query and why?

I have to get the names of the Departments and the number of Employees in it. Test is my schema.
So I come up with two queries that give me the same result -
First
SELECT Department.Departmentname,
(
SELECT COUNT(*)
FROM test.Employee
WHERE Employee.Departmentid = Department.idDepartment
) AS NumberOfEmployees
FROM test.Department;
Second
SELECT Department.Departmentname AS NAme,COUNT(Employee.idEmployee) AS Employee_COUNT
FROM test.Department
LEFT JOIN test.Employee
ON Employee.Departmentid = Department.idDepartment
GROUP BY Employee.Departmentid ;
Which of the two is the best and efficient way to get the required result? Any other solution is welcome.
Please explain why a particular solution is better
My preference for expressing the logic is the second query, which I would write as:
SELECT d.Departmentname AS Name, COUNT(e.idEmployee) AS Employee_COUNT
FROM test.Department d LEFT JOIN
test.Employee e
ON e.Departmentid = d.idDepartment
GROUP BY d.Departmentname;
Note the use of table aliases and the fact that the GROUP BY uses the same columns as the SELECT. However, in MySQL, this query will not use an index on DepartmentName for the group by. That means that the GROUP BY is doing a file sort, a relatively expensive operation.
When you write the query like this:
SELECT d.Departmentname,
(SELECT COUNT(*)
FROM test.Employee e
WHERE e.Departmentid = d.idDepartment
) AS NumberOfEmployees
FROM test.Department d;
No explicit group by is needed. With an index on Employee(DepartmentId) this will use the index for the count(*), so this version would normally perform better in MySQL.
The difference in performance is probably negligible until you start having thousands or ten of thousands of rows.

Sql query to join two table for the final output

Below are the two tables, customer and department. I am struggling very hard to get the output.
I want to write the query which shows only the department name which have maximum number of employees.
Answer should be like this ...
Please help me to write the query.
I would suggest using a sub-query, and then selecting from that query. With the outer select, you order by the number of employees and then limit it to 1. This will give you the top department, but also has the flexibility to be modified to give you a list of the x-number of top departments.
SELECT Dep_Name FROM (
SELECT
d.Dep_Name, COUNT(*) AS `count`
FROM
Departments d
JOIN Employees e ON e.Dep_id = d.Dep_id
GROUP BY
d.Dep_id
) AS q
ORDER BY `count` DESC
LIMIT 1
UPDATE
Per a comment by #Dems, you can actually handle this without a sub-query:
SELECT
d.Dep_Name
FROM
Departments d
JOIN Employees e ON e.Dep_id = d.Dep_id
GROUP BY
d.Dep_id
ORDER BY
COUNT(*) DESC
LIMIT 1
SELECT *
FROM
(
SELECT
d.dep_id,
d.dep_name,
count(c.cus_id) cusCount
FROM
cus c,
dep d
WHERE
c.dep_id = d.dep_id
GROUP BY
d.dep_id,d.dep_name
ORDER BY
cusCount desc)
WHERE
ROWNUM = 1;
I created cus & dep tables in Oracle 10g database and tested my code successfully.
What database are you using, and could you post you code.
There error message shows that your "Order by" clause is wrong.