Semantics of multiple joins - mysql

What happens actually when we use cascaded join statements
select student.name, count(teacher.id)
from student
left join course on student.course_id = course.id
left join teacher on student.teacher_id = teacher.id
group by student.name;
It seems when I used only the first left join alone it returned 30 rows while using the second left join alone returned 20 rows. But using together returns 600 rows. What is actually happening ? Does the result from the first left join is used in the second ? I don't understand the semantics. Help me understand it.

Since you don't have any join conditions between teacher and course, you're getting a full cross-product between each of the other two joins. Since one join returns 20 rows and the other returns 30 rows, the 3-way join returns 20x30 = 600 rows. Its equivalent to:
SELECT t1.name, count(t2.id)
FROM (SELECT student.name
FROM student
LEFT JOIN course ON student.id = course.id) AS t1
CROSS JOIN
(SELECT teacher.id
FROM student
LEFT JOIN teacher ON student.id = teacher.id) AS t2
GROUP BY t1.name
Notice that the CROSS JOIN of the two subqueries has no ON condition.
The correct way to structure this database is as follows:
student table: id (PK), name
course table: id (PK), name, fee, credits
student_course table: id (PK), student_id (FK), course_id (FK), unique key on (student_id, course_id)
Then to get the name of each student and the average course fee, you would do:
SELECT s.name, AVG(c.fee) AS avg_fee
FROM student AS s
LEFT JOIN student_course AS sc ON s.id = sc.student_id
LEFT JOIN course AS c ON sc.course_id = c.id

All Mysql joins are graphically explained here. Take a look and choose correct joins for both joined tables.

Related

How can I join 4 table?

I have 4 tables, three are many to many relationship:
Student(StudID,GroupId,Name,....)
Lesson(LessID,LessonName,Mark)
StudentLesson(StudID,LessID)
and the relationship between student and Group is One to Many
Student(StudID,Name,....)
Group(GroupId,GroupNumber)
What I want is how select Name, LessonName, Mark, GroupNumber
select S.Name, L.LessonName, L.Mark, G.GroupNumber from Student s
join StudentLesson SL on SL.StudId = S.StudId
join Lesson L on SL.LessID = L.LessID
Join Group G on G.GroupId = S.GroupId
I think the error in this line Join Group G on G.GroupId=S.GroupId, because when I omit it, it works between many to many but between one to many it didn't work.
group is a reserved word, so it needs to be quoted. In MySQL, you can use backticks:
select S.Name, L.LessonName, L.Mark, G.GroupNumber
from Student S
join StudentLesson SL on SL.StudId = S.StudId
join Lesson L on SL.LessID = L.LessID
Join `Group` G on G.GroupId = S.GroupId
Based on Comments: the query is fine; you lack data that matches the results you're after.
There are no students with a groupID
There are no students with a groupID matching GROUPID in the group table.
To prove this out you could simply make the last join a LEFT Join provided you have no where clause with limits on Group.
FROM:
select S.Name,L.LessonName,L.Mark,G.GroupNumber from Student s
join StudentLesson SL on SL.StudId=S.StudId
join Lesson L on SL.LessID =L.LessID
Join Group G on G.GroupId=S.GroupId
TO:
SELECT S.Name, L.LessonName, L.Mark, G.GroupNumber
FROM Student s
INNER JOIN StudentLesson SL on SL.StudId=S.StudId
INNER JOIN Lesson L on SL.LessID =L.LessID
LEFT JOIN Group G on G.GroupId=S.GroupId
This will show you all students w/ lessons and groupNumber if the groupID's match; but i'm betting they will all be NULL.
So are you after all students regardless if they have lessons or groups if so your inner joins should be left. If you're only after students that have lessons and belong to groups then they all need to be inner joins. Just depends on what you're after!
Left join will say include all records from the prior data joins, and only those that match from this join (to group in the example)

SQL Left Join a Table on a Left Joined Table

Iam currently trying to left join a table on a left joined table as follows.
I have the tables:
accounts (id, vorname, nachname)
projektkurse (id, accounts_id, projektwochen_id)
projektkurs_einzel (id, projektkurse_id)
projektkurs_einzel_zeiten (id, date, shift, projektkurs_einzel_id)
Now I want to get every account and the amount times they have an entry inside of projektkurs_einzel_zeiten, which should also be unique. So having the same date and shift multiple times does not count as multiple entries. The result should also be limited by the column projektwochen_id from the table projektkurse. This column should match a certain value for example 8.
Some Accounts don't have any entries in projektkurse, projektkurs_einzel and projektkurs_einzel_zeiten, this is why my first thought was using LEFT JOIN like this:
SELECT accounts.id, accounts.vorname, accounts.nachname, COUNT(DISTINCT projektkurs_einzel_zeiten.date, projektkurs_einzel_zeiten.shift) AS T
FROM accounts
LEFT JOIN projektkurse on accounts.id = projektkurse.creator_id
LEFT JOIN projektkurs_einzel on projektkurse.id = projektkurs_einzel.projektkurs_id
LEFT JOIN projektkurs_einzel_zeiten ON projektkurs_einzel.id = projektkurs_einzel_zeiten.projektkurs_einzel_id
WHERE projektkurse.projektwochen_id = 8
GROUP BY accounts.id
This query does not achieve exactly what I want. It only returns accounts that have atleast one entry in projektkurse even if they have none in projektkurs_einzel and projektkurs_einzel_zeiten. The Count is obviously 0 for them but the accounts that have no entries in projektkurse are being ignored completly.
How can I also show the accounts that don't have entries in any other table with the Count 0 aswell?
I would recommend writing the query like this:
SELECT a.id, a.vorname, a.nachname,
COUNT(DISTINCT pez.date, pez.shift) AS T
FROM accounts a LEFT JOIN
projektkurse
ON a.id = pk.creator_id AND
pk.projektwochen_id = 8 LEFT JOIN
projektkurs_einzel pe
ON pk.id = pe.projektkurs_id LEFT JOIN
projektkurs_einzel_zeiten pez
ON pe.id = pez.projektkurs_einzel_id
GROUP BY a.id, a.vorname, a.nachname;
Notes:
Your problem is fixed by moving the WHERE condition to the ON clause. Your WHERE turns the outer join into an inner join, because NULL values do not match.
Table aliases make the query easier to write and to read.
It is a best practice to include all unaggregated columns in the GROUP BY. However, assuming that id is unique, your formulation is okay (due to something called "functional dependencies").
You should not use eft join table's column ins where condition this work as inner join
You should move the where condition for a left joined table in the corresponding ON clause
SELECT accounts.id, accounts.vorname, accounts.nachname, COUNT(DISTINCT projektkurs_einzel_zeiten.date, projektkurs_einzel_zeiten.shift) AS T
FROM accounts
LEFT JOIN projektkurse on accounts.id = projektkurse.creator_id
AND projektkurse.projektwochen_id = 8
LEFT JOIN projektkurs_einzel on projektkurse.id = projektkurs_einzel.projektkurs_id
LEFT JOIN projektkurs_einzel_zeiten ON projektkurs_einzel.id = projektkurs_einzel_zeiten.projektkurs_einzel_id
GROUP BY accounts.id

Join queries are not working as expected when trying to compare a count result with a value

I'm learnin SQL from a book and i'm trying to do some exercices on join queries.The only problem that i'm facing is that all of my join queries are not working while they seem well
students(student_id,student_names,student_age)
courses_students(course_id,student_id)
courses(course_id,course_schedule,course_room,teacher_id)
teachers(teacher_id,teacher_names)
The query is "which courses have more than 5 students enrolled?"
Here is what i've done :
SELECT course_name,
count
(SELECT count(*)
FROM courses) AS COUNT
FROM students,
courses,
courses_students
WHERE students.student_id=courses_students.student_id,
courses.course_id=courses_students.course_id
AND COUNT > 5
And the other one is what are the names of students enrolled in at least 2 courses scheduled for the same hours
My query :
SELECT student_name,
schedule
FROM students,
courses,
courses_students
WHERE students.student_id=courses_students.student_id,
courses.course_id=courses_students.course_id
AND COUNT > 2
In this inner query:
(SELECT count(*)
FROM courses) AS COUNT
you need to narrow down what is included in the COUNT. As it is, it is selecting all items in the courses table. The inner query does not know about the restrictions in the outer query. Try adding a where clause in this inner query. You might need to add table aliases to uniquely refer to the correct courses table, so there is no ambiguity whether it is referring to the courses table in the inner query or the outer query.
And, as noted in other answers, this is not the best way to structure joins.
In MySQL you are required to define joins explicitly. Unlike Oracle it can't handle joins with sign of =.
SELECT course_name,
count
(SELECT count(*)
FROM courses) AS COUNT
FROM students
INNER JOIN courses on courses.course_id=courses_students.course_id
INNER JOIN courses_students on students.student_id=courses_students.student_id
WHERE COUNT(*) > 2
You need to aggregage by course and then assert the number of students:
SELECT
c.course_name,
COUNT(*) AS cnt
FROM courses c
INNER JOIN courses_students cs
ON c.course_id = cs.course_id
INNER JOIN students s
ON cs.student_id = s.student_id
GROUP BY
c.course_name
HAVING
COUNT(*) > 5;
stackoverflow is actually not a site to do homework, but as you already have given a try to solve the task, here is a solution for question number one:
SELECT cs.course_id
FROM courses_students cs
INNER JOIN students s
ON cs.course_id = s.course_id
GROUP BY cs.course_id
HAVING count(*) > 5
Read about the GROUP BY and HAVING clause - nice way to solve some problems.
Question number 2 could be solved like this:
SELECT student_names
FROM students s
INNER JOIN courses_students cs
ON cs.student_id = s.student_id
INNER JOIN (
SELECT course_id
FROM courses c
GROUP BY course_schedule
HAVING count(*) > 1
) sub
ON sub.course_id = cs.course_id
The INNER JOIN with the subquery is selecting courses which are scheduled at the same time (having the same course_schedule).
As the other tables are "connected" with INNER JOINs, we will finally just have the subset of students which are participating one of those courses.

SQL - Selecting items from table based on data in other tables?

In the above relational schema, how would I do the following?:
List the names of all students that have a higher GPA than the minimum required GPA for the major they have applied for.
A couple of joins should do the trick:
SELECT s.*
FROM Student s
JOIN Apply a ON s.sId = a.sId
JOIN MinimumGPA m on m.major = a.major
WHERE s.gpa > m.mingpa
You use JOIN and NATURAL JOIN(this one is not required but i like it),
with join you ufse 2 tables giving 2 columns that need to be equal(you specify them in the WHERE), NATURAL JOIN does the same but assuming you have 1 or more columns with the same name(those are the ones natural join uses like if they were delared in the where)
So first you fuse MinimumGPA and Apply(they have 2 cols with the same name so natural join)
Select * FROM MinimumGPA NATURAL JOIN Apply
then, since you awhere asked for the NAMES of the students, you fuse that new table(giving it a name, in this case i used MinimumGPAApply, you could name it "the dogtable" if you wanted) with students, since the name on the columns isnt the same you use JOIN and specify the columns in the where, also you add the gpa condition
Select sName from Student JOIN (The first query) As MinimumGPAApply WHERE Student.sId = MinimumGPAApply.sID AND Student.GPA > MinimumGPAApply.minGPA
So at the end you end with somethign like this:
Select sName from Student JOIN (Select * FROM MinimumGPA NATURAL JOIN Apply) As MinimumGPAApply WHERE Student.sId = MinimumGPAApply.sID AND Student.GPA > MinimumGPAApply.minGPA
select s.Cname,s.gpa as studentGpa ,mg.mingpa as mingpaRequired from student s
inner join apply a on s.sid=a.sid
inner join major m on m.major=a.major
inner join minimumGPA mg on mg.major=m.major
where mg.mingpa<s.gpa

I am unsure: Is this an anti-join?

I am working on the first problem of the famous SQLzoos and am working on the using Null section: http://sqlzoo.net/wiki/Using_Null
The question is:
List the teachers who have NULL for their department.
The corresponding SQL query would be:
SELECT t.name
FROM teacher t
WHERE t.dept IS NULL
Is this a type of anti-join? Specifically, is this a left-anti-join?
This isn't a join at all.
The statement is filtering only records for teachers who don't have an assigned department.
Set Difference
The set difference of teachers and departments, teacher \ department would be a kind of "anti-join"
SELECT
t.name
FROM teacher t
LEFT JOIN department d ON d.id = t.dept_id
WHERE d.id IS NULL
At first glance, this statement does what your statement does, if the foreign key reference was enforced, it would guarantee to do exactly that. However, one use for this statement would be to retrieve teachers who are assigned to departments that have since been deleted (e.g. if the English Lit Dept. & English as 2nd Lang Dept. were reorganized as the English Dept.)
Symmetric Difference
Another "anti-join" would be the symmetric difference, which selects elements from both sets ONLY if they cannot be joined, i.e
(teacher \ department) U (department \ teacher)
I can't think of a motivating example using teachers and departments, but one way to write the symmetric difference on databases that support the FULL OUTER JOIN would be:
SELECT
t.name
FROM teacher t
FULL OUTER JOIN department d ON d.id = t.dept_id
WHERE d.id IS NULL OR t.id IS NULL
For MySQL, this statement would have to be written as the union of two statements.
SELECT
t.name teacher_name, d.name department_name
FROM teacher t
LEFT JOIN department d ON d.id = t.dept_id
WHERE d.id IS NULL
UNION ALL
SELECT
t.name teacher_name, d.name department_name
FROM teacher t
LEFT JOIN department d ON d.id = t.dept_id
WHERE t.id IS NULL
Looking through one of my projects, I found this one use of symmetric difference:
Context:
I have three tables: users, users_gameplay_summary, users_transactions_summary. I needed to email those users who created their accounts in the past 7 days AND one of the following
have transacted but have not played or played but have not transacted.
To get the list, I have this query (note, this was written for Postgresql, and won't work on MySQL, but it illustrates the symmetric difference use case):
SELECT
COALESCE(g.user_id, t.user_id) user_id
FROM users_gameplay_summary g
FULL OUTER JOIN users_transactions_summary t ON t.user_id = g.user_id
WHERE COALESCE(g.user_id, t.user_id) IN (
SELECT user_id
FROM users
WHERE created_at > CURRENT_DATE - '7 day'::interval)
AND (g.user_id IS NULL OR t.user_id IS NULL)
Not exactly, your not actually joining anything now,
in the case of a left anti join you would have access to the department name as well. (although it would be NULL)
Your sql code would be a correct answer for the question you gave though.
A left anti join would be:
SELECT t.name
FROM teacher t
LEFT JOIN dept d ON d.id = t.dept
WHERE d.id IS NULL
To solve this problem of listing teachers without assigned departments, you don't need a JOIN between teacher and dept tables.
dept table is basically a dictionary table that you join to, to translate ids to corresponding names.
teacher table has a dept column which normally could have a FOREIGN KEY constraint to id column in dept table.
Your query is not an ANTI-JOIN. This is a simple projection and selection query using one table.
SELECT t.name
FROM teacher t
WHERE t.dept IS NULL
For an ANTI-JOIN you would at least need a JOIN operation between more than one table at first.
Normally an ANTI-JOIN could look like:
Using LEFT JOIN
SELECT *
FROM table1 t1
LEFT JOIN table2 t2
ON t1.join_column = t2.join_column
WHERE t2.join_column IS NULL
Using NOT EXISTS
SELECT *
FROM table1 t1
WHERE NOT EXISTS (
SELECT 1
FROM table2 t2
WHERE t1.join_column = t2.join_column
)