Can't get rid of repeats in mysql query - mysql

I'm completing some queries for my databases class and I've ran into one that I can't seem to get.
It says: "For the student with ID "20084" (or any other value) show the total number of credits for the courses taken. Don't display the tot_creds value from the student table, you should use SQL aggregation on courses taken by the student."
I have looked for some answers online but none of them truly convinced me and basically all of them gave me different results.
I've got this on my own:
select sum(credits)
from (course join section using(course_id)) join (takes join student using(ID))
using (course_id, sec_id, semester, year)
where student.ID = 20084;
The problem I have is that the student has repeated a couple courses and the query returns the credits of those repeated courses as well. I've tried putting distinct in front of sum(credits) but the answer is the same.
Is there anything I'm missing?
Thanks.

This should return what you are looking for. In the sub-select, I used MAX() even though the values of repeated items would be the same just so it would return one result from repeated rows. This sub select will return one row for each instance of a course, so the sum of that would not include the repeated courses.
SELECT
SUM(`a`.`credits`) as `credits`
FROM (
SELECT
MAX(`course`.`credits`) AS `credits`
FROM `student`
JOIN `takes`
ON `takes`.`ID` = `student`.`ID`
JOIN `section`
USING (`course_id`,`sec_id`,`semester`,`year`)
JOIN `course`
ON `course`.`course_id` = `section`.`course_id`
WHERE `student`.`ID` = 20084
GROUP BY `section`.`course_id`
ORDER BY `takes`.`grade` DESC
) AS `a`
To test the sub-query, try running this query. It should return a list of all courses taken, with just one row per course:
SELECT
`section`.`course_id`,
`section`.`sec_id`,
`section`.`semester`,
`section`.`year`,
MAX(`course`.`credits`) AS `credits`
FROM `student`
JOIN `takes`
ON `takes`.`ID` = `student`.`ID`
JOIN `section`
USING (`course_id`,`sec_id`,`semester`,`year`)
JOIN `course`
ON `course`.`course_id` = `section`.`course_id`
WHERE `student`.`ID` = 20084
GROUP BY `section`.`course_id`
ORDER BY `takes`.course_id`,`takes`.`grade` DESC

Related

Join mysql table with distinct value from another table

I encountered a problem on a database I am working with. I have a table of counsels which may hold repeating values, but their is an enrolment number filed which is unique and can be used to fetch them. However, I want to join from a cases_counsel table on the "first" unique value of the counsel table that matches that column on the cases counsel table.
I want to list the cases belonging to a particular counsel using the enrolment_number as the counsel_id on the cp_cases_counsel table. That means I want to pick just a distinct value of a counsel, then use it to join the cp_cases_counsel table and also return the count for such.
However, I keep getting duplicates. This was the mysql query I tried
SELECT T.suitno, T.counsel_id, COUNT(*) as total from cp_cases_counsel T
INNER JOIN (SELECT
enrolment_number as id, MIN(counsel)
FROM
cp_counsel
GROUP BY
enrolment_number
) A
ON A.id = T.counsel_id
GROUP BY T.suitno, T.counsel_id
and
SELECT enrolment_number as id, MIN(counsel) as counsel, COUNT(*) as total FROM cp_counsel
JOIN cp_cases_counsel ON cp_cases_counsel.counsel_id = cp_counsel.enrolment_number
GROUP BY enrolment_number
For the second query, it's joining twice and I am having like double of what I am supposed to get.
The columns that you want in the results are councel (actually only one of all its values) from cp_counsel and counsel_id from cp_cases_counsel, so you must group by them and select them:
SELECT a.counsel, t.counsel_id, COUNT(*) AS total
FROM cp_cases_counsel t
INNER JOIN (
SELECT enrolment_number, MIN(counsel) AS counsel
FROM cp_counsel
GROUP BY enrolment_number
) a ON a.enrolment_number = t.counsel_id
GROUP BY a.counsel, t.counsel_id;

MySQL order by keyword not working with insert into table from the inner join of two tables

Such is my MySQL query and the order by keyword isnt working here. Cannot figure out what's wrong with it.
insert into leaderboard
select student.student_name as name , sum(marks) as total
from marks inner join student on student.student_id = marks.student_id
group by marks.student_id order by total desc;
leaderboard table output image
Your current insert is not far off, though as a matter of practice, you should always explicitly list out the target columns for insertion, i.e. use this version:
INSERT INTO leaderboard (name, total) -- or whatever the column names are called
SELECT s.student_name, SUM(m.marks)
FROM marks m
INNER JOIN student s ON s.student_id = m.student_id
GROUP BY s.student_id;
Regarding the order you do or don't perceive in the leaderboard table, appreciate that SQL tables are modeled after unordered sets of data. That is, there is not really any inherent order in a SQL table. If you want to view your data in a certain order, then use an ORDER BY clause when you query (not when you insert):
SELECT name, total
FROM leaderboard
ORDER BY total DESC;

Having trouble a query and specifically with joins

The code below is completely wrong and does not work at all. Im basically trying to look through my tables and compile a list of DeptName and the total student number for a department where a department has more than 40 students.
Im confused about joins in general and if someone could explain and show where im going wrong. im sure there is also other problems so any help with them would help
So basically one department is connected to one module, and a student is enrolled in a module. A student cannot take a module outside of their department. So each student should have one module that connects to one department
All of the ID fields in other tables are foreign keys as you can guess and changing the tables is not what I want to do here I just want to do this query as this stands
Relevant tables columns
Table Department DeptID, DeptName, Faculty, Address
Table Modules ModuleID, ModuleName, DeptID, Programme
Table Students StudentID,StudentName,DoB,Address,StudyType,`
Table Enrolments EID,StudentID,ModuleID,Semester,Year
SELECT Department.DeptName, COUNT(Student.StudentID) AS 'No of Students' FROM Department LEFT JOIN Module ON Department.DeptID= Module.DeptID LEFT JOIN Enrolment ON Module.ModuleID= Enrolment.StudentID LEFT JOIN Student.StudentID
GROUP BY(Department.DeptID)
HAVING COUNT(Student.StudentID)>=40
I have not included every table here as there are quite a lot.
But unless i've got this completely wrong you don't need to access a ModuleID in a staff table for the module they teach or something not relevant to this at all. As no student or Dept details are in there.
If that is the case i will fix it very quickly.
SELECT Department.DeptName, COUNT(Student.StudentID) AS 'No of Students'
FROM Department
LEFT JOIN Module
ON Department.DeptID= Module.DeptID
LEFT JOIN Enrolment
-- problem #1:
ON Module.ModuleID= Enrolment.StudentID
-- problem #2:
LEFT JOIN Student.StudentID
-- problem #3:
GROUP BY(Department.DeptID)
HAVING COUNT(Student.StudentID)>=40
You're joining these two tables using the wrong field. Generally when the modeling is done correctly, you should use USING instead of ON for joins
The right side of any JOIN operator has to be a table, not a column.
You have to group by every column in the select clause that is not part of an aggregate function like COUNT. I recommend that you select the DeptID instead of the name, then use the result of this query to look up the name in a subsequent select.
Note : Following code is untested.
WITH bigDepts AS (
SELECT DeptId, COUNT(StudentID) AS StudentCount
FROM Department
JOIN Module
USING ( DeptID )
JOIN Enrolment
USING ( ModuleID )
JOIN Student
USING ( StudentID )
GROUP BY DeptID
HAVING COUNT(StudentID)>=40
)
SELECT DeptID, DeptName, StudentCount
FROM Department
JOIN bigDepts
USING ( DeptID )
Instead of left join you need to use inner join since you need to select related rows only from those three tables.
Groupy by and having clause seems fine. Since you need departments with more than 40 students instead of >= please use COUNT(e.StudentID)>40
SELECT d.DeptName, COUNT(e.StudentID) AS 'No of Students' FROM Department d INNER JOIN Module m ON d.DeptID= m.DeptID inner JOIN Enrolment e ON m.ModuleID= e.StudentID LEFT JOIN Student.StudentID
GROUP BY(d.DeptName)
HAVING COUNT(e.StudentID)>40
So your join clause was a bit iffy to students as you wrote it, and presumably these should all be inner joins.
I've reformatted your query using aliases to make it easier to read.
Since you're counting the number of rows per DeptName you can simply do count(*), likewise in your having you are after counts greater than 40 only. Without seeing your schemas and data it's not possible to know if you might have duplicate Students, if that's the case and you want distinct students count can amend to count(distinct s.studentId)
select d.DeptName, Count(*) as 'No of Students'
from Department d
join Module m on m.DeptId=d.DeptId
join Enrolment e on e.StudentId=m.ModuleId
join Students s on s.StudentId=e.studentId
group by(d.DeptName)
having Count(*)>40
Also, looking at your join conditions, is the Enrolement table relevant?
select d.DeptName, Count(*) as 'No of Students'
from Department d
join Module m on m.DeptId=d.DeptId
join Students s on s.StudentId=m.moduleId
group by(d.DeptName)
having Count(*)>40

Slow SQL query with LEFT JOIN

I have already read similar questions, but it does not help me.
I have query
SELECT `login`,
`photo`,
`username`,
`user`.`id`,
`name`,
`msg_info`
FROM `user`
LEFT JOIN `friends`
ON `friends`.`child` = `user`.`fb_id`
WHERE `friends`.`parent` = '1111'
ORDER BY `msg_info` DESC
Which tooks 0.7411 seconds (and even more)
It shows 158 total rows (ok i can limit it, but query still slow)
Each of tables friends and user has more than 200.000 rows
What can i do for query go faster?
Thank you!
As the comments pointed out, your left join is really not different than the following inner join query:
SELECT
login,
photo,
username,
user.id,
name,
msg_info
FROM user u
INNER JOIN friends f
ON f.child = u.fb_id
WHERE
f.parent = '1111'
ORDER BY
msg_info DESC;
We can try adding an index to the friends table on (parent, child, name, msg_info, ...). I am not sure which other columns belong to friends, but the basic idea is to create an index on parent, to speed up the WHERE clause, and hopefully take advantage of low cardinality on the parent column. Then, we include the child column to speed up the join. We also include all the other columns in the select clause to let the index cover the other columns we need.
CREATE INDEX idx ON friends (parent, child, name, msg_info, ...);
As #MrVimes suggeted, sometimes adding a condition to the JOIN clause can make a big difference:
SELECT login, photo, username, user.id, name, msg_info
FROM user u
INNER JOIN friends f ON f.child = u.fb_id AND f.parent = '1111'
ORDER BY msg_info DESC;
Assuming, of course, all your PK and FKs are properly defined and indexed.

MySQL Compare Result in WHERE clause

I imagine I'm missing something pretty obvious here.
I'm trying to display a list of 'bookings' where the total charges is higher than the total payments for the booking. The charges and payments are stored in separate tables linked using foreign keys.
My query so far is:
SELECT `booking`.`id`,
SUM(`booking_charge`.`amount`) AS `charges`,
SUM(`booking_payment`.`amount`) AS `payments`
FROM `booking`
LEFT JOIN `booking_charge` ON `booking`.`id` = `booking_charge`.`booking_id`
LEFT JOIN `booking_payment` ON `booking`.`id` = `booking_payment`.`booking_id`
WHERE `charges` > `payments` ///this is the incorrect part
GROUP BY `booking`.`id`
My tables look something like this:
Booking (ID)
Booking_Charge (Booking_ID, Amount)
Booking_Payment (Booking_ID, Amount)
MySQL doesn't seem to like comparing the results from these two tables, I'm not sure what I'm missing but I'm sure it's something which would be possible.
try HAVING instead of WHERE like this
SELECT `booking`.`id`,
SUM(`booking_charge`.`amount`) AS `charges`,
SUM(`booking_payment`.`amount`) AS `payments`
FROM `booking`
LEFT JOIN `booking_charge` ON `booking`.`id` = `booking_charge`.`booking_id`
LEFT JOIN `booking_payment` ON `booking`.`id` = `booking_payment`.`booking_id`
GROUP BY `booking`.`id`
HAVING `charges` > `payments`
One of the problems with the query is the cross join between rows from `_charge` and rows from `_payment`. It's a semi-Cartesian join. Each row returned from `_charge` will be matched with each row returned from `_payment`, for a given `booking_id`.
Consider a simple example:
Let's put a single row in `_charge` for $40 for a particular `booking_id`.
And put two rows into `_payment` for $20 each, for the same `booking_id`.
The query will would return total charges of $80. (= 2 x $40). If there were instead five rows in \'_payment\' for $10 each, the query would return a total charges of $200 ( = 5 x $40)
There's a couple of approaches to addressing that issue. One approach is to do the aggregation in an inline view, and return the total of the charges and payments as a single row for each booking_id, and then join those to the booking table. With at most one row per booking_id, the cross join doesn't give rise to the problem of "duplicating" rows from _charge and/or _payment.
For example:
SELECT b.id
, IFNULL(c.amt,0) AS charges
, IFNULL(p.amt,0) AS payments
FROM booking b
LEFT
JOIN ( SELECT bc.booking_id
, SUM(bc.amount) AS amt
FROM booking_charge bc
GROUP BY bc.booking_id
) c
ON c.booking_id = b.id
LEFT
JOIN ( SELECT bp.booking_id
, SUM(bp.amount) AS amt
FROM booking_payment bp
GROUP BY bp.booking_id
) p
ON p.booking_id = b.id
WHERE IFNULL(c.amt,0) > IFNULL(p.amt,0)
We could make use of a HAVING clause, in place of the WHERE.
The query in this answer is not the only way to get the result, nor is it the most efficient. There are other query patterns that will return an equivalent result.