For a homework assignment, I have to write a MySQL query to calculate the GPA of every student in the database table. I broke the problem down into 3 parts: (1) calculating the number of grade points earned by each student, (2) calculating the number of credits taken, and then (3) dividing grade points by credits. Here are the queries I've written for steps 1 and 2:
Calculate grade points earned:
SELECT ID, SUM( credits ) AS credits_taken
FROM takes
NATURAL JOIN course
GROUP BY ID
2 Find grade points earned:
SELECT ID, SUM( credits * ( SELECT points FROM gradepoint WHERE letter = grade ) ) AS tot_grade_points
FROM takes NATURAL JOIN course
GROUP BY ID
I manually evaluated each query and they return the correct results. But I can't figure out how to return (credits_taken / tot_grade_points) for each student. Here is what I have tried:
SELECT ID, GPA
FROM student AS S NATURAL JOIN
(SELECT ID,( 'credits_taken' / SUM( credits * ( SELECT points FROM gradepoint WHERE letter = grade ) )) AS GPA
FROM takes AS T1 NATURAL JOIN course
WHERE S.ID = T1.ID
AND EXISTS (
SELECT ID, SUM( credits ) AS 'credits_taken'
FROM takes AS T2 NATURAL JOIN course
WHERE S.ID = T2.ID
GROUP BY ID
)
GROUP BY ID) Z
GROUP BY ID
But this gives me the error " Unknown column 'S.ID' in 'where clause'". From what I've read, you can't reference the alias of a table from a subquery in a join operation. Does anyone have another way of doing the calculation of these two subqueries and returning them bound to the student ID?
The 'takes' table maps student IDs to information about the courses they've taken, most importantly the course_id and grade. The 'course' table contains the 'credits' field, the number of credits the course is worth.
EDIT
Here are the relevant table structures:
takes:
Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| ID | varchar(5) | NO | PRI | | |
| course_id | varchar(8) | NO | PRI | | |
| sec_id | varchar(8) | NO | PRI | | |
| semester | varchar(6) | NO | PRI | | |
| year | decimal(4,0) | NO | PRI | 0 | |
| grade | varchar(2) | YES | | NULL | |
+-----------+--------------+------+-----+---------+-------+
course:
+-----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| course_id | varchar(8) | NO | PRI | | |
| title | varchar(50) | YES | | NULL | |
| dept_name | varchar(20) | YES | MUL | NULL | |
| credits | decimal(2,0) | YES | | NULL | |
+-----------+--------------+------+-----+---------+-------+
I would try:
SELECT takes.sec_id,
SUM( course.credits * gradepoint.points ) / SUM( course.credits ) AS GPA
FROM takes
JOIN gradepoint ON takes.grade = gradepoint.letter
JOIN course ON takes.course_id = course.course_id
GROUP BY takes.sec_id
Since your table structure description is incomplete I had to guess gradepoint schema and I assumed sec_id identifies a student in takes table, if there is another column for that just replace it in the query in both SELECT and GROUP BY parts. Maybe it is ID, but a column name like that is usually used for primary keys. Or maybe there are no primary keys defined at all, which is a bad practise anyway. Also you would need to join student table if you wanted any student info other than id, like name and so on.
I would also recommend using JOIN ... ON ... syntax instead of NATURAL JOIN, not only it is more readable, it also gives you more flexibility, for example see how gradepoint is joined instead of using costly dependent subquery.
Related
I want to delete the rows with null values in the column
How can i delete it?
SELECT employee.Name,
`department`.NUM,
SALARY
FROM employee
LEFT JOIN `department` ON employee.ID = `department`.ID
ORDER BY NUM;
+--------------------+-------+----------+
| Name | NUM | SALARY |
+--------------------+-------+----------+
| Gallegos | NULL | NULL |
| Lara | NULL | NULL |
| Kent | NULL | NULL |
| Lena | NULL | NULL |
| Flores | NULL | NULL |
| Alexandra | NULL | NULL |
| Hodge | 8001 | 973.45 |
+--------------------+-------+----------+
Should be like this
+--------------------+-------+----------+
| Name | NUM | SALARY |
+--------------------+-------+----------+
| | | |
| Hodge | 8001 | 973.45 |
+--------------------+-------+----------+
You are asking to delete, but to me it seems more like removing nulls from the result of select statement, if so use:
SELECT employee.Name,
`department`.NUM,
SALARY
FROM employee
LEFT JOIN `department` ON employee.ID = `department`.ID
WHERE (`department`.NUM IS NOT NULL AND SALARY IS NOT NULL)
ORDER BY NUM;
Note: The parentheses are not required but it’s good practice to enclose grouped comparators for better readability.
The above query will exclude the even if the NUM column is not null and the SALARY column is null and vice versa
If by deleting you mean that you don't want to see rows with null values in your table, you can use INNER JOIN instead of LEFT JOIN.
You use INNER JOIN when you want to return only records having pair on both sides, and you'll use LEFT JOIN when you need all records from the “left” table, no matter if they have pair in the “right” table or not.
You can learn more here.
I want to get the (last row) average air_temperature from all stations that have the specified county_number.
Therefor, my solution would be something like
SELECT AVG(air_temperature)
FROM weather
WHERE station_id IN (
SELECT station_id
FROM stations
WHERE county_number = 25
)
ORDER
BY id DESC
LIMIT 1;
Clearly, this does not give the correct row as it returns the average air_temperature based on all air_temperature ever recorded of one station.
Back to the problem, I want to get the average air_temperature over the last inserted row from each station that have the specified county_number.
Table weather
+------------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| station_id | char(20) | YES | MUL | NULL | |
| timestamp | timestamp | YES | | NULL | |
| air_temperature | float | YES | | NULL | |
+------------------+-------------+------+-----+---------+----------------+
Table stations
+---------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-------------+------+-----+---------+-------+
| station_id | char(20) | NO | PRI | NULL | |
| county_number | int(10) | YES | | NULL | |
+---------------+-------------+------+-----+---------+-------+
Tables are minimized
I would recommend doing this with a join and some filtering:
select avg(w.air_temperature)
from weather w join
stations s
on w.station_id = s.station_id
where s.county_number = 25 and
w.timestamp = (select max(w2.timestamp) from weather w2 where w2.station_id = w.station_id)
You can get the last inserted row by checking the max(timestamp):
SELECT
AVG(w.air_temperature)
FROM weather w
INNER JOIN (
SELECT station_id, max(timestamp) maxtimestamp FROM weather GROUP BY station_id
) t
ON w.station_id = t.station_id AND w.timestamp = t.maxtimestamp
WHERE
w.station_id IN (SELECT station_id FROM stations WHERE county_number = 25)
UPDATE: I just noticed that your timestamp column is nullable and you are talking about the "last inserted row". That is the one with the greatest ID. Hence:
As of MySQL 8 you can use window functions in order to read the table only once:
select avg(air_temperature)
from
(
select air_temperature, id, max(id) over (partition by station_id) as max_id
from weather
where station_id in (select station_id from stations where county_number = 25)
) analyzed
where id = max_id;
In older versions you must read the table twice:
select avg(air_temperature)
from weather
where (station_id, id) in
(
select station_id, max(id)
from weather
where station_id in (select station_id from stations where county_number = 25)
group by station_id
);
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
Working on a project with students/grades/etc, I need to update the top 3 students every once in a while. I came up with the query below. However, I am having trouble getting their rank/order. I know how to do that in a simple query, but in a more complex one, it is not working.
I am getting all of the other columns correctly, and, with all the methods I tried to get the order by, I sometimes got 0 (like the current state of the code), sometimes values that are just wrong (1, 11, 10), etc.
NOTE: I have checked various questions (including the question below), but I just couldn't figure out how to place them in my query.
What is the best way to generate ranks in MYSQL?
Summary:
GOAL:
- Get sum of each students' marks from marks, divide that on the number of entries in the table (again marks). Students are from a given grade.
- Use sum(mark) to rank these students.
- Get the top three.
- Place the top three students from that grade in the TopStudents table, with their average marks (as sum) and their id's.
TABLES:
Students table contains info about student including id:
+-------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+----------------+
| id | int (20) unsigned | NO | PRI | NULL | auto_increment |
| name |varchar(20) unsigned | NO | | NULL | |
+-------------+---------------------+------+-----+---------+----------------+
Marks Table has marks of each student on each exam
+-------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+----------------+
| id |int (20) unsigned | NO | PRI | NULL | auto_increment |
| idStudent |int (20) unsigned | NO | FOR | NULL | |
| mark |tinyInt (3) unsigned | NO | | NULL | |
| idExam |int (20) unsigned | NO | FOR | NULL | |
+-------------+---------------------+------+-----+---------+----------------+
Grade Table has grade id and name:
+-------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+----------------+
| id | int (20) unsigned | NO | PRI | NULL | auto_increment |
| name |varchar(20) unsigned | NO | | NULL | |
+-------------+---------------------+------+-----+---------+----------------+
Class Table classes for each grade. References table
+-------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+----------------+
| id | int (20) unsigned | NO | PRI | NULL | auto_increment |
| name |varchar(20) unsigned | NO | | NULL | |
| idGrade | int (20) unsigned | NO | FOR | NULL | |
+-------------+---------------------+------+-----+---------+----------------+
and finally, the infamous TopStudents Table .
+-------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+----------------+
| id | int (20) unsigned | NO | PRI | NULL | auto_increment |
| idStudent | int (20) unsigned | NO | FOR | NULL | |
| sumMarks | int (20) unsigned | NO | | NULL | |
| rank |tinyInt (1) unsigned | NO | | NULL | |
| date |date unsigned | NO | | NULL | |
+-------------+---------------------+------+-----+---------+----------------+
ATTEMPTS:
Attempt 1: ERROR: all ranks are 0
INSERT INTO topStudents(`date`, idStudent, `sum`, `order`)
SELECT
'2018-10-10' AS DATE,
student.id AS idStudent,
AVG(marks.mark)
#n = #n + 1 AS `order`
FROM
marks
INNER JOIN student ON student.id = marks.idStudent
INNER JOIN class ON class.id = marks.idClass
INNER JOIN grade ON class.idGrade = grade.id
WHERE
grade.id = 2
GROUP BY
marks.idStudent
ORDER BY
SUM(mark)
DESC
LIMIT 3
Attempt 2: ranks returned: 1, 11, 10
SET #n := 0;
INSERT INTO topStudents(`date`, idStudent, `sum`, `rank`)
SELECT
'2018-10-10' AS DATE,
tbl.idStudent AS idStudent,
AVG(tbl.mark) AS `sum`,
rnk AS `rank`
FROM (SELECT student.id AS idStudent, SUM(mark) AS mark FROM
marks
INNER JOIN student ON student.id = marks.idStudent
INNER JOIN class ON class.id = marks.idClass
INNER JOIN grade ON class.idGrade = grade.id
WHERE
grade.id = 2
GROUP BY
marks.idStudent
ORDER BY
SUM(mark)
DESC
LIMIT 3) AS tbl, (SELECT #n = #n + 1) AS rnk
In more recent versions of MySQL, you need to use a derived table for the ordering, before assigning the ranks:
INSERT INTO topStudents (`date`, idStudent, `sum`, `order`)
SELECT date, idStudent, `sum`, (#n := #n + 1) AS `order`
FROM (SELECT '2018-10-10' AS DATE, s.id AS idStudent,
SUM(m.mark) / (SELECT COUNT(*) FROM marks m2 WHERE m2.idStudent = m.idStudent) AS `sum`
FROM marks m JOIN
student s
ON s.id = m.idStudent JOIN
class c
ON c.id = m.idClass JOIN
grade g
ON c.idGrade = g.id
WHERE g.id = 2
GROUP BY m.idStudent
ORDER BY SUM(mark) DESC
LIMIT 3
) sm CROSS JOIN
(SELECT #n := 0) params;
I am almost certain that the calculation for sum is incorrect, and that you really intend avg(mark). However, this is the logic you have in your question.
I'm trying to modify and existing query to work with a conditional JOIN.
I have three tables: invoices, companies and clients.
The application logic is this: first there's a bill of quantities which gets created for each order. Then, I create an invoice for that bill of quantities. Sometimes the client that makes the order is different from the client that gets billed (example: client A from company A+ is making an order, but client B from company B+ is getting billed for that order). For this scenario I have the invoice_as_company_id and invoice_as_client_id columns.
Right now I have a query that gets all the invoices and it looks like this:
SELECT i.*, co.name AS company, cl.name AS client
FROM invoices i
LEFT JOIN companies co ON i.invoice_company_id = co.company_id
LEFT JOIN clients cl ON i.invoice_client_id = cl.client_id
ORDER BY i.invoice_date DESC
LIMIT 10
So I would like to modify this query like this:
if the invoice_as_company_id is null, the use the
invoice_company_id field in the companies table join
if the invoice_as_client_id is null, the use the invoice_client_id field
in the clients table join
The database tables are bellow.
Invoices
+-----------------------+--------------+
| invoice_id | int(10) |
| invoice_date | date |
| invoice_number | int(11) |
| invoice_amount | decimal(5,2) |
| invoice_company_id | int(11) |
| invoice_client_id | int(11) |
| invoice_as_company_id | int(11) |
| invoice_as_client_id | int(11) |
| date_added | int(11) |
+-----------------------+--------------+
Companies
+--------------+--------------+
| company_id | int(10) |
| company_name | varchar(255) |
| date_added | int(11) |
+--------------+--------------+
Clients
+-------------+--------------+
| client_id | int(10) |
| client_name | varchar(255) |
| date_added | int(11) |
+-------------+--------------+
LEFT JOIN companies co
ON co.company_id=IFNULL(i.invoice_company_as_id, i.invoice_company_id)
And if you have more specific test cases:
LEFT JOIN companies co
ON co.company_id=IF(i.invoice_company_as_id IS NULL OR i.invoice_company_as_id = 0, i.invoice_company_as_id, i.invoice_company_id)
Same for your 2nd case. Performances may go away for heavy tables...
Suppose, we have a table:
SELECT * FROM users_to_courses;
+---------+-----------+------------+---------+
| user_id | course_id | pass_date | file_id |
+---------+-----------+------------+---------+
| 1 | 1 | 2014-01-01 | 1 |
| 1 | 1 | 2014-01-01 | 2 |
| 1 | 1 | 2014-02-01 | 3 |
| 1 | 1 | 2014-02-01 | 4 |
+---------+-----------+------------+---------+
Schema:
CREATE TABLE `users_to_courses` (
`user_id` int(10) unsigned NOT NULL,
`course_id` int(10) unsigned NOT NULL,
`pass_date` date NOT NULL,
`file_id` int(10) unsigned NOT NULL,
PRIMARY KEY (`user_id`, `course_id`, `pass_date`, `file_id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
One user can pass a certain course multiple times, and every time he passes multiple certificates can be generated. user_id and course_id store the links to users and courses tables. file_id - to files table, where info about certificate files is stored.
In our example user #1 has passed course #1 twice and every time 2 certificates were issued: 4 records totally.
How can I get this data: for user_id=1 for every course get MAX(pass_date) and all the files, attached to this date. So far I could only get this:
SELECT
users_to_courses.course_id,
MAX(users_to_courses.pass_date) AS max_passed_date,
GROUP_CONCAT(users_to_courses.file_id SEPARATOR ',') AS files
FROM
users_to_courses
WHERE
users_to_courses.user_id=1
GROUP BY
users_to_courses.course_id;
+-----------+-----------------+---------+
| course_id | max_passed_date | files |
+-----------+-----------------+---------+
| 1 | 2014-02-01 | 1,2,3,4 |
+-----------+-----------------+---------+
I need this:
+-----------+-----------------+---------+
| course_id | max_passed_date | files |
+-----------+-----------------+---------+
| 1 | 2014-02-01 | 3,4 |
+-----------+-----------------+---------+
I think, this requires a compound GROUP BY.
fiddle
Try the below query it first gets max date for all the records and then we can join only those record in the outer query. You can use the same query for more than one user by adding group by utc.user_id
SELECT
utc.course_id,
mdt.maxDate AS max_passed_date,
GROUP_CONCAT(utc.file_id SEPARATOR ',') AS files
FROM
users_to_courses utc
join
(SELECT MAX(pass_date) AS maxDate, course_id cId, user_id uId
FROM users_to_courses GROUP BY user_id, course_id) AS mdt
ON
mdt.uId = utc.user_id
AND
mdt.cId = utc.course_id
AND
mdt.maxDate = utc.pass_date
WHERE
utc.user_id=1
GROUP BY
utc.course_id;