SQL Incorrect SUMS from multiple JOINS - mysql

I'm trying to sum multiple tables using Joins and Sums in MySQL and not having much success.
My Tables (Unnecessary Columns Removed)
Students
idStudent studentname studentyear
1 foobar 11
2 barfoo 11
3 thing 8
Athletics_Results
idResult idStudent points
1 1 14
2 1 11
3 3 7
4 2 9
Team_Results
idTeamResults year points
1 11 9
2 8 8
3 7 14
So let me explain about the tables, because I admit they're poorly named and designed.
Students holds the basic info about each student, including their year and name. Each student has a unique ID.
Athletics_Results stores the results from athletics events. The idStudent column is a foreign key and relates to idStudent in the student column. So student foobar (idStudent 1) has scored 14 and 11 points in the example.
Team_Results stores results from events that more than one student took part in. It just stores the year group and points.
The Aim
I want to be able to produce a sum of points for each year - combined from both athletics_results and team_results. EG:
year points
7 14 <-- No results in a_r, just 14 points in t_r
8 15 <-- 7 points in a_r (idResult 4) and 8 in t_r
11 43 <-- 14, 11, 9 points in a_r and 9 in t_r
What I've tried
For testing purposes, I've not tried combining the a_r scores and t_r scores yet but left them as two columns so I can see what's going on.
The first query I tried:
SELECT students.studentyear as syear, SUM(athletics_results.points) as score, SUM(team_results.points) as team_score
FROM students
JOIN team_results ON students.studentyear = team_results.year
JOIN athletics_results ON students.idStudent = athletics_results.idStudent
GROUP BY syear;
This gave different rows for each year (as desired) but had incorrect SUMS. I learnt this was due to not grouping the joins.
I then created this code:
SELECT studentyear as sYear, teamPoints, AthleticsPoints
FROM students st
JOIN (SELECT year, SUM(tm.points) as teamPoints
FROM team_results tm
GROUP BY year) tr ON st.studentyear = tr.year
JOIN (SELECT idStudent, SUM(atr.points) as AthleticsPoints
FROM athletics_results atr
) ar ON st.idStudent = ar.idStudent
Which gave correct SUMS but only returned one year group row (e.g the scores for Year 11).
EDIT - SQLFiddle here: http://sqlfiddle.com/#!9/dbc16/. This is with my actual test data which is a bigger sample than the data I posted here.

http://sqlfiddle.com/#!9/ad111/7
SELECT tr.`year`, COALESCE(tr.points,0)+COALESCE(SUM(ar.points),0)
FROM Team_Results tr
LEFT JOIN Students s
ON tr.`year`=s.studentyear
LEFT JOIN Athletics_Results ar
ON s.idStudent = ar.idStudent
GROUP BY tr.year
According to your comment and fiddle provided
check http://sqlfiddle.com/#!9/dbc16/3
SELECT tr.`year`, COALESCE(tr.points,0)+COALESCE(SUM(ar.points),0)
FROM (
SELECT `year`, SUM(points) as points
FROM Team_Results
GROUP BY `year`) tr
LEFT JOIN Students s
ON tr.`year`=s.studentyear
LEFT JOIN Athletics_Results ar
ON s.idStudent = ar.idStudent
GROUP BY tr.year

Try this http://sqlfiddle.com/#!9/2bfb1/1/0
SELECT
year, SUM(points)
FROM
((SELECT
a.year, SUM(b.points) AS points
FROM
student a
JOIN at_result b ON b.student_id = a.id
GROUP BY a.year) UNION (SELECT
a.year, SUM(a.points) AS points
FROM
t_result a
GROUP BY a.year)) c
GROUP BY year;
On your data I get:
year points
7 14
8 15
11 43

Can be done in multiple ways. My first thought is:
SELECT idStudent, year, SUM(points) AS totalPoints FROM (
SELECT a.idStudent, c.year, a.points+b.points AS points
FROM students a
INNER JOIN Athletics_Results b ON a.idStudent=b.idStudent
INNER JOIN Team_Results c ON a.studentyear=c.year) d
GROUP BY idStudent,year

Related

SQL: How to join two tables and extract the data by timestamp?

I'm using mysql. I have two tables, one is about movie type, and the other is about movie rating with timestamps. I want to join these two tables together with movie id to count the average rating for each type of movie. I'm trying to extract only the movie types which have at least 10 ratings per film and the ratings made in December, and order by the highest to lowest average rating.
Table 'types'
movieId
type
1
Drama
2
Adventure
3
Comedy
...
...
Table 'ratings'
movieId
rating
timestamp
1
1
851786086
2
1.5
1114306148
1
2
1228946388
3
2
850723898
1
2.5
1167422234
2
2.5
1291654669
1
3
851345204
2
3
944978286
3
3
965088579
3
3
1012598088
1
3.5
1291598726
1
4
1291779829
1
4
850021197
2
4
945362514
1
4.5
1072836909
1
5
881166397
1
5
944892273
2
5
1012598088
...
...
...
Expect result: (Nb ratings >= 10 and rate given in December)
type
Avg_Rating
Drama
3.45
I'm trying to write the query like below, but I'm not able to execute it. (around 10 thousand data in original table)
Where should I adjust my query?
SELECT DISTINCT T.type, AVG(R.rating) FROM types AS T
INNER JOIN ratings AS R ON T.movieId = R.movieId
WHERE R.timestamp LIKE (
SELECT FROM_UNIXTIME(R.timestamp,'%M') AS Month FROM ratings
GROUP BY Month
HAVING Month = 'December')
GROUP BY T.type
HAVING COUNT(R.rating) >=10
ORDER BY AVG(R.rating) DESC;
I see two problems:
timestamp LIKE - what's that supposed to do?
and
inner query with GROUP BY by without any aggregation. Perhaps you meant WHERE? And anyway you don't need it at all - just do the same check for December directly on timestamp, w/o LIKE and w/o subquery
SELECT DISTINCT T.type, AVG(R.rating) FROM
types AS T INNER JOIN ratings AS R 
ON T.movieId = R.movieId
WHERE FROM_UNIXTIME(R.timestamp,'%M') = 'December'
GROUP BY T.type
HAVING COUNT(R.rating) >=10
ORDER BY AVG(R.rating) DESC;
You can try next query.
SELECT DISTINCT T.type, AVG(R.rating) FROM types AS T
INNER JOIN ratings AS R ON T.movieId = R.movieId
GROUP BY T.type
HAVING
COUNT(R.rating) >= 10 -- have 10 or more rating records
AND SUM(MONTH(FROM_UNIXTIME(R.timestamp)) = 12) > 0 -- have at least one rating in December
ORDER BY AVG(R.rating) DESC;
sqlize

compare mysql numeric values group_concat of two columns with join

I have 3 tables
1.users
user_id nationality
1 Egyptian
2 Palestinian
3 French
centers
id center_name
1 q
12 y
5 x
23 z
centers_users
student_id center_id
1 12
2 5
3 5
1 23
2 12
what I expect
Nationality center_name count_of_users_from this country
Egyptian y,z 10
Palestinian x,y 33
French x,q 7
I have tried many mysql queries but I cannot get the result I want
Final query I execute:
SELECT * from (SELECT (LENGTH(GROUP_CONCAT(DISTINCT user_id))-ENGTH(REPLACE(GROUP_CONCAT(DISTINCT user_id), ',', ''))) as ss,GROUP_CONCAT( DISTINCT user_id) ,nationality from user where user_id in(SELECT student_id FROM `centers_users`) GROUP by nationality)a
But only get the count with nationality.
When I Join with centers gives me redundancy because I cannot put "ON" condition with
group_concat
How can I implement it?
Thanks..
I think you want to join the tables and aggregate:
select u.nationality,
group_concat(distinct c.center_name) as center_names,
count(distinct user_id) as users_from_this_country
from users u join
user_centers uc
on u.user_id = uc.student_id join
centers c
on c.center_id = uc.center_id
group by u.nationality;
You may be able to use count(*) for users_from_this_country. It depends on how you want to count a user who is in multiple centers in the same country.

MySQL: Select Most Recent Response By Max Date from JOINing Tables

I have a survey table that compiles non-unique records whenever that person responds to a survey, so they can be in there multiple times -- I'm trying to figure out how to bring back the just the row with the most recent date.
Here's the person table:
ID First Last Employer
1 Jerry Seinfeld NBC
2 Elaine Benes Pendant Publishing
3 George Costanza Kruger Industrial Smoothing
4 Cosmo Kramer Kramerica Industries
And here's the survey table:
ID Survey Response Date
1 9 Yes 4/14/15
1 9 No 8/9/15
2 9 No 10/13/15
3 9 No 6/19/15
3 9 Yes 2/3/15
3 8 IQ 7/27/15
4 9 Yes 5/12/15
If the IDs duplicate and the survey number is 9, I only want returned the row with the most recent date.
Here's what I've been trying:
SELECT p.id, p.first, p.last, p.employer, s.response, s.date
FROM person p
LEFT JOIN
(SELECT s1.id, s1.survey, s1.response, s1.date, MAX(s1.date)
FROM survey s1
WHERE s1.survey = 9
GROUP BY s1.id) AS s ON s.id = p.id
ORDER BY s.date;
But whenever I do that the max date and the actual date for the row don't match sometimes -- so the MAX function is working correctly but only with regards to the ID, not with regards to giving me that row. But I have to group on the ID in order to properly match the two tables and that's where I'm getting stuck.
And when I try something like this I get the Invalid use of group function error:
SELECT p.id, p.first, p.last, p.employer, s.response, s.date
FROM person p
LEFT JOIN
(SELECT s1.id, s1.survey, s1.response, s1.date, MAX(s1.date)
FROM survey s1
WHERE s1.survey = 9 AND MAX(s1.date) = s1.date
GROUP BY s1.id) AS s ON s.id = p.id
ORDER BY s.date;
My desired result looks like this:
ID First Last Employer Response Date
3 George Costanza Kruger Industrial Smoothing Yes 2/3/15
4 Cosmo Kramer Kramerica Industries Yes 5/12/15
1 Jerry Seinfeld NBC No 8/9/15
2 Elaine Benes Pendant Publishing No 10/13/15
Here is tested query:
select `person`.*,`survey`.`Response`,`survey`.`Date`
from `survey`
inner join
(
SELECT ID,max(`Date`) as `d`
FROM `survey`
WHERE `Survey`=9
group by ID
) as `t` on `t`.`ID` = `survey`.`ID` and
`t`.`d` = `survey`.`Date`
inner join `person` on `person`.`ID` = `survey`.`ID`

Get product total sales per moth, with 0 in the gaps

I have been stuck in a recent problem with a SQL Query. What I'm trying to archieve is to get each product in the store and show how many of them has been sold each month. However, sometimes there are some months where these products were not sold, which means they won't be displayed.
For instance, this is the result I'm getting right now
Article Month Sold
CN140027 6 312
CN140027 7 293
CN140027 12 122
CN140186 1 10
CN140186 4 2
While I want to get something more like this
Article Month Sold
CN140027 6 312
CN140027 7 293
CN140027 8 0
CN140027 9 0
CN140027 10 0
CN140027 11 0
CN140027 12 122
CN140186 1 10
CN140186 2 0
CN140186 3 0
CN140186 4 2
And here is the query I'm using at the moment
SELECT k.artikelnr, Months.datefield as `Months`, IFNULL(SUM(k.menge),0) as `Quantity`
FROM store_shop_korb as k LEFT OUTER JOIN office_calendar AS Months
ON Months.datefield = month(k.date_insert)
WHERE k.date_insert BETWEEN "2014-12-01" AND "2015-12-31"
group by k.artikelnr, Months.datefield
What am I missing? Or what am I doing wrong? Any help is really appreciated.
Thanks in advance.
EDIT:
Additional information:
office_calendar is the calendar table. It only contains the months as registry, from 1 to 12.
Additionally, I'm taking the article/product ID from a table called 'store_shop_korb', which contains all the lines of a made order (so it contains the article ID, its price, the quantity for each order..)
This works for me:
SELECT k.artikelnr, c.datefield AS `Month`, COALESCE(s.Quantity, 0) AS Sold
FROM (
SELECT artikelnr
FROM store_shop_korb
GROUP BY artikelnr
) k
JOIN office_calendar c
LEFT JOIN (
SELECT artikelnr, MONTH(date_insert) AS monthfield, SUM(menge) AS Quantity
FROM store_shop_korb
GROUP BY artikelnr, MONTH(date_insert)
) s ON k.artikelnr = s.artikelnr AND c.datefield = s.monthfield
ORDER BY k.artikelnr, c.datefield
If you have a table of articles, you can use it in the place of subquery k. I'm basically normalizing on the fly.
Explanation:
There's basically 3 sets of data that get joined. The first is a distinct set of articles (k), the second is a distinct set of months (c). These two are joined without restriction, meaning you get the cartesian product (every article x every month). This result is then left-joined to the sales per month (s) so that we don't lose 0 entries.
Add another where condition , i think it will solve your problem
SELECT k.artikelnr, Months.datefield as `Months`, IFNULL(SUM(k.menge),0) as `Quantity`
FROM store_shop_korb as k LEFT OUTER JOIN office_calendar AS Months
ON Months.datefield = month(k.date_insert)
WHERE IFNULL(SUM(k.menge),0)>0 AND k.date_insert BETWEEN "2014-12-01" AND "2015-12-31"
group by k.artikelnr, Months.datefield
I have tried this in MSAccess and it seems to work OK
SELECT PRODUCT, CALENDAR.MONTH, A
FROM CALENDAR LEFT JOIN (
SELECT PRODUCT, MONTH(SALEDTE) AS M, SUM(SALEAMOUNT) AS A
FROM SALES
WHERE SALEDTE BETWEEN #1/1/2015# AND #12/31/2015#
GROUP BY PRODUCT, MONTH(SALEDTE) ) AS X
ON X.M = CALENDAR.MONTH
If you already have a calender table then use this.
SELECT B.Article,
A.Month,
COALESCE(c.Sold, 0)
FROM (SELECT DISTINCT Months.datefield --Considering this as months feild
FROM office_calendar AS Months) A
CROSS JOIN (SELECT DISTINCT article
FROM Yourtable) B
LEFT OUTER JOIN Yourtable C
ON a.month = c.Month
AND b.Article = c.Article
Else you need a months table. Try this.
SELECT *
FROM (SELECT 1 AS month UNION
SELECT 2 UNION
SELECT 3 UNION
SELECT 4 UNION
SELECT 5 UNION
SELECT 6 UNION
SELECT 7 UNION
SELECT 8 UNION
SELECT 9 UNION
SELECT 10 UNION
SELECT 11 UNION
SELECT 12) A
CROSS JOIN (SELECT DISTINCT article
FROM Yourtable) B
LEFT OUTER JOIN Yourtable C
ON a.month = c.Month
AND b.Article = c.Article

SQL: Fetch rows having a column (group by column) being the MAX value

I would like to know how to retrieve rows matching the maximum value for a column.
SCHEMA
assignments:
id student_id subject_id
1 10 1
2 10 2
3 20 1
4 30 3
5 30 3
6 40 2
students:
id name
10 A
20 B
30 C
subjects:
id name
1 Math
2 Science
3 English
Queries:
Provide the SQL for:
1. Display the names of the students who have taken most number of assignments
2. Display the names of the subjects which have been taken the most number of times
Results:
1.
A
C
2.
Math
English
Thanks !
The previous answer is not quite right - you won't get the instances where there are two with the same count. Try this - the second will be easy to replicate once understand the concept.
SELECT a.student_id, s.name, COUNT(a.subject_id) as taken_subjects
FROM assignments a
INNER JOIN students s ON a.student_id = s.id
GROUP BY a.student_id, s.name
HAVING COUNT(a.subject_id) = (SELECT COUNT(*) FROM assignments GROUP BY student_id LIMIT 1)
Alternate query:
SELECT a.subject_id, s.subject_name, COUNT(a.subject_id) FROM assignment a, subjects s
WHERE a.subject_id = s.subject_id
GROUP BY a.student_id, s.subject_name
HAVING COUNT(a.subject_id) = (SELECT MAX(COUNT(1)) FROM assignment GROUP BY subject_id)