MySQL Greatest N Results with Join Tables - mysql

Selecting the Top n Results, I've seen the numerous posts and great articles on here about how to do it but I am struggling to do it with my data set. Most of the examples focus on data sets without the need for additional joins.
I've been trying to apply the examples from http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/ to my query without much success.
Three tables exist Person, Credit and Media.
Person links to Credit and Credit to Media.
The query below should return the top 5 medias per person, however it doesn't, where have I gone wrong?
SELECT
p.id AS person_id,
c.id AS credit_id,
m.id AS media_id, m.rating_average
FROM person p
INNER JOIN credit c ON c.person_id = p.id
INNER JOIN media m ON m.id = c.media_id
where (
select count(*) from media as m2
inner JOIN credit c2 on m2.id=c2.media_id
where c2.person_id = c.person_id and m2.rating_average >= m.rating_average
) <= 5
Clarification:
Top Medias are calculated from those with the highest rating_average.
Update:
SQLFiddle http://sqlfiddle.com/#!9/eb0fd
Desired output for top 3 medias (m) per person (p). Obviously I would like to do be able this for the top 5 medias but this is only test data.
p m c rating_average
1 9 27 9
1 7 28 8
1 1 1 8
2 1 5 8
2 4 8 8
2 7 29 8
3 4 10 8
3 3 9 6
3 5 11 5
4 3 13 6
4 5 14 5
4 6 15 3
5 4 16 8
5 5 17 5
5 6 18 3
6 6 19 3
7 7 20 8
8 9 23 9
8 1 21 8
8 8 22 0
9 1 24 8
9 7 26 8
9 5 25 5

i think i solve it :)
First here is one solution based on the way you started. But there is a catch I couldn't solve it to show exact 3 (or whatever number you choose i pick 3 for example) row for each person_id. Problem is that solution is based on counting how many rows is there with the rating_average greater then current row. So if you have 5 same top value you could choose to show all 5 or not to show them at all and that's not good. So here is the way you do that... (of course this is example where if you have 4 top value you show them all (I think that no make sense at all to not show the data))...
SELECT t1.person_id, t1.credit_id, t1.media_id, t1.rating_average
FROM (SELECT p.id AS person_id, c.id AS credit_id, m.id AS media_id,
m.rating_average AS rating_average
FROM person p
INNER JOIN credit c ON c.person_id = p.id
INNER JOIN media m ON m.id = c.media_id) as t1
WHERE (SELECT COUNT(*)
FROM (SELECT p.id AS person_id, c.id AS credit_id, m.id AS media_id,
m.rating_average AS rating_average
FROM person p
INNER JOIN credit c ON c.person_id = p.id
INNER JOIN media m ON m.id = c.media_id) AS t2
WHERE t2.person_id = t1.person_id AND t2.rating_average > t1.rating_average) < 3
ORDER BY person_id ASC, rating_average DESC
Important: This solution can work (to show exact 3 rows for each person) if you don't have value that repeat it self... Here is the Fiddle http://sqlfiddle.com/#!9/eb0fd/64 you can see the problem where person_id is 1!
After that i played a little more and make it work just as you wanted in the question i think. Here is a code for that:
SET #num := 0, #person := 0;
SELECT person_id, credit_id, media_id, rating_average, rowNumber
FROM (SELECT t1.person_id, t1.credit_id, t1.media_id, t1.rating_average,
#num := if(#person = t1.person_id, #num + 1, 1) AS rowNumber,
#person := t1.person_id
FROM (SELECT p.id AS person_id, c.id AS credit_id, m.id AS media_id,
m.rating_average AS rating_average
FROM person p
INNER JOIN credit c ON c.person_id = p.id
INNER JOIN media m ON m.id = c.media_id
ORDER BY p.id ASC, m.rating_average DESC) as t1) as t2
WHERE rowNumber <= 3
Here is the Fiddle for that http://sqlfiddle.com/#!9/eb0fd/65 ...
GL!
P. S. sorry for my English hope you could understand what i was talking about...

Related

How to display only the rows where number of unique values in one column equals value in second column

(Using mysql 5.0)
I have this table:
opportunity main_user_id certificate_id required_certificates
1 491 1 2
1 341 1 2
1 161 1 2
1 161 2 2
1 205 2 2
1 578 2 2
2 161 2 2
2 466 3 2
2 466 2 2
2 156 2 2
2 668 2 2
3 222 5 1
3 123 5 1
3 875 5 1
3 348 5 1
I need to only display the rows where number of distinct values in certificate_id equals value in required_certificates.
opportunity_id column has id's from 0 to 15 and main_user_id's repeat (hence I can't use group by)
The table is basically a list of users matched for particular job opportunity, who have the required certificates. All i need to do now, is to only show the ones who have both of the required certificates, not one OR another.
My current sql statement:
select op_main.id as opportunity_id, u.id as main_user_id, c.id as certificate_id, required2.required as required_certificates
from opportunities as op_main
join opportunity_certificates as oc on oc.opportunity_id = op_main.id
join certificates as c on c.id = oc.certificate_id and oc.is_required
join user_certificates as uc on uc.certificate_id = c.id
join users as u on u.id = uc.user_id
join (
select id as op_id, (
select count(distinct c.id)
from opportunities as op
join opportunity_certificates as oc on oc.opportunity_id = op.id
join certificates as c on c.id = oc.certificate_id and oc.is_required
join user_certificates as uc on uc.certificate_id = c.id
join users as u on u.id = uc.user_id
where uc.certificate_id = oc.certificate_id and oc.is_required and op.id = op_id
) as required from opportunities
) as required2 on required2.op_id = op_main.id
where uc.certificate_id = oc.certificate_id and oc.is_required and op_id = op_main.id
based on the table above the output would be:
opportunity main_user_id
1 161
2 466
3 222
3 123
3 875
3 348
I spent many hours trying to work it out. If someone is keen on helping me, I can send you the database. Thanks.
It is quite simple with windowed functions - MySQL 8 and above:
WITH cte AS (
SELECT *, COUNT(DISTINCT certificate_id) OVER(PARTITION BY user_id) AS cnt
FROM (
-- your query with joins
) sub
)
SELECT *
FROM cte
WHERE cnt = required_certificates;
DBFiddle Demo
It turns out that MySQL 8.0 doesn't support COUNT(DISTINCT ...) OVER so I used subquery with DISTINCT.
ER_NOT_SUPPORTED_YET: This version of MySQL doesn't yet support '(DISTINCT ..)'

MySQL: join three tables and get counts

I am so totally lost.
Working on a database for my co-rec team. Players, Matches, Available players for a match, chosen players for a match, etc.....
The first major step I'd like is to be able to combine Players, Matches and Available to get a list of matches with the number of Women and number of Men available.
Here are my tables:
Players (id, Gender, Name, ....)
id Gender Name
1 M David
2 M Alberto
3 F Alison
4 F Karen
5 F Callie
6 M Stephan
Matches (id, ...)
id
1
2
3
Available (id, matchID, playerID)
id matchID PlayerID
1 1 1
2 1 8
3 1 11
... ... ...
16 2 1
17 2 2
18 2 15
... ... ...
26 3 6
27 3 7
28 3 18
Desired Result
Match Women Men Total
1 5 10 15
2 4 6 10
3 6 10 16
... ... ... ...
Here's the closest I've got (just this morning):
select m.id, p.gender
from matches m
inner join available a on m.id = a.matchid
inner join players p on p.id = a.playerid
Morning clarity:
select m.id,
sum(case when p.gender="Male" then 1 else 0 end) "Males",
sum(case when p.gender="Female" then 1 else 0 end) "Females",
count(p.gender) "Total"
from matches m
inner join available a on m.id = a.matchid
inner join players p on p.id = a.playerid
group by m.id

SQL Incorrect SUMS from multiple JOINS

I'm trying to sum multiple tables using Joins and Sums in MySQL and not having much success.
My Tables (Unnecessary Columns Removed)
Students
idStudent studentname studentyear
1 foobar 11
2 barfoo 11
3 thing 8
Athletics_Results
idResult idStudent points
1 1 14
2 1 11
3 3 7
4 2 9
Team_Results
idTeamResults year points
1 11 9
2 8 8
3 7 14
So let me explain about the tables, because I admit they're poorly named and designed.
Students holds the basic info about each student, including their year and name. Each student has a unique ID.
Athletics_Results stores the results from athletics events. The idStudent column is a foreign key and relates to idStudent in the student column. So student foobar (idStudent 1) has scored 14 and 11 points in the example.
Team_Results stores results from events that more than one student took part in. It just stores the year group and points.
The Aim
I want to be able to produce a sum of points for each year - combined from both athletics_results and team_results. EG:
year points
7 14 <-- No results in a_r, just 14 points in t_r
8 15 <-- 7 points in a_r (idResult 4) and 8 in t_r
11 43 <-- 14, 11, 9 points in a_r and 9 in t_r
What I've tried
For testing purposes, I've not tried combining the a_r scores and t_r scores yet but left them as two columns so I can see what's going on.
The first query I tried:
SELECT students.studentyear as syear, SUM(athletics_results.points) as score, SUM(team_results.points) as team_score
FROM students
JOIN team_results ON students.studentyear = team_results.year
JOIN athletics_results ON students.idStudent = athletics_results.idStudent
GROUP BY syear;
This gave different rows for each year (as desired) but had incorrect SUMS. I learnt this was due to not grouping the joins.
I then created this code:
SELECT studentyear as sYear, teamPoints, AthleticsPoints
FROM students st
JOIN (SELECT year, SUM(tm.points) as teamPoints
FROM team_results tm
GROUP BY year) tr ON st.studentyear = tr.year
JOIN (SELECT idStudent, SUM(atr.points) as AthleticsPoints
FROM athletics_results atr
) ar ON st.idStudent = ar.idStudent
Which gave correct SUMS but only returned one year group row (e.g the scores for Year 11).
EDIT - SQLFiddle here: http://sqlfiddle.com/#!9/dbc16/. This is with my actual test data which is a bigger sample than the data I posted here.
http://sqlfiddle.com/#!9/ad111/7
SELECT tr.`year`, COALESCE(tr.points,0)+COALESCE(SUM(ar.points),0)
FROM Team_Results tr
LEFT JOIN Students s
ON tr.`year`=s.studentyear
LEFT JOIN Athletics_Results ar
ON s.idStudent = ar.idStudent
GROUP BY tr.year
According to your comment and fiddle provided
check http://sqlfiddle.com/#!9/dbc16/3
SELECT tr.`year`, COALESCE(tr.points,0)+COALESCE(SUM(ar.points),0)
FROM (
SELECT `year`, SUM(points) as points
FROM Team_Results
GROUP BY `year`) tr
LEFT JOIN Students s
ON tr.`year`=s.studentyear
LEFT JOIN Athletics_Results ar
ON s.idStudent = ar.idStudent
GROUP BY tr.year
Try this http://sqlfiddle.com/#!9/2bfb1/1/0
SELECT
year, SUM(points)
FROM
((SELECT
a.year, SUM(b.points) AS points
FROM
student a
JOIN at_result b ON b.student_id = a.id
GROUP BY a.year) UNION (SELECT
a.year, SUM(a.points) AS points
FROM
t_result a
GROUP BY a.year)) c
GROUP BY year;
On your data I get:
year points
7 14
8 15
11 43
Can be done in multiple ways. My first thought is:
SELECT idStudent, year, SUM(points) AS totalPoints FROM (
SELECT a.idStudent, c.year, a.points+b.points AS points
FROM students a
INNER JOIN Athletics_Results b ON a.idStudent=b.idStudent
INNER JOIN Team_Results c ON a.studentyear=c.year) d
GROUP BY idStudent,year

Query Results Not Accurate

I have the following query which provides me with accurate results:
SELECT t.id
FROM titles t
ORDER BY t.id
My results are:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
My second query also provides me with accurate results:
SELECT t.id
FROM titles t
JOIN subscriptions s
ON t.id = s.title
WHERE s.user=2
Results:
10
11
14
So what I am trying to do is receive all the results from the first query that don't show up in the second query, so I run this:
SELECT t.id
FROM titles t
ORDER BY t.id NOT IN
(
SELECT t.id
FROM titles t
JOIN subscriptions s
ON t.id = s.title
WHERE s.user=2
);
But my results end up as this:
14
11
10
13
12
9
8
7
6
5
4
3
2
1
What am I doing wrong here? Any why is the order reversed in my second query?
NOT IN should be a part of WHERE condition, not ORDER BY statement:
SELECT
t.id
FROM
titles t
WHERE
t.id NOT IN
(
SELECT t.id
FROM titles t
JOIN subscriptions s
ON t.id = s.title
WHERE s.user=2
)
ORDER BY
t.id

Inner join on a distinct set of parameters

I've tried a few of the similar SO questions, but I can't seem to figure it out.
On the first inner join, I only want to bring in DISTINCT function columns code and serial_id. So when I do my SUM selects, it calculates one per distinct. Ie there are multiple rows with the same func.code and func.serial_id. I only want 1 of them.
SELECT
sl.imp_id,
lat.version,
SUM(IF(lat.status = 'P',1,0)) AS powered,
SUM(IF(lat.status = 'F',1,0)) AS functional
FROM slots sl
INNER JOIN functions func ON sl.id = func.slot_id
INNER JOIN latest_status lat ON lat.code = func.code
AND lat.serial_id = func.serial_id
WHERE sl.id=55
GROUP BY sl.imp_id, lat.version
EDIT 2 - sample data explanation -------------------
slots - id, imp_id, name
functions - id, slot_id, code, serial_id
latest_status - id, code, serial_id, version, status
**slots**
id imp_id name
1 5 'the name'
2 5 'another name'
3 5 'name!'
4 5 'name!!'
5 5 'name!!!'
6 5 'testing'
7 5 'hi'
8 5 'test'
**functions**
id slot_id code serial_id
1 1 11HRK 10
2 2 22RMJ 11
3 3 26OLL 01
4 4 22RMJ 00
6 6 11HRK 10
7 7 11HRK 10
8 8 22RMJ 00
**latest_status**
id code serial_id version status
1 11HRK 10 1 F
1 11HRK 10 2 P
3 22RMJ 11 1 P
4 22RMJ 11 2 F
5 26OLL 01 1 F
6 26OLL 01 2 P
7 22RMJ 00 1 F
8 22RMJ 00 2 F
After running the query, the result should look like this:
imp_id version powered functional
5 1 1 3
5 2 2 2
The function table gets rolled up based on the code, serial_id. 1 row per code, serial_id.
It then gets joined onto the latest_status table based on the serial_id and code, which is a one (functions) to many (latest_status) relationship, so two rows come out of this, one for each version.
How about using DISTINCT?
SELECT
SUM(IF(lat.status = 'P',1,0)) AS powered,
SUM(IF(lat.status = 'F',1,0)) AS functional
FROM slots sl
INNER JOIN (Select DISTINCT id1, code, serial_id from functions) f On sl.rid = f.id1
INNER JOIN latest_status lat ON lat.code = f.code
AND lat.serial_id = f.serial_id
WHERE sl.id=55
GROUP BY sl.imp_id, lat.version
If you want only the distinct code and serial_id, you need to group by those not the imp_id and version. And end up with something like
SELECT
SUM(IF(lat.status = 'P',1,0)) AS powered,
SUM(IF(lat.status = 'F',1,0)) AS functional
FROM slots sl
INNER JOIN functions func ON sl.rid = func.id1
INNER JOIN latest_status lat ON lat.code = func.code
AND lat.serial_id = func.serial_id
WHERE sl.id=55
GROUP BY func.code, func.serial_id
However, this could all be rubish, without more data as tgo what some of those other columns are, but they dont seem to be the ones you wanted to group by.