Trouble using group by to get a max value across two tables - mysql

I have been trying to solve a problem for a very long time- days- and I am not making any progress. Basically, I have two tables, players and matches. Each player in players has a unique player_id, as well as a group_id that identifies which group he/she belongs to. Each match in matches has the player_ids of two players in it, first_player and second_player, who are always from the same group. first_score corresponds to the score that first_player scores and second_score corresponds to the score that second_player scores. A match is won by who ever scores more. Here are the two tables:
create table players (
player_id integer not null unique,
group_id integer not null
);
create table matches (
match_id integer not null unique,
first_player integer not null,
second_player integer not null,
first_score integer not null,
second_score integer not null
);
Now what I am trying to do is to get the players with the most wins from each group, their group ID as well as the number of wins. So, for example, if there are three groups, the result would be something like:
Group Player Wins
1 24 23
2 13 25
3 34 20
Here's what I have right now
SELECT p1.group_id AS Group, p1.player_id AS Player, COUNT(*) AS Wins
FROM players p1, matches m1
WHERE (m1.first_player = p1.player_id AND m1.first_score > m1.second_score)
OR (m1.second_player = p1.player_id AND m1.second_score > m1.first_score)
GROUP BY p1.group_id
HAVING COUNT(*) >= (
SELECT COUNT(*)
FROM players p2, matches m2
WHERE p2.group_id = p1.group_id AND
((m2.first_player = p2.player_id AND m2.first_score > m2.second_score)
OR (m2.second_player = p2.player_id AND m2.second_score > m2.first_score))
)
My idea is to only select players whose wins are greater than, or equal to, the wins of all other players in his group. There is some syntactic problem with my query. I think I am using GROUP BY incorrectly as well.
There is also the issue of a tie in the number of wins, where I should just get the player with the least player_id. But I haven't even gotten to that point yet. I would really appreciate your help, thanks!
EDIT 1
I have a few sample data that I am running my query against.
SELECT * FROM players gives me this:
Player_ID Group_ID
100 1
200 1
300 1
400 2
500 2
600 3
700 3
SELECT * FROM matches gives me this:
match_id first_player second_player first_score second_score
1 100 200 10 20
2 200 300 30 20
3 400 500 30 10
4 500 400 20 20
5 600 700 20 10
So, the query should return:
Group Player Wins
1 200 2
2 400 1
3 600 1
Running the query as is returns the following error:
ERROR: column "p1.player_id" must appear in the GROUP BY clause or be used in an aggregate function
Now I understand that I have to specify player_id in the GROUP BY clause if I want to use it in the SELECT (or HAVING) statement, but I do not wish to group by player ID, only by the group ID.
Even if I do add p1.player_id to GROUP BY in my outer query, I get...the correct answer actually. But I am a bit confused. Doesn't Group By aggregate the table according to that column? Logically speaking, I only want to group by p1.group_id.
Also, if I were to have multiple players in a group with the highest number of wins, how can I just keep the one with the lowest player_id?
Edit 2
If I change the matches table to such that for Group 1, there are two players with 1 win each, the query result omits Group 1 from the result altogether.
So, if my matches table is:
match_id first_player second_player first_score second_score
1 100 200 10 20
2 200 300 10* 20
3 400 500 30 10
4 500 400 20 20
5 600 700 20 10
I would expect the result to be
Group Player Wins
1 200 1
1 300 1
2 400 1
3 600 1
However, I get the following:
Group Player Wins
2 400 1
3 600 1
Note that the desired result is
Group Player Wins
1 200 1
2 400 1
3 600 1
Since I wish to only take the player with the least player_id in the case of a draw.

WITH first_players AS (
SELECT group_id,player_id,SUM(first_score) AS scores FROM players p LEFT JOIN matches m ON p.player_id=m.first_player GROUP BY group_id,player_id
),
second_players AS (
SELECT group_id,player_id,SUM(second_score) AS scores FROM players p LEFT JOIN matches m ON p.player_id=m.second_player GROUP BY group_id,player_id
),
all_players AS (
WITH al AS (
SELECT group_id, player_id, scores FROM first_players
UNION ALL
SELECT group_id, player_id, scores FROM second_players
)
SELECT group_id, player_id,COALESCE(SUM(scores),0) AS scores FROM al GROUP BY group_id, player_id
),
players_rank AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY group_id ORDER BY scores DESC, player_id ASC) AS score_rank,
ROW_NUMBER() OVER(PARTITION BY scores ORDER BY player_id ASC) AS id_rank FROM all_players ORDER BY group_id
)
SELECT group_id, player_id AS winner_id FROM players_rank WHERE score_rank=1 AND id_rank=1
Results
group_id winner_id
1 45
2 20
3 40
Try it Out

try like below
with cte as
(
select p.Group_ID,t1.winplayer,t1.numberofwin
row_number()over(partition by p.Group_ID order by t1.numberofwin desc,t1.winplayer) rn from players p join
(
SELECT count(*) as numberofwin,
case when first_score >second_score then first_player
else second_player end as winplayer
FROM matches group by case when first_score >second_score then first_player
else second_player end
) t1 on p.Player_ID =t1.winplayer
) select * from cte where rn=1

It works when you add the player_id in the GROUP BY because you know each player plays only in one group. So you group by the player in a certain group. That is why, logically, you can add the player_id to the GROUP BY.

Related

Select multiple tables with only unique users and ordered by latest id

I have 2 tables, first one is called members:
id name show
1 John 1
2 Wil 1
3 George 1
4 Chris 1
Second is called score:
id user_id score
1 1 90
2 1 70
3 2 55
4 3 30
5 3 40
6 3 100
7 4 30
user_id from score is the id of members.
What I want is to show a scorelist with unique members.id, ordered by score.score and order by the latest score.id.
I use the following code:
SELECT members.id, members.show, score.id, score.user_id, score.score FROM members
INNER JOIN score ON score.user_id = members.id
WHERE members.show = '1'
GROUP BY score.user_id
ORDER BY score.score DESC, score.id DESC
The output is not ordered by the latest score.id, but it does show only unique user_id's:
id user_id score
1 1 90
3 2 55
4 3 30
7 4 30
It should be like:
id user_id score
6 3 100
2 1 70
3 2 55
7 4 30
I hope you can help me
You could use:
with cte as (
select id,
user_id,
score,
row_number() over(partition by user_id order by id desc) as row_num
from score
) select cte.id,user_id,score
from cte
inner join members m on cte.user_id=m.id
where row_num=1
order by score desc;
Demo
If your MySQL server doesn't support windows function, use:
select s.id,s.user_id,s.score
from score s
inner join members m on s.user_id=m.id
where s.id in (select max(id) as id
from score
group by user_id
)
order by score desc;
Demo

PHP SQL order by multiple rows

I have a table called 'scorelist' with the following results:
ID USER_ID SCORE SEASON
-----------------------------
1 1 35 3
2 1 45 2
3 2 80 3
4 2 85 1
5 3 65 2
I want to make a score list where I show the scores of the users but only of their last played season.
Result should be:
ID USER_ID SCORE SEASON
-----------------------------
3 2 80 3
5 3 65 2
1 1 35 2
I use the following code:
SELECT * FROM scorelist
WHERE season = (
SELECT season FROM scorelist ORDER BY season DESC LIMIT 1
)
GROUP BY user_id
ORDER BY score DESC;
But then I only get the results of season 3, so a lot of users are not shown.
I also tried:
SELECT * FROM scorelist group by user_id ORDER BY score DESC, season DESC
But this is also not working.
I hope you can help me.
The subquery gets the latest season for each user. If you join to that you get your desired results
SELECT s1.*
FROM scorelist s1
JOIN
(
SELECT user_id, max(season) AS season
FROM scorelist
GROUP BY user_id
) s2 ON s1.user_id = s2.user_id AND s1.season = s2.season
Since MySQL 8.0 you can use window function row_number to solve this problem:
WITH ordered_scorelist AS (
SELECT
scorelist.*,
row_number() over (partition by USER_ID order by SEASON DESC) rn
FROM scorelist
) SELECT
USER_ID, SCORE, SEASON
FROM ordered_scorelist
WHERE rn = 1
ORDER BY SCORE DESC;
MySQL row_number test

SQL command to keep top 5 high scores only

I have a mySQL database with e.g. the following table,
world level player_id score
-----------------------------------------
1 1 1 100
1 1 2 123
1 1 3 130
1 1 4 200
1 1 5 90
1 2 8 234
.
.
.
For each unique (world, level, player_id) triple I want to record the top five scores only, as new scores come in.
My thoughts are to do the following: First insert the new score record, e.g.
REPLACE INTO Highscores (world, level, player_id, score) VALUES (1, 1, 6, 500)
Then keep only those records of the same (world, level) with the top 5 scores, e.g.
DELETE FROM Highscores WHERE world=1 AND level=1 AND score < (SELECT min(score) FROM (SELECT score FROM Highscores ORDER BY score DESC LIMIT 5) AS Highscores);
But I was wondering if there was some other way to do this, perhaps with a single line of SQL, which might be more efficient?
On tied scores:
I assume that the last record in the table was added last, so in the case of ties, I want to keep the last record and remove the earlier record.
world level player_id score
-----------------------------------------
1 1 1 100
1 1 2 200
1 1 3 100
1 1 4 100
1 1 5 100
1 1 8 200
Here, e.g. the row with player_id=8 would be kept, but the row with player_id=2 would be removed. player_id=1, 3, 4, 5 would be kept too.
Update
In the end, by introducing an AUTO_INCREMENT unique tableid as primary key, I settled for the following approach:
REPLACE INTO Highscores (world, level, player_id, score) VALUES (1, 1, 6, 500)
DELETE FROM Highscores
WHERE world=1 AND level=1 AND tableid NOT IN
(SELECT tableid FROM (SELECT tableid FROM Highscores WHERE world=1 AND level=1
AND score >=
(SELECT min(score) FROM
(SELECT score FROM Highscores WHERE world=1 AND level=1 ORDER BY score DESC LIMIT 5)
AS d)) AS c)
This may not be quite what you are looking for, but it should work. If you create the database with 5 'dummy' values, such as a negative player_id and a 0 score, this query will select the lowest, and replace it with the new score:
UPDATE Highscores
INNER JOIN
(SELECT player_id
FROM Highscores
ORDER BY score ASC, player_id ASC LIMIT 1) rpid
ON Highscores.player_id = rpid.player_id
SET Highscores.player_id = 6, Highscores.score = 300
Just replace the last line with the values that you need. Here is the fiddle.
Note that this solution is based on the assumption that player_id is always set to increment, and only used once.

MySQL: Greatest n per group with joins and conditions

Table Structure
I have a table similar to the following:
venues
The following table describes a list of businesses
id name
50 Nando's
60 KFC
rewards
The table describes a number of rewards, the venue it corresponds to and the number of points needed to redeem the reward.
id venue_id name points
1 50 5% off 10
2 50 10% off 20
3 50 11% off 30
4 50 15% off 40
5 50 20% off 50
6 50 30% off 50
7 60 30% off 70
8 60 60% off 100
9 60 65% off 120
10 60 70% off 130
11 60 80% off 140
points_data
The table describes the number of points remaining a user has for each venue.
venue_id points_remaining
50 30
60 90
Note that this query is actually computed within SQL like so:
select * from (
select venue_id, (total_points - points_redeemed) as points_remaining
from (
select venue_id, sum(total_points) as total_points, sum(points_redeemed) as points_redeemed
from (
(
select venue_id, sum(points) as total_points, 0 as points_redeemed
from check_ins
group by venue_id
)
UNION
(
select venue_id, 0 as total_points, sum(points) as points_redeemed
from reward_redemptions rr
join rewards r on rr.reward_id = r.id
group by venue_id
)
) a
group by venue_id
) b
GROUP BY venue_id
) points_data
but for this question you can probably just ignore that massive query and assume the table is just called points_data.
Desired Output
I want to get a single query that gets:
The top 2 rewards the user is eligible for each venue
The lowest 2 rewards the user is not yet eligible for for each venue
So for the above data, the output would be:
id venue_id name points
2 50 10% off 20
3 50 11% off 30
4 50 15% off 40
5 50 20% off 50
7 60 30% off 70
8 60 60% off 100
9 60 65% off 120
What I got so far
The best solution I found so far is first getting the points_data, and then using code (i.e. PHP) to dynamically write the following:
(
select * from rewards
where venue_id = 50
and points > 30
ORDER BY points desc
LIMIT 2
)
union all
(
select * from rewards
where venue_id = 50
and points <= 30
ORDER BY points desc
LIMIT 2
)
UNION ALL
(
select * from rewards
where venue_id = 60
and points <= 90
ORDER BY points desc
LIMIT 2
)
UNION ALL
(
select * from rewards
where venue_id = 60
and points > 90
ORDER BY points desc
LIMIT 2
)
ORDER BY venue_id, points asc;
However, I feel the query can get a bit too long and in-efficient. For example, if a user has points in 400 venues, that is 800 sub-queries.
I tried also doing a join like so, but can't really get better than:
select * from points_data
INNER JOIN rewards on rewards.venue_id = points_data.venue_id
where points > points_remaining;
which is far from what I want.
Correlated subqueries counting the number of higher or lower rewards to determine the top or bottom entries are one way.
SELECT r1.*
FROM rewards r1
INNER JOIN points_data pd1
ON pd1.venue_id = r1.venue_id
WHERE r1.points <= pd1.points_remaining
AND (SELECT count(*)
FROM rewards r2
WHERE r2.venue_id = r1.venue_id
AND r2.points <= pd1.points_remaining
AND (r2.points > r1.points
OR r2.points = r1.points
AND r2.id > r1.id)) < 2
OR r1.points > pd1.points_remaining
AND (SELECT count(*)
FROM rewards r2
WHERE r2.venue_id = r1.venue_id
AND r2.points > pd1.points_remaining
AND (r2.points < r1.points
OR r2.points = r1.points
AND r2.id < r1.id)) < 2
ORDER BY r1.venue_id,
r1.points;
SQL Fiddle
Since MySQL 8.0 a solution using the row_number() window function would be an alternative. But I suppose you are on a lower version.
SELECT x.id,
x.venue_id,
x.name,
x.points
FROM (SELECT r.id,
r.venue_id,
r.name,
r.points,
pd.points_remaining,
row_number() OVER (PARTITION BY r.venue_id,
r.points <= pd.points_remaining
ORDER BY r.points DESC) rntop,
row_number() OVER (PARTITION BY r.venue_id,
r.points > pd.points_remaining
ORDER BY r.points ASC) rnbottom
FROM rewards r
INNER JOIN points_data pd
ON pd.venue_id = r.venue_id) x
WHERE x.points <= x.points_remaining
AND x.rntop <= 2
OR x.points > x.points_remaining
AND x.rnbottom <= 2
ORDER BY x.venue_id,
x.points;
db<>fiddle
The tricky part is here to partition the set also into the subset where the points of the user are enough to redeem the reward and the one where the points aren't enough, per venue. But as in MySQL logical expressions evaluate to 0 or 1 (in non Boolean context), the respective expressions can be used for that.

mySQL - winning streak

Hi I am trying to figure out a way of finding the largest winning streak for each member in my table. When the table was built, this was never in the plans to happen so is why Im seeking help on how I can achieve this.
My structure is as follows:
id player_id opponant_id won loss timestamp
If it is a persons game, the player id is their id. If they are being challenged by someone, their id is the opponant id and the won loss (1 or 0) is in relation to the player_id.
I want to find the greatest winning streak for each user.
Anyone have any ideas on how to do this with the current table structure.
regards
EDIT
here is some test data, where id 3 is the player in question:
id player_id won loss timestamp
1 6 0 1 2012-03-14 13:31:00
13 3 0 1 2012-03-15 13:10:40
17 3 0 1 2012-03-15 13:29:56
19 4 0 1 2012-03-15 13:37:36
51 3 1 0 2012-03-16 13:20:05
53 6 0 1 2012-03-16 13:32:38
81 3 0 1 2012-03-21 13:14:49
89 4 1 0 2012-03-21 14:01:28
91 5 0 1 2012-03-22 13:14:20
Give this a try. Edited to take into account loss rows
SELECT
d.player_id,
MAX(d.winStreak) AS maxWinStreak
FROM (
SELECT
#cUser := 0,
#winStreak := 0
) v, (
SELECT
player_id,
won,
timestamp,
#winStreak := IF(won=1,IF(#cUser=player_id,#winStreak+1,1),0) AS winStreak,
#cUser := player_id
FROM (
(
-- Get results where player == player_id
SELECT
player_id,
won,
timestamp
FROM matchTable
) UNION (
-- Get results where player == opponent_id (loss=1 is good)
SELECT
opponent_id,
loss,
timestamp
FROM matchtable
)
) m
ORDER BY
player_id ASC,
timestamp ASC
) d
GROUP BY d.player_id
This works by selecting all win/loses and counting the win streak as it goes through. The subquery is then grouped by player_id and the max winStreak as calculated as it looped through is output per-player.
It seemed to work nicely against my test dataset anyway :)
To do this more efficiently I would restructure, i.e.
matches (
matchID,
winningPlayerID,
timeStamp
)
players (
playerID
-- player name etc
)
matchesHasPlayers (
matchID,
playerID
)
Which would lead to an inner query of
SELECT
matches.matchID,
matchesHasPlayers.playerID,
IF(matches.winningPlayerID=matchesHasPlayers.playerID,1,0) AS won
matches.timestamp
FROM matches
INNER JOIN matchesHasPlayers
ORDER BY matches.timestamp
resulting in
SELECT
d.player_id,
MAX(d.winStreak) AS maxWinStreak
FROM (
SELECT
#cUser := 0,
#winStreak := 0
) v, (
SELECT
matchesHasPlayers.playerID,
matches.timestamp,
#winStreak := IF(matches.winningPlayerID=matchesHasPlayers.playerID,IF(#cUser=matchesHasPlayers.playerID,#winStreak+1,1),0) AS winStreak,
#cUser := matchesHasPlayers.playerID
FROM matches
INNER JOIN matchesHasPlayers
ORDER BY
matchesHasPlayers.playerID ASC,
matches.timestamp ASC
) d
GROUP BY d.player_id
SELECT * FROM
(
SELECT player_id, won, loss, timestamp
FROM games
WHERE player_id = 123
UNION
SELECT opponant_id as player_id, loss as won, won as loss, timestamp
FROM games
WHERE opponant_id = 123
)
ORDER BY timestamp
That will give you all the results for one player ordered by timestamp. Then you would need to loop those results and count winning records or else concatenate them all into a string and then use string functions to find your highest 11111 set in that string. That code will vary depending on the language you want to use, but logically those are the two choices.