I have a table that keeps track of the scores of people playing my game
userID | game_level | date_of_attempt | score
1 1 2014-02-07 19:29:00 2
1 2 2014-02-08 19:00:00 0
2 1 2014-03-03 11:11:04 4
... ... ... ...
I am trying to write a query that, for a given user, will tell me their cumulative score for each game_level as well as they average of the last 20 scores they have obtained on a particular game_level (by sorting on date_of_attempt)
For example:
userID | game_level | sum of scores on game level | average of last 20 level scores
1 1 26 4.5
1 2 152 13
Is it possible to do such a thing in a single query? I often need to perform the query for multiple game_levels, and I use a long subquery to work out which levels are needed which makes me think a single query would be better
MySQL does not support analytic functions, so obtaining the average is trickier than it would be in some other RDBMS. Here I use user-defined variables to obtain the groupwise rank and then test on the result to average only over the 20 most recent records:
SELECT userID, game_level, SUM(score), x.avg
FROM my_table JOIN (
SELECT AVG(CASE WHEN (#rank := (CASE
WHEN t.userID = #userID
AND t.game_level = #gamelevel
THEN #rank + 1
ELSE 0
END) < 20 THEN score END) AS avg,
#userID := userID AS userID,
#game_level := game_level AS game_level
FROM my_table,
(SELECT #rank := #userID := #game_level := NULL) init
ORDER BY userID, game_level, date_of_attempt DESC
) x USING (userID, game_level)
GROUP BY userID, game_level
See How to select the first/least/max row per group in SQL for further information.
Related
I'm trying to figure out how to Select a specific number of rows from a MySQL table based on WHERE clause. I have a table with 10 dummy users, I want to get 2 previous and 2 next users of specific user with their ranks.
user_id | points
==================
10 200
4 130
2 540
13 230
15 900
11 300
3 600
17 110
20 140
1 430
5 800
I achieved adding a column for ranking like:
user_id | points | rank
===========================
15 900 1
5 800 2
3 600 3
2 540 4
1 430 5
11 300 6
13 230 7
10 200 8
20 140 9
4 130 10
17 110 11
But the problem is that I want only 5 rows. Suppose I'm retrieving data for user with user_id = 11. The output should look like this:
user_id | points | rank
===========================
2 540 4
1 430 5
11 300 6
13 230 7
10 200 8
where user_id = 11 is in the centre with 2 rows above and 2 below. I have tried nesting UNIONS and SELECT statements but nothing seems to work properly.
Here's a suggestion if you're on MySQL 8+:
WITH cte AS (
SELECT user_id, points,
ROW_NUMBER() OVER (ORDER BY points DESC) AS Rnk
FROM mytable)
SELECT cte2.user_id,
cte2.points,
cte2.Rnk
FROM cte cte1
JOIN cte cte2
ON cte1.user_id=11
AND cte2.Rnk >= cte1.Rnk-2
AND cte2.Rnk <= cte1.Rnk+2
Using common table expression (cte) then do a self join with condition of user_id=11 as base to get the Rnk value of -2 and +2.
Demo fiddle
Since you're on older MySQL version, here's what I can suggest:
SET #uid := 11;
SET #Rnk := (SELECT Rnk
FROM
(SELECT user_id, points,
#r := #r+1 AS Rnk
FROM mytable
CROSS JOIN (SELECT #r := 0) r
ORDER BY points DESC) v
WHERE user_id = #uid);
SELECT user_id, points, Rnk
FROM
(SELECT user_id, points,
#r := #r+1 AS Rnk
FROM mytable
CROSS JOIN (SELECT #r := 0) r
ORDER BY points DESC) v
WHERE Rnk >= #Rnk-2
AND Rnk <= #Rnk+2;
If you will only use user_id as base, then the only part here you need to change is the SET #uid. The remaining queries are just fulfilling your condition of getting two positions above and below the rank retrieved according to the user_id. The base query in SET #Rnk is the same as the base query for the last one. The idea is to assign #Rnk variable with Rnk position of user_id=11 then use it in WHERE condition for the last query.
I'm not aware if there's any online fiddle still using MySQL 5.1 but here's probably the closest version to it, MySQL 5.5 demo fiddle.
Using MariaDB and trying to see if I can get pull original rankings for each row of a table based on the create date.
For example, imagine a scores table that has different scores for different users and categories (lower score is better in this case)
id
leaderboardId
userId
score
submittedAt ↓
rankAtSubmit
9
15
555
50.5
2022-01-20 01:00:00
2
8
15
999
58.0
2022-01-19 01:00:00
3
7
15
999
59.1
2022-01-15 01:00:00
3
6
15
123
49.0
2022-01-12 01:00:00
1
5
15
222
51.0
2022-01-10 01:00:00
1
4
14
222
87.0
2022-01-09 01:00:00
1
5
15
555
51.0
2022-01-04 01:00:00
1
The "rankAtSubmit" column is what I'm trying to generate here if possible.
I want to take the best/smallest score of each user+leaderboard and determine what the rank of that score was when it was submitted.
My attempt at this failed because in MySQL you cannot reference outer level columns more than 1 level deep in a subquery resulting in an error trying to reference t.submittedAt in the following query:
SELECT *, (
SELECT ranking FROM (
SELECT id, RANK() OVER (PARTITION BY leaderboardId ORDER BY score ASC) ranking
FROM scores x
WHERE x.submittedAt <= t.submittedAt
GROUP BY userId, leaderboardId
) ranks
WHERE ranks.id = t.id
) rankAtSubmit
FROM scores t
Instead of using RANK(), I was able to accomplish this by with a single subquery that counts the number of users that have a score that is lower than and submitted before the given score.
SELECT id, userId, score, leaderboardId, submittedAt,
(
SELECT COUNT(DISTINCT userId) + 1
FROM scores t2
WHERE t2.userId = t.userId AND
t2.leaderboardId = t.leaderboardId AND
t2.score < t.score AND
t2.submittedAt <= t.submittedAt
) AS rankAtSubmit
FROM scores t
What I understand from your question is you want to know the minimum and maximum rank of each user.
Here is the code
SELECT userId, leaderboardId, score, min(rankAtSubmit),max(rankAtSubmit)
FROM scores
group BY userId,
leaderboardId,
scorescode here
I'm tracking number of steps/day. I want to get the average steps/day using the 5 best days out of a 7 day period. My end goal is going to be to get an average for the best 5 out of 7 days for a total of 16 weeks.
Here's my sqlfiddle - http://sqlfiddle.com/#!9/5e69bdf/2
Here is the query I'm currently using but I've discovered the result is not correct. It's taking the average of 7 days instead of selecting the 5 days that had the most steps. It's outputting 14,122 as an average instead of 11,606 based on my data as posted in the sqlfiddle.
SELECT SUM(a.steps) as StepsTotal, AVG(a.steps) AS AVGSteps
FROM (SELECT * FROM activities
JOIN Courses
WHERE activities.encodedid=? AND activities.activitydate BETWEEN
DATE_ADD(Courses.Startsemester, INTERVAL $y DAY) AND
DATE_ADD(Courses.Startsemester, INTERVAL $x DAY)
ORDER BY activities.steps DESC LIMIT 5
) a
GROUP BY a.encodedid
Here's the same query with the values filled in for testing:
SELECT SUM(a.steps) as StepsTotal, AVG(a.steps) AS AVGSteps
FROM (SELECT * FROM activities
JOIN Courses
WHERE activities.encodedid='42XPC3' AND activities.activitydate BETWEEN
DATE_ADD(Courses.Startsemester, INTERVAL 0 DAY) AND
DATE_ADD(Courses.Startsemester, INTERVAL 6 DAY)
ORDER BY activities.steps DESC LIMIT 5
) a
GROUP BY a.encodedid
As #SloanThrasher pointed out, the reason the query is not working is because you have multiple rows for the same course in the Courses database which end up being joined to the activities database. Thus the output for the subquery gives the top value (16058) 3 times plus the second highest value (11218) twice for a total of 70610 and an average of 14122. You can work around this by modifying the query as follows:
SELECT SUM(a.steps) as StepsTotal, AVG(a.steps) AS AVGSteps
FROM (SELECT * FROM activities
JOIN (SELECT DISTINCT Startsemester FROM Courses) c
WHERE activities.encodedid='42XPC3' AND activities.activitydate BETWEEN
DATE_ADD(c.Startsemester, INTERVAL 0 DAY) AND
DATE_ADD(c.Startsemester, INTERVAL 6 DAY)
ORDER BY CAST(activities.steps AS UNSIGNED) DESC LIMIT 5
) a
GROUP BY a.encodedid
Now since there are actually only 3 days with activity (2018-07-16, 2018-07-17 and 2018-07-18) between the start of semester and 6 days later (2018-07-12 and 2018-07-18) this gives a total of 37533 (16058+11218+10277) and an average of 12517.7.
StepsTotal AVGSteps
37553 12517.666666666666
Ideally, you probably also want to add a constraint on the Course chosen from Courses e.g. change
(SELECT DISTINCT Startsemester FROM Courses)
to
(SELECT DISTINCT Startsemester FROM Courses WHERE CourseNumber='PHED1164')
Try this query:
SELECT #rn := 1, #weekAndYear := 0;
SELECT weekDayAndYear,
SUM(steps),
AVG(steps)
FROM (
SELECT #weekAndYear weekAndYearLag,
CASE WHEN #weekAndYear = YEAR(activitydate) * 100 + WEEK(activitydate)
THEN #rn := #rn + 1 ELSE #rn := 1 END rn,
#weekAndYear := YEAR(activitydate) * 100 + WEEK(activitydate) weekDayAndYear,
steps,
lightly_act_min,
fairly_act_min,
sed_act_min,
vact_min,
encodedid,
activitydate,
username
FROM activities
ORDER BY YEAR(activitydate) * 100 + WEEK(activitydate), CAST(steps AS UNSIGNED) DESC
) a WHERE rn <= 5
GROUP BY weekDayAndYear
Demo
With additional variables, I imitate SQL Server ROW_NUMBER function, to number from 1 to 7 days partitioned by weeks. This way I can filter best 5 days and easily get a average grouping by column weekAndDate, which is in the same format as variable: yyyyww (i used integer to avoid casting to varchar).
Consider the following:
DROP TABLE IF EXISTS my_table;
CREATE TABLE `my_table`
(id SERIAL PRIMARY KEY
,steps INT NOT NULL
);
insert into my_table (steps) values
(9),(5),(7),(7),(7),(8),(4);
select prev
, sum(steps) total
from (
select steps
, case when #prev = grp
then #j:=#j+1 else #j:=1 end j
, #prev:=grp prev
from (SELECT steps
, case when mod(#i,3)=0
then #grp := #grp+1 else #grp:=#grp end grp -- a 3 day week
, #i:=#i+1 i
from my_table
, (select #i:=0,#grp:=0) vars
order
by id) x
, (select #prev:= null, #j:=0) vars
order by grp,steps desc,i) a
where j <=2 -- top 2 (out of 3)
group by prev;
+------+-------+
| prev | total |
+------+-------+
| 1 | 16 |
| 2 | 15 |
| 3 | 4 |
+------+-------+
http://sqlfiddle.com/#!9/ee46d7/11
[Aim]
We would like to find out how often an event "A" ocurred before time "X". More concretely, given the dataset below we want to find out the count of the prior purchases.
[Context]
DMBS: MySQL 5.6
We have following dataset:
user | date
1 | 2015-06-01 17:00:00
2 | 2015-06-02 18:00:00
1 | 2015-06-03 19:00:00
[Desired output]
user | date | purchase count
1 | 2015-06-01 17:00:00 | 1
2 | 2015-06-02 18:00:00 | 1
1 | 2015-06-03 19:00:00 | 2
[Already tried]
We managed to get the count on a specific day using an inner join on the table itself.
[Problem(s)]
- How to do this in a single query?
This could be done using user defined variable which is faster as already mentioned in the previous answer.
This needs creating incremental variable for each group depending on some ordering. And from the given data set its user and date.
Here how you can achieve it
select
user,
date,
purchase_count
from (
select *,
#rn:= if(#prev_user=user,#rn+1,1) as purchase_count,
#prev_user:=user
from test,(select #rn:=0,#prev_user:=null)x
order by user,date
)x
order by date;
Change the table name test to your actual table name
http://sqlfiddle.com/#!9/32232/12
Probably the most efficient way is to use variables:
select t.*,
(#rn := if(#u = user, #rn + 1,
if(#u := user, 1, 1)
)
) as purchase_count;
from table t cross join
(select #rn := 0, #u := '') params
order by user, date ;
You can also do this with correlated subqueries, but this is probably faster.
Hi I am trying to figure out a way of finding the largest winning streak for each member in my table. When the table was built, this was never in the plans to happen so is why Im seeking help on how I can achieve this.
My structure is as follows:
id player_id opponant_id won loss timestamp
If it is a persons game, the player id is their id. If they are being challenged by someone, their id is the opponant id and the won loss (1 or 0) is in relation to the player_id.
I want to find the greatest winning streak for each user.
Anyone have any ideas on how to do this with the current table structure.
regards
EDIT
here is some test data, where id 3 is the player in question:
id player_id won loss timestamp
1 6 0 1 2012-03-14 13:31:00
13 3 0 1 2012-03-15 13:10:40
17 3 0 1 2012-03-15 13:29:56
19 4 0 1 2012-03-15 13:37:36
51 3 1 0 2012-03-16 13:20:05
53 6 0 1 2012-03-16 13:32:38
81 3 0 1 2012-03-21 13:14:49
89 4 1 0 2012-03-21 14:01:28
91 5 0 1 2012-03-22 13:14:20
Give this a try. Edited to take into account loss rows
SELECT
d.player_id,
MAX(d.winStreak) AS maxWinStreak
FROM (
SELECT
#cUser := 0,
#winStreak := 0
) v, (
SELECT
player_id,
won,
timestamp,
#winStreak := IF(won=1,IF(#cUser=player_id,#winStreak+1,1),0) AS winStreak,
#cUser := player_id
FROM (
(
-- Get results where player == player_id
SELECT
player_id,
won,
timestamp
FROM matchTable
) UNION (
-- Get results where player == opponent_id (loss=1 is good)
SELECT
opponent_id,
loss,
timestamp
FROM matchtable
)
) m
ORDER BY
player_id ASC,
timestamp ASC
) d
GROUP BY d.player_id
This works by selecting all win/loses and counting the win streak as it goes through. The subquery is then grouped by player_id and the max winStreak as calculated as it looped through is output per-player.
It seemed to work nicely against my test dataset anyway :)
To do this more efficiently I would restructure, i.e.
matches (
matchID,
winningPlayerID,
timeStamp
)
players (
playerID
-- player name etc
)
matchesHasPlayers (
matchID,
playerID
)
Which would lead to an inner query of
SELECT
matches.matchID,
matchesHasPlayers.playerID,
IF(matches.winningPlayerID=matchesHasPlayers.playerID,1,0) AS won
matches.timestamp
FROM matches
INNER JOIN matchesHasPlayers
ORDER BY matches.timestamp
resulting in
SELECT
d.player_id,
MAX(d.winStreak) AS maxWinStreak
FROM (
SELECT
#cUser := 0,
#winStreak := 0
) v, (
SELECT
matchesHasPlayers.playerID,
matches.timestamp,
#winStreak := IF(matches.winningPlayerID=matchesHasPlayers.playerID,IF(#cUser=matchesHasPlayers.playerID,#winStreak+1,1),0) AS winStreak,
#cUser := matchesHasPlayers.playerID
FROM matches
INNER JOIN matchesHasPlayers
ORDER BY
matchesHasPlayers.playerID ASC,
matches.timestamp ASC
) d
GROUP BY d.player_id
SELECT * FROM
(
SELECT player_id, won, loss, timestamp
FROM games
WHERE player_id = 123
UNION
SELECT opponant_id as player_id, loss as won, won as loss, timestamp
FROM games
WHERE opponant_id = 123
)
ORDER BY timestamp
That will give you all the results for one player ordered by timestamp. Then you would need to loop those results and count winning records or else concatenate them all into a string and then use string functions to find your highest 11111 set in that string. That code will vary depending on the language you want to use, but logically those are the two choices.