MySQL update table only with the highest values of another table - mysql

For a game site.
All games are recorded if the player's score is greater than his old score
Table of all players (over 10,000 players)
CREATE TABLE games (
PlayerID INT UNSIGNED,
Date TIMESTAMP(12),
Score BIGINT UNSIGNED DEFAULT 0,
#...other data
);
Once a month, I do an update of the table of records best. And after I erase all games.
Table of best players (top 50)
CREATE TABLE best (
#...same as games, without final other data
PlayerID INT UNSIGNED,
Date TIMESTAMP(12),
Score BIGINT UNSIGNED DEFAULT 0
);
So I add the 50 best players of the table games in to the table best:
INSERT INTO best (PlayerID, Date, Score)
SELECT PlayerID, Date, Score FROM games ORDER BY Score DESC LIMIT 50;
And after (and this is where I have a problem) I try to keep in best only the best 50. At this point best contains 100 lines.
What I have to do:
Do not store several times the same player PlayerID.
Delete the worst Score for this player.
And at the end, leaving only the top 50.
->
+----------+---------+
| PlayerID | Score |
+----------+---------+
| 25 | 20000 | New
| 25 | 25000 | Old best
| 40 | 10000 | Old best
| 57 | 80000 | New best
| 57 | 45000 | Old
| 80 | 35000 | New best
+----------+---------+
I have to retain in the end only 50 lines (the ones with "best" in my example).
I tried many things, but I have not succeeded in achieve the expected result.
I am using PHP, so if it is possible to do it simply with a intermediare storage in an array, that's fine too.
The speed is not a priority because it is an operation that is done only once a month.

The following SQL returns the top 50 scores:
SELECT `PlayerId`, max(`Score`) MaxScore
FROM (
SELECT `PlayerId`, `Date`, `Score` FROM games
UNION
SELECT `PlayerId`, `Date`, `Score` FROM best
) t
GROUP BY `PlayerId`
ORDER BY `MaxScore` DESC
LIMIT 50
You can use the result to overwrite the table best. For this you also need the corresponding Date field, which is missing so far. The next SQL will also return a maxDate field which corresponds to the highscore.
SELECT t2.`PlayerId`, max(t2.`Date`) maxDate, top.`MaxScore`
FROM
(
SELECT `PlayerId`, max(`Score`) MaxScore
FROM (
SELECT `PlayerId`, `Date`, `Score` FROM games
UNION
SELECT `PlayerId`, `Date`, `Score` FROM best
) t1
GROUP BY `PlayerId`
ORDER BY `MaxScore` DESC
LIMIT 50
) top
LEFT JOIN (
SELECT `PlayerId`, `Date`, `Score` FROM games
UNION
SELECT `PlayerId`, `Date`, `Score` FROM best
) t2 ON t2.`PlayerId` = top.`PlayerId` AND t2.`Score` = top.`MaxScore`
GROUP BY t2.`PlayerId`
ORDER BY top.`MaxScore` DESC
To transfer the new top 50 highscores into the best table you can use a temporary table like tmp_best. Insert the top scores into the empty table tmp_best with (you have to insert your select query from above):
INSERT INTO tmp_best (`PlayerId`, `Date`, `Score`)
SELECT ...
After this the best table can be emptied and then you can copy the rows from tmp_best into best.
Here is an alternative solution, which has simplified SQL. The difference
to the solution above is the using of a temporary table tmp_all at the beginning for the unified data. Before using the following SQL you have to create tmp_all, which can be a copy of the structure of games or best.
DELETE FROM tmp_all;
INSERT INTO tmp_all
SELECT `PlayerId`, `Date`, `Score` FROM games
UNION
SELECT `PlayerId`, `Date`, `Score` FROM best
;
DELETE FROM best;
INSERT INTO best (`PlayerId`, `Date`, `Score`)
SELECT t2.`PlayerId`, max(t2.`Date`) maxDate, top.`MaxScore`
FROM
(
SELECT `PlayerId`, max(`Score`) MaxScore
FROM tmp_all t1
GROUP BY `PlayerId`
ORDER BY `MaxScore` DESC
LIMIT 50
) top
LEFT JOIN tmp_all t2 ON t2.`PlayerId` = top.`PlayerId` AND t2.`Score` = top.`MaxScore`
GROUP BY t2.`PlayerId`
ORDER BY top.`MaxScore` DESC
;

SELECT PlayerID, Date, Score FROM games ORDER BY Score DESC LIMIT 50
UNION
SELECT PlayerID, Date, Score FROM best
Here you'll get the best 50 players all-time. Then, as suggested by #ethrbunny, erase the best table and populate it again with the above query. You can use a TEMPORARY TABLE
UNION guarantees you that you'll get no duplicated player

Related

AVG with LIMIT and GROUP BY

I'm looking to make a SQL query, but I can't do it... and I can't find an example like mine.
I have a simple table People with 3 columns, 7 records :
I'd like to get for each team, the average points of 2 bests people.
My Query:
SELECT team
, (SELECT AVG(point)
FROM People t2
WHERE t1.team = t2.team
ORDER
BY point DESC
LIMIT 2) as avg
FROM People t1
GROUP
BY team
Current result: (average on all people of each team)
Apparently, it's not possible to use a limit into subquery. "ORDER BY point DESC LIMIT 2" is ignored.
Result expected:
I want the average points of 2 bests people (with highest points) for each team, not the average points of all people of each team.
How can I do that? If anyone has any idea..
I'm on MySQL Database
Link of Fiddle : http://sqlfiddle.com/#!9/8c80ef/1
Thanks !
You can try this.
try to make a order number by a subquery, which order by point desc.
then only get top 2 row by each team, if you want to get other top number just modify the number in where clause.
CREATE TABLE `People` (
`id` int(11) NOT NULL,
`name` varchar(20) NOT NULL,
`team` varchar(20) NOT NULL,
`point` int(4) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `People` (`id`, `name`, `team`, `point`) VALUES
(1, 'Luc', 'Jupiter', 10),
(2, 'Marie', 'Saturn', 0),
(3, 'Hubert', 'Saturn', 0),
(4, 'Albert', 'Jupiter', 50),
(5, 'Lucy', 'Jupiter', 50),
(6, 'William', 'Saturn', 20),
(7, 'Zeus', 'Saturn', 40);
ALTER TABLE `People`
ADD PRIMARY KEY (`id`);
ALTER TABLE `People`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=8;
Query 1:
SELECT team,avg(point) totle
FROM People t1
where (
select count(*)
from People t2
where t2.id >= t1.id and t1.team = t2.team
order by t2.point desc
) <=2 ## if you want to get other `top` number just modify this number
group by team
Results:
| team | totle |
|---------|-------|
| Jupiter | 50 |
| Saturn | 30 |
This is a pain in MySQL. If you want the two highest point values, you can do:
SELECT p.team, AVG(p2.point)
FROM people p
WHERE p.point >= (SELECT DISTINCT p2.point
FROM people p2
WHERE p2.team = p.team
ORDER BY p2.point DESC
LIMIT 1, 1 -- get the second one
);
Ties make this tricky, and your question isn't clear on what to do about them.

MySQL Aggregate Row Result Count

Is there a simple way in MySQL to return the number of aggregate result rows?
For example:
SELECT `name`, SUM(`points`)
FROM `goals`
GROUP BY `name`
HAVING SUM(`points`) > 10
If I were looking for number of unique names, if possible, how may I achieve this?
For example, if a return data set is:
Player1 | 11
Player2 | 15
Player3 | 17
Is there a way to return the number of results, which would be three (3)?
Here's one option using a subquery:
SELECT COUNT(*)
FROM (
SELECT `name`, SUM(`points`)
FROM `goals`
GROUP BY `name`
HAVING SUM(`points`) > 10
) t
select count(*) as num_of_records
from
(
SELECT name
FROM goals
GROUP BY name
HAVING SUM(points) > 10
) tmp

Listing the average from the top 5 scores of each player

After reading several other answers of similar problems, I still can't wrap my head to achieve the following:
Having a list of player scores, I would like to get the top n scores of each player (the scores table only has the player id and a value). The final purpose is to aggregate the scores with the AVG() function.
Also note that the n bound is just a limit; a player may have less than n scores, in which case all of them should be computed.
Once the results are calculated, joining with the player table will allow to expand each player id into printable information.
In MySQL you need vars to accomplish yours requirements:
select
idplayer, Score
from
(
select
idplayer, T.Score,
#r := IF(#g=idplayer,#r+1,1) RowNum,
#g := idplayer
from (select #g:=null) initvars
CROSS JOIN
(
SELECT s.Score,
s.idplayer
FROM scores s
ORDER BY idplayer, s.score DESC
) T
) U
WHERE RowNum <= 3
Test it at sqlfiddle:
create table scores( idplayer int, score int);
insert into scores values
(1,5), (1,7), (1,18), (1,27), (2,6);
Results:
| IDPLAYER | SCORE |
--------------------
| 1 | 27 |
| 1 | 18 |
| 1 | 7 |
| 2 | 6 |
Start from here:
drop table if exists scores;
create table scores (playerid integer, score integer);
insert into scores values
(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),
(2,1),(2,2),(2,3),(2,4);
select p1.playerid, p1.score
from scores p1, scores p2
where p1.playerid = p2.playerid
and
p1.score >=
ifnull((select score
from scores
where playerid=p1.playerid
order by score desc limit 4,1
),0)
group by p1.playerid,p1.score;
which will give you the desired list of top scores.
I'm not 100% sure this will work in mysql. However, the following captures the idea as a correlated subquery:
select p.*,
(select sum(score)
from (select score
from scores s
where s.playerid = p.playerid
order by score desc
limit 5
) s2
) as summax5
from players p

Retrieving Nth subquery for INSERT

Abstract
From a table holding various posts of users to a forum, another table shall be daily updated with the top 20 posters. Posts are stored in posts, daily high-scores are held in hiscore.
Tables
posts:
post_id(PK:INT) | user_id(INT) | ... | timestamp(TIMESTAMP)
hiscore:
user_id(INT) | rank(INT)
Query
TRUNCATE TABLE `hiscore` ;
INSERT INTO `hiscore` (`user_id`,`rank`)
(
SELECT `user_id`, ???
FROM `posts`
WHERE `timestamp` BETWEEN blah AND blah
GROUP BY `user_id`
ORDER BY COUNT(`post_id`) DESC
LIMIT 20
)
The actual question
What is to be inserted in the above query instead of ??? to account for the rank?
Is there a variable like #NTH_SUBQUERY that'll substitute for 5 on the fifth run of the SELECT subquery?
UPDATE: The table hiscore is supposed to only hold the top 20 posters. I know the table structure can be optimized. The focus of the answers should be on how to determine the current retrieved row of the sub-query.
INSERT INTO `hiscore` (`user_id`,`rank`)
(
SELECT `user_id`, #rank = #rank + 1
FROM `posts`, (SELECT #rank := 0) r
WHERE `timestamp` BETWEEN blah AND blah
GROUP BY `user_id`
ORDER BY COUNT(`post_id`) DESC
LIMIT 20
)
You seems too fancy on truncate, for you cases
hiscore:
the_date (DATE) | user_id(INT) | rank(INT)
and built a key on the_date, rank
insertion
set #pos=0;
insert into hiscore
select cur_date(), user_id, #pos:=#pos+1
from ...
to keep the table size manageable, you probably can delete once in few months
Or you can set an auto_increment on rank
create table hiscore
(
the_date date not null,
rank int(3) not null auto_increment,
user_id int(10) not null,
primary key (the_date, rank)
);
So, the rank is auto incremented (which is the same as order by number of daily posts descending)

Finding a users maximum score and the associated details

I have a table in which users store scores and other information about said score (for example notes on score, or time taken etc). I want a mysql query that finds each users personal best score and it's associated notes and time etc.
What I have tried to use is something like this:
SELECT *, MAX(score) FROM table GROUP BY (user)
The problem with this is that whilst you can extra the users personal best from that query [MAX(score)], the returned notes and times etc are not associated with the maximum score, but a different score (specifically the one contained in *). Is there a way I can write a query that selects what I want? Or will I have to do it manually in PhP?
I'm assuming that you only want one result per player, even if they have scored the same maximum score more than once. I am also assuming that you want each player's first time that they got their personal best in the case that there are repeats.
There's a few ways of doing this. Here's a way that is MySQL specific:
SELECT user, scoredate, score, notes FROM (
SELECT *, #prev <> user AS is_best, #prev := user
FROM table1, (SELECT #prev := -1) AS vars
ORDER BY user, score DESC, scoredate
) AS T1
WHERE is_best
Here's a more general way that uses ordinary SQL:
SELECT T3.* FROM table1 AS T3
JOIN (
SELECT T1.user, T1.score, MIN(scoredate) AS scoredate
FROM table1 AS T1
JOIN (SELECT user, MAX(score) AS score FROM table1 GROUP BY user) AS T2
ON T1.user = T2.user AND T1.score = T2.score
GROUP BY T1.user
) AS T4
ON T3.user = T4.user AND T3.score = T4.score AND T3.scoredate = T4.scoredate
Result:
1, '2010-01-01 17:00:00', 50, 'Much better'
2, '2010-01-01 14:00:00', 100, 'Perfect score'
Test data I used to test this:
CREATE TABLE table1 (user INT NOT NULL, scoredate DATETIME NOT NULL, score INT NOT NULL, notes NVARCHAR(100) NOT NULL);
INSERT INTO table1 (user, scoredate, score, notes) VALUES
(1, '2010-01-01 12:00:00', 10, 'First attempt'),
(1, '2010-01-01 17:00:00', 50, 'Much better'),
(1, '2010-01-01 22:00:00', 30, 'Time for bed'),
(2, '2010-01-01 14:00:00', 100, 'Perfect score'),
(2, '2010-01-01 16:00:00', 100, 'This is too easy');
You can join with a sub query, as in the following example:
SELECT t.*,
sub_t.max_score
FROM table t
JOIN (SELECT MAX(score) as max_score,
user
FROM table
GROUP BY user) sub_t ON (sub_t.user = t.user AND
sub_t.max_score = t.score);
The above query can be explained as follows. It starts with:
SELECT t.* FROM table t;
... This by itself will obviously list all the contents of the table. The goal is to keep only the rows that represent a maximum score of a particular user. Therefore if we had the data below:
+------------------------+
| user | score | notes |
+------+-------+---------+
| 1 | 10 | note a |
| 1 | 15 | note b |
| 1 | 20 | note c |
| 2 | 8 | note d |
| 2 | 12 | note e |
| 2 | 5 | note f |
+------+-------+---------+
...We would have wanted to keep just the "note c" and "note e" rows.
To find the rows that we want to keep, we can simply use:
SELECT MAX(score), user FROM table GROUP BY user;
Note that we cannot get the notes attribute from the above query, because as you had already noticed, you would not get the expected results for fields not aggregated with an aggregate function, like MAX() or not part of the GROUP BY clause. For further reading on this topic, you may want to check:
Debunking GROUP BY Myths
How does MySQL decide which id to return in group by clause?
Why does MySql allow “group by” queries WITHOUT aggregate functions?
Now we only need to keep the rows from the first query that match the second query. We can do this with an INNER JOIN:
...
JOIN (SELECT MAX(score) as max_score,
user
FROM table
GROUP BY user) sub_t ON (sub_t.user = t.user AND
sub_t.max_score = t.score);
The sub query is given the name sub_t. It is the set of all the users with the personal best score. The ON clause of the JOIN applies the restriction to the relevant fields. Remember that we only want to keep rows that are part of this subquery.
SELECT *
FROM table t
ORDER BY t.score DESC
GROUP BY t.user
LIMIT 1
Side note: It is better to specify the fields than use SELECT *