MySQL sum ranking, group by genre, limit 10

MySQL sum ranking, group by genre, limit 10 - mysql

I have MySQL tables like these and I would like to calculate the TOP10 for each genre:
rankings_2016 (trackId, genreId, ranking, timestamp)
genres (genreId, genreName)
tracks (trackId, trackName, genreId)
artists (artistId, artistName)
artists_tracks (artistId, trackId)
I would like to get TOP10 rankings for each genre, for every track and for every artist.
A track or an artist could have up to 2 genres. Ranking could be the same. Just to get the idea with LIMIT 2:
genreId | trackId | ranking
---------------------------------
0 1111 100
0 2222 99
1 1111 100
1 2222 99
genreId | artistId | ranking
---------------------------------
0 1111 100
0 2222 99
1 1111 100
1 2222 99
The only solution I found is getting everything in a table and then LIMIT 10 in the page, but it's killing my database in terms of size (I have limited resources).
For the tracks I wrote this:
SELECT trackId, genreId, #newRank := SUM(ranking) as ranking
FROM rankings_2016
WHERE timestamp >= ( select unix_timestamp('2016-01-01') )
AND timestamp <= ( select unix_timestamp('2016-12-31') )
GROUP BY trackId, genreId
For the artists:
SELECT artistId, genreId, #newRank := SUM(a1.ranking) as ranking
FROM rankings_2016 a1
LEFT JOIN artists_tracks a2
ON a1.trackId = a2.trackId
WHERE timestamp >= ( select unix_timestamp('2016-01-01') )
AND timestamp <= ( select unix_timestamp('2016-12-31') )
GROUP BY artistId, genreId
Thank all in advance for your hints.
UPDATE
The logic in general (and accepted reply) requires good indexes and performant server.
ARTISTS in my case failed with error 500 unless I increased CPU.
In general replacing LEFT with INNER saves 1 second.

Consider a correlated count subquery to rank order the rankings by Artist / Track / Genre groupings. Then use this rank calculated column in outer query to filter for top 10 per grouping:
Artist Ranking (top 10 rankings per artist and genre)
SELECT main.artistId, main.genreId, main.ranking
FROM
(
SELECT a.artistId, r.genreId, r.ranking,
(SELECT COUNT(*) FROM rankings_2016 subr
LEFT JOIN artists_tracks suba ON subr.trackId = suba.trackId
WHERE suba.artistId = a.artistId
AND subr.genreId = r.genreId
AND subr.ranking >= r.ranking) AS rn
FROM rankings_2016 r
LEFT JOIN artists_tracks a ON r.trackId = a.trackId
WHERE r.timestamp BETWEEN ( select unix_timestamp('2016-01-01') )
AND ( select unix_timestamp('2016-12-31') )
) AS main
WHERE main.rn <= 10
Track Ranking (top 10 rankings per track and genre)
SELECT main.trackId, main.genreId, main.ranking
FROM
(
SELECT r.trackId, r.genreId, r.ranking,
(SELECT COUNT(*) FROM rankings_2016 subr
WHERE subr.genreId = r.genreId
AND subr.trackId = r.trackId
AND subr.ranking >= r.ranking) AS rn
FROM rankings_2016 r
WHERE r.timestamp BETWEEN ( select unix_timestamp('2016-01-01') )
AND ( select unix_timestamp('2016-12-31') )
) AS main
WHERE main.rn <= 10

Related

Get distinct values from groups in MySQL

I want to get the id of the lowest points from each team (the team field).
My query works but i need to make sure the following query is good enough with a large table.
I need Simplification and Optimization.
Query:
SELECT T.id from teams as T
INNER JOIN (
SELECT MIN(T1.points) AS P FROM teams AS T1
GROUP BY T1.team LIMIT 5
) TJOIN ON T.points IN (TJOIN.P)
GROUP BY T.team
ORDER BY T.points ASC LIMIT 5
Table teams
id
team (foreign_key)
points (indexed)
1
a
100
2
a
101
3
b
106
4
c
105
5
c
102
Result
id
1
5
3

I believe the query you are looking for is:
SELECT MIN(T.id)
FROM teams as T
INNER JOIN (
SELECT team, MIN(points) AS min_points
FROM teams
GROUP BY team LIMIT 5
) TJOIN
ON T.team = TJOIN.team
AND T.points = TJOIN.min_points
GROUP BY T.team
ORDER BY T.points ASC
LIMIT 5
You need to join based on both the column being grouped by and the min value. Consider the result of your query if multiple teams had a score of 100.
Another way of doing this is to use ROW_NUMBER():
SELECT id
FROM (
SELECT id, points, ROW_NUMBER() OVER (PARTITION BY team ORDER BY points ASC, id ASC) rn
FROM teams
) t
WHERE rn = 1
ORDER BY points ASC
LIMIT 5

Exclude users with zero score from ranking

I've come across this great solution to rank users based on their score in mysql.
SELECT d.*, c.ranks
FROM
(
SELECT Score, #rank:=#rank+1 Ranks
FROM
(
SELECT DISTINCT Score
FROM tableName a
ORDER BY score DESC
) t, (SELECT #rank:= 0) r
) c
INNER JOIN tableName d
ON c.score = d.score
However, I would like to know if there is a way to exclude users with 0 or without score from the ranking, but still return these users in the results.
So for example
KEY username password score Ranks
1 Anna 123 8 1
2 Bobby 345 5 2
3 Helen 678 5 2
4 Jon 567 -2 3
5 Arthur ddd -9 4
4 Chris 444 0
5 Liz eee 0

Since you want to SELECT all of your users, start with that:
SELECT
user.*
FROM user
Now, we want to add in a table of ranked users, so we'll start to add in complexity. We're aiming to get a temporary table of ranked users, so we'll LEFT JOIN as to not filter out any non-ranked users.
SELECT
user.*,
ranked_user.score,
ranked_user.rank
FROM user
LEFT JOIN(
// subquery
) AS ranked_user ON ranked_user.user_id = user.id
Then we'll have to figure out the section for the subquery where the ranks are determined. You have most of it already, I'm just going to add in an IF statement to only assign a rank if they have a score. Altogether, you get this:
SELECT
user.*,
ranked_user.score,
ranked_user.rank
FROM user
LEFT JOIN(
SELECT
score,
user_id,
IF(score = 0 OR score IS NULL, null, #rank:=#rank+1) AS rank
FROM(
SELECT
DISTINCT score,
user_id
FROM stat
ORDER BY score DESC
) t, (SELECT #rank:= 0) r
) ranked_user ON ranked_user.user_id = user.user_id
Here's how that IF statement works:
IF(score = 0 OR score IS NULL,
# if c.score is 0 or missing
null,
# set the value to null
#rank:=#rank+1)
# otherwise, calculate a rank
AS rank
# call this value "rank"
Just a side note: I'd change user.* and actually list each column of user that you want. That's considered best practice.

How to wirte an extensible SQL to find the users who continuously login for n days

If I have a table(Oracle or MySQL), which stores the date user logins.
So how can I write a SQL(or something else) to find the users who have continuously login for n days.
For example:
userID | logindate
1000 2014-01-10
1000 2014-01-11
1000 2014-02-01
1000 2014-02-02
1001 2014-02-01
1001 2014-02-02
1001 2014-02-03
1001 2014-02-04
1001 2014-02-05
1002 2014-02-01
1002 2014-02-03
1002 2014-02-05
.....
We can see that user 1000 has continually logined for two days in 2014, and user 1001 has continually logined for 5 days. and user 1002 never continuously logins.
The SQL should be extensible , which means I can pick every number of n, and modify a little or pass a new parameter, and the results is as expected.
Thank you!

As we don't know what dbms you are using (you named both MySQL and Oracle), here are are two solutions, both doing the same: Order the rows and subtract rownumber days from the login date (so if the 6th record is 2014-02-12 and the 7th is 2014-02-13 they both result in 2014-02-06). So we group by user and that groupday and count the days. Then we group by user to find the longest series.
Here is a solution for a dbms with analytic window functions (e.g. Oracle):
select userid, max(days)
from
(
select userid, groupday, count(*) as days
from
(
select
userid, logindate - row_number() over (partition by userid order by logindate) as groupday
from mytable
)
group by userid, groupday
)
group by userid
--having max(days) >= 3
And here is a MySQL query (untested, because I don't have MySQL available):
select
userid, max(days)
from
(
select
userid, date_add(logindate, interval -row_number day) as groupday, count(*) as days
from
(
select
userid, logindate,
#row_num := #row_num + 1 as row_number
from mytable
cross join (select #row_num := 0) r
order by userid, logindate
)
group by userid, groupday
)
group by userid
-- having max(days) >= 3

I think the following query will give you a very extensible parametrization:
select z.userid, count(*) continuous_login_days
from
(
with max_dates as
( -- Get max date for every user ID
select t.userid, max(t.logindate) max_date
from test t
group by t.userid
),
ranks as
( -- Get ranks for login dates per user
select t.*,
row_number() over
(partition by t.userid order by t.logindate desc) rnk
from test t
)
-- So here, we select continuous days by checking if rank inside group
-- (per user ID) matches login date compared to max date
select r.userid, r.logindate, r.rnk, m.max_date
from ranks r, max_dates m
where m.userid = r.userid
and r.logindate + r.rnk - 1 = m.max_date -- here is the key
) z
-- Then we only group by user ID to get the number of continuous days
group by z.userid
;
Here is the result:
USERID CONTINUOUS_LOGIN_DAYS
1 1000 2
2 1001 5
3 1002 1
So you can just choose by querying field CONTINUOUS_LOGIN_DAYS.
EDIT : If you want to choose from all ranges (not only the last one), my query structure no longer works because it relied on the last range. But here is a workaround:
with w as
( -- Parameter
select 2 nb_cont_days from dual
)
select *
from
(
select t.*,
-- Get number of days around
(select count(*) from test t2
where t2.userid = t.userid
and t2.logindate between t.logindate - nb_cont_days + 1
and t.logindate) m1,
-- Get also number of days more in the past, and in the future
(select count(*) from test t2
where t2.userid = t.userid
and t2.logindate between t.logindate - nb_cont_days
and t.logindate + 1) m2,
w.nb_cont_days
from w, test t
) x
-- If these 2 fields match, then we have what we want
where x.m1 = x.nb_cont_days
and x.m2 = x.nb_cont_days
order by 1, 2
You just have to change the parameter in the WITH clause, so you can even create a function from this query to call it with this parameter.

SELECT userID,count(userID) as numOfDays FROM LOGINTABLE WHERE logindate between '2014-01-01' AND '2014-02-28'
GROUP BY userID
In this case you can check the login days per user, in a specific period

Get User Rank Based on Sum with Pagination

There are several rank posts out there but I have yet to see one dealing with when the results are paginated and when the ranking criteria (in this case: points) is equal to the previous user. I have tried a few of the pre-existing examples but none have worked.
I have a table called "users" with the column "id". I also have a table called "points" with the columns "user_id" and "amount".
I need:
1.) Users with duplicate sum of points to have the same rank
Points Table
user_id amount
1 10
2 20
1 5
3 20
3 -5
4 5
Rank should be
rank user_id total
1 2 20
2 1 15
2 3 15
3 4 5
2.) Needs to maintain the ranking from one page to another so the rank has to be gathered in the query and not the resulting PHP.
3.) Display ALL users not just ones with rows in the points table because some users have 0 points and I want to display them last.
Right now I'm just listing the users in order of their points but their rank is not gathered because it wasn't working.
$getfanspoints = mysql_query("SELECT DISTINCT id,
(SELECT SUM(amount) AS points FROM points WHERE points.user_id = users.id) AS points
FROM users
ORDER BY points DESC LIMIT $offset, $fans_limit", $conn);
I've read these solutions and none have worked.
[Roland's Blog][1]
[How to get rank based on SUM's][2]
[MySQL, get users rank][3]
[How to get rank using mysql query][4]
and a few others whose link I can't find right now.
Any suggestions?
[EDIT]
I used ypercube's bottom answer.

SELECT COUNT(*) AS rank
, t.user_id
, t.total
FROM
( SELECT user_id
, SUM(amount) AS total
FROM points
GROUP BY user_id
) AS t
JOIN
( SELECT DISTINCT
SUM(amount) AS total
FROM points
GROUP BY user_id
) AS dt
ON
t.total <= dt.total
GROUP BY t.user_id
ORDER BY rank
, user_id
But the above may be really slow with a big table and points awarded often. It might be really better to have just this and calculate the ranks in your application code:
SELECT users.id AS user_id
, SUM(amount) AS total
FROM
users
LEFT JOIN
points
ON points.user_id = users.id
GROUP BY users.id
ORDER BY total DESC
, user_id
This will work, too (edited, to work with the users table and with OFFSET):
SELECT *
FROM
( SELECT
#rank := #rank + (#t <> total) AS rank
, user_id
, #t := total AS total
FROM
( SELECT users.id AS user_id
, COALESCE(SUM(amount),0) AS total
FROM users
LEFT JOIN points
ON users.id = points.user_id
GROUP BY users.id
) AS o
CROSS JOIN
( SELECT #rank := 0, #t := -999999
) AS dummy
ORDER BY total DESC
, user_id
) tmp
LIMIT x OFFSET y

Selecting sum of single column depending on two columns on same table

I have a table very similar to the one below. p1 and p2 on the table refer to id of player on an another table.
id score p1 p2 date
-- ----- -- -- ----
1 12 1 2 2011.10.21
2 23 3 4 2011.10.22
3 21 1 3 2011.10.23
4 35 5 1 2011.10.24
5 11 2 3 2011.10.25
What I want to do is the get the player id (p1 or p2) with highest score. My solution is something like select sum(score) but I can't form a query because a player may appear in both p1 or p2 columns.
Also a bigger problem is when I want to sort scores from highest to lowest. I dont know what to do. How can I sum and sort a score if I need to group to separate columns? The result I want is similar to this output:
pID score times_played
--- ----- ------------
1 68 3
3 55 3
5 35 1
2 23 2
4 23 1
Is my database design flawed? If there is a more intelligent way I'd like to know. Should I need seperate single queries so I can merge them on PHP or something?
Any help would be appreciated.
Cheers.
PS: I couldnt think a nice subject. Feel free to edit.

You can put the players in one column as so:
select id, score, p1 as player, date from yourtable
union all
select id, score, p2 as player, date from yourtable
You now have players in one column. You can do this to get the score sum for all players
select sum(score), player from (
select id, score, p1 as player, date from yourtable
union all
select id, score, p2 as player, date from yourtable
) group by player
Now, you say that you also want to know how many times the player played and sort them in descending order:
select sum(score), player, count(*) as timesPlayed from (
select id, score, p1 as player, date from yourtable
union all
select id, score, p2 as player, date from yourtable
) group by player order by sum(score) desc

Try this to get players with highest score (disregarding ties)
select id,p1,p2
from table t1
join (select max(score) as MaxS) xx on xx.MaxS = t1.Score
limit 1
To get player total score, try this
select Player as pID,Sum(tot) as Score, count(*) as TimesPlayed
from
(
select p1 as Player,sum(score) as Tot
from table
group by p1
union all
select p2,sum(score)
from table
group by p2
) xx
Group by xx.Player
order by Score desc

Alternatively to using UNION (ALL) on the table, you could try something like this:
SELECT
CASE p.PlayerNumber WHEN 1 THEN t.p1 ELSE t.p2 END AS pID,
SUM(t.score) AS score,
COUNT(*) AS times_played
FROM atable t
CROSS JOIN (SELECT 1 AS PlayerNumber UNION ALL SELECT 2) p
GROUP BY
pID /* this is probably MySQL-specific; most, if not all, other major
database systems would require repeating the entire pID expression here, i.e.
GROUP BY
CASE p.PlayerNumber WHEN 1 THEN t.p1 ELSE t.p2 END
*/
ORDER BY
score DESC,
times_played DESC /* this is based on your result set;
you might want to omit it or change it to ASC */
UPDATE, in an answer to a question in the comments: joining the result set to the user table:
SELECT
`user`.*, /* you should probably specify
the necessary columns explicitly */
totals.score,
totals.times_played
FROM `user` u
INNER JOIN (
SELECT
CASE p.PlayerNumber WHEN 1 THEN t.p1 ELSE t.p2 END AS pID,
SUM(t.score) AS score,
COUNT(*) AS times_played
FROM atable t
CROSS JOIN (SELECT 1 AS PlayerNumber UNION ALL SELECT 2) p
GROUP BY
pID
) totals ON user.id = totals.pID
ORDER BY
totals.score DESC,
totals.times_played DESC

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MySQL sum ranking, group by genre, limit 10 - mysql

Related

Get distinct values from groups in MySQL

Exclude users with zero score from ranking

How to wirte an extensible SQL to find the users who continuously login for n days

Get User Rank Based on Sum with Pagination

Selecting sum of single column depending on two columns on same table

Categories

Resources