FIND_IN_SET too slow with GROUP_CONCAT (Dense Rank in MySQL) - mysql

I have a query that calculates dense ranks based on the value of a column :
SELECT id,
score1,
FIND_IN_SET
(
score1,
(
SELECT GROUP_CONCAT(score1 ORDER BY score1 DESC) FROM scores
)
) as rank
FROM score_news;
This is what the query results look like:
+----+--------+------+
| id | score1 | rank |
+----+--------+------+
| 1 | 15 | 1 |
| 2 | 15 | 1 |
| 3 | 14 | 3 |
| 4 | 13 | 4 |
+----+--------+------+
The query takes Nx longer time when number of scores increases by N times. Is there any way I can optimize this ? My table size in the order of 106
NOTE: I have already tried a technique using mysql user variables but I get inconsistent results when I run it on a large set. On investigation I found this in the MySQL docs:
The order of evaluation for user variables is undefined and may change
based on the elements contained within a given query. In SELECT #a, #a
:= #a+1 ..., you might think that MySQL will evaluate #a first and
then do an assignment second, but changing the query (for example, by
adding a GROUP BY, HAVING, or ORDER BY clause) may change the order of
evaluation...The general rule is never to assign a value to a user
variable in one part of a statement and use the same variable in some
other part of the same statement. You might get the results you
expect, but this is not guaranteed.
My attempt with user variables :
SELECT
a.id,
#prev := #curr as prev,
#curr := a.score1 as curr,
#rank := IF(#rank = 0, #rank + 1, IF(#prev > #curr, #rank+#ties, #rank)) AS rank,
#ties := IF(#prev = #curr, #ties+1, 1) AS ties
FROM
scores a,
(
SELECT
#curr := null,
#prev := null,
#rank := 0,
#ties := 1,
#total := count(*)
FROM scores
WHERE score1 is not null
) b
WHERE
score1 is not null
ORDER BY
score1 DESC
)

The solution with variables could work, but you need to first order the result set, and only then work with the variable assignments:
SELECT a.id,
#rank := IF(#curr = a.score1, #rank, #rank + #ties) AS rank,
#ties := IF(#curr = a.score1, #ties + 1, 1) AS ties,
#curr := a.score1 AS curr
FROM (SELECT * FROM scores WHERE score1 is NOT NULL ORDER BY score1 DESC) a,
(SELECT #curr := null, #rank := 0, #ties := 1) b
NB: I placed the curr column last in the select clause to save one variable.

You can also use following query to get your dense rank without using user defined variables
SELECT
a.*,
(SELECT
COUNT(DISTINCT score1)
FROM
scores b
WHERE a.`score1` < b.score1) + 1 rank
FROM
scores a
ORDER BY score1 DESC
Demo
Demo using your data set
An index on score1 column might help you

Related

mySql rank sorting from-to

I got a table like this:
users:
| username | statistics |
----------- ---------------------------------------
0 | peter200 | { "gamesWon": 4, "gamesPlayed" : 4} |
1 | eminem33 | { "gamesWon": 7, "gamesPlayed" : 20} |
Note: (statistics = "JSON")
And I'd like to create a rank-list.
So the user with the highest number in statistics. gamesWon gets rank numero uno 1 and so on.
What I've got so far is something like this (what is exactly working like I've hoped):
SELECT username, statistics, #rank := #rank + 1 AS rank
FROM users, (SELECT #rank := 0) r
WHERE JSON_EXTRACT(statistics, '$.gamesWon')
ORDER BY JSON_EXTRACT(statistics, '$.gamesWon') DESC
So to my question: Now I'd like to update the query from above to just get the information in an specific rank-index (lets say from rank 2 to rank 10).
Adding AND rank > 2 AND rank < 10 to the WHERE clause does not seems to be a working solution. So any help would be really appreciated.
You need to wrap it with subquery:
SELECT *
FROM (SELECT username, statistics, #rank := #rank + 1 AS rank
FROM users, (SELECT #rank := 0) r
WHERE JSON_EXTRACT(statistics, '$.gamesWon')
ORDER BY JSON_EXTRACT(statistics, '$.gamesWon') DESC
) s
WHERE rank > 2 AND rank < 10
You don't actually need the subquery. You can use limit with offset:
SELECT username, statistics, #rank := #rank + 1 AS rank
FROM users CROSS JOIN (SELECT #rank := 0) r
WHERE JSON_EXTRACT(statistics, '$.gamesWon')
ORDER BY JSON_EXTRACT(statistics, '$.gamesWon') DESC
LIMIT 1, 9

Select recent n number of entries of all users from table

I have a below table and wants to select only last 2 entries of all users.
Source table:
-------------------------------------
UserId | QuizId(AID)|quizendtime(AID)|
--------------------------------------
1 10 2016-5-12
2 10 2016-5-12
1 11 2016-6-12
2 12 2016-8-12
3 12 2016-8-12
2 13 2016-8-12
1 14 2016-9-12
3 14 2016-9-12
3 11 2016-6-12
Expected output is like, (should list only recent 2 quizid entries for all users)
-------------------------------------
UserId | QuizId(AID)|quizendtime(AID)|
--------------------------------------
1 14 2016-9-12
1 11 2016-6-12
2 13 2016-8-12
2 12 2016-8-12
3 14 2016-9-12
3 12 2016-8-12
Any idea's to produce this output.
Using MySQL user defined variables you can accomplish this:
SELECT
t.UserId,
t.`QuizId(AID)`,
t.`quizendtime(AID)`
FROM
(
SELECT
*,
IF(#sameUser = UserId, #a := #a + 1 , #a := 1) row_number,
#sameUser := UserId
FROM your_table
CROSS JOIN (SELECT #a := 1, #sameUser := 0) var
ORDER BY UserId , `quizendtime(AID)` DESC
) AS t
WHERE t.row_number <= 2
Working Demo
Note: If you want at most x number of entries for each user then change the condition in where clause like below:
WHERE t.row_number <= x
Explanation:
SELECT
*,
IF(#sameUser = UserId, #a := #a + 1 , #a := 1) row_number,
#sameUser := UserId
FROM your_table
CROSS JOIN (SELECT #a := 1, #sameUser := 0) var
ORDER BY UserId , `quizendtime(AID)` DESC;
This query sorts all the data in ascending order of userId and descending order of quizendtime(AID).
Now take a walk on this (multi) sorted data.
Every time you see a new userId assign a row_number (1). If you see the same user again then just increase the row_number.
Finally filtering only those records which are having row_number <= 2 ensures the at most two latest entries for each user.
EDIT: As Gordon pointed out that the evaluation of expressions using user defined variables in mysql is not guaranteed to follow the same order always so based on that the above query is slightly modified:
SELECT
t.UserId,
t.`QuizId(AID)`,
t.`quizendtime(AID)`
FROM
(
SELECT
*,
IF (
#sameUser = UserId,
#a := #a + 1,
IF(#sameUser := UserId, #a := 1, #a:= 1)
)AS row_number
FROM your_table
CROSS JOIN (SELECT #a := 1, #sameUser := 0) var
ORDER BY UserId , `quizendtime(AID)` DESC
) AS t
WHERE t.row_number <= 2;
WORKING DEMO V2
User-defined variables are the key to the solution. But, it is very important to have all the variable assignments in a single expression. MySQL does not guarantee the order of evaluation of expressions in a select -- and, in fact, sometimes processes them in different orders.
select t.*
from (select t.*,
(#rn := if(#u = UserId, #rn + 1,
if(#u := UserId, 1, 1)
)
) as rn
from t cross join
(select #u := -1, #rn := 0) params
order by UserId, quizendtime desc
) t
where rn <= 2;

mysql row number count down and dynamic number of row

I believe it can be solve by temp table/stored procedure but in case it can be done by single SQL statement.
Goal: List all row with count down by year, however number of row of each year is different. Row can be order by date
Result Arm to:
|-Count Down-|-Date-------|
| 3 | 2013-01-01 | <- Start with number of Row of each year
| 2 | 2013-03-15 |
| 1 | 2013-06-07 |
| 5 | 2014-01-01 | <- Start with number of Row of each year
| 4 | 2014-03-17 |
| 3 | 2014-07-11 |
| 2 | 2014-08-05 |
| 1 | 2014-11-12 |
SQL:
Select #row_number:=#row_number-1 AS CountDown, Date
FROM table JOIN
(Select #row_number:=COUNT(*), year(date) FROM table GROUP BY year(date))
Is there any solution for that?
The subquery that gets the count by year needs to return the year, so you can join it with the main table to get the starting number for the countdown. And you need to detect when the year changes, so you need another variable for that.
SELECT #row_number := IF(YEAR(d.Date) = #prevYear, #row_number-1, y.c) AS CountDown,
d.Date, #prevYear := YEAR(d.Date)
FROM (SELECT Date
FROM Table1
ORDER BY Date) AS d
JOIN
(Select count(*) AS c, year(date) AS year
FROM Table1
GROUP BY year(date)) AS y
ON YEAR(d.Date) = y.year
CROSS JOIN (SELECT #prevYear := NULL) AS x
DEMO
You can do the count down using variables (or correlated subqueries). The following does the count, but the returned data is not in the order you specify:
select (#rn := if(#y = year(date), #rn + 1,
if(#y := year(date), 1, 1)
)
) as CountDown, t1.*
from table1 cross join
(select #y := 0, #rn := 0) vars
order by date desc;
That is easily fixed with another subquery:
select t.*
from (select (#rn := if(#y = year(date), #rn + 1,
if(#y := year(date), 1, 1)
)
) as CountDown, t1.*
from table1 cross join
(select #y := 0, #rn := 0) vars
order by date desc
) t
order by date;
Note the complicated expression for assigning CountDown. This expression is setting both variables (#y and #rn) in a single expression. MySQL does not guarantee the order of evaluation of expressions in a select. If you assign these in different expressions, then they might be executed in the wrong order.

How to get correct position on ties in mysql rankings

This is my code and works for ties but it does not skip position on ties
SELECT `item`, (`totalrate` / `nrrates`),
#rank_count := #rank_count + (totalrate/nrrates < #prev_value) rank,
#prev_value := totalrate/nrrates avg
FROM table, (SELECT #prev_value := NULL, #rank_count := 1) init
ORDER BY avg DESC
Here is the out I get
item (`totalrate` / `nrrates`) rank avg
Virginia 10.0000 1 10
Ana 9.7500 2 9.75
Angeie 9.72 3 9.72
Carel 9.666666666 4 9.66
sammy 9.666666666 4 9.66
Oda 9.500000000 5 9.5
I want
item (`totalrate` / `nrrates`) rank avg
Virginia 10.0000 1 10
Ana 9.7500 2 9.75
Angeie 9.72 3 9.72
Carel 9.666666666 4 9.66
sammy 9.666666666 4 9.66
Oda 9.500000000 6 9.5
To skip the 5 position
I would like to merge with this that does skip position on ties
(I took the below code from this post
MySQL Rank in the Case of Ties)
SELECT t1.name, (SELECT COUNT(*) FROM table_1 t2 WHERE t2.score > t1.score) +1
AS rnk
FROM table_1 t1
how would I modify my code to get it to skip position with the above code it looks simple but i haven't figured it out.
Thanks
On ties, you may want to skip and use current row num to next unmatched avg value row as next rank.
Following should help you
SELECT `item`, #curr_avg := ( `totalrate` / `nrrates` )
, case when #prev_avg = #curr_avg then #rank := #rank
else #rank := ( #cur_row + 1 )
end as rank
, #cur_row := ( #cur_row + 1 ) as cur_row
, #prev_value := #curr_avg avg
FROM table
, ( SELECT #prev_avg := 0, #curr_avg := 0
, #rank := 0, #cur_row := 0 ) init
ORDER BY avg DESC
Similar examples:
To display top 4 rows using rank
Mysql Query for Rank (RowNumber) and Groupings
Update a field with an incrementing value that resets based on
field
Here's another alternative. First, the averages are calculated. If they are already available in a table, it would be even easier (as can be seen in the fiddle demo). Anyways, the rank is based on the logic of counting how many items have a lesser average than the current item.
SELECT
A1.`item`,
A1.avg,
COUNT(A2.`item`) avg_rank
FROM
(
SELECT `item`, (`totalrate` / `nrrates`),
#prev_value := totalrate/nrrates avg
FROM table, (SELECT #prev_value := NULL, #rank_count := 1) init
) A1 --alias for the inline view
INNER JOIN
(
SELECT `item`, (`totalrate` / `nrrates`),
#prev_value := totalrate/nrrates avg
FROM table, (SELECT #prev_value := NULL, #rank_count := 1) init
) A2 --alias for the inline view
ON A2.avg < A1.avg
GROUP BY A1.id, A1.avg
ORDER BY A1.avg;
SQL Fiddle demo

MySQL get rank of duplicate values based on their create dates

I'm looking for a way on how can I get the rank of the duplicate entries of my table based on their create dates. The older the date will be the one who will get the rank 1 and so on for the next duplicates. It should look like this:
id number create_date rank
1 1 02/03 1
2 1 02/04 2
3 3 02/03 1
4 4 02/03 1
5 4 02/04 2
6 4 02/05 3
I tried searching for this but I can't understand well on how they implement it or more like it is not the way I wanted it to be. Hope someone can help me on this.
select
t.*,
#rank := if(#prevDate = create_date, #rank, #rank + 1) as rank,
#prevDate := create_date
from
your_table t
, (select #rank := 0, #prevDate := null) var_init
order by create_date, id
Explanation:
Here
, (select #rank := 0, #prevDate := null) var_init
the variables are initalized. It's the same as writing
set #rank = 0;
set #prevDate = null;
select ... /*without the crossjoin*/;
Then the order of the columns in the select clause is important. First we check with this line
#rank := if(#prevDate = create_date, #rank, #rank + 1) as rank,
if the current row has the same date as the previous row. The #prevDate holds the value of the previous row. If yes, the #rank variable stays the same, if not it's incremented.
In the next line
#prevDate := create_date
we set the #prevDate variable to the value of the current row. That's why the order of the columns in the select clause is important.
Finally, since we're checking with the previous row, if the dates differ, the order by clause is important.