MySQL get rank of duplicate values based on their create dates - mysql

I'm looking for a way on how can I get the rank of the duplicate entries of my table based on their create dates. The older the date will be the one who will get the rank 1 and so on for the next duplicates. It should look like this:
id number create_date rank
1 1 02/03 1
2 1 02/04 2
3 3 02/03 1
4 4 02/03 1
5 4 02/04 2
6 4 02/05 3
I tried searching for this but I can't understand well on how they implement it or more like it is not the way I wanted it to be. Hope someone can help me on this.

select
t.*,
#rank := if(#prevDate = create_date, #rank, #rank + 1) as rank,
#prevDate := create_date
from
your_table t
, (select #rank := 0, #prevDate := null) var_init
order by create_date, id
Explanation:
Here
, (select #rank := 0, #prevDate := null) var_init
the variables are initalized. It's the same as writing
set #rank = 0;
set #prevDate = null;
select ... /*without the crossjoin*/;
Then the order of the columns in the select clause is important. First we check with this line
#rank := if(#prevDate = create_date, #rank, #rank + 1) as rank,
if the current row has the same date as the previous row. The #prevDate holds the value of the previous row. If yes, the #rank variable stays the same, if not it's incremented.
In the next line
#prevDate := create_date
we set the #prevDate variable to the value of the current row. That's why the order of the columns in the select clause is important.
Finally, since we're checking with the previous row, if the dates differ, the order by clause is important.

Related

FIND_IN_SET too slow with GROUP_CONCAT (Dense Rank in MySQL)

I have a query that calculates dense ranks based on the value of a column :
SELECT id,
score1,
FIND_IN_SET
(
score1,
(
SELECT GROUP_CONCAT(score1 ORDER BY score1 DESC) FROM scores
)
) as rank
FROM score_news;
This is what the query results look like:
+----+--------+------+
| id | score1 | rank |
+----+--------+------+
| 1 | 15 | 1 |
| 2 | 15 | 1 |
| 3 | 14 | 3 |
| 4 | 13 | 4 |
+----+--------+------+
The query takes Nx longer time when number of scores increases by N times. Is there any way I can optimize this ? My table size in the order of 106
NOTE: I have already tried a technique using mysql user variables but I get inconsistent results when I run it on a large set. On investigation I found this in the MySQL docs:
The order of evaluation for user variables is undefined and may change
based on the elements contained within a given query. In SELECT #a, #a
:= #a+1 ..., you might think that MySQL will evaluate #a first and
then do an assignment second, but changing the query (for example, by
adding a GROUP BY, HAVING, or ORDER BY clause) may change the order of
evaluation...The general rule is never to assign a value to a user
variable in one part of a statement and use the same variable in some
other part of the same statement. You might get the results you
expect, but this is not guaranteed.
My attempt with user variables :
SELECT
a.id,
#prev := #curr as prev,
#curr := a.score1 as curr,
#rank := IF(#rank = 0, #rank + 1, IF(#prev > #curr, #rank+#ties, #rank)) AS rank,
#ties := IF(#prev = #curr, #ties+1, 1) AS ties
FROM
scores a,
(
SELECT
#curr := null,
#prev := null,
#rank := 0,
#ties := 1,
#total := count(*)
FROM scores
WHERE score1 is not null
) b
WHERE
score1 is not null
ORDER BY
score1 DESC
)
The solution with variables could work, but you need to first order the result set, and only then work with the variable assignments:
SELECT a.id,
#rank := IF(#curr = a.score1, #rank, #rank + #ties) AS rank,
#ties := IF(#curr = a.score1, #ties + 1, 1) AS ties,
#curr := a.score1 AS curr
FROM (SELECT * FROM scores WHERE score1 is NOT NULL ORDER BY score1 DESC) a,
(SELECT #curr := null, #rank := 0, #ties := 1) b
NB: I placed the curr column last in the select clause to save one variable.
You can also use following query to get your dense rank without using user defined variables
SELECT
a.*,
(SELECT
COUNT(DISTINCT score1)
FROM
scores b
WHERE a.`score1` < b.score1) + 1 rank
FROM
scores a
ORDER BY score1 DESC
Demo
Demo using your data set
An index on score1 column might help you

Select recent n number of entries of all users from table

I have a below table and wants to select only last 2 entries of all users.
Source table:
-------------------------------------
UserId | QuizId(AID)|quizendtime(AID)|
--------------------------------------
1 10 2016-5-12
2 10 2016-5-12
1 11 2016-6-12
2 12 2016-8-12
3 12 2016-8-12
2 13 2016-8-12
1 14 2016-9-12
3 14 2016-9-12
3 11 2016-6-12
Expected output is like, (should list only recent 2 quizid entries for all users)
-------------------------------------
UserId | QuizId(AID)|quizendtime(AID)|
--------------------------------------
1 14 2016-9-12
1 11 2016-6-12
2 13 2016-8-12
2 12 2016-8-12
3 14 2016-9-12
3 12 2016-8-12
Any idea's to produce this output.
Using MySQL user defined variables you can accomplish this:
SELECT
t.UserId,
t.`QuizId(AID)`,
t.`quizendtime(AID)`
FROM
(
SELECT
*,
IF(#sameUser = UserId, #a := #a + 1 , #a := 1) row_number,
#sameUser := UserId
FROM your_table
CROSS JOIN (SELECT #a := 1, #sameUser := 0) var
ORDER BY UserId , `quizendtime(AID)` DESC
) AS t
WHERE t.row_number <= 2
Working Demo
Note: If you want at most x number of entries for each user then change the condition in where clause like below:
WHERE t.row_number <= x
Explanation:
SELECT
*,
IF(#sameUser = UserId, #a := #a + 1 , #a := 1) row_number,
#sameUser := UserId
FROM your_table
CROSS JOIN (SELECT #a := 1, #sameUser := 0) var
ORDER BY UserId , `quizendtime(AID)` DESC;
This query sorts all the data in ascending order of userId and descending order of quizendtime(AID).
Now take a walk on this (multi) sorted data.
Every time you see a new userId assign a row_number (1). If you see the same user again then just increase the row_number.
Finally filtering only those records which are having row_number <= 2 ensures the at most two latest entries for each user.
EDIT: As Gordon pointed out that the evaluation of expressions using user defined variables in mysql is not guaranteed to follow the same order always so based on that the above query is slightly modified:
SELECT
t.UserId,
t.`QuizId(AID)`,
t.`quizendtime(AID)`
FROM
(
SELECT
*,
IF (
#sameUser = UserId,
#a := #a + 1,
IF(#sameUser := UserId, #a := 1, #a:= 1)
)AS row_number
FROM your_table
CROSS JOIN (SELECT #a := 1, #sameUser := 0) var
ORDER BY UserId , `quizendtime(AID)` DESC
) AS t
WHERE t.row_number <= 2;
WORKING DEMO V2
User-defined variables are the key to the solution. But, it is very important to have all the variable assignments in a single expression. MySQL does not guarantee the order of evaluation of expressions in a select -- and, in fact, sometimes processes them in different orders.
select t.*
from (select t.*,
(#rn := if(#u = UserId, #rn + 1,
if(#u := UserId, 1, 1)
)
) as rn
from t cross join
(select #u := -1, #rn := 0) params
order by UserId, quizendtime desc
) t
where rn <= 2;

MySQL Query get the last N rows per Group

Suppose that I have a database which contains the following columns:
VehicleID|timestamp|lat|lon|
I may have multiple times the same VehicleId but with a different timestamp. Thus VehicleId,Timestamp is the primary key.
Now I would like to have as a result the last N measurements per VehicleId or the first N measurements per vehicleId.
How I am able to list the last N tuples according to an ordering column (e.g. in our case timestamp) per VehicleId?
Example:
|VehicleId|Timestamp|
1|1
1|2
1|3
2|1
2|2
2|3
5|5
5|6
5|7
In MySQL, this is most easily done using variables:
select t.*
from (select t.*,
(#rn := if(#v = vehicle, #rn + 1,
if(#v := vehicle, 1, 1)
)
) as rn
from table t cross join
(select #v := -1, #rn := 0) params
order by VehicleId, timestamp desc
) t
where rn <= 3;

How to get correct position on ties in mysql rankings

This is my code and works for ties but it does not skip position on ties
SELECT `item`, (`totalrate` / `nrrates`),
#rank_count := #rank_count + (totalrate/nrrates < #prev_value) rank,
#prev_value := totalrate/nrrates avg
FROM table, (SELECT #prev_value := NULL, #rank_count := 1) init
ORDER BY avg DESC
Here is the out I get
item (`totalrate` / `nrrates`) rank avg
Virginia 10.0000 1 10
Ana 9.7500 2 9.75
Angeie 9.72 3 9.72
Carel 9.666666666 4 9.66
sammy 9.666666666 4 9.66
Oda 9.500000000 5 9.5
I want
item (`totalrate` / `nrrates`) rank avg
Virginia 10.0000 1 10
Ana 9.7500 2 9.75
Angeie 9.72 3 9.72
Carel 9.666666666 4 9.66
sammy 9.666666666 4 9.66
Oda 9.500000000 6 9.5
To skip the 5 position
I would like to merge with this that does skip position on ties
(I took the below code from this post
MySQL Rank in the Case of Ties)
SELECT t1.name, (SELECT COUNT(*) FROM table_1 t2 WHERE t2.score > t1.score) +1
AS rnk
FROM table_1 t1
how would I modify my code to get it to skip position with the above code it looks simple but i haven't figured it out.
Thanks
On ties, you may want to skip and use current row num to next unmatched avg value row as next rank.
Following should help you
SELECT `item`, #curr_avg := ( `totalrate` / `nrrates` )
, case when #prev_avg = #curr_avg then #rank := #rank
else #rank := ( #cur_row + 1 )
end as rank
, #cur_row := ( #cur_row + 1 ) as cur_row
, #prev_value := #curr_avg avg
FROM table
, ( SELECT #prev_avg := 0, #curr_avg := 0
, #rank := 0, #cur_row := 0 ) init
ORDER BY avg DESC
Similar examples:
To display top 4 rows using rank
Mysql Query for Rank (RowNumber) and Groupings
Update a field with an incrementing value that resets based on
field
Here's another alternative. First, the averages are calculated. If they are already available in a table, it would be even easier (as can be seen in the fiddle demo). Anyways, the rank is based on the logic of counting how many items have a lesser average than the current item.
SELECT
A1.`item`,
A1.avg,
COUNT(A2.`item`) avg_rank
FROM
(
SELECT `item`, (`totalrate` / `nrrates`),
#prev_value := totalrate/nrrates avg
FROM table, (SELECT #prev_value := NULL, #rank_count := 1) init
) A1 --alias for the inline view
INNER JOIN
(
SELECT `item`, (`totalrate` / `nrrates`),
#prev_value := totalrate/nrrates avg
FROM table, (SELECT #prev_value := NULL, #rank_count := 1) init
) A2 --alias for the inline view
ON A2.avg < A1.avg
GROUP BY A1.id, A1.avg
ORDER BY A1.avg;
SQL Fiddle demo

ranking results of mysql query using AVG

I have a query that ranks results in MySQL:
SET #rank := 0;
SELECT Name, Score, #rank := #rank + 1
FROM Results
ORDER BY Score
This works fine until I try to base the ranking on the average score:
SET #rank := 0;
SELECT Name, AVG(Score) as AvScore, #rank := #rank + 1
FROM Results
ORDER BY AvScore
If I run this I get just the one record back because of the AVG. However, if I add a GROUP BY on Name so that I can get the averages listed for everyone, this has the effect of messing up the correct rankings.
I know the answer's probably staring me in the face but I can't quite get it. How can I output a ranking for each name based on their average result?
You need to use a sub-query:
SET #rank := 0;
SELECT a.name,
a.avscore,
#rank := #rank + 1
FROM (SELECT name,
Avg(score) AS AvScore
FROM results
GROUP BY name) a
ORDER BY a.avscore
You have to order first and then select rank from a derived table:
SELECT Name, AvScore, #rank := #rank + 1
FROM (
SELECT Name, AVG(AvScore) AS AvScore FROM Results
GROUP BY Name ORDER BY AVG(AvScore)
) t1, (SELECT #rank = 0) t2;