I'd like to solve this LeetCode problem, https://leetcode.com/problems/rank-scores/, using MySQL following this example in the docs (https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_rank):
SELECT score, RANK() OVER w as 'rank'
FROM scores
WINDOW w AS (ORDER BY score DESC);
I've created a test database (using the Django ORM) in which this works fine:
mysql> SELECT score, RANK() OVER w as 'rank' FROM scores WINDOW w AS (ORDER BY score DESC);
+-------+------+
| score | rank |
+-------+------+
| 4.00 | 1 |
| 4.00 | 1 |
| 3.85 | 3 |
| 3.65 | 4 |
| 3.65 | 4 |
| 3.50 | 6 |
+-------+------+
6 rows in set (0.00 sec)
However, if I enter this in LeetCode I get a syntax error:
Any idea what the problem is here? Perhaps RANK() is a new function which the MySQL version on LeetCode doesn't have yet?
Rank Is not supported in MySql 5.7.21 . Only from Mysql 8 , we can use rank function, you can try the below query
SELECT Score,
(SELECT count(1) FROM (SELECT distinct Score s FROM Scores) tmp WHERE s >= Score) 'rank'
FROM Scores
ORDER BY Score desc
Related
Similar to this question, I have the following table in MySQL 8.0.15:
CREATE TABLE golf_scores (id INT PRIMARY KEY AUTO_INCREMENT, person TEXT, score INT, age INT);
INSERT INTO golf_scores (person, score, age) VALUES ('Angela', 40, 25),('Angela', 45, 25),('Angela', 55, 25),('Peter',45, 32),('Peter',55,32),('Rachel', 65, 35),('Rachel',75,35),('Jeff',75, 16);
SELECT * FROM golf_scores;
+----+--------+-------+------+
| id | person | score | age |
+----+--------+-------+------+
| 1 | Angela | 40 | 25 |
| 2 | Angela | 45 | 25 |
| 3 | Angela | 55 | 25 |
| 4 | Peter | 45 | 32 |
| 5 | Peter | 55 | 32 |
| 6 | Rachel | 65 | 35 |
| 7 | Rachel | 75 | 35 |
| 8 | Jeff | 75 | 16 |
+----+--------+-------+------+
We want to select the following "best" 3 rows:
+----+--------+-------+------+
| id | person | score | age |
+----+--------+-------+------+
| 1 | Angela | 40 | 25 |
| 4 | Peter | 45 | 32 |
| 6 | Rachel | 65 | 35 |
+----+--------+-------+------+
In other words, the lowest 3 golf scores without having duplicates by person, and also the other columns from that row. I'm not worried about ties; I'd still just like three results.
The query SELECT person, MIN(score) as min_score FROM golf_scores GROUP BY person ORDER BY min_score LIMIT 3; gives the right rows, but is limited to the columns person and score`. When I try to modify it like this:
SELECT id, person, MIN(score) as min_score, age FROM golf_scores GROUP BY person ORDER BY min_score LIMIT 3;
I get this error:
ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'records.golf_scores.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
I also tried eliminating duplicate names with SELECT id, DISTINCT person, score, age FROM golf_scores ORDER BY score LIMIT 3 but I get an error
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'DISTINCT person, score FROM golf_scores ORDER BY score LIMIT 3' at line 1
How can I get the desired output in MySQL?
Use row_number():
select t.*
from (select t.*, row_number() over (partition by person order by score) as seqnum
from golf_scores t
) t
where seqnum = 1
order by score asc
limit 3;
In older versions, you can do this by using a correlated subquery and id:
select gs.*
from golf_scores gs
where gs.id = (select gs2.id
from golf_scores gs2
where gs2.person = gs.person
order by gs2.score asc
limit 1
)
order by score asc
limit 3;
This may also be the fastest way with an index on golf_scores(person, score, id).
Here's one way:
SELECT x.*
FROM golf_scores x
JOIN
( SELECT MIN(id) id FROM
( SELECT a.*
FROM golf_scores a
JOIN
( SELECT person, MIN(score) score FROM golf_scores GROUP BY person ) b
ON b.person = a.person
AND b.score = a.score
) n
GROUP
BY person
, score
) y
ON y.id = x.id
ORDER
BY x.score LIMIT 3;
I have a question about sql. I have a question looks like this.
+----+-------+
| Id | Score |
+----+-------+
| 1 | 3.50 |
| 2 | 3.65 |
| 3 | 4.00 |
| 4 | 3.85 |
| 5 | 4.00 |
| 6 | 3.65 |
+----+-------+
The table is called 'Scores' and after ranking the score here, it will look like this,
+-------+------+
| Score | Rank |
+-------+------+
| 4.00 | 1 |
| 4.00 | 1 |
| 3.85 | 2 |
| 3.65 | 3 |
| 3.65 | 3 |
| 3.50 | 4 |
+-------+------+
Here is a sample answer but I am confused about the part after WHERE.
select
s.Score,
(select count(distinct Score) from Scores where Score >= s.Score)
Rank
from Scores s
order by s.Score Desc;
This Score >= s.Score is something like Score column compare with itself. I totally feel confused about this part. How does it work? Thank you!
E.
One way to understand this is to just run the query for each row of your sample data. Starting with the first row, we see that the score is 4.00. The correlated subquery in the select clause:
(select count(distinct Score) from Scores where Score >= s.Score)
will return a count of 1, because there is only one record whose distinct score is greater than or equal to 4.00. This is also the case for the second record in your data, which has a score of 4.00 as well. For the score 3.85, the subquery would find a distinct count of 2, because there are two scores which are greater than or equal to 3.85, namely 3.85 and 4.00. You can apply this logic across the whole table to convince yourself of how the query works.
+-------+------+
| Score | Rank |
+-------+------+
| 4.00 | 1 | <-- 1 score >= 4.00
| 4.00 | 1 | <-- 1 score >= 4.00
| 3.85 | 2 | <-- 2 scores >= 3.85
| 3.65 | 3 | <-- 3 scores >= 3.65
| 3.65 | 3 | <-- 3 scores >= 3.65
| 3.50 | 4 | <-- 4 scores >= 3.50
+-------+------+
This is known as a dependent subquery (and can be quite inefficient). A dependent subquery - basically means it cannot be turned into a join because it "depends" on a specific value - runs for every result row in the output for the specific "dependent" values. In this case each result-row already has a "specific" value of s.Score.
The 'Score' in the dependent subquery refers to the original table and not the outer query.
It may be more clear with an additional alias:
select
s.Score,
(select count(distinct other_scores.Score)
from Scores other_scores
where other_scores.Score >= s.Score) Rank -- value of s.Score is known
-- and placed directly into dependent subquery
from Scores s
order by s.Score Desc;
"Modern" SQL dialects (including MySQL 8.0+) provide "RANK" and "DENSE_RANK" Window Functions to answer these sorts of queries. Window Functions, where applicable, are often much faster than dependent queries because the Query Planner can optimize at a higher level: these functions also have a tendency to tame otherwise gnarly SQL.
The MySQL 8+ SQL Syntax that ought to do the trick:
select
s.Score,
DENSE_RANK() over w AS Rank
from Scores s
window w as (order by Score desc)
There are also various work-abouts to emulate ROW_NUMBER / Window Functions for older versions of MySQL.
Because it is dependent subquery. Every subquery will need to be re-evaluate on each row from outter query. If you familiar with Python, you can think of it like this:
from collections import namedtuple
ScoreTuple = namedtuple('ScoreTuple', ['Id', 'Score'])
Scores = [ScoreTuple(1, 3.50),
ScoreTuple(2, 3.65),
ScoreTuple(3, 4.00),
ScoreTuple(4, 3.85),
ScoreTuple(5, 4.00),
ScoreTuple(6, 3.65)]
Rank = []
for s in Scores: # each row from outter query
rank = len(set([innerScore.Score # SELECT COUNT(DISTINCT Score)
for innerScore in Scores # FROM Scores
if innerScore.Score >= s.Score])) # WHERE Score >= s.Score
Rank.append(rank)
I have a table that looks like:
+-------------+--------+------------+
| Employee ID | Salary | Grievances |
+-------------+--------+------------+
| 101 | 70,000 | 12 |
| 102 | 90,000 | 100 |
| ... | ... | ... |
+-------------+--------+------------+
And I want to find all employees who are in the top-ten for salary, but the bottom-five for grievances. I (think I) know how to do this in SQL Server using ROW_NUMBER, but how to do it in MySQL? I've seen the goto question on doing this, but it doesn't really apply to a multiple column ordering.
If I understand correctly, you can do this with a self-join:
select s.*
from (select t.*
from t
order by salary desc
limit 10
) s join
(select t.*
from t
order by grievances asc
limit 5
) g
on s.employeeid = g.employeeid;
Post the problem statement and current code I am using, and wondering if any smart ideas to improve query performance? Using MySQL. Thanks.
Write a SQL query to rank scores. If there is a tie between two scores, both should have the same ranking. Note that after a tie, the next ranking number should be the next consecutive integer value. In other words, there should be no "holes" between ranks.
+----+-------+
| Id | Score |
+----+-------+
| 1 | 3.50 |
| 2 | 3.65 |
| 3 | 4.00 |
| 4 | 3.85 |
| 5 | 4.00 |
| 6 | 3.65 |
+----+-------+
For example, given the above Scores table, your query should generate the following report (order by highest score):
+-------+------+
| Score | Rank |
+-------+------+
| 4.00 | 1 |
| 4.00 | 1 |
| 3.85 | 2 |
| 3.65 | 3 |
| 3.65 | 3 |
| 3.50 | 4 |
+-------+------+
SELECT
s.score, scores_and_ranks.rank
FROM
Scores s
JOIN
(
SELECT
score_primary.score, COUNT(DISTINCT score_higher.score) + 1 AS rank
FROM
Scores score_primary
LEFT JOIN Scores score_higher
ON score_higher.score > score_primary.score
GROUP BY score_primary.score
) scores_and_ranks
ON s.score = scores_and_ranks.score
ORDER BY rank ASC;
BTW, post issue from Gordon's code.
BTW, tried sgeddes's code, but met with new issues,
New issue from Gordon's code,
thanks in advance,
Lin
User defined variables are probably faster than what you are doing. However, you need to be careful when using them. In particular, you cannot assign a variable in one expression and use it in another -- I mean, you can, but the expressions can be evaluated in any order so your code may not do what you intend.
So, you need to do all the work in a single expression:
select s.*,
(#rn := if(#s = score, #rn,
if(#s := score, #rn + 1, #rn + 1)
)
) as rank
from scores s cross join
(select #rn := 0, #s := 0) params
order by score desc;
One option is to use user-defined variables:
select score,
#rnk:=if(#prevScore=score,#rnk,#rnk+1) rnk,
#prevScore:=score
from scores
join (select #rnk:=0, #prevScore:=0) t
order by score desc
SQL Fiddle Demo
I have a MySQL table which contains statistics about function usage for a program. What I retrieve from it basically looks like this (top 15 total here) :
SELECT function_id, data_timer, SUM( data_counter ) total
FROM data
GROUP BY function_id
ORDER BY total DESC
+-------------+------------+-------+
| function_id | data_timer | total |
+-------------+------------+-------+
| 56 | 567 | 4389 |
| 23 | 7880 | 1267 |
| 7 | 145 | 812 |
| ... | ... | ... |
+-------------+------------+-------+
Since those results are used in a website module where the user can select which column will be used to ORDER BY as well as between ASC and DESC, I needed to retrieve the rank of each row of the results.
With the help of this question, I was able to assign a rank to each row of the results :
SET #rank = 0;
SELECT #rank:=#rank+1 AS rank, function_id, data_timer, SUM( data_counter ) total
FROM data
WHERE client_id = 2
GROUP BY function_id
ORDER BY total DESC
+------+-------------+------------+-------+
| rank | function_id | data_timer | total |
+------+-------------+------------+-------+
| 1 | 56 | 567 | 4389 |
| 2 | 23 | 7880 | 1267 |
| 3 | 7 | 145 | 812 |
| ... | ... | ... | ... |
+------+-------------+------------+-------+
I am now having some difficulties trying to invert this table, meaning I would like to have the results sorted with the least used function first. Something like this (supposing there are 76 functions) :
+------+-------------+------------+-------+
| rank | function_id | data_timer | total |
+------+-------------+------------+-------+
| 76 | 44 | 346 | 1 |
| 75 | 2 | 3980 | 4 |
| 74 | 13 | 612 | 7 |
| ... | ... | ... | ... |
+------+-------------+------------+-------+
Here is my SQL query attempt :
SELECT rank, function_id, data_timer, total
FROM
(
SET #rank = 0;
SELECT #rank:=#rank+1 AS rank, function_id, data_timer, SUM( data_counter ) total
FROM data
WHERE client_id = 2
GROUP BY function_id
ORDER BY total DESC
)
ORDER BY rank DESC
It keeps popping me this :
#1064 - You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use near
'SET #rank = 0' at line 4
Since I'm not too skilled with SQL, I guess I'm missing something obvious.
Any help will be gladly appreciated, thanks!
You are trying to assign a variable inside of your sub-query. This won't work. Move the assignment outside of your sub-query and it should run.
SET #rank = 0;
SELECT rank, function_id, data_timer, total
FROM
(
SELECT #rank:=#rank+1 AS rank, function_id, data_timer, SUM( data_counter ) total
FROM data
WHERE client_id = 2
GROUP BY function_id
ORDER BY total DESC
)
ORDER BY rank DESC
Another option is to initialize your #rank variable in your query instead of a separate statement:
SELECT rank, function_id, data_timer, total
FROM
(
SELECT #rank:=#rank+1 AS rank, function_id, data_timer, SUM( data_counter ) total
FROM data,
(SELECT #rank := 0 ) r
WHERE client_id = 2
GROUP BY function_id
ORDER BY total DESC
) r
ORDER BY rank DESC
Condensed SQL Fiddle Demo