Mysql Ranking Query on 2 columns - mysql

Table
id user_id rank_solo lp
1 1 15 45
2 2 7 79
3 3 17 15
How can I sort out a ranking query that sorts on rank_solo ( This ranges from 0 to 28) and if rank_solo = rank_solo , uses lp ( 0-100) to further determine ranking?
(If lp = lp, add a ranking for no tie rankings)
The query should give me the ranking from a certain random user_id. How is this performance wise on 5m+ rows?
So
User_id 1 would have ranking 2
User_id 2 would have ranking 3
User_id 3 would have ranking 1

You can get the ranking using variablesL
select t.*, (#rn := #rn + 1) as ranking
from t cross join
(select #rn := 0) params
order by rank_solo desc, lp;

You can use ORDER BY to sort your query:
SELECT *
FROM `Table`
ORDER BY rank_solo, lp

I'm not sure I quite understand what you're saying. With that many rows, create a query on the fields you're using to do your selects. For example, in MySQL client use:
create index RANKINGS on mytablename(rank_solo,lp,user_id);
Depending on what you use in your query to select the data, you may change the index or add another index with a different field combination. This has improved performance on my tables by a factor of 10 or more.
As for the query, if you're selecting a specific user then could you not just use:
select rank_solo from table where user_id={user id}
If you want the highest ranking individual, you could:
select * from yourtable order by rank_solo,lp limit 1
Remove the limit 1 to list them all.
If I've misunderstood, please comment.

An alternative would be to use a 2nd table.
table2 would have the following fields:
rank (auto_increment)
user_id
rank_solo
lp
With the rank field as auto increment, as it's populated, it will automatically populate with values beginning with "1".
Once the 2nd table is ready, just do this when you want to update the rankings:
delete from table2;
insert into table2 select user_id,rank_solo,lp from table1 order by rank_solo,lp;
It may not be "elegant" but it gets the job done. Plus, if you create an index on both tables, this query would be very quick since the fields are numeric.

Related

MySQL grouping with detail

I have a table that looks like this...
user_id, match_id, points_won
1 14 10
1 8 12
1 12 80
2 8 10
3 14 20
3 2 25
I want to write a MYSQL script that pulls back the most points a user has won in a single match and includes the match_id in the results - in other words...
user_id, match_id, max_points_won
1 12 80
2 8 10
3 2 25
Of course if I didn't need the match_id I could just do...
select user_id, max(points_won)
from table
group by user_id
But as soon as I add match_id to the "select" and "group by" I have a row for every match, and if I only add the match_id to the "select" (and not the "group by") then it won't correctly relate to the points_won.
Ideally I don't want to do the following either because it doesn't feel particularly safe (e.g. if the user has won the same amount of points on multiple matches)...
SELECT t.user_id, max(t.points_won) max_points_won
, (select t2.match_id
from table t2
where t2.user_id = t.user_id
and t2.points_won = max_points_won) as 'match_of_points_maximum'
FROM table t
GROUP BY t.user_id
Are there any more elegant options for this problem?
This is harder than it needs to be in MySQL. One method is a bit of a hack but it works in most circumstances. That is the group_concat()/substring_index() trick:
select user_id, max(points_won),
substring_index(group_concat(match_id order by points_won desc), ',', 1)
from table
group by user_id;
The group_concat() concatenates together all the match_ids, ordered by the points descending. The substring_index() then takes the first one.
Two important caveats:
The resulting expression has a type of string, regardless of the internal type.
The group_concat() uses an internal buffer, whose length -- by default -- is 1,024 characters. This default length can be changed.
You can use the query:
select user_id, max(points_won)
from table
group by user_id
as a derived table. Joining this to the original table gets you what you want:
select t1.user_id, t1.match_id, t2.max_points_won
from table as t1
join (
select user_id, max(points_won) as max_points_won
from table
group by user_id
) as t2 on t1.user_id = t2.user_id and t1.points_won = t2.max_points_won
I think you can optimize your query by add limit 1 in the inner query.
SELECT t.user_id, max(t.points_won) max_points_won
, (select t2.match_id
from table t2
where t2.user_id = t.user_id
and t2.points_won = max_points_won limit 1) as 'match_of_points_maximum'
FROM table t
GROUP BY t.user_id
EDIT : only for postgresql, sql-server, oracle
You could use row_number :
SELECT USER_ID, MATCH_ID, POINTS_WON
FROM
(
SELECT user_id, match_id, points_won, row_number() over (partition by user_id order by points_won desc) rn
from table
) q
where q.rn = 1
For a similar function, have a look at Gordon Linoff's answer or at this article.
In your example, you partition your set of result per user then you order by points_won desc to obtain highest winning point first.

MySQL : Group By Clause Not Using Index when used with Case

Im using MySQL
I cant change the DB structure, so thats not an option sadly
THE ISSUE:
When i use GROUP BY with CASE (as need in my situation), MYSQL uses
file_sort and the delay is humongous (approx 2-3minutes):
http://sqlfiddle.com/#!9/f97d8/11/0
But when i dont use CASE just GROUP BY group_id , MYSQL easily uses
index and result is fast:
http://sqlfiddle.com/#!9/f97d8/12/0
Scenerio: DETAILED
Table msgs, containing records of sent messages, with fields:
id,
user_id, (the guy who sent the message)
type, (0=> means it's group msg. All the msgs sent under this are marked by group_id. So lets say group_id = 5 sent 5 msgs, the table will have 5 records with group_id =5 and type=0. For type>0, the group_id will be NULL, coz all other types have no group_id as they are individual msgs sent to single recipient)
group_id (if type=0, will contain group_id, else NULL)
Table contains approx 10 million records for user id 50001 and with different types (i.e group as well as individual msgs)
Now the QUERY:
SELECT
msgs.*
FROM
msgs
INNER JOIN accounts
ON (
msgs.user_id = accounts.id
)
WHERE 1
AND msgs.user_id IN (50111)
AND msgs.type IN (0, 1, 5, 7)
GROUP BY CASE `msgs`.`type` WHEN 0 THEN `msgs`.`group_id` ELSE `msgs`.`id` END
ORDER BY `msgs`.`group_id` DESC
LIMIT 100
I HAVE to get summary in a single QUERY,
so msgs sent to group lets say 5 (have 5 records in this table) will be shown as 1 record for summary (i may show COUNT later, but thats not an issue).
The individual msgs have NULL as group_id, so i cant just put 'GROUP BY group_id ' coz that will Group all individual msgs to single record which is not acceptable.
Sample output can be something like:
id owner_id, type group_id COUNT
1 50001 0 2 5
1 50001 1 NULL 1
1 50001 4 NULL 1
1 50001 0 7 5
1 50001 5 NULL 1
1 50001 5 NULL 1
1 50001 5 NULL 1
1 50001 0 10 5
Now the problem is that the GROUP condition after using CASE (which i currently think that i have to because i only need to group by group_id if type=0) is causing alot of delay coz it's not using indexes which it does if i dont use CASE (like just group by group_id ). Please view SQLFiddles above to see the explain results
Can anyone plz give an advice how to get it optimized
UPDATE
I tried a workaround , that does somehow works out (drops INITIAL queries to 1sec). Using union, what it does is, to minimize the resultset by union that forces SQL to write on disk for filesort (due to huge resultset), limit the resultset of group msgs, and individual msgs (view query below)
-- first part of union retrieves group msgs (that have type 0 and needs to be grouped by group_id). Applies the limit to captivate the out of control result set
-- The second query retrieves individual msgs, (those with type !=0, grouped by msgs.id - not necessary but just to be save from duplicate entries due to joins). Applies the limit to captivate the out of control result set
-- JOins the two to retrieve the desired resultset
Here's the query:
SELECT
*
FROM
(
(
SELECT
msgs.id as reference_id, user_id, type, group_id
FROM
msgs
INNER JOIN accounts
ON (msgs.user_id = accounts.id)
WHERE 1
AND accounts.id IN (50111 ) AND type = 0
GROUP BY msgs.group_id
ORDER BY msgs.id DESC
LIMIT 40
)
UNION
ALL
(
SELECT
msgs.id as reference_id, user_id, type, group_id
FROM
msgs
INNER JOIN accounts
ON (
msgs.user_id = accounts.id
)
WHERE 1
AND msgs.type != 0
AND accounts.id IN (50111)
GROUP BY msgs.id
ORDER BY msgs.id
LIMIT 40
)
) AS temp
ORDER BY reference_id
LIMIT 20,20
But has alot of caveats,
-I need to handle the limit in inner queries as well. Lets say 20recs per page, and im on page 4. For inner queries , i need to apply limit 0,80, since im not sure which of the two parts had how many records in the previous 3 pages. So, as the records per page and number of pages grow, my query grows heavier. Lets say 1k rec per page, and im on page 100 , or 1K, the load gets heavier and time exponentially increases
I need to handle ordering in inner queries and then apply on the resultset prepared by union , conditions need to be applied on both inner queries seperately(but not much of an issue)
-Cant use calc_found_rows, so will need to get count using queries seperately
The main issue is the first one. The higher i go with the pagination , the heavier it gets
Would this run faster?
SELECT id, user_id, type, group_id
FROM
( SELECT id, user_id, type, group_id, IFNULL(group_id, id) AS foo
FROM msgs
WHERE user_id IN (50111)
AND type IN (0, 1, 5, 7)
)
GROUP BY foo
ORDER BY `group_id` DESC
LIMIT 100
It needs INDEX(user_id, type).
Does this give the 'correct' answer?
SELECT DISTINCT *
FROM msgs
WHERE user_id IN (50111)
AND type IN (0, 1, 5, 7)
GROUP BY IFNULL(group_id, id)
ORDER BY `group_id` DESC
LIMIT 100
(It needs the same index)

in sql, how to make a rank column based on the value of another column?

say I wish to create a table like following:
user score rank
a 100 2
b 200 1
c 50 3
d 50 3
How exactly do I create a rank column in which it updates with the new entry of record with score?
For a small table, the easiest way is a correlated subquery:
select t.*,
(select 1 + count(*)
from t t2
where t2.score > t.score
) as rank
from t
order by score desc;
Note: this implements "rank" as per the rank() window function available in most databases.

How do I do a dynamic UNION query in MySQL?

mytable has an auto-incrementing id column which is an integer, and for all intents and purposes in this case you can safely assume that the higher ID represents a more recent value. mytable also has an indexed column called group_id which is a foreign key to the groups table.
I want a quick and dirty query to select the 5 most recent rows for each group_id from mytable.
If there were only three groups, this would be easy, as I could do this:
SELECT * FROM `mytable` WHERE `group_id` = 1 ORDER BY `id` DESC LIMIT 5
UNION ALL
SELECT * FROM `mytable` WHERE `group_id` = 2 ORDER BY `id` DESC LIMIT 5
UNION ALL
SELECT * FROM `mytable` WHERE `group_id` = 3 ORDER BY `id` DESC LIMIT 5
However, there is not a fixed number of groups. Groups are determined by the what's in the groups table, so there is an indeterminate number of them.
My thoughts so far:
I could grab a CURSOR on the groups table and build a new SQL query string, then EXECUTE it. However, that seems really messy and I'm hoping there's a better way of doing it.
I could grab a CURSOR on the groups table and insert things into a temporary table, then select from that. However, that also seems really messy.
I don't know if I could just grab a CURSOR and then start returning rows directly from there. Is there perhaps something similar to SQL Server's #table type variables?
What I'm hoping most of all is that I'm overthinking this and there is a way to do this in a SELECT statement.
To get n most recent rows per group can be best handled by window functions in other RDBMS (SQL Server,Postgre Sql,Oracle etc), But unfortunately MySql don't have any window functions so for alternative there is a solution to use user defined variables to assign a rank for rows that belong to same group in this case ORDER BY group_id,id desc is important to order the results properly per group
SELECT c.*
FROM (
SELECT *,
#r:= CASE WHEN #g = group_id THEN #r + 1 ELSE 1 END rownum,
#g:=group_id
FROM mytable
CROSS JOIN(SELECT #g:=NULL ,#r:=0) t
ORDER BY group_id,id desc
) c
WHERE c.rownum <=5
Above query will give you 5 recent rows for each group_id and if you want to get more than 5 rows just change where filter of outer query to your desired number WHERE c.rownum <= n

MySQL Query to find row duplicates based on condition with limit

I have two tables:
Members:
id username
Trips:
id member_id flag_status created
("YES" or "NO")
I can do a query like this:
SELECT
Trip.id, Trip.member_id, Trip.flag_status
FROM
trips Trip
WHERE
Trip.member_id = 1711
ORDER BY
Trip.created DESC
LIMIT
3
Which CAN give results like this:
id member_id flag_status
8 1711 YES
9 1711 YES
10 1711 YES
My goal is to know if the member's last three trips all had a flag_status = "YES", if any of the three != "YES", then I don't want it to count.
I also want to be able to remove the WHERE Trip.member_id = 1711 clause, and have it run for all my members, and give me the total number of members whose last 3 trips all have flag_status = "YES"
Any ideas?
Thanks!
http://sqlfiddle.com/#!2/28b2d
In that sqlfiddle, when the correct query i'm seeking runs, I should see results such as:
COUNT(Member.id)
2
The two members that should qualify are members 1 and 3. Member 5 fails because one of his trips has flag_status = "NO"
You could use GROUP_CONCAT function, to obtain a list of all of the status ordered by id in ascending order:
SELECT
member_id,
GROUP_CONCAT(flag_status ORDER BY id DESC) as status
FROM
trips
GROUP BY
member_id
HAVING
SUBSTRING_INDEX(status, ',', 3) NOT LIKE '%NO%'
and then using SUBSTRING_INDEX you can extract only the last three status flags, and exclude those that contains a NO. Please see fiddle here. I'm assuming that all of your rows are ordered by ID, but if you have a created date you should better use:
GROUP_CONCAT(flag_status ORDER BY created DESC) as status
as Raymond suggested. Then, you could also return just the count of the rows returned using something like:
SELECT COUNT(*)
FROM (
...the query above...
) as q
Although I like the simplicity of fthiella's solution, I just can't think of a solution that depends so much on data representation. In order not to depend on it you can do something like this:
SELECT COUNT(*) FROM (
SELECT member_id FROM (
SELECT
flag_status,
#flag_index := IF(member_id = #member, #flag_index + 1, 1) flag_index,
#member := member_id member_id
FROM trips, (SELECT #member := 0, #flag_index := 1) init
ORDER BY member_id, id DESC
) x
WHERE flag_index <= 3
GROUP BY member_id
HAVING SUM(flag_status = 'NO') = 0
) x
Fiddle here. Note I've slightly modified the fiddle to remove one of the users.
The process basically ranks the trips for each of the members based on their id desc and then only keeps the last 3 of them. Then it makes sure that none of the fetched trips has a NO in the flag_status. FInally all the matching meembers are counted.