Input data:
npost_id
mid
like_count
7
t4
3
21
t11
2
30
t16
2
31
t16
2
32
t18
2
I want the post_id that received the most likes per one person.
I need to pick only one row with satisfying several conditions: Max(like_count), per 1 id (Can be duplicated), npost_id (primary key)
Here's what I've tried:
SELECT npost_id, mid, like_count
FROM feed
WHERE (mid, like_count) IN (SELECT mid, MAX(like_count)
FROM feed
GROUP BY mid)
I can't think of anything other than that query.
In MySQL 8.0, one option to retrieve only one row for each combination of <mid, like_count> is to use the ROW_NUMBER window function, which allows you to assign a ranking value for each combination of <mid, like_count> (a partition). In order to get only one element for each of these, it's sufficient to filter out rows that have ranking bigger than 1 (the rows that have repeated <mid, like_count> values).
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY mid, like_count ORDER BY npost_id) AS rn
FROM tab
)
SELECT npost_id, mid, like_count
FROM cte
WHERE rn = 1
Check the demo here.
In MySQL 5.7, you can instead aggregate on the two different combination of <mid, like_count> and take the smaller value for the npost_id field (given that you are willing to accept any npost_id value for the partition).
SELECT MIN(npost_id) AS npost_id,
mid,
like_count
FROM tab
GROUP BY mid, like_count
Check the demo here.
Related
In the following, I am querying the same table 2 times. The second query is a nested query inside left join but queries the same table. The only difference is the addition of the aggregation function count, the result of which is used by the outer query. Is there a better way to approach this?
select sm.student_id, sm.marks, smarks.d as d_marks from student_marks as sm
left join(
select m.student_id, count(distinct m.marks) as d from student_marks as m group by m.student_id
) as smarks on smarks.student_id = sm.student_id;
Is it possible to do this in a single query without using a left join.
Yes there is an alternative approach which is using windowed functions. There's no way of doing COUNT(DISTINCT in a windowed function, but you can do this using DENSE_RANK() twice, once sorting by what you want a distinct count of ascending, and once descending, adding these together then taking one away:
SELECT sm.student_id,
sm.marks,
DENSE_RANK() OVER(PARTITION BY sm.student_id ORDER BY sm.marks DESC) +
DENSE_RANK() OVER(PARTITION BY sm.student_id ORDER BY sm.marks ASC) - 1 AS d_marks
FROM student_marks AS sm
N.B. this is not guaranteed to perform any better just because you are referencing a table one fewer times.
To explain the DENSE_RANK() trick, consider a simple data set:
marks
dense_rank ASC
dense_rank DESC
1
1
3
1
1
3
2
2
2
3
3
1
The two ranks added together will always be one more than the total number of items in the set (i.e. 1+3, 2+2, and 3+1 all equal 4), so we just need to take one off the result and this gives us our distinct count of items in the set without actually using COUNT(DISTINCT which isn't allowed (as noted in the restrictions)
ADENDUM
If marks is nullable (which I had assumed it would not be) and you don't want null rows included in the count, then as noted in the comments this wouldn't quite work as it is, you'd need to remove any null rows from the total, which can be done using:
- MAX(CASE WHEN sm.marks IS NULL THEN 1 ELSE 0 END) OVER(PARTITION BY sm.student_id)
I have a table that looks like this...
user_id, match_id, points_won
1 14 10
1 8 12
1 12 80
2 8 10
3 14 20
3 2 25
I want to write a MYSQL script that pulls back the most points a user has won in a single match and includes the match_id in the results - in other words...
user_id, match_id, max_points_won
1 12 80
2 8 10
3 2 25
Of course if I didn't need the match_id I could just do...
select user_id, max(points_won)
from table
group by user_id
But as soon as I add match_id to the "select" and "group by" I have a row for every match, and if I only add the match_id to the "select" (and not the "group by") then it won't correctly relate to the points_won.
Ideally I don't want to do the following either because it doesn't feel particularly safe (e.g. if the user has won the same amount of points on multiple matches)...
SELECT t.user_id, max(t.points_won) max_points_won
, (select t2.match_id
from table t2
where t2.user_id = t.user_id
and t2.points_won = max_points_won) as 'match_of_points_maximum'
FROM table t
GROUP BY t.user_id
Are there any more elegant options for this problem?
This is harder than it needs to be in MySQL. One method is a bit of a hack but it works in most circumstances. That is the group_concat()/substring_index() trick:
select user_id, max(points_won),
substring_index(group_concat(match_id order by points_won desc), ',', 1)
from table
group by user_id;
The group_concat() concatenates together all the match_ids, ordered by the points descending. The substring_index() then takes the first one.
Two important caveats:
The resulting expression has a type of string, regardless of the internal type.
The group_concat() uses an internal buffer, whose length -- by default -- is 1,024 characters. This default length can be changed.
You can use the query:
select user_id, max(points_won)
from table
group by user_id
as a derived table. Joining this to the original table gets you what you want:
select t1.user_id, t1.match_id, t2.max_points_won
from table as t1
join (
select user_id, max(points_won) as max_points_won
from table
group by user_id
) as t2 on t1.user_id = t2.user_id and t1.points_won = t2.max_points_won
I think you can optimize your query by add limit 1 in the inner query.
SELECT t.user_id, max(t.points_won) max_points_won
, (select t2.match_id
from table t2
where t2.user_id = t.user_id
and t2.points_won = max_points_won limit 1) as 'match_of_points_maximum'
FROM table t
GROUP BY t.user_id
EDIT : only for postgresql, sql-server, oracle
You could use row_number :
SELECT USER_ID, MATCH_ID, POINTS_WON
FROM
(
SELECT user_id, match_id, points_won, row_number() over (partition by user_id order by points_won desc) rn
from table
) q
where q.rn = 1
For a similar function, have a look at Gordon Linoff's answer or at this article.
In your example, you partition your set of result per user then you order by points_won desc to obtain highest winning point first.
I got a question in my homework for SQL about selecting the maximum values from the same table that have different class "Letters"
For example:
ID Student Group Avg(value)
-------------------------------------
1 stud1 A 9
2 stud2 A 9.5
3 stud3 B 8
4 stud4 B 8.5
What my query should do, is to show stud2 and stud4.The maximum from their respective groups.
I managed to do it in the end, but it took a lot of characters so I thought that maybe there's a shorter way to do. Any ideas? I used to first search the id or the stud that has max avg(value) from group A, intersecting with the id of the stud that has max avg(value) from B and then putting everything into one big select and then using those intersected IDs into another query that requested to show some different things about those IDs. But as I said, it looked far too long and thought that maybe there's an shorter way.
Try this (I renamed group to grp and avg to avg_val as those are reserved keywords):
select t1.*
from your_table t1
inner join (
select grp, max(avg_val) avg_val
from your_table
group by grp
) t2 on t1.grp = t2.grp
and t1.avg_val = t2.avg_val;
It finds maximum avg value per group and joins it with original table to get the corresponding students.
Please note that if there are multiple students with same avg as the max value of the that group, all of those students will be returned.
Table
id user_id rank_solo lp
1 1 15 45
2 2 7 79
3 3 17 15
How can I sort out a ranking query that sorts on rank_solo ( This ranges from 0 to 28) and if rank_solo = rank_solo , uses lp ( 0-100) to further determine ranking?
(If lp = lp, add a ranking for no tie rankings)
The query should give me the ranking from a certain random user_id. How is this performance wise on 5m+ rows?
So
User_id 1 would have ranking 2
User_id 2 would have ranking 3
User_id 3 would have ranking 1
You can get the ranking using variablesL
select t.*, (#rn := #rn + 1) as ranking
from t cross join
(select #rn := 0) params
order by rank_solo desc, lp;
You can use ORDER BY to sort your query:
SELECT *
FROM `Table`
ORDER BY rank_solo, lp
I'm not sure I quite understand what you're saying. With that many rows, create a query on the fields you're using to do your selects. For example, in MySQL client use:
create index RANKINGS on mytablename(rank_solo,lp,user_id);
Depending on what you use in your query to select the data, you may change the index or add another index with a different field combination. This has improved performance on my tables by a factor of 10 or more.
As for the query, if you're selecting a specific user then could you not just use:
select rank_solo from table where user_id={user id}
If you want the highest ranking individual, you could:
select * from yourtable order by rank_solo,lp limit 1
Remove the limit 1 to list them all.
If I've misunderstood, please comment.
An alternative would be to use a 2nd table.
table2 would have the following fields:
rank (auto_increment)
user_id
rank_solo
lp
With the rank field as auto increment, as it's populated, it will automatically populate with values beginning with "1".
Once the 2nd table is ready, just do this when you want to update the rankings:
delete from table2;
insert into table2 select user_id,rank_solo,lp from table1 order by rank_solo,lp;
It may not be "elegant" but it gets the job done. Plus, if you create an index on both tables, this query would be very quick since the fields are numeric.
I have table that looks like this:
id rank
a 2
a 1
b 4
b 3
c 7
d 1
d 1
e 9
I need to get all the distinct rank values on one column and count of all the unique id's that have reached equal or higher rank than in the first column.
So the result I need would be something like this:
rank count
1 5
2 4
3 3
4 3
7 2
9 1
I've been able to make a table with all the unique id's with their max rank:
SELECT
MAX(rank) AS 'TopRank',
id
FROM myTable
GROUP BY id
I'm also able to get all the distinct rank values and count how many id's have reached exactly that rank:
SELECT
DISTINCT TopRank AS 'rank',
COUNT(id) AS 'count of id'
FROM
(SELECT
MAX(rank) AS 'TopRank',
id
FROM myTable
GROUP BY id) tableDerp
GROUP BY TopRank
ORDER BY TopRank ASC
But I don't know how to get count of id's where the rank is equal OR HIGHER than the rank in column 1. Trying SUM(CASE WHEN TopRank > TopRank THEN 1 END) naturally gives me nothing. So how can I get the count of id's where the TopRank is higher or equal to each distinct rank value? Or am I looking in the wrong way and should try something like running totals instead? I tried to look for similar questions but I think I'm completely on a wrong trail here since I couldn't find any and this seems a pretty simple problem that I'm just overthinking somehow. Any help much appreciated.
One approach is to use a correlated subquery. Just get the list of ranks and then use a correlated subquery to get the count you are looking for:
SELECT r.rank,
(SELECT COUNT(DISTINCT t2.id)
FROM myTable t2
WHERE t2.rank >= r.rank
) as cnt
FROM (SELECT DISTINCT rank FROM myTable) r;