Mysql query to not show duplicate records - mysql

I have a leader board of high scores and need to not show the duplicate records for a more accurate leader board.
Table: highscores
+-------------------+----------------+-------------+-------+
| id | name | time | moves | score |
+-------------------+----------------+-------------+-------+
| 1 | person1 | 33 | 22 | 245 |
+-------------------+----------------+-------------+-------+
| 2 | person1 | 83 | 31 | 186 |
+-------------------+----------------+-------------+-------+
and my query is
SELECT * FROM highscores ORDER by Score DESC LIMIT 100
how can I change the query to only show the higherscore of duplicate records without messing up the descending part
this seems to be working
SELECT * FROM highscores GROUP BY name ORDER by Score DESC LIMIT 100

Use mysql's custom group by:
SELECT * FROM (
SELECT * FROM highscores
ORDER by Score DESC) x
GROUP BY name
ORDER by Score DESC
LIMIT 100
This works because when not all non-aggregate columns are listed in the group by, mysql returns the first row encountered for each unique combination of the columns listed in the group by.

Related

Why can't I use this code to select the max row

I'm trying to select the row that contains the largest number and have accomplished it using this fairly simple query:
SELECT s1.score, name
FROM scores s1
JOIN (
SELECT id, MAX(score) score
FROM scores
GROUP BY id
) s2 ON s1.score = s2.score
All it does (If im not wrong), is just checking if the score field is equal the the MAX(score). So why can't we just do it using one single SELECT statement ?. Something like this:
SELECT id, score
FROM scores
GROUP BY id
HAVING MAX(score) = score
*The code above does not work, I want to ask why it is not working, because its essentially doing the same thing as the previous code I posted
Also here's the data I'm working with:
The problem in your second query is the fact that the GROUP BY clause requires all non-aggregated fields within its context. In your case you are dealing with three fields (namely "id", "score" and "MAX(score)") and you're referencing only one (namely "id") inside the GROUP BY clause.
Fixing that would require you to add the non-aggregated "score" field inside your GROUP BY clause, as follows:
SELECT id, score
FROM scores
GROUP BY id, score
HAVING MAX(score) = score
Though this would lead to a wrong aggregation and output, because it would attempt to get the maximum score for each combination of (id, score).
And if you'd attempt to remove the "score" field from both the SELECT and GROUP BY clauses, to solve the non-aggregated columns issue, as follows:
SELECT id
FROM scores
GROUP BY id
HAVING MAX(score) = score
Then the HAVING clause would complain as long it references the "score" field but it is found neither within the SELECT clause nor within the GROUP BY clause.
There's really no way for you to use that kind of notation, as it either violates the full GROUP_BY mode, or it just returns the wrong output.
It returns all persons with same score which the score is the max:
WITH CTE AS (
SELECT *, ROW_NUMBER() OVER(ORDER BY score desc) RN
FROM scores
)
SELECT * FROM CTE
WHERE CTE.RN = 1
Here's what your queries return
DROP table if exists t;
create table t
(id int,score int);
insert into t values
(1,10),(2,20),(3,20);
SELECT s1.id,s1.score
FROM t s1
JOIN (
SELECT id, MAX(score) score
FROM t
GROUP BY id
) s2 ON s1.score = s2.score ;
+------+-------+
| id | score |
+------+-------+
| 1 | 10 |
| 2 | 20 |
| 2 | 20 |
| 3 | 20 |
| 3 | 20 |
+------+-------+
5 rows in set (0.001 sec)
SELECT id, score,max(score)
FROM t
GROUP BY id
HAVING MAX(score) = score
+------+-------+------------+
| id | score | max(score) |
+------+-------+------------+
| 1 | 10 | 10 |
| 2 | 20 | 20 |
| 3 | 20 | 20 |
+------+-------+------------+
3 rows in set (0.001 sec)
Neither result seems to be what you are looking for. You could clarify by posting sample data and desired outcome.

Get the total score of players from only a recent number of matches in MySql

My MySql table named match_info roughly looks like this -
match_no | player_id | score
1 | 1 | 50
1 | 2 | 12
2 | 1 | 10
2 | 2 | 14
3 | 1 | 11
3 | 2 | 30
The actual table contains a lot more data than this. Anyways for the sake of example, lets say this is the table for now. What I want is to find the total score of each player for a select number of recent matches. For example say now I need to count only 2 recent matches. So only match no. 3 and 2 should be counted and my output should be -
player_id | score
2 | 44
1 | 21
How can I do this?
I tried the following -
SELECT player_id,SUM(score) as total_score
FROM match_info
where match_no IN
(select match_no
from match_info
ORDER BY match_no DESC LIMIT 2)
group by player_id
order by total_score desc;
However the error I encountered was-
This version of MariaDB doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
So I can't use limit inside of "IN" sub-query. What alternative method may I use?
Try to use this query, It's Working as you expected.
SELECT player_id, SUM(score) FROM match_info m1 INNER JOIN (SELECT DISTINCT match_no FROM match_info ORDER BY match_no DESC LIMIT 2) m2 ON m1.`match_no` = m2.match_no GROUP BY `player_id` ORDER BY `player_id`
You can try "join" vs a temp table with values you want to compare which will allow more query optimization.

Mysql order by top two then id

I want to show first two top voted Posts then others sorted by id
This is table
+----+-------+--------------+--------+
| Id | Name | Post | Votes |
+====+=======+==============+========+
| 1 | John | John's msg | -6 |
| 2 |Joseph |Joseph's msg | 8 |
| 3 | Ivan | Ivan's msg | 3 |
| 4 |Natalie|Natalie's msg | 10 |
+----+-------+--------------+--------+
After query result should be:
+----+-------+--------------+--------+
| Id | Name | Post | Votes |
+====+=======+==============+========+
| 4 |Natalie|Natalie's msg | 10 |
| 2 |Joseph |Joseph's msg | 8 |
-----------------------------------------------
| 1 | John | John's msg | -6 |
| 3 | Ivan | Ivan's msg | 3 |
+----+-------+--------------+--------+
I have 1 solution but i feel like there is better and faster way to do it.
I run 2 queries, one to get top 2, then second to get others:
SELECT * FROM table order by Votes desc LIMIT 2
SELECT * FROM table order by Id desc
And then in PHP i make sure that i show 1st query as it is, and on displaying 2nd query i remove entry's that are in 1st query so they don't double.
Can this be done in single query to select first two top voted, then others?
You would have to use subqueries or union - meaning you have a single outer query, which contains multiple queries inside. I would simply retrieve the IDs from the first query and add a id not in (...) criterion to the where clause of the 2nd query - thus filtering out the posts retrieved in the first query:
SELECT * FROM table WHERE Id NOT IN (...) ORDER BY Id DESC
With union the query would look like as follows:
(SELECT table.*, 1 as o FROM table order by Votes desc LIMIT 2)
UNION
(SELECT table.*, 0 FROM table
WHERE Id NOT IN (SELECT Id FROM table order by Votes desc LIMIT 2))
ORDER BY o DESC, if(o=1,Votes,Id) DESC
As you can see, it wraps 3 queries into one and has a more complicated ordering as well because in union the order of the records retrieved is not guaranteed.
Two simple queries seem to be a lot more efficient to me in this particular case.
There could be different ways to write a query that returns the rows in the order you want. My solution is this:
select
table.*
from
table left join (select id from table order by votes desc limit 2) l
on table.id = l.id
order by
case when l.id is not null then votes end desc,
tp.id
the subquery will return the first two id ordered by votes desc, the join will succeed whenever the row is one of the first two otherwise l.id will be null instead.
The order by will order by number of votes desc whenever the row is the first or the second (=l.id is not null), when l.id is null it will put the rows at the bottom and order by id instead.

Show all grouped results and sort

I have a table, like that one:
| B | 1 |
| C | 2 |
| B | 2 |
| A | 2 |
| C | 3 |
| A | 2 |
I would like to fetch it, but sorted and grouped. That is, I would like it grouped by the letter, but sorted by the highest sum of the group. Also, I want to show all entries within the group:
| C | 3 |
| C | 2 |
| A | 2 |
| A | 2 |
| B | 2 |
| B | 1 |
The order is that way because C has 3 and 2. 3+2=5, which is higher than 2+2=4 for A which in turn is higher than 2+1=3 for B.
I need to show all "grouped" letters because there are other columns that are distinct all of which I need shown.
EDIT:
Thanks for the quick reply. I have the audacity, however, to inquire further.
I have this query:
SELECT * FROM `ip_log` WHERE `IP` IN
(SELECT `IP` FROM `ip_log` GROUP BY `IP` HAVING COUNT(DISTINCT `uid`) > 1)
GROUP BY `uid` ORDER BY `IP`
The letters in the upper description are ip (I need it grouped by the IP addresses) and the numbers are timestamp (I need it sorted by the sum (or just used as the sorting parameter)). Should I create a temporary table and then use the solution below?
select t.Letter, t.Value
from MyTable t
inner join (
select Letter, sum(Value) as ValueSum
from MyTable
group by Letter
) ts on t.Letter = ts.Letter
order by ts.ValueSum desc, t.Letter, t.Value desc
SQL Fiddle Example
If your table's columns are letter and number, the way I would go around to doing this would be the following:
SELECT
letter,
GROUP_CONCAT(number ORDER BY number DESC),
SUM(number) AS total
FROM table
GROUP BY letter
ORDER BY total desc
What you will get, based on your example is the following:
| C | 3,2 | 5
| A | 2,2 | 4
| B | 2,1 | 3
You can then process that data to get the actual information you want/need.
If you still want the data in the format you requested originally, it is not possible with a single query. The reason for that is that you can't sort based on an aggregated data that you are not calculating in the same query (the SUM of the number column). So you will need to make a sub-query to calculate that and feed it back into the original query (disclaimer: untested query):
SELECT
letter,
number
FROM table
JOIN (SELECT ltr, SUM(number) AS total FROM table GROUP BY letter) AS totals
ON table.letter = totals.ltr
ORDER BY totals.total desc, letter desc, number desc

Sort data before using GROUP BY?

I have read that grouping happens before ordering, is there any way that I can order first before grouping without having to wrap my whole query around another query just to do this?
Let's say I have this data:
id | user_id | date_recorded
1 | 1 | 2011-11-07
2 | 1 | 2011-11-05
3 | 1 | 2011-11-06
4 | 2 | 2011-11-03
5 | 2 | 2011-11-06
Normally, I'd have to do this query in order to get what I want:
SELECT
*
FROM (
SELECT * FROM table ORDER BY date_recorded DESC
) t1
GROUP BY t1.user_id
But I'm wondering if there's a better solution.
Your question is somewhat unclear but I have a suspicion what you really want is not any GROUP aggregates at all, but rather ordering by date first, then user ID:
SELECT
id,
user_id,
date_recorded
FROM tbl
ORDER BY date_recorded DESC, user_id ASC
Here would be the result. Note reordering by date_recorded from your original example
id | user_id | date_recorded
1 | 1 | 2011-11-07
3 | 1 | 2011-11-06
2 | 1 | 2011-11-05
5 | 2 | 2011-11-06
4 | 2 | 2011-11-03
Update
To retrieve the full latest record per user_id, a JOIN is needed. The subquery (mx) locates the latest date_recorded per user_id, and that result is joined to the full table to retrieve the remaining columns.
SELECT
mx.user_id,
mx.maxdate,
t.id
FROM (
SELECT
user_id,
MAX(date_recorded) AS maxdate
FROM tbl
GROUP BY user_id
) mx JOIN tbl t ON mx.user_id = t.user_id AND mx.date_recorded = t.date_recorded
Iam just using the technique
"Using order clause before group by inserting it in group_concat clause"
SELECT SUBSTRING_INDEX(group_concat(cast(id as char)
ORDER BY date_recorded desc),',',1),
user_id,
SUBSTRING_INDEX(group_concat(cast(`date_recorded` as char)
ORDER BY `date_recorded` desc),',',1)
FROM data
GROUP BY user_id