Remove list of repetitions in a list SQL - mysql

I have a table for a game where more people play the game it will repeat each other.. I can remove duplicates no problem but this one is different because I want to remove the others while keeping the highest score is it possible to do it?
for example
Name Level Score
Green 99 797,000
Green 99 819,000
Green 99 970,000
Green 99 890,000
I want to keep row 3 and remove the others.

In the example you show, you can get the top row this way:
SELECT * FROM `thistable` ORDER BY score DESC LIMIT 1
But I assume you have more than one Name in the table. If you want the highest score for a specific name:
SELECT * FROM `thistable` WHERE Name = 'Green' ORDER BY Score DESC LIMIT 1
If you want results for multiple names, but the row with the highest score for each one, it's a bit more complex:
SELECT * FROM `thistable`
JOIN (SELECT Name, MAX(Score) AS Score FROM `thistable` GROUP BY Name) AS x
USING (Name, Score);
This type of problem is commonly tagged greatest-n-per-group and it's a frequent question.
PS: Don't take the downvotes seriously. Stack Overflow has become a pretty nasty place. There will always be someone who thinks you didn't ask a valid question, or didn't ask it "correctly."

Related

Mysql SUM CASE with unique IDs only

Easiest explained through an example.
A father has children who win races.
How many of a fathers offspring have won a race and how many races in total have a fathers offspring won. (winners and wins)
I can easily figure out the total amount of wins but sometimes a child wins more than one race so to figure out winners I need only sum if the child has won, not all the times it has won.
In the below extract from a query I cannot use Distinct, so this doesn't work
SUM(CASE WHEN r.finish = '1' AND DISTINCT h.runnersid THEN 1 ELSE 0 END ) AS winners,
This also won't work
SUM(SELECT DISTINCT r.runnersid FROM runs r WHERE r.finish='1') AS winners
This works when I need to find the total amount of wins.
SUM(CASE WHEN r.finish = '1' THEN 1 ELSE 0 END ) AS wins,
Here is a sqlfiddle http://sqlfiddle.com/#!2/e9a81/1
Let's take this step by step.
You have two pieces of information you are looking for: Who has won a race, and how many races have they one.
Taking the first one, you can select a distinct runnersid where they have a first place finish:
SELECT DISTINCT runnersid
FROM runs
WHERE finish = 1;
For the second one, you can select every runnersid where they have a first place finish, count the number of rows returned, and group by runnersid to get the total wins for each:
SELECT runnersid, COUNT(*) AS numWins
FROM runs
WHERE finish = 1
GROUP BY runnersid;
The second one actually has everything you want. You don't need to do anything with that first query, but I used it to help demonstrate the thought process I take when trying to accomplish a task like this.
Here is the SQL Fiddle example.
EDIT
As you've seen, you don't really need the SUM here. Because finish represents a place in the race, you don't want to SUM that value, but you want to COUNT the number of wins.
EDIT2
An additional edit based on OPs requirements. The above does not match what OP needs, but I left this in as a reference to any future readers. What OP really needs, as I understand it now, is the number of children each father has that has run a race. I will again explain my thought process step by step.
First I wrote a simple query that pulls all of the winning father-son pairs. I was able to use GROUP BY to get the distinct winning pairs:
SELECT father, name
FROM runs
WHERE finish = 1
GROUP BY father, name;
Once I had done that, I used it is a subquery and the COUNT(*) function to get the number of winners for each father (this means I have to group by father):
SELECT father, COUNT(*) AS numWinningChildren
FROM(SELECT father, name
FROM runs
WHERE finish = 1
GROUP BY father, name) t
GROUP BY father;
If you just need the fathers with winning children, you are done. If you want to see all fathers, I would write one query to select all fathers, join it with our result set above, and replace any values where numWinningChildren is null, with 0.
I'll leave that part to you to challenge yourself a bit. Also because SQL Fiddle is down at the moment and I can't test what I was thinking, but I was able to test those above with success.
I think you want the father name along with the count of the wins by his sons.
select father, count(distinct(id)) wins
from runs where father = 'jack' and finish = 1
group by father
sqlfiddle
I am not sure if this is what you are looking for
Select user_id, sum(case when finish='1' then 1 else 0 end) as total
From table
Group by user_id

Get entry with max value in MySQL

I've got a MySQL database with lots of entris of highscores for a game. I would like to get the "personal best" entry with the max value of score.
I found a solution that I thought worked, until I got more names in my database, then it returnes completely different results.
My code so far:
SELECT name, score, date, version, mode, custom
FROM highscore
WHERE score =
(SELECT MAX(score)
FROM highscore
WHERE name = 'jonte' && gamename = 'game1')
For a lot of values, this actually returns the correct value as such:
JONTE 240 2014-04-28 02:52:33 1 0 2053
It worked fine with a few hundred entries, some with different names. But when I added new entries and swapped name to 'gabbes', for the new names I instead get a list of multiple entries. I don't see the logic here as the entries in the database seem quite identical with some differences in data.
JONTE 176 2014-04-28 11:03:46 1 0 63
GABBES 176 2014-04-28 11:09:12 1 0 3087
The above has two entires, but sometimes it may also return 10-20 entries in a row too.
Any help?
If you want the high score for each person (i.e. personal best) you can do this...
SELECT name, max(score)
FROM highscore
WHERE gamename = 'game1'
GROUP BY name
Alternatively, you can do this...
SELECT name, score, date, version, mode, custom
FROM highscore h1
WHERE score =
(SELECT MAX(score)
FROM highscore h2
WHERE name = h1.name && gamename = 'game1')
NOTE: In your SQL, your subclause is missing the name = h1.name predicate.
Note however, that this second option will give multiple rows for the same person if they recorded the same high score multiple times.
The multiple entries are returned because multiple entries have the same high score. You can add LIMIT 1 to get only a single entry. You can choose which entry to return with the ORDER BY clause.

Sum Of mysql table record

I have a table named tbl_Question and a column named INT_MARK which has different marks for different questions. Like this:
VH_QUESTION INT_MARK
----------- --------
Q1 2
Q2 4
My question is: How to get a random set of 20 questions whose total sum of marks is 50?
select VH_QUESTION, sum(INT_MARK) from tbl_Question
group by VH_QUESTION
having sum(INT_MARK) > 50
order by rand() limit 1
I think this question may help you - seems a very similar problem.
If that don't work, I'd try to divide the problem in two: first, you make a combinatory of your questions. Then, you filter them by it's sum of points.
I couldn't find, however, how to produce all combinations of the table. I don't know how difficult that would be.
select VH_QUESTION, sum(INT_MARK) from tbl_Question
group by VH_QUESTION
having sum(INT_MARK) >= 50
order by rand() limit 20
Quick answer
SELECT * ,SUM(INT_MARK) as total_mark FROM tbl_Question
GROUP BY VH_QUESTION
HAVING total_mark="50"
ORDER BY RAND()
LIMIT 5
it returns 0 line when no answers are possible but each time it finds one the questionsare random.
You could check the benchmark to see if you can have a faster query for large tables.

MySQL query for items where average price is less than X?

I'm stumped with how to do the following purely in MySQL, and I've resorted to taking my result set and manipulating it in ruby afterwards, which doesn't seem ideal.
Here's the question. With a dataset of 'items' like:
id state_id price issue_date listed
1 5 450 2011 1
1 5 455 2011 1
1 5 490 2011 1
1 5 510 2012 0
1 5 525 2012 1
...
I'm trying to get something like:
SELECT * FROM items
WHERE ([some conditions], e.g. issue_date >= 2011 and listed=1)
AND state_id = 5
GROUP BY id
HAVING AVG(price) <= 500
ORDER BY price DESC
LIMIT 25
Essentially I want to grab a "group" of items whose average price fall under a certain threshold. I know that my above example "group by" and "having" are not correct since it's just going to give the AVG(price) of that one item, which doesn't really make sense. I'm just trying to illustrate my desired result.
The important thing here is I want all of the individual items in my result set, I don't just want to see one row with the average price, total, etc.
Currently I'm just doing the above query without the HAVING AVG(price) and adding up the individual items one-by-one (in ruby) until I reach the desired average. It would be really great if I could figure out how to do this in SQL. Using subqueries or something clever like joining the table onto itself are certainly acceptable solutions if they work well! Thanks!
UPDATE: In response to Tudor's answer below, here are some clarifications. There is always going to be a target quantity in addition to the target average. And we would always sort the results by price low to high, and by date.
So if we did have 10 items that were all priced at $5 and we wanted to find 5 items with an average < $6, we'd simply return the first 5 items. We wouldn't return the first one only, and we wouldn't return the first 3 grouped with the last 2. That's essentially how my code in ruby is working right now.
I would do almost an inverse of what Jasper provided... Start your query with your criteria to explicitly limit the few items that MAY qualify instead of getting all items and running a sub-select on each entry. Could pose as a larger performance hit... could be wrong, but here's my offering..
select
i2.*
from
( SELECT i.id
FROM items i
WHERE
i.issue_date > 2011
AND i.listed = 1
AND i.state_id = 5
GROUP BY
i.id
HAVING
AVG( i.price) <= 500 ) PreQualify
JOIN items i2
on PreQualify.id = i2.id
AND i2.issue_date > 2011
AND i2.listed = 1
AND i2.state_id = 5
order by
i2.price desc
limit
25
Not sure of the order by, especially if you wanted grouping by item... In addition, I would ensure an index on (state_id, Listed, id, issue_date)
CLARIFICATION per comments
I think I AM correct on it. Don't confuse "HAVING" clause with "WHERE". WHERE says DO or DONT include based on certain conditions. HAVING means after all the where clauses and grouping is done, the result set will "POTENTIALLY" accept the answer. THEN the HAVING is checked, and if IT STILL qualifies, includes in the result set, otherwise throws it out. Try the following from the INNER query alone... Do once WITHOUT the HAVING clause, then again WITH the HAVING clause...
SELECT i.id, avg( i.price )
FROM items i
WHERE i.issue_date > 2011
AND i.listed = 1
AND i.state_id = 5
GROUP BY
i.id
HAVING
AVG( i.price) <= 500
As you get more into writing queries, try the parts individually to see what you are getting vs what you are thinking... You'll find how / why certain things work. In addition, you are now talking in your updated question about getting multiple IDs and prices at apparent low and high range... yet you are also applying a limit. If you had 20 items, and each had 10 qualifying records, your limit of 25 would show all of the first item and 5 into the second... which is NOT what I think you want... you may want 25 of each qualified "id". That would wrap this query into yet another level...
What MySQL does makes perfectly sense. What you want to do does not make sense:
if you have let's say 4 items, each with price of 5 and you put HAVING AVERAGE <= 7 what you say is that the query should return ALL the permutations, like:
{1} - since item with id 1, can be a group by itself
{1,2}
{1,3}
{1,4}
{1,2,3}
{1,2,4}
...
and so on?
Your algorithm of computing the average in ruby is also not valid, if you have items with values 5, 1, 7, 10 - and seek for an average value of less than 7, element with value 10 can be returned just in a group with element of value 1. But, by your algorithm (if I understood correctly), element with value 1 is returned in the first group.
Update
What you want is something like the Knapsack problem and your approach is using some kind of Greedy Algorithm to solve it. I don't think there are straight, easy and correct ways to implement that in SQL.
After a google search, I found this article which tries to solve the knapsack problem with AI written in SQL.
By considering your item price as a weight, having the number of items and the desired average, you could compute the maximum value that can be entered in the 'knapsack' by multiplying desired_cost with number_of_items
I'm not entirely sure from your question, but I think this is a solution to your problem:
SELECT * FROM items
WHERE (some "conditions", e.g. issue_date > 2011 and listed=1)
AND state_id = 5
AND id IN (SELECT id
FROM items
GROUP BY id
HAVING AVG(price) <= 500)
ORDER BY price DESC
LIMIT 25
note: This is off the top of my head and I haven't done complex SQL in a while, so it might be wrong. I think this or something like it should work, though.

In MySQL, can I have a table returning the ten last rated games by rating?

The actual question is a little more complex than that, so here goes.
I have a website which reviews games. Ratings/reviews are posted for each game, and so I have a MySQL database to handle it all.
Thing is, I'd really like a page that showed what score (out of 10) meant what, and to illustrate it would have the game that was last reviewed as an example. I can always do it without, but this would be cooler.
So the query should return something like this (but running from 10 to 0):
|---------------*----------------*-----------------*-----------------|
* game.gameName | game.gameImage | review.ourScore | review.postedOn *
|---------------*----------------*-----------------*-----------------|
| Top Game | img | 10 | (unix timestamp)|
| NearlyTop Game| img | 9 | (unix timestamp)|
| Great Game | img | 8 | (unix timestamp)|
|---------------*----------------*-----------------*-----------------|
The information is in two tables, game and review. I think you'd use MAX() to find out the last timestamp and corresponding game information, but as far as complex queries go, I'm in way over my head.
Of course this could be done with 10 simple SELECTs but I'm sure there must be a way to do this in one query.
Thanks for any help.
Here is an ugly solution I found:
This query simply gets the IDs and scores of the reviews that you want to look at. I have included it so that you can understand what the trick is, without getting distracted by other stuff:
SELECT * FROM
(SELECT reviewID, ourScore FROM review ORDER BY postedOn DESC) as `r`
GROUP BY ourScore
ORDER BY ourScore DESC;
This exploits MySQL's 'GROUP BY' behavior. When the grouping is done, if the source rows have different values for different columns, then the value of the topmost source row is used. So if you had rows in this order:
reviewId Score
1 3
0 3
2 3
Then after you group by score, the reviewId is 1 because that row was on the top:
reviewId Score
1 3
So we want to put the most recent review on the top before we do the group by. Since ORDERing is always dones after grouping, in a single SELECT statement, I had to make a subquery to accomplish this. Now we just dress up this query a little bit to get all the fields you wanted:
SELECT `r`.*, game.gameName, game.gameImage FROM
(SELECT reviewID, ourScore, postedOn, gameID FROM review ORDER BY **postedOn DESC**) as `r`
JOIN game ON `r`.gameID = game.gameID
GROUP BY ourScore
ORDER BY ourScore DESC;
That should work.
SELECT DISTINCT game.gameName, game.gameImage, review.ourScore FROM game
LEFT JOIN review
ON game.ID = review.gameID
ORDER BY review.postedOn
LIMIT 10
Or something like that, check out how to use the Distinct first, I'm not sure on the syntax, and you may have to tell the ORDER BY DESC or ASC depending on what you want.
Well..
SELECT game.gameName, game.gameImage, review.ourScore
FROM game
LEFT JOIN review ON game.gameID = review.gameID
GROUP BY review.ourScore DESC
LIMIT 10
returns a list of games grouped by each individual score. But this isn't what I want, I want the game that is last posted - this is why the timestamp is important. With that query, MySQL returns the first result it can find.
I think this would work:
select g.gameName, g.gameImage, r.ourScore, r.postedOn
from game g, review r
where g.gameId = r.gameId
and r.postedOn = (select max(sr.postedOn)
from review sr where sr.ourScore = r.ourScore)
group by r.ourScore
order by r.ourScore desc;
Edit: above SQL was corrected after David Grayson's comment. I think this query is pretty easy to understand but probably performs poorly compared with his solution.