ORDER BY and GROUP BY those results in a single query - mysql

I am trying to query a dataset from a single table, which contains quiz answers/entries from multiple users. I want to pull out the highest scoring entry from each individual user.
My data looks like the following:
ID TP_ID quiz_id name num_questions correct incorrect percent created_at
1 10154312970149546 1 Joe 3 2 1 67 2015-09-20 22:47:10
2 10154312970149546 1 Joe 3 3 0 100 2015-09-21 20:15:20
3 125564674465289 1 Test User 3 1 2 33 2015-09-23 08:07:18
4 10153627558393996 1 Bob 3 3 0 100 2015-09-23 11:27:02
My query looks like the following:
SELECT * FROM `entries`
WHERE `TP_ID` IN('10153627558393996', '10154312970149546')
GROUP BY `TP_ID`
ORDER BY `correct` DESC
In my mind, what that should do is get the two users from the IN clause, order them by the number of correct answers and then group them together, so I should be left with the 2 highest scores from those two users.
In reality it's giving me two results, but the one from Joe gives me the lower of the two values (2), with Bob first with a score of 3. Swapping to ASC ordering keeps the scores the same but places Joe first.
So, how could I achieve what I need?

You're after the groupwise maximum, which can be obtained by joining the grouped results back to the table:
SELECT * FROM entries NATURAL JOIN (
SELECT TP_ID, MAX(correct) correct
FROM entries
WHERE TP_ID IN ('10153627558393996', '10154312970149546')
GROUP BY TP_ID
) t
Of course, if a user has multiple records with the maximal score, it will return all of them; should you only want some subset, you'll need to express the logic for determining which.

MySql is quite lax when it comes to group-by-clauses - but as a rule of thumb you should try to follow the rule that other DBMSs enforce:
In a group-by-query each column should either be part of the group-by-clause or contain a column-function.
For your query I would suggest:
SELECT `TP_ID`,`name`,max(`correct`) FROM `entries`
WHERE `TP_ID` IN('10153627558393996', '10154312970149546')
GROUP BY `TP_ID`,`name`
Since your table seems quite denormalized the group by name-par could be omitted, but it might be necessary in other cases.
ORDER BY is only used to specify in which order the results are returned but does nothing about what results are returned - so you need to apply the max()-function to get the highest number of right answers.

Related

what does this sql query do? SELECT column_1 FROM table_1,table_2;

SELECT column_1 FROM table_1,table_2;
When I ran this on my database it returned huge number of rows with duplicate column_1 values. I could not understand why I got these results. Please explain what this query does.
it gives you a cross product from table 1 and table 2
In more layman's terms, it means that for each record in Table A, you get every record from Table B (all possible combinations).
TableA with 3 records and Table B with 3 records gives 9 total records in the result:
TableA-1/B-1
TableA-1/B-2
TableA-1/B-3
TableA-2/B-1
TableA-2/B-2
TableA-2/B-3
TableA-3/B-1
TableA-3/B-2
TableA-3/B-3
Often used as a basis for Cartesian Queries (which themselves are the means to generate, say, a list of future dates based on a recurrence schedule: give me all possible results for the next 6 months, then restrict that set to those whose factor matches my day of the week)
This is 'valid' way of cross joining two tables; it is not the preferred way though. Cross Join would be much clearer. An on condition would then be helpful to limit results,
Imagine that i have 3 friends named Jhon, Ana, Nick; then i have in the other table 2 are T-shirts a red and a yellow and i wanna know witch is from.
So in the query being tableA:Friends and tableB:Tshirts returns:
1|JHON | t-shirt_YELLOW
2|JHON | t-shirt_RED
3|ANA | t-shirt_YELLOW
4|ANA | t-shirt_RED
5|NICK | t-shirt_YELLOW
6|NICK | t-shirt_RED
As you see this join has no relational logic between friends and Tshirts so by evaluating all the posible combination generates what you call duplicates.

SQL: Repeated records by grouping some columns

I have a data like,
ID Name ItemA ItemB ItemC
OXZ234 Adam 4 4 5
OXZ234 Adam 1 2 3
OXZ345 Tarzen 6 7 8
OXDER2 William 9 8 2
OXDER2 William 0 8 0
I need to find how much of food each person eats. For example by referring first two records I can say, Adam of ID OXZ234 ate ItemA-5, ItemB-6 and ItemC-8. But for small amount of data this kind of manual calculation is affordable. I have a million data records like this. So initially I need to find the records which is having same ID and name but only items count differing.
I have tried the query to find duplicate records by grouping all columns like below,
select ID,Name,ItemA,ItemB,ItemC, COUNT(*)
from DATA_REFRESH
group by ID,Name,ItemA,ItemB,ItemC
having COUNT(*) > 1
But Now I have to identify records having items columns differed.
So the expected output is like,
OXZ234 Adam 2
OXDER2 William 2
OXZ345 Tarzen 1
Any suggestion would be helpful!
You want SUM
select ID,
Name,
sum(ItemA) as ItA,
sum(ItemB) as ItB,
sum(ItemC) as ItC,
count(ID) as Occurrences -- Counts the number of entries per person
from DATA_REFRESH
group by ID,Name
having count(ID) >1 -- restricts this so only those with more than one entry appear
Hi, You can have a simple query without having clause,
select ID,Name,COUNT(*)
from DATA_REFRESH
group by ID,Name order by COUNT(*) desc ;
Simply try like this,
select ID,Name,COUNT(*)
from Sample_Check
group by ID,Name
having COUNT(*) > 1

select records in given ids sorting order

i have table lets say - Students,
with 5 records and id(s) are 1 to 5, now i want to select the records - in a way that result should come like given sorting order of id column
id column should be resulted - 5,2,1,3,4
is there any other way to do this - then separate db calls for ids?
single db call ?
I guess if you really want a hard-coded order, you could do something like this:
order by case id
when 5 then 0
when 2 then 1
when 1 then 2
when 3 then 3
when 4 then 4
else 999
end
Or more simply (as #Strawberry points out in the comments):
order BY FIELD(id,4,3,1,2,5) desc

MySql order by specific ID values

Is it possible to sort in MySQL by "order by" using a predefined set of column values (ID) like order by (ID=1,5,4,3) so I would get records 1, 5, 4, 3 in that order out?
UPDATE: Why I need this...
I want my records to change sort randomly every 5 minutes. I have a cron task to update the table to put different, random sort order in it.
There is just one problem! PAGINATION.
I will have visitors who come to my page, and I will give them the first 20 results. They will wait 6 minutes, go to page 2 and have the wrong results as the sort order has already changed.
So I thought that if I put all the IDs into a session on page 2, we get the correct records even if the sorting had already changed.
Is there any other better way to do this?
You can use ORDER BY and FIELD function.
See http://lists.mysql.com/mysql/209784
SELECT * FROM table ORDER BY FIELD(ID,1,5,4,3)
It uses Field() function, Which "Returns the index (position) of str in the str1, str2, str3, ... list. Returns 0 if str is not found" according to the documentation. So actually you sort the result set by the return value of this function which is the index of the field value in the given set.
You should be able to use CASE for this:
ORDER BY CASE id
WHEN 1 THEN 1
WHEN 5 THEN 2
WHEN 4 THEN 3
WHEN 3 THEN 4
ELSE 5
END
On the official documentation for mysql about ORDER BY, someone has posted that you can use FIELD for this matter, like this:
SELECT * FROM table ORDER BY FIELD(id,1,5,4,3)
This is untested code that in theory should work.
SELECT * FROM table ORDER BY id='8' DESC, id='5' DESC, id='4' DESC, id='3' DESC
If I had 10 registries for example, this way the ID 1, 5, 4 and 3 will appears first, the others registries will appears next.
Normal exibition
1
2
3
4
5
6
7
8
9
10
With this way
8
5
4
3
1
2
6
7
9
10
There's another way to solve this. Add a separate table, something like this:
CREATE TABLE `new_order` (
`my_order` BIGINT(20) UNSIGNED NOT NULL,
`my_number` BIGINT(20) NOT NULL,
PRIMARY KEY (`my_order`),
UNIQUE KEY `my_number` (`my_number`)
) ENGINE=INNODB;
This table will now be used to define your own order mechanism.
Add your values in there:
my_order | my_number
---------+----------
1 | 1
2 | 5
3 | 4
4 | 3
...and then modify your SQL statement while joining this new table.
SELECT *
FROM your_table AS T1
INNER JOIN new_order AS T2 on T1.id = T2.my_number
WHERE ....whatever...
ORDER BY T2.my_order;
This solution is slightly more complex than other solutions, but using this you don't have to change your SELECT-statement whenever your order criteriums change - just change the data in the order table.
If you need to order a single id first in the result, use the id.
select id,name
from products
order by case when id=5 then -1 else id end
If you need to start with a sequence of multiple ids, specify a collection, similar to what you would use with an IN statement.
select id,name
from products
order by case when id in (30,20,10) then -1 else id end,id
If you want to order a single id last in the result, use the order by the case. (Eg: you want "other" option in last and all city list show in alphabetical order.)
select id,city
from city
order by case
when id = 2 then city else -1
end, city ASC
If i had 5 city for example, i want to show the city in alphabetical order with "other" option display last in the dropdown then we can use this query.
see example other are showing in my table at second id(id:2) so i am using "when id = 2" in above query.
record in DB table:
Bangalore - id:1
Other - id:2
Mumbai - id:3
Pune - id:4
Ambala - id:5
my output:
Ambala
Bangalore
Mumbai
Pune
Other
SELECT * FROM TABLE ORDER BY (columnname,1,2) ASC OR DESC

GROUP BY does not remove duplicates

I have a watchlist system that I've coded, in the overview of the users' watchlist, they would see a list of records, however the list shows duplicates when in the database it only shows the exact, correct number.
I've tried GROUP BY watch.watch_id, GROUP BY rec.record_id, none of any types of group I've tried seems to remove duplicates. I'm not sure what I'm doing wrong.
SELECT watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
FROM
(
watchlist watch
LEFT OUTER JOIN records rec ON rec.record_id = watch.record_id
LEFT OUTER JOIN members usr ON rec.user_id = usr.user_id
)
WHERE watch.user_id = 1
GROUP BY watch.watch_id
LIMIT 0, 25
The watchlist table looks like this:
+----------+---------+-----------+------------+
| watch_id | user_id | record_id | watch_date |
+----------+---------+-----------+------------+
| 13 | 1 | 22 | 1314038274 |
| 14 | 1 | 25 | 1314038995 |
+----------+---------+-----------+------------+
GROUP BY does not "remove duplicates". GROUP BY allows for aggregation. If all you want is to combine duplicated rows, use SELECT DISTINCT.
If you need to combine rows that are duplicate in some columns, use GROUP BY but you need to to specify what to do with the other columns. You can either omit them (by not listing them in the SELECT clause) or aggregate them (using functions like SUM, MIN, and AVG). For example:
SELECT watch.watch_id, COUNT(rec.street_number), MAX(watch.watch_date)
... GROUP by watch.watch_id
EDIT
The OP asked for some clarification.
Consider the "view" -- all the data put together by the FROMs and JOINs and the WHEREs -- call that V. There are two things you might want to do.
First, you might have completely duplicate rows that you wish to combine:
a b c
- - -
1 2 3
1 2 3
3 4 5
Then simply use DISTINCT
SELECT DISTINCT * FROM V;
a b c
- - -
1 2 3
3 4 5
Or, you might have partially duplicate rows that you wish to combine:
a b c
- - -
1 2 3
1 2 6
3 4 5
Those first two rows are "the same" in some sense, but clearly different in another sense (in particular, they would not be combined by SELECT DISTINCT). You have to decide how to combine them. You could discard column c as unimportant:
SELECT DISTINCT a,b FROM V;
a b
- -
1 2
3 4
Or you could perform some kind of aggregation on them. You could add them up:
SELECT a,b, SUM(c) "tot" FROM V GROUP BY a,b;
a b tot
- - ---
1 2 9
3 4 5
You could add pick the smallest value:
SELECT a,b, MIN(c) "first" FROM V GROUP BY a,b;
a b first
- - -----
1 2 3
3 4 5
Or you could take the mean (AVG), the standard deviation (STD), and any of a bunch of other functions that take a bunch of values for c and combine them into one.
What isn't really an option is just doing nothing. If you just list the ungrouped columns, the DBMS will either throw an error (Oracle does that -- the right choice, imo) or pick one value more or less at random (MySQL). But as Dr. Peart said, "When you choose not to decide, you still have made a choice."
While SELECT DISTINCT may indeed work in your case, it's important to note why what you have is not working.
You're selecting fields that are outside of the GROUP BY. Although MySQL allows this, the exact rows it returns for the non-GROUP BY fields is undefined.
If you wanted to do this with a GROUP BY try something more like the following:
SELECT watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
FROM
(
watchlist watch
LEFT OUTER JOIN est8_records rec ON rec.record_id = watch.record_id
LEFT OUTER JOIN est8_members usr ON rec.user_id = usr.user_id
)
WHERE watch.watch_id IN (
SELECT watch_id FROM watch WHERE user_id = 1
GROUP BY watch.watch_id)
LIMIT 0, 25
I Would never recommend using SELECT DISTINCT, it's really slow on big datasets.
Try using things like EXISTS.
You are grouping by watch.watch_id and you have two results, which have different watch IDs, so naturally they would not be grouped.
Also, from the results displayed they have different records. That looks like a perfectly valid expected results. If you are trying to only select distinct values, then you don't want ot GROUP, but you want to select by distinct values.
SELECT DISTINCT()...
If you say your watchlist table is unique, then one (or both) of the other tables either (a) has duplicates, or (b) is not unique by the key you are using.
To suppress duplicates in your results, either use DISTINCT as #Laykes says, or try
GROUP BY watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
It sort of sounds like you expect all 3 tables to be unique by their keys, though. If that is the case, you are simply masking some other problem with your SQL by trying to retrieve distinct values.