Sql query max, group by - mysql

I am trying to get all students group by class_id, student_id, teacher_id
SO what I mean is this one :
Select id,class_id, student_id,teacher_id, max(active)
FROM student_classes
GROUP BY class_id, student_id, teacher_id
But this is what I get
Actually what I want as a result is:
114 137 1 47 1
108 138 2 49 0
113 197 3 47 1
So basically the problem is at the third row. Instead of having id = 113 I get ID=111.
What should I do in this case? Can you please help me with the query

As mentioned in the comments, MySQL allows something against the SQL standard, letting you include a non-aggregated column (in this case id) in the select list of a query that includes a group by. As far as I know, it will arbitrarily pick one row in each grouping and display the id value from that row.
If you have a specific rule about which id value you want to see, you need to express that in your query.
By the way, your desired output appears to have multiple typos (e.g. 197, which doesn't appear in your data at all).
From your comment (which you should edit into your original question), and your desired output, I think the rule you want for the id column is:
If there are any rows with active=1 in the group, choose the maximum id value from those rows
If all rows in the group have active=0, choose the minimum id value. (You didn't say this specifically; I'm assuming it based on the presence of 108 on the second row of your desired output.)
I think that this query will produce those results. (And also eliminate the non-standard MySQL behavior.)
SELECT
COALESCE(
MAX(CASE WHEN active=1 THEN id ELSE NULL END),
MIN(id)
) AS some_id
class_id, student_id, teacher_id, max(active)
FROM student_classes
GROUP BY class_id, student_id, teacher_id

MySQL versions 5.5, 5.6 works as you coded. But actually it's not correct. With version 5.7 and higher it will throw error. The error will be like "SELECT list is not in GROUP BY clause and contains nonaggregated column 'student_classes.id'..."
Therefore it seems your DB version is old and maybe this code should work as you wanted
select
---------
min(x.id) as id,
---------
x.class_id,
x.student_id,
x.active
from student_classes x
inner join (select
class_id,
student_id,
teacher_id,
---------
max(active) max_active
---------
from student_classes x
group by class_id, student_id, teacher_id
) y
on x.class_id = y.class_id and
x.student_id = y.student_id and
x.teacher_id = y.teacher_id and
x.active = y.max_active
group by x.class_id, x.student_id, x.active
order by id, class_id, student_id
;

You don't want an aggregation actually, but rather pick particular rows. The rule for picking a row is: Per class_id, student_id, teacher_id get the one with the maximum active and in case of a tie the lowest id. This is a ranking of rows.
As of MySQL 8 you can use a window function like ROW_NUMBER to rank rows:
select *
from
(
select
sc.*,
row_number() over (partition by class_id, student_id, teacher_id
order by active desc, id) as rn
from student_classes sc
) with_wanted_id
where rn = 1;
In older versions you could use NOT EXISTS to exclude rows for which a better row exists:
select *
from student_classes sc1
where not exists
(
select null
from student_classes sc2
where sc2.class_id = sc1.class_id
and sc2.student_id = sc1.student_id
and sc2.teacher_id = sc1.teacher_id
and
(
sc2.active > sc1.active
or
(sc2.active = sc1.active and sc2.id < sc1.id)
)
);

Related

count of diferent values with the same id

I have ID_employe in one column (the same ID can be on more than 2 rows) and in other column I have ID_job.
I need MYSQL to find same values in the first column and then check, if there are everytime the same values in the second column.
If there is any difference, I need to give me number of ID_employe what has it different.
So example:
from this example I need SQL to give me result: 2
(because ID_employe 1 and 3 has different ID_job)
Thank you very much!
With EXISTS:
select count(distinct t.ID_employe) counter
from tablename t
where exists (select 1 from tablename where ID_employe = t.ID_employe and ID_job <> t.ID_job)
You can use a having clause and compare the minimum and maximum id_job per id_employee to exhibit those that have at least two jobs. Then you can count in another level of aggregation:
select count(*) cnt
from (
select id_employee
from mytable
group by id_employee
having min(id_job) <> max(id_job)
) t

Query for getting top 5 candidate in every group in single table

I have a table in which student marks in each subject and i have to get query in such a way that i will able to get all top 5 student in every subject who secure highest marks.
Here is a sample table:
My expected output look somthing like :
Top five student in PCM, ART, PCB on the basis of students marks,And also if two or more student secure same than those record also need to be in list with single query.
Original Answer
Technically, what you want to accomplish is not possible using a single SQL query. Had you only wanted one student per subject you could have achieved that using GROUP BY, but in your case it won't work.
The only way I can think of to get 5 students for each subject would be to write x queries, one for each subject and use UNION to glue them together. Such query will return a maximum of 5x rows.
Since you want to get the top 5 students based on the mark, you will have to use an ORDER BY clause, which, in combination with the UNION clauses will cause an error. To avoid that, you will have to use subqueries, so that UNION and ORDER BY clauses are not on the same level.
Query:
-- Select the 5 students with the highest mark in the `PCM` subject.
(
SELECT *
FROM student
WHERE subject = 'PCM'
ORDER BY studentMarks DESC
LIMIT 5
)
UNION
(
SELECT *
FROM student
WHERE subject = 'PCB'
ORDER BY studentMarks DESC
LIMIT 5
)
UNION
(
SELECT *
FROM student
WHERE subject = 'ART'
ORDER BY studentMarks DESC
LIMIT 5
);
Check out this SQLFiddle to evaluate the result yourself.
Updated Answer
This update aims to allow getting more than 5 students in the scenario that many students share the same grade in a particular subject.
Instead of using LIMIT 5 to get the top 5 rows, we use LIMIT 4,1 to get the fifth highest grade and use that to get all students that have a grade more or equal to that in a given subject. Though, if there are < 5 students in a subject LIMIT 4,1 will return NULL. In that case, we want essentially every student, so we use the minimum grade.
To achieve what is described above, you will need to use the following piece of code x times, as many as the subjects you have and join them together using UNION. As can be easily understood, this solution can be used for a small handful of different subjects or the query's extent will become unmaintainable.
Code:
-- Select the students with the top 5 highest marks in the `x` subject.
SELECT *
FROM student
WHERE studentMarks >= (
-- If there are less than 5 students in the subject return them all.
IFNULL (
(
-- Get the fifth highest grade.
SELECT studentMarks
FROM student
WHERE subject = 'x'
ORDER BY studentMarks DESC
LIMIT 4,1
), (
-- Get the lowest grade.
SELECT MIN(studentMarks)
FROM student
WHERE subject = 'x'
)
)
) AND subject = 'x';
Check out this SQLFiddle to evaluate the result yourself.
Alternative:
After some research I found an alternative, simpler query that will yield the same result as the one presented above based on the data you have provided without the need of "hardcoding" every subject in its own query.
In the following solution, we define a couple of variables that help us control the data:
one to cache the subject of the previous row and
one to save an incremental value that differentiates the rows having the same subject.
Query:
-- Select the students having the top 5 marks in each subject.
SELECT studentID, studentName, studentMarks, subject FROM
(
-- Use an incremented value to differentiate rows with the same subject.
SELECT *, (#n := if(#s = subject, #n +1, 1)) as n, #s:= subject
FROM student
CROSS JOIN (SELECT #n := 0, #s:= NULL) AS b
) AS a
WHERE n <= 5
ORDER BY subject, studentMarks DESC;
Check out this SQLFiddle to evaluate the result yourself.
Ideas were taken by the following threads:
Get top n records for each group of grouped results
How to SELECT the newest four items per category?
Select X items from every type
Getting the latest n records for each group
Below query produces almost what I desired, may this query helps others in future.
SELECT a.studentId, a.studentName, a.StudentMarks,a.subject FROM testquery AS a WHERE
(SELECT COUNT(*) FROM testquery AS b
WHERE b.subject = a.subject AND b.StudentMarks >= a.StudentMarks) <= 2
ORDER BY a.subject ASC, a.StudentMarks DESC

Mysql latest record for distinct column

I have a table: invoice
inv_id cus_id due_amt paid total_due
1 71 300 0 300
2 71 200 0 500
3 71 NULL 125 375
4 72 50 0 50
5 72 150 0 200
I want the result
cus_id total_due
71 375
72 200
That is I want the total_due of unique customer or otherwise can say I need the latest invoice details of unique customer.
What I tried:
SELECT cus_id, total_due FROM invoice GROUP BY cus_id ORDER BY inv_id DESC
But this not give the required result.
Please someone can help me..
Try this Query :
SELECT `cus_id` as CustId, (SELECT `total_due` FROM invoice WHERE cus_id = CustId ORDER BY `inv_id` DESC LIMIT 1) as total_due FROM invoice GROUP BY cus_id
create a subquery to get the recent total_due of the customer
SELECT cus_id, (select total_due from invoice where inv_id=max(a.inv_id)) as total_due FROM invoice a GROUP BY cus_id ORDER BY inv_id DESC
Demo here
Try this sample query
SELECT i1.cus_id,i1.total_due FROM invoice as i1
LEFT JOIN invoice AS i2 ON i1.cus_id=i2.cus_id AND i1.inv_id<i2.inv_id
WHERE i2.inv_id IS NULL
Just give a row number based on the group of cus_id and in the descending order of inv_id. Then select the rows having row number 1.
Query
select t1.cus_id, t1.total_due from (
select cus_id, total_due, (
case cus_id when #a
then #b := #b + 1
else #b := 1 and #a := cus_id end
) as rn
from your_table_name t,
(select #b := 0, #a := '') r
order by cus_id, inv_id desc
) t1
where t1.rn = 1
order by t1.cus_id;
Find a demo here
The query:
SELECT cus_id, total_due FROM invoice GROUP BY cus_id ORDER BY inv_id DESC
is invalid SQL because of the total_due column in the SELECT clause.
A query with GROUP BY is allowed to contain in the SELECT clause:
expressions that are also present in the GROUP BY clause;
expressions that use aggregate functions (aka "GROUP BY" functions);
columns that are functionally dependent on columns that are present in the GROUP BY clause.
The expression total_due is neither of the above.
Before version 5.7.5, MySQL used to accept such invalid queries. However, the server was free to return indeterminate values for the invalid expressions. Since version 5.7.5, MySQL rejects such queries (other RDBMSes reject them from long time ago...).
Why is such a query invalid?
Because a GROUP BY query does not return rows from the table. It creates the rows it returns. For each row it puts in the result set it uses a group of rows from the table. All rows in the group have the same values for the expressions present in the GROUP BY clause but they may have distinct values in the other expressions that appear in the SELECT clause.
What's the correct solution for this particular question?
I answered this question many times before on StackOverflow. Take a look at this answer, this answer, this answer or this answer and apply to your query what you learn from there.

MySQL grouping with detail

I have a table that looks like this...
user_id, match_id, points_won
1 14 10
1 8 12
1 12 80
2 8 10
3 14 20
3 2 25
I want to write a MYSQL script that pulls back the most points a user has won in a single match and includes the match_id in the results - in other words...
user_id, match_id, max_points_won
1 12 80
2 8 10
3 2 25
Of course if I didn't need the match_id I could just do...
select user_id, max(points_won)
from table
group by user_id
But as soon as I add match_id to the "select" and "group by" I have a row for every match, and if I only add the match_id to the "select" (and not the "group by") then it won't correctly relate to the points_won.
Ideally I don't want to do the following either because it doesn't feel particularly safe (e.g. if the user has won the same amount of points on multiple matches)...
SELECT t.user_id, max(t.points_won) max_points_won
, (select t2.match_id
from table t2
where t2.user_id = t.user_id
and t2.points_won = max_points_won) as 'match_of_points_maximum'
FROM table t
GROUP BY t.user_id
Are there any more elegant options for this problem?
This is harder than it needs to be in MySQL. One method is a bit of a hack but it works in most circumstances. That is the group_concat()/substring_index() trick:
select user_id, max(points_won),
substring_index(group_concat(match_id order by points_won desc), ',', 1)
from table
group by user_id;
The group_concat() concatenates together all the match_ids, ordered by the points descending. The substring_index() then takes the first one.
Two important caveats:
The resulting expression has a type of string, regardless of the internal type.
The group_concat() uses an internal buffer, whose length -- by default -- is 1,024 characters. This default length can be changed.
You can use the query:
select user_id, max(points_won)
from table
group by user_id
as a derived table. Joining this to the original table gets you what you want:
select t1.user_id, t1.match_id, t2.max_points_won
from table as t1
join (
select user_id, max(points_won) as max_points_won
from table
group by user_id
) as t2 on t1.user_id = t2.user_id and t1.points_won = t2.max_points_won
I think you can optimize your query by add limit 1 in the inner query.
SELECT t.user_id, max(t.points_won) max_points_won
, (select t2.match_id
from table t2
where t2.user_id = t.user_id
and t2.points_won = max_points_won limit 1) as 'match_of_points_maximum'
FROM table t
GROUP BY t.user_id
EDIT : only for postgresql, sql-server, oracle
You could use row_number :
SELECT USER_ID, MATCH_ID, POINTS_WON
FROM
(
SELECT user_id, match_id, points_won, row_number() over (partition by user_id order by points_won desc) rn
from table
) q
where q.rn = 1
For a similar function, have a look at Gordon Linoff's answer or at this article.
In your example, you partition your set of result per user then you order by points_won desc to obtain highest winning point first.

MySQL wrong results with GROUP BY and ORDER BY

I have a table user_comission_configuration_history and I need to select the last Comissions configuration from a user_id.
Tuples:
I'm trying with many queries, but, the results are wrong. My last SQL:
SELECT *
FROM(
SELECT * FROM user_comission_configuration_history
ORDER BY on_date DESC
) AS ordered_history
WHERE user_id = 408002
GROUP BY comission_id
The result of above query is:
But, the correct result is:
id user_id comission_id value type on_date
24 408002 12 0,01 PERCENTUAL 2014-07-23 10:45:42
23 408002 4 0,03 CURRENCY 2014-07-23 10:45:41
21 408002 6 0,015 PERCENTUAL 2014-07-23 10:45:18
What is wrong in my SQL?
This is your query:
SELECT *
FROM (SELECT *
FROM user_comission_configuration_history
ORDER BY on_date DESC
) AS ordered_history
WHERE user_id = 408002
GROUP BY comission_id;
One major problem with your query is that it uses a MySQL extension to group by that MySQL explicitly warns against. The extension is the use of other columns in the in theselect that are not in the group by or in aggregation functions. The warning (here) is:
MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. This means
that the preceding query is legal in MySQL. You can use this feature
to get better performance by avoiding unnecessary column sorting and
grouping. However, this is useful primarily when all values in each
nonaggregated column not named in the GROUP BY are the same for each
group. The server is free to choose any value from each group, so
unless they are the same, the values chosen are indeterminate.
So, the values returned in the columns are indeterminate.
Here is a pretty efficient way to get what you want (with "comission" spelled correctly in English):
SELECT *
FROM user_commission_configuration_history cch
WHERE NOT EXISTS (select 1
from user_commission_configuration_history cch2
where cch2.user_id = cch.user_id and
cch2.commission_id = cch.commission_id and
cch2.on_date > cch.on_date
) AND
cch.user_id = 408002;
Here's one way to do what your trying. It gets the max date for each user_ID and commissionID and then joins this back to the base table to limit the results to just the max date for each commissionID.
SELECT *
FROM user_comission_configuration_history A
INNER JOIN (
SELECT User_ID, Comission_Id, max(on_Date) mOn_Date
FROM user_comission_configuration_history
Group by User-Id, Comission_Id
) B
on B.User_ID = A.User_Id
and B.Comission_Id = A.Comission_ID
and B.mOnDate=A.on_date
WHERE user_id = 408002
ORDER BY on_Date desc;