Greatest 'n' per group by month

Greatest 'n' per group by month - mysql

I have a mysql table with date, name and rating of a person. I need to build a query to show the best person of each month. The query above gives me maximum rating of the month but wrong name/id of person.
SELECT DATE_FORMAT(date,'%m.%Y') as date2, MAX(rating), name FROM test GROUP BY date2
Here's sqlfiddle with sample table: http://sqlfiddle.com/#!2/4dd54b/9
I read several greatest-n-per-group topics, but those queries didn't work, I suppose it's because of grouping by DATE_FORMAT. So here I ask.

The easiest way is to use the substring_index()/group_concat() trick:
SELECT DATE_FORMAT(date, '%m.%Y') as date2, MAX(rating),
substring_index(group_concat(name order by rating desc), ',', 1) as name
FROM test
GROUP BY date2;

A faster solution might look like this - although removal of the DATE_FORMAT function altogether will speed things up even further...
SELECT x.*
FROM test x
JOIN
( SELECT DATE_FORMAT(date,'%Y-%m') dt
, MAX(rating) max_rating
FROM test
GROUP
BY DATE_FORMAT(date,'%Y-%m')
) y
ON y.dt = DATE_FORMAT(x.date,'%Y-%m')
AND y.max_rating = x.rating;

Related

MySQL double averaging with double grouping

I have the following data in a MySQL table called test
I run the following SQL query
SELECT user_id,
group_id,
sum(value) as total_present,
avg(value)*100 as attendance_percentage
FROM test t
WHERE t.session_date BETWEEN '2017-10-01' AND '2017-10-15'
GROUP BY user_id
This gives me percentages for each user_id like this:
If you look at the output example above, user_id 1 and 2 are in the same group_id. So is there a way for me to further group my query to then take an average of the same group_id's. So for the example above, the group_id 3 percentage should be 70.83335

Khalid answer is OK. But I think you should consider the problem you are averaging different things with different magnitude. user_id = 2 has more values than user_id = 1 so his percentage should weight more.
For example if user_id = 3 only went once with 100% attendance that will distort the avg.
You should do:
SELECT group_id, avg(value)
FROM yourTable
GROUP BY group_id
In this case the AVG() is 71.42 instead of 70.83

You can by applying further aggregation on your query
SELECT t.group_id, avg(t.attendance_percentage) as t.group_attendance_percentage
FROM(
SELECT user_id, group_id, sum(value) as total_present, avg(value)*100 as attendance_percentage
FROM test t
WHERE t.session_date BETWEEN '2017-10-01' AND '2017-10-15'
GROUP BY user_id
) t
GROUP BY t.group_id

How to get record with earliest date per group

I need to get the earliest date from the following table based on column 'ItemNo'.
ItemNo PO_number Date
110913 PO-8048 9/15/2015
110913 PO-8036 9/30/2015
110652 PO-1011 10/19/2015
110652 PO-1011 10/10/2015
110009 PO-1016 7/1/2015
110009 PO-1087 6/20/2015
110888 PO-7171 4/1/2015
Your query result should be look like this.
ItemNo PO_number Date
110913 PO-8048 9/15/2015
110652 PO-1011 10/10/2015
110009 PO-1087 6/20/2015
110888 PO-7171 4/1/2015
Any help would be greatly appreciated.

Couple of different ways you could approach this, one reasonable approach would be something like:
with min_rec as
(
select t.ItemNo, t.PO_number, t.Date, row_number() over(partition by t.ItemNo order by t.Date asc) as rn
from your_table t
)
select m.ItemNo, m.PO_number, m.Date
from min_rec m
where m.rn = 1;
Leveraging a CROSS APPLY would be another approach that would work as well, though in this particular case it likely wouldn't be a better performing approach (though as always, it depends):
select distinct c.ItemNo, c.PO_number, c.Date
from your_table t
cross apply (
select top 1 i.ItemNo, i.PO_number, i.Date
from your_table i
where i.ItemNo = t.ItemNo
order by i.Date asc) c;
And naturally, you could simply use a self-joining subquery (I'll skip the example on that one).

invalid use of group function error for extracting date with max and min

I'm trying to execute a query like this:
SELECT MAX(counter), MIN(counter) ,
my_date IN (SELECT my_date FROM my_table WHERE counter = MAX(counter) ) AS max_Date ,
my_date IN (SELECT my_date FROM my_table WHERE counter = MIN(counter) ) AS min_Date
FROM my_table;
and it's giving me the "invalid use of group function" error. what I want to do is to find the date for the maximum counter and then find the date for the minimum counter. Any help!! really appreciate it .. thanks.

You're trying to use the result of aggregate functions (max()/min()) on a row-by-row basis, but those results are not available until the DB has scanned the entire table.
e.g. it's a chicken and egg problem. You need to count chickens, but the eggs that will produce the chickens haven't even been layed yet.
That's why there's HAVING clauses, which allow you to use the results of aggregate functions to do filtering.
Try this for the subqueries:
SELECT my_date FROM my_table HAVING counter = MIN(counter)
^^^^^^

You can get the dates where the largest and smallest counter values appear using a trick with group_concat() and substring_index():
SELECT MAX(counter), MIN(counter) ,
substring_index(group_concat(my_date order by counter desc), ',', 1) as max_date,
substring_index(group_concat(my_date order by counter), ',', 1) as min_date
FROM my_table;
Note: You probably want to format the date first to your liking.
You can also do this with a join.
The problem with your query is:
where counter = min(counter)
You can't include aggregation functions in the where clause, because both are referring to the table in the subquery. You could possibly do this using aliaes, but why bother? There are other ways to write the query.

You need a subselect to get the max and min counters and then join back against the table a couple of times to get the other values from those rows.
SELECT MaxCounter, MinCounter, a.my_date, b.my_date
FROM (SELECT MAX(counter) AS MaxCounter, MIN(counter) AS MinCounter
FROM my_table) Sub1
INNER JOIN my_table a ON Sub1.MaxCounter
INNER JOIN my_table b ON Sub1.MinCounter
Note that this does assume that counter is unique!

ORDER BY date and time BEFORE GROUP BY name in mysql

i have a table like this:
name date time
tom | 2011-07-04 | 01:09:52
tom | 2011-07-04 | 01:09:52
mad | 2011-07-04 | 02:10:53
mad | 2009-06-03 | 00:01:01
i want oldest name first:
SELECT *
ORDER BY date ASC, time ASC
GROUP BY name
(->doesn't work!)
now it should give me first mad(has earlier date) then tom
but with GROUP BY name ORDER BY date ASC, time ASC gives me the newer mad first because it groups before it sorts!
again: the problem is that i can't sort by date and time before i group because GROUP BY must be before ORDER BY!

Another method:
SELECT *
FROM (
SELECT * FROM table_name
ORDER BY date ASC, time ASC
) AS sub
GROUP BY name
GROUP BY groups on the first matching result it hits. If that first matching hit happens to be the one you want then everything should work as expected.
I prefer this method as the subquery makes logical sense rather than peppering it with other conditions.

As I am not allowed to comment on user1908688's answer, here a hint for MariaDB users:
SELECT *
FROM (
SELECT *
ORDER BY date ASC, time ASC
LIMIT 18446744073709551615
) AS sub
GROUP BY sub.name
https://mariadb.com/kb/en/mariadb/why-is-order-by-in-a-from-subquery-ignored/

I think this is what you are seeking :
SELECT name, min(date)
FROM myTable
GROUP BY name
ORDER BY min(date)
For the time, you have to make a mysql date via STR_TO_DATE :
STR_TO_DATE(date + ' ' + time, '%Y-%m-%d %h:%i:%s')
So :
SELECT name, min(STR_TO_DATE(date + ' ' + time, '%Y-%m-%d %h:%i:%s'))
FROM myTable
GROUP BY name
ORDER BY min(STR_TO_DATE(date + ' ' + time, '%Y-%m-%d %h:%i:%s'))

This worked for me:
SELECT *
FROM your_table
WHERE id IN (
SELECT MAX(id)
FROM your_table
GROUP BY name
);

Use a subselect:
select name, date, time
from mytable main
where date + time = (select min(date + time) from mytable where name = main.mytable)
order by date + time;

If you wont sort by max date and group by name, you can do this query:
SELECT name,MAX(date) FROM table group by name ORDER BY name
where date may by some date or date time string. It`s response to you max value of date by each one name

Another way to solve this would be with a LEFT JOIN, which could be more efficient. I'll first start with an example that considers only the date field, as probably it is more common to store date + time in one datetime column, and I also want to keep the query simple so it's easier to understand.
So, with this particular example, if you want to show the oldest record based on the date column, and assuming that your table name is called people you can use the following query:
SELECT p.* FROM people p
LEFT JOIN people p2 ON p.name = p2.name AND p.date > p2.date
WHERE p2.date is NULL
GROUP BY p.name
What the LEFT JOIN does, is when the p.date column is at its minimum value, there will be no p2.date with a smaller value on the left join and therefore the corresponding p2.date will be NULL. So, by adding WHERE p2.date is NULL, we make sure to show only the records with the oldest date.
And similarly, if you want to show the newest record instead, you can just change the comparison operator in the LEFT JOIN:
SELECT p.* FROM people p
LEFT JOIN people p2 ON p.name = p2.name AND p.date < p2.date
WHERE p2.date is NULL
GROUP BY p.name
Now, for this particular example where date+time are separate columns, you would need to add them in some way if you want to query based on the datetime of two columns combined, for example:
SELECT p.* FROM people p
LEFT JOIN people p2 ON p.name = p2.name AND p.date + INTERVAL TIME_TO_SEC(p.time) SECOND > p2.date + INTERVAL TIME_TO_SEC(p2.time) SECOND
WHERE p2.date is NULL
GROUP BY p.name
You can read more about this (and also see some other ways to accomplish this) on the The Rows Holding the Group-wise Maximum of a Certain Column page.

I had a different variation on this question where I only had a single DATETIME field and needed a limit after a group by or distinct after sorting descending based on the datetime field, but this is what helped me:
select distinct (column) from
(select column from database.table
order by date_column DESC) as hist limit 10
In this instance with the split fields, if you can sort on a concat, then you might be able to get away with something like:
select name,date,time from
(select name from table order by concat(date,' ',time) ASC)
as sorted
Then if you wanted to limit you would simply add your limit statement to the end:
select name,date,time from
(select name from table order by concat(date,' ',time) ASC)
as sorted limit 10

In Oracle, This work for me
SELECT name, min(date), min(time)
FROM table_name
GROUP BY name

work for me mysql
select * from (SELECT number,max(date_added) as datea FROM sms_chat group by number) as sup order by datea desc

This is not the exact answer, but this might be helpful for the people looking to solve some problem with the approach of ordering row before group by in mysql.
I came to this thread, when I wanted to find the latest row(which is order by date desc but get the only one result for a particular column type, which is group by column name).
One other approach to solve such problem is to make use of aggregation.
So, we can let the query run as usual, which sorted asc and introduce new field as max(doc) as latest_doc, which will give the latest date, with grouped by the same column.
Suppose, you want to find the data of a particular column now and max aggregation cannot be done.
In general, to finding the data of a particular column, you can make use of GROUP_CONCAT aggregator, with some unique separator which can't be present in that column, like GROUP_CONCAT(string SEPARATOR ' ') as new_column, and while you're accessing it, you can split/explode the new_column field.
Again, this might not sound to everyone. I did it, and liked it as well because I had written few functions and I couldn't run subqueries. I am working on codeigniter framework for php.
Not sure of the complexity as well, may be someone can put some light on that.
Regards :)

MySQL Query GROUP BY day / month / year

Is it possible to make a simple query to count how many records I have in a determined period of time like a year, month, or day, having a TIMESTAMP field, like:
SELECT COUNT(id)
FROM stats
WHERE record_date.YEAR = 2009
GROUP BY record_date.YEAR
Or even:
SELECT COUNT(id)
FROM stats
GROUP BY record_date.YEAR, record_date.MONTH
To have a monthly statistic.
Thanks!

GROUP BY YEAR(record_date), MONTH(record_date)
Check out the date and time functions in MySQL.

GROUP BY DATE_FORMAT(record_date, '%Y%m')
Note (primarily, to potential downvoters). Presently, this may not be as efficient as other suggestions. Still, I leave it as an alternative, and a one, too, that can serve in seeing how faster other solutions are. (For you can't really tell fast from slow until you see the difference.) Also, as time goes on, changes could be made to MySQL's engine with regard to optimisation so as to make this solution, at some (perhaps, not so distant) point in future, to become quite comparable in efficiency with most others.

try this one
SELECT COUNT(id)
FROM stats
GROUP BY EXTRACT(YEAR_MONTH FROM record_date)
EXTRACT(unit FROM date) function is better as less grouping is used and the function return a number value.
Comparison condition when grouping will be faster than DATE_FORMAT function (which return a string value). Try using function|field that return non-string value for SQL comparison condition (WHERE, HAVING, ORDER BY, GROUP BY).

I tried using the 'WHERE' statement above, I thought its correct since nobody corrected it but I was wrong; after some searches I found out that this is the right formula for the WHERE statement so the code becomes like this:
SELECT COUNT(id)
FROM stats
WHERE YEAR(record_date) = 2009
GROUP BY MONTH(record_date)

If your search is over several years, and you still want to group monthly, I suggest:
version #1:
SELECT SQL_NO_CACHE YEAR(record_date), MONTH(record_date), COUNT(*)
FROM stats
GROUP BY DATE_FORMAT(record_date, '%Y%m')
version #2 (more efficient):
SELECT SQL_NO_CACHE YEAR(record_date), MONTH(record_date), COUNT(*)
FROM stats
GROUP BY YEAR(record_date)*100 + MONTH(record_date)
I compared these versions on a big table with 1,357,918 rows (innodb),
and the 2nd version appears to have better results.
version1 (average of 10 executes): 1.404 seconds
version2 (average of 10 executes): 0.780 seconds
(SQL_NO_CACHE key added to prevent MySQL from CACHING to queries.)

If you want to filter records for a particular year (e.g. 2000) then optimize the WHERE clause like this:
SELECT MONTH(date_column), COUNT(*)
FROM date_table
WHERE date_column >= '2000-01-01' AND date_column < '2001-01-01'
GROUP BY MONTH(date_column)
-- average 0.016 sec.
Instead of:
WHERE YEAR(date_column) = 2000
-- average 0.132 sec.
The results were generated against a table containing 300k rows and index on date column.
As for the GROUP BY clause, I tested the three variants against the above mentioned table; here are the results:
SELECT YEAR(date_column), MONTH(date_column), COUNT(*)
FROM date_table
GROUP BY YEAR(date_column), MONTH(date_column)
-- codelogic
-- average 0.250 sec.
SELECT YEAR(date_column), MONTH(date_column), COUNT(*)
FROM date_table
GROUP BY DATE_FORMAT(date_column, '%Y%m')
-- Andriy M
-- average 0.468 sec.
SELECT YEAR(date_column), MONTH(date_column), COUNT(*)
FROM date_table
GROUP BY EXTRACT(YEAR_MONTH FROM date_column)
-- fu-chi
-- average 0.203 sec.
The last one is the winner.

You can do this simply Mysql DATE_FORMAT() function in GROUP BY. You may want to add an extra column for added clarity in some cases such as where records span several years then same month occurs in different years.Here so many option you can customize this. Please read this befor starting. Hope it should be very helpful for you. Here is sample query for your understanding
SELECT
COUNT(id),
DATE_FORMAT(record_date, '%Y-%m-%d') AS DAY,
DATE_FORMAT(record_date, '%Y-%m') AS MONTH,
DATE_FORMAT(record_date, '%Y') AS YEAR
FROM
stats
WHERE
YEAR = 2009
GROUP BY
DATE_FORMAT(record_date, '%Y-%m-%d ');

If you want to group by date in MySQL then use the code below:
SELECT COUNT(id)
FROM stats
GROUP BY DAYOFMONTH(record_date)
Hope this saves some time for the ones who are going to find this thread.

Complete and simple solution with similarly performing yet shorter and more flexible alternative currently active:
SELECT COUNT(*) FROM stats
-- GROUP BY YEAR(record_date), MONTH(record_date), DAYOFMONTH(record_date)
GROUP BY DATE_FORMAT(record_date, '%Y-%m-%d')

If you want to get a monthly statistics with row counts per month of each year ordered by latest month, then try this:
SELECT count(id),
YEAR(record_date),
MONTH(record_date)
FROM `table`
GROUP BY YEAR(record_date),
MONTH(record_date)
ORDER BY YEAR(record_date) DESC,
MONTH(record_date) DESC

The following query worked for me in Oracle Database 12c Release 12.1.0.1.0
SELECT COUNT(*)
FROM stats
GROUP BY
extract(MONTH FROM TIMESTAMP),
extract(MONTH FROM TIMESTAMP),
extract(YEAR FROM TIMESTAMP);

I prefer to optimize the one year group selection like so:
SELECT COUNT(*)
FROM stats
WHERE record_date >= :year
AND record_date < :year + INTERVAL 1 YEAR;
This way you can just bind the year in once, e.g. '2009', with a named parameter and don't need to worry about adding '-01-01' or passing in '2010' separately.
Also, as presumably we are just counting rows and id is never NULL, I prefer COUNT(*) to COUNT(id).

try it
GROUP BY YEAR(record_date), MONTH(record_date)

I wanted to get similar data per day, after experimenting a bit, this is the fastest I could find for my scenario
SELECT COUNT(id)
FROM stats
GROUP BY record_date DIV 1000000;
If you want to have it per month, add additional zeroes (00)
I would not recommend this from "make the code readable" perspective, it might also break in different versions. But in our case this took less then half the time compared to some other more clearer queries that I tested.
This is a MySQL answer (as MySQL is tagged in the question) and well documented in the manual https://dev.mysql.com/doc/refman/8.0/en/date-and-time-type-conversion.html

.... group by to_char(date, 'YYYY') --> 1989
.... group by to_char(date,'MM') -->05
.... group by to_char(date,'DD') --->23
.... group by to_char(date,'MON') --->MAY
.... group by to_char(date,'YY') --->89

Here's one more approach. This uses [MySQL's LAST_DAY() function][1] to map each timestamp to its month. It also is capable of filtering by year with an efficient range-scan if there's an index on record_date.
SELECT LAST_DAY(record_date) month_ending, COUNT(*) record_count
FROM stats
WHERE record_date >= '2000-01-01'
AND record_date < '2000-01-01' + INTERVAL 1 YEAR
GROUP BY LAST_DAY(record_date)
If you want your results by day, use DATE(record_date) instead.
If you want your results by calendar quarter, use YEAR(record_date), QUARTER(record_date).
Here's a writeup. https://www.plumislandmedia.net/mysql/sql-reporting-time-intervals/
[1]: https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_last-day

Or you can use group by clause like this,
//to get data by month and year do this ->
SELECT FORMAT(TIMESTAMP_COLUMN, 'MMMM yy') AS Month, COUNT(ID) FROM TABLE_NAME GROUP BY FORMAT(TIMESTAMP_COLUMN, 'MMMM yy')
if you want to fetch records by date then in group by change format to
'dd-mm-yy' or 'dd-MMMM-yyy'

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Greatest 'n' per group by month - mysql

The easiest way is to use the substring_index()/group_concat() trick: SELECT DATE_FORMAT(date, '%m.%Y') as date2, MAX(rating), substring_index(group_concat(name order by rating desc), ',', 1) as name FROM test GROUP BY date2;

Related

MySQL double averaging with double grouping

How to get record with earliest date per group

invalid use of group function error for extracting date with max and min

ORDER BY date and time BEFORE GROUP BY name in mysql

MySQL Query GROUP BY day / month / year

Categories

Resources