Grouping MySQL datetime into intervals irrespective of timezone - mysql

This question has been asked before but I am facing a slightly different problem.
I have a table which logs events and stores their timestamps (as datetime). I need to be able to break up time into chunks and get number of events that occurred in that interval. The interval can be custom (Say from 5 minutes to 1 hour and even beyond).
The obvious solution is to convert the datetime to unix_timestamp divide it by number of seconds in the interval, take its floor function and multiply it back by the number of seconds. Finally convert the unix_timestamp back to the datetime format.
This works fine for small intervals.
select
from_unixtime(floor(unix_timestamp(event.timestamp)/300)*300) as start_time,
count(*) as total
from event
where timestamp>='2012-08-03 00:00:00'
group by start_time;
This gives the correct output
+---------------------+-------+
| start_time | total |
+---------------------+-------+
| 2012-08-03 00:00:00 | 11 |
| 2012-08-03 00:05:00 | 4 |
| 2012-08-03 00:10:00 | 4 |
| 2012-08-03 00:15:00 | 7 |
| 2012-08-03 00:20:00 | 8 |
| 2012-08-03 00:25:00 | 1 |
| 2012-08-03 00:30:00 | 1 |
| 2012-08-03 00:35:00 | 3 |
| 2012-08-03 00:40:00 | 3 |
| 2012-08-03 00:45:00 | 5 |
~~~~~OUTPUT SNIPPED~~~~~~~~~~~~
But if I increase the interval to say 1 hour (3600 sec)
mysql> select from_unixtime(floor(unix_timestamp(event.timestamp)/3600)*3600) as start_time, count(*) as total from event where timestamp>='2012-08-03 00:00:00' group by start_time;
+---------------------+-------+
| start_time | total |
+---------------------+-------+
| 2012-08-02 23:30:00 | 35 |
| 2012-08-03 00:30:00 | 30 |
| 2012-08-03 01:30:00 | 12 |
| 2012-08-03 02:30:00 | 18 |
| 2012-08-03 03:30:00 | 12 |
| 2012-08-03 04:30:00 | 4 |
| 2012-08-03 05:30:00 | 3 |
| 2012-08-03 06:30:00 | 13 |
| 2012-08-03 07:30:00 | 269 |
| 2012-08-03 08:30:00 | 681 |
| 2012-08-03 09:30:00 | 1523 |
| 2012-08-03 10:30:00 | 911 |
+---------------------+-------+
The reason, as far as I could gauge, for the boundaries not being set properly is that unix_timestamp will convert time from my local timezone (GMT + 0530) to UTC and then output the numerical value.
So a value like 2012-08-03 00:00:00 will actually be 2012-08-02 18:30:00. Dividing and using floor will set the minutes part to 00. But when I use from_unixtime, it will convert it back to GMT + 0530 and hence give me intervals that begin at 30 mins.
How do I ensure the query works correctly irrespective of the timezone? I use MySQL 5.1.52 so to_seconds() is not available
EDIT:
The query should also fire correctly irrespective of the interval (can be hours, minutes, days). A generic solution would be appreciated

You can use TIMESTAMPDIFF to group by intervals of time:
For a specified interval of hours, you can use:
SELECT '2012-08-03 00:00:00' +
INTERVAL FLOOR(TIMESTAMPDIFF(HOUR, '2012-08-03 00:00:00', timestamp) / <n>) * <n> HOUR AS start_time,
COUNT(*) AS total
FROM event
WHERE timestamp >= '2012-08-03 00:00:00'
GROUP BY start_time
Replace the occurances of 2012-08-03 00:00:00 with your minimum input date.
<n> is your specified interval in hours (every 2 hours, 3 hours, etc.), and you can do the same for minutes:
SELECT '2012-08-03 00:00:00' +
INTERVAL FLOOR(TIMESTAMPDIFF(MINUTE, '2012-08-03 00:00:00', timestamp) / <n>) * <n> MINUTE AS start_time,
COUNT(*) AS total
FROM event
WHERE timestamp >= '2012-08-03 00:00:00'
GROUP BY start_time
Where <n> is your specified interval in minutes (every 45 minutes, 90 minutes, etc).
Be sure you're passing in your minimum input date (in this example 2012-08-03 00:00:00) as the second parameter to TIMESTAMPDIFF.
EDIT: If you don't want to worry about which interval unit to pick in the TIMESTAMPDIFF function, then of course just do the interval by seconds (300 = 5 minutes, 3600 = 1 hour, 7200 = 2 hours, etc.)
SELECT '2012-08-03 00:00:00' +
INTERVAL FLOOR(TIMESTAMPDIFF(SECOND, '2012-08-03 00:00:00', timestamp) / <n>) * <n> SECOND AS start_time,
COUNT(*) AS total
FROM event
WHERE timestamp >= '2012-08-03 00:00:00'
GROUP BY start_time
EDIT2: To address your comment pertaining to reducing the number of areas in the statement where you have to pass in your minimum parameter date, you can use:
SELECT b.mindate +
INTERVAL FLOOR(TIMESTAMPDIFF(SECOND, b.mindate, timestamp) / <n>) * <n> SECOND AS start_time,
COUNT(*) AS total
FROM event
JOIN (SELECT '2012-08-03 00:00:00' AS mindate) b ON timestamp >= b.mindate
GROUP BY start_time
And simply pass in your minimum datetime parameter once into the join subselect.
You can even make a second column in the join subselect for your seconds interval (e.g. 3600) and name the column something like secinterval... then change the <n>'s to b.secinterval, so you only have to pass in your minimum date parameter AND interval one time each.
SQLFiddle Demo

the easier method would be:
Method1
select date(timestamp) as date_timestamp, hour(timestamp) as hour_timestamp, count(*) as total
from event
where timestamp>='2012-08-03 00:00:00'
group by date_timestamp, hour_timestamp
if you would like to use your original approach.
Method2
select from_unixtime(floor(unix_timestamp(event.timestamp-1800)/3600)*3600+1800) as start_time,
count(*) as total
from event
where timestamp>='2012-08-03 00:00:00'
group by start_time;
EDIT1
for the first method, it also allows user to set different interval.
For example, if user wants the log to group by 15 minutes,
select date(time) as date_timestamp,
hour(time) as hour_timestamp,
floor(minute(time) as minute_timestamp / 15) * 15 as minute_timestamp
count(*) as total
from event
group by date_timestamp, hour_timestamp, minute_timestamp

Related

Add seconds since Time '08:05' if timestamp is older

I have this working SQL code (MySql), that returns number of Seconds a MState value have been eg. 4
SELECT SUM(Seconds_In_State) From
(SELECT
Time_Stamp,
MState,
(-1 * TIMESTAMPDIFF(Minute, LEAD(Time_Stamp) OVER(ORDER BY Time_Stamp), Time_Stamp))
AS Seconds_In_State
FROM Mstate
WHERE DATE(`Time_Stamp`) = CURDATE()
AND TIME(`Time_Stamp`) >= '08:05' /*ShiftStart input tag*/
ORDER BY Time_Stamp) AS T
Where MState = 4;
The result will not include the 300 seconds, where MState have been 4, from 08:05 to 08:10. Since I look at timestamps > 08:05. But I would like to include the time from 08:05 to 08:10 where the MState have been 4.
So it finds how many seconds from Time_Stamp where MState is 4, to next timestamp. From 08:05, and sums them together. If MState 4 have timestamp 08:00, it is not included. But I would like to include the seconds from 08:05 to 08:10, where MState is 4.
So if the MState prior to Time_Stamp >= 08:05, have value 4, add seconds from 08:05 to 08:10, to Seconds_In_State.
Data
|Time_Stamp | MState |
|------------------------|------- |
|2021-04-23 07:50:00 | 3 |
|2021-04-23 08:00:00 | 4 |
|2021-04-23 08:10:00 | 1 |
|2021-04-23 08:22:00 | 2 |
|2021-04-23 08:30:00 | 3 |
|2021-04-23 08:40:00 | 4 |
|2021-04-23 08:50:00 | 1 |
|2021-04-23 09:01:00 | 2 |
|2021-04-23 09:10:00 | 3 |
Result from current code:
|SUM(Seconds_In_State) |
|600 |
Result I would like to get:
|SUM(Seconds_In_State) |
|900 |
You would seem to want lag() and a comparison to enforce the timeframe:
SELECT SUM(Seconds_In_State)
FROM (SELECT s.*,
TIMESTAMPDIFF(Minute,
GREATEST(LAG(Time_Stamp) OVER (ORDER BY Time_Stamp), addtime(cast(curdate() as datetime), '08:05')),
Time_Stamp
) AS Seconds_In_State
FROM Mstate s
WHERE DATE(Time_Stamp) = CURDATE() AND
TIME(Time_Stamp) >= '08:05' /*ShiftStart input tag*/
) s
WHERE MState = 4
ORDER BY Time_Stamp

How to select count today and tomorrow data less than specific time group by day?

I have a table like a table below.
I want to select count and group by day.
But the data in 1 day will start counts at 7:00:00 until tomorrow at 6:59:59 (24hr.).
For example
Day 1 data between '2019/06/01 7:00:00' and '2019/06/02 06:59:59'
Day 2 data between '2019/06/02 7:00:00' and '2019/06/03 06:59:59'
How can I code the where condition?
id | create_date | judge |
-----+---------------------+---------+
1 | 2019-06-02 8:00:00 | ok |
2 | 2019-06-02 9:00:00 | ok |
3 | 2019-06-02 10:00:00 | ok |
4 | 2019-06-02 11:00:00 | ok |
5 | 2019-06-02 15:00:00 | ok |
6 | 2019-06-03 4:00:00 | ok |
7 | 2019-06-03 5:00:00 | ok |
8 | 2019-06-03 8:00:00 | ok |
9 | 2019-06-03 9:00:00 | ok |
10 | 2019-06-03 9:00:00 | fail |
I've tried below but the result is not as expected.
SELECT COUNT(*),DAY(create_date)
FROM mytable
WHERE judge = 'ok' and MONTH(create_date) = '6' and YEAR(create_date) = '2019' and TIME(create_date) > '07:00:00'
Group by DAY(create_date) order by DAY(create_date) ASC
Expected results
COUNT(*) | DAY(create_date) |
-----------+---------------------+
7 | 2 | (from id 1 to 7)
2 | 3 | (from id 8 and 9)
You could subtract seven hours from each date, truncate them to show the date only and then group them:
SELECT DATE(DATE_SUB(create_date, INTERVAL 7 HOUR)), COUNT(*)
FROM mytable
-- Where clause if you need it...
GROUP BY DATE(DATE_SUB(create_date, INTERVAL 7 HOUR))
Just subtract 7 hours for the aggregation and the date/time comparisons:
SELECT DATE(create_date - interval 7 hour) as dte, COUNT(*)
FROM mytable
WHERE judge = 'ok' and
create_date >= '2019-06-01 07:00:00' AND
create_date < '2019-07-01 07:00:00'
GROUP BY DATE(create_date - interval 7 hour)
ORDER BY dte;
Try this-
SELECT
CAST(DATE_SUB(create_date, INTERVAL 7 HOUR) AS DATE),
COUNT(*)
FROM YOUR_TABLE
GROUP BY CAST(DATE_SUB(create_date, INTERVAL 7 HOUR) AS DATE)

MySQL - Select rows where 1 hour before datetime column

I have these rows
id | start_time |
1 | 2018-06-15 02:00:00 |
2 | 2018-06-15 02:45:00 |
3 | 2018-06-15 03:45:00 |
I want to select rows that are 1 hour before the start_time. So if the time is 2018-06-15 01:00:00 then the first row should be returned.
How do i do this? I've tried below but i don't know how to subtract 1 hour from start_time.
SELECT *
FROM table1
WHERE DATE_FORMAT(start_time, '%Y-%m-%d %H') <= DATE_FORMAT(NOW(), '%Y-%m-%d %H');
To subtract hours ,use date_sub function
In your case
SELECT DATE_SUB(DATE_FORMAT(start_time, '%Y-%m-%d %H'), INTERVAL 1 HOUR)

mysql Return amount of order for hour with on colum day of month

I have a mysql db which I use to return amounts of orders by hour in a specific day. I use this SELECT statement for that.
select
hour(datains),sum(valore)
from
ordini
where (stato=10 or stato = 1 ) and DATE(datains) = DATE_SUB(CONCAT(CURDATE(), ' 00:00:00'), INTERVAL 0 DAY)
group by hour(datains)
order by
id DESC
It returns:
+--------------+---------------+
| hour datains | valore |
| 12 | 34 |
| 11 | 56 |
| 10 | 134 |
+-------------------------------
Now I need to have columns for a certain number of days, like this.
+--------------+---------------+--------------+--------------+
| hour datains | 01-01-2014 | 02-01-2014 | 03-01-2014 |
| 12 | 34 | 34 | 77 |
| 11 | 56 | 0 | 128 |
| 10 | 134 | 66 | 12 |
+------------------------------+-----------------------------+
Is this possible?
It seems you have a table ordini with columns datains, valore, and stato.
Perhaps you can try this query to generate hour-by-hour aggregates for a three days' worth of recent sales, but not including today.
SELECT DATE_FORMAT(datains, '%Y-%m-%d %H:00') AS hour,
SUM(valore) AS valore
FROM ordini
WHERE (stato = 1 OR stato = 10)
AND datains >= CURRENT_DATE() - INTERVAL 3 DAY
AND datains < CURRENT_DATE
GROUP BY DATE_FORMAT(datains, '%Y-%m-%d %H:00')
ORDER BY DATE_FORMAT(datains, '%Y-%m-%d %H:00')
This will give you a result set with one row for each hour of the three days, for example:
2014-01-01 10:00 456
2014-01-01 11:00 123
2014-01-02 10:00 28
2014-01-02 11:00 350
2014-01-02 12:00 100
2014-01-02 13:00 17
2014-01-03 10:00 321
2014-01-03 11:00 432
2014-01-03 12:00 88
2014-01-03 13:00 12
That's the data summary you have requested, but formatted row-by-row. Your next step is to figure out an appropriate technique to pivot that result set, formatting it so some rows become columns.
It happens that I have just written a post on this very topic. It is here:
http://www.plumislandmedia.net/mysql/sql-reporting-time-intervals/

Timezone issue on Simple temperature trend using average for last hour in mysql

I would like to use highcharts to see the temperature behaviour on last hour.
I record 20 to 30 temperature values for each hour.
Here, I would like to extract, for last hour, 4 to 6 average values (one value for a 10 or 15 minutes period) and plot them. Maybe I will change that to 3 values (one for 20 minutes) to get something smoother.
I have values like that (for example) :
mysql> SELECT date,valeur FROM temperature
+---------------------+--------+
| date | valeur |
+---------------------+--------+
| 2013-09-26 11:30:40 | 25.2 |
| 2013-09-26 11:33:19 | 25.4 |
| 2013-09-26 11:34:12 | 25.5 |
| 2013-09-26 11:38:37 | 25.4 |
| 2013-09-26 11:39:30 | 25.4 |
| 2013-09-26 11:40:23 | 25.4 |
| 2013-09-26 11:43:02 | 25.4 |
| 2013-09-26 11:45:41 | 25.3 |
| 2013-09-26 11:47:33 | 25.3 |
| 2013-09-26 11:51:07 | 25.4 |
| 2013-09-26 11:51:52 | 25.3 |
...
I tried to extract with this command :
SELECT ROUND(UNIX_TIMESTAMP(date)/(15 * 60)) AS timekey, ROUND(AVG(valeur),1) AS a FROM temperature WHERE date >= (now() - INTERVAL 1 HOUR) GROUP BY timekey ORDER BY DATE;
But I don't get any output. If I change the interval to 5 hours, I get 16 values :
[1534861, 24.600000]
[1534862, 24.600000]
[1534863, 24.600000]
[1534864, 24.700000]
[1534865, 24.700000]
[1534866, 24.600000]
[1534867, 24.600000]
[1534868, 24.600000]
[1534869, 24.600000]
[1534870, 24.600000]
[1534871, 24.700000]
[1534872, 24.700000]
[1534873, 24.700000]
[1534874, 24.800000]
[1534875, 25.000000]
[1534876, 25.200000]
Any idea how to correct this mysql request ?
Thanks you all
Greg
edit - See selected answer : the code was good, but the timezone wasn't !
I am guessing your issue is most likely a timezone difference of 1 hour.
if you get no values for last hour but you get 16 for last 5(making up 4 hours worth of values), that sounds like you have no values for last hour. If you are certain you do, check the timezone of data vs timezone of now()
try using sysdate perhaps. Quote from manual:
In addition, the SET TIMESTAMP statement affects the value returned by
NOW() but not by SYSDATE(). This means that timestamp settings in the
binary log have no effect on invocations of SYSDATE(). Setting the
timestamp to a nonzero value causes each subsequent invocation of
NOW() to return that value. Setting the timestamp to zero cancels this
effect so that NOW() once again returns the current date and time.
See the description for SYSDATE() for additional information about the
differences between the two functions.
This should work (note floor() usage):
SELECT from_unixtime(floor(unix_timestamp(date) / 15 * 60) * 15 * 60) AS tstamp,
round(avg(valeur),1) AS a
FROM temperature
WHERE date >= (now() - INTERVAL 1 HOUR)
GROUP BY 1
ORDER BY 1