How to group datetime into intervals of 3 hours in mysql - mysql

I have the follwoing table structure
start
count
2022-08-02 22:13:35
20
2022-08-03 04:27:20
10
2022-08-03 09:21:48
10
2022-08-03 14:25:48
10
2022-08-03 14:35:07
10
2022-08-03 15:16:09
10
2022-08-04 07:09:07
20
2022-08-04 10:35:45
10
2022-08-04 14:42:49
10
I want to group the start column into 3 hour intervals and sum the count
like follows
interval
count
01h-03h
400
03h-06h
78
...
...
...
....
20h-23h
100
23h-01h
64
I have the following query but am not sure how to proceed from here
select hour(start), sum(count) from `table`
GROUP BY hour(start)

To do this you need to be able to take any DATETIME value and truncate it to the most recent three-hour boundary. For example you need to take 2022-09-06 19:35:20 and convert it to 2022-09-06 18:00:00.
Do that with an expression like this:
DATE(start) + INTERVAL (HOUR(start) - MOD (HOUR(start), 3)) HOUR
This truncates the value to the nearest DATE(), then adds back the correct number of hours.
So a query might look like this:
SELECT DATE(start) + INTERVAL (HOUR(start) - MOD (HOUR(start), 3)) HOUR,
SUM(count)
FROM table
GROUP BY DATE(start) + INTERVAL (HOUR(start) - MOD (HOUR(start), 3)) HOUR
The trick to solving this problem of aggregating database rows over blocks of time is, generally, to come up with the appropriate way of truncating the DATETIME or TIMESTAMP values. Writeup here.
And if you want to aggregate by 3 hour intervals, with all days gathered into a single result of eight rows, do this.
SELECT HOUR(start) - MOD(HOUR(start, 3)),
SUM(count)
FROM table
GROUP BY HOUR(start) - MOD(HOUR(start, 3))
Again, you use an expression to truncate each value to the appropriate block of time.

select hr_range,sum(res.cnt)
from (
select hour(start) hr,
case when hour(start) between 0 and 3 then '00h-03h'
when hour(start) between 4 and 6 then '04h-06h'
when hour(start) between 7 and 9 then '07h-09h'
....
when hour(start) between 21 and 23 then '21h-23h'
end as hr_range,
sum(count) as cnt from `table`
GROUP BY hour(start)
)res
group by hr_range
I think this is one of the way to solve the issue

Related

Query that accounts for changes in the hour of a timestamp

The problem is as follows. I have a table that contains the following columns:
machine
timestamp
speed (meters/minute)
C1
22/9/2020, 16:45
15
C1
22/9/2020, 16:55
5
C1
22/9/2020, 17:20
19
What I want to know is the distance travelled in each hour for each machine, so I need to subtract the timestamp of the current row from the timestamp of the next row, and then multiply by the speed of the first row (e.g: 16:55 - 16:45 = 10 minutes -> 10 * 15 = 150 meters between 16:45 and 16:55).
I was able to do that by using a logic similar to the one below (it is not all the same query):
' 1st get the timestamp of the next row'
lead(query."timestamp") OVER (PARTITION BY query.id ORDER BY query."timestamp") AS lead_timestamp
' 2nd get the duration'
query."lead_timestamp" - query."timestamp" AS duration
' 3rd calculate the distance'
query."duration" * query."speed" AS distance
' 4th group by hour'
GROUP BY date_trunc('hour', CAST(query."timestamp" AS timestamp)
It works almost 100% fine. I get a table similar to the one below:
machine
timestamp
duration (meters)
C1
22/9/2020, 16:00
275
C1
22/9/2020, 17:00
...
But as you can see, as I group data per hour, the total meters for the hour 16:00 are not correct because there wasn't a timestamp after the timestamp equal to "22/9/2020, 16:55" that forced the grouping to end at "22/9/2020, 16:59". So, in the end, I added to the hour 16:00 part of the duration for the hour 17:00 (those 20 minutes were added to hour 16:00).
I am not sure how to solve this problem but I have looked into UNION to add an 'artificial' row whenever there's a transition of hour between timestamps, before even starting subtracting values to calculate the duration. But it seems rather complicated since I would have to do it for each machine and I don't know how many rows it would be.
Can you help me? Thanks! If I was not clear pls ask for more info!
This could be of help:
WITH cte as (
select 'c1' as machine, '2020-09-22 16:46' as timestamp, 15 as speed
union all
select 'c1', '2020-09-22 16:56', 5
union all
select 'c1', '2020-09-22 17:20', 19)
select
machine,
timestamp,
speed,
lead(timestamp) over (partition by machine) nextTime,
TIMEDIFF( lead(timestamp) over (partition by machine), timestamp) diffTime,
minute(TIMEDIFF( lead(timestamp) over (partition by machine), timestamp))*speed meters
from cte;
output:
machine
timestamp
speed
nextTime
diffTime
meters
c1
2020-09-22 16:46
15
2020-09-22 16:56
00:10:00.000000
150
c1
2020-09-22 16:56
5
2020-09-22 17:20
00:24:00.000000
120
c1
2020-09-22 17:20
19

Daterange histogram with SQL (MySQL)

I have a table with date ranges like the following:
start | end
2020-07-25 20:37:00 2020-07-25 20:44:00
2020-07-25 21:37:00 2020-07-25 22:44:00
2020-07-26 07:11:00 2020-07-27 10:50:00
...
At the end, I want a histogram which shows for every hour of a day how many date ranges "overlaps" each hour.
So the resulting histogram consists of 24 bars.
How do I do this in SQL for MySQL? (Side note: I'm using TypeORM, but I'm able to write plain SQL statements)
I only found solutions calculating and grouping by the length of the individual intervals with TIMESTAMPDIFF, but that's not what I want to achieve.
In future I may want to show the same histogram not per hour but per minute of a day or per day of a month and so on. But I assume that's simple to do once I get the idea of the query :)
One method is the brute force method:
with recursive hours as (
select 0 as hh
union all
select hh + 1
from hours
where hh < 23
)
select h.hh, count(t.start)
from hours h left join
t
on start >= '2020-07-25' + interval h.hh hour and
end < '2020-07-25' + interval (h.hh + 1) hour
where end < '2020-07-25' + interval 1 day and
start >= '2020-07-25'
group by h.hh
order by h.hh;

SQL query to retrieve latest 2 week records

I have a database with CreatedDate is store in Unix epoch time and some other info. I want a query to able to retrieve latest 2 week record base on the last record.
Below is part of the example
ID User Ranking CreatedDate
-------------------------------------------------------
1 B.Sisko 1 1461136714
2 B.Sisko 2 1461123378
3 B.Sisko 3 1461123378
4 B.Sisko 3 1461600137
5 K.Janeway 4 1461602181
6 K.Janeway 4 1461603096
7 J.Picard 4 1461603096
The last record CreatedDate is 25 Apr 2016, so I want the record from 12 Apr to 25 Apr.
I not sure how to compare to get latest data? any suggestion
The simplest method is probably to just subtract two weeks from today's date/time:
where CreatedDate >= UNIX_TIMESTAMP() - 7*24*60*60
Another approach is to convert the value to a date/time:
where from_unixtime(CreatedDate) >= date_sub(now(), interval 2 week)
The advantage of this approach is that it is easier to align to days. So, if you want two weeks starting at midnight:
where from_unixtime(CreatedDate) >= date_sub(curdate(), interval 2 week)
The disadvantage is that the function on the column prevents the use of indices on that column.
EDIT:
This is definitely not how your question was phrased. But in that case, you should use:
select t.*
from t cross join
(select from_unixtime(max(CreatedDate)) as maxcd from t) m
where from_unixtime(CreatedDate) >= date_sub(maxcd, interval 2 week);
It may seem odd, but you need to execute two queries one to find the Maximum Date and knock off 14 days -- and then use that as a condition to requery the table. I used ID_NUM since ID is a reserved word in Oracle and likely other RDBMS as well.
SELECT ID_NUM, USER, RANKING,
TO_DATE('19700101000000', 'YYYYMMDDHH24MISS')+((CreatedDate-18000)
/(60*60*24)) GOOD_DATE
FROM MY_TABLE
WHERE
GOOD_DATE >=
(SELECT MAX( TO_DATE('19700101000000', 'YYYYMMDDHH24MISS')+
((CreatedDate-18000) /(60*60*24))) -14
FROM MY_TABLE)

Get average day or week values

I have statistical data like this:
time val1
1424166578 51
1424166877 55
1424167178 57
1424167477 57
time is a unix timestamp. There is one record every 5 minutes excluding nights and sundays. This continues over several weeks.
Now I want to get these values for an average day and an average week. The result should include values for every 5 minutes like normal but for average past days or weeks.
The result should look like this:
time val1
0 43.423
300 46.635
600 51.887
...
So time could be a timestamp with relative time since day or week start. Perhaps it is better to use DATETIME... not sure.
If I use GROUP BY FROM_UNIXTIME(time, '%Y%m%d') for example I get one value for the whole day. But I want all average values for all days.
You seem to be interested in grouping dates by five minute intervals instead of dates. This is fairly straightforward:
SELECT
HOUR(FROM_UNIXTIME(time)) AS HH,
(MINUTE(FROM_UNIXTIME(time)) DIV 5) * 5 AS MM,
AVG(val1) AS VAL
FROM your_table
WHERE time > UNIX_TIMESTAMP(CURRENT_TIMESTAMP - INTERVAL 7 DAY)
GROUP BY HH, MM
The following result will explain how date is clamped:
time FROM_UNIXTIME(time) HH MM
1424166578 2015-02-17 14:49:38 14 45
1424166877 2015-02-17 14:54:37 14 50
1424167178 2015-02-17 14:59:38 14 55
1424167477 2015-02-17 15:04:37 15 00
I would approach this as:
select date(from_unixtime(time)) as day, avg(val)
from table t
group by date(from_unixtime(time))
order by day;
Although you can use the format argument, I think of that more for converting the value to a string than to a date/time.

MySQL: date difference in months (even if 'month' < 30 days should be count as 1)

I would like to compare 2 dates: '2012-05-05', '2012-06-04' and receive 1 as a result (difference bettwen May and June).
What I got:
SELECT TIMESTAMPDIFF(MONTH, '2012-05-05', '2012-06-04') as difference
-- output: 0
I'm looking for a query for which I will receive 1 as a result (dates are from 2 different months; not important if difference is in fact < 30 days).
I've tried:
SELECT TIMESTAMPDIFF(MONTH, DATE_FORMAT('2012-05-05','%Y-%m'), DATE_FORMAT('2012-06-04','%Y-%m') ) as difference
-- output: NULL
also:
SELECT DATEDIFF( DATE_FORMAT('2012-05-05','%Y-%m'), DATE_FORMAT('2012-06-04','%Y-%m') ) as difference
-- output: NULL
Do you have other ideas?
I don't know if there are ways of doing it with the function, but you can do simple case... Obviously can be improved.
CASE
WHEN DAYS <=30 THEN 1
WHEN DAYS BETWEEN 31 and 60 THEN 2
--ELSE ....
END as MONTH_DIFF
Also found this solution here:
SELECT 12 * (YEAR(DateOfService)
- YEAR(BirthDate))
+ (MONTH(DateOfService)
- MONTH(BirthDate)) AS months
FROM table
In mysql documentation has function PERIOD_DIFF that returns the number of months between periods P1 and P2. P1 and P2 should be in the format YYMM or YYYYMM.
SELECT PERIOD_DIFF('201205', '201206') as difference
If you from datetime value, use DATE_FORMAT. Such as :
SELECT PERIOD_DIFF(DATE_FORMAT(NOW(), '%Y%m'),DATE_FORMAT(YOURCOLUMN, '%Y%m')) AS difference
FROM YOURTABLE;