Select data between 2 dates and average hourly output - mysql

I am have a series of temperature data which is gathered every minute and put into a MySQL database. I want to write a query to select all temperatures for the last 24 hours then group and average them into hours. Is this possible via a SQL command?
I think its a case of selecting all records between the two date/times, but thats the point I get stuck.
Data Example:
ID Temp Timestamp
3922 22 2015-11-17 14:12:23
3923 22 2015-11-17 14:13:23
3924 22.05 2015-11-17 14:14:23
3925 22.05 2015-11-17 14:15:23
Needed output / Result
Temp Time
22 2015-11-17 14:00:00
23 2015-11-17 15:00:00
23 2015-11-17 16:00:00
22.05 2015-11-17 17:00:00
I hope you can help as I am totally lost with SQL commands.

Try this query
SELECT
AVG(Temp) AS Temp,
DATE_FORMAT(Timestamp, '%Y-%m-%d %H:00:00') AS Time
FROM Table
WHERE DATE_FORMAT(Timestamp, '%Y-%m-%d %H:00:00') >= DATE_FORMAT(DATE_ADD(NOW(), INTERVAL -1 DAY), '%Y-%m-%d %H:00:00')
GROUP BY DATE_FORMAT(Timestamp, '%Y-%m-%d %H:00:00')

Yes, this is quite feasible in SQL. It is easiest to get the hour out as:
select hour(timestamp), avg(temp)
from t
where timestamp >= date_sub(now(), interval 1 day)
group by hour(timestamp);
This isn't perfect, because the first and last hour boundary is from a different day. So, this is more like the output you want:
select date_add(date(timestamp), interval hour(timestamp) hour) as `time`,
avg(temp)
from t
where timestamp >= date_sub(now(), interval 1 day)
group by date_add(date(timestamp), interval hour(timestamp) hour);

Related

How to know the date interval covered by WEEK function

I want to do other 2 columns that shows me, respectively, the first date and the last day of the week grouped.
Currently, I'm summarizing a query by week period. When I run this query, it shows me the results on table:
SELECT
pc.date,
CONCAT(YEAR(pc.date), '/', WEEK(pc.date)) as year_week
FROM pc
GROUP BY CONCAT(YEAR(pc.date), '/', WEEK(pc.date))
ORDER BY pc.date
date
year_week
2020-09-02
2020/35
2020-09-07
2020/36
2020-09-17
2020/37
2020-09-23
2020/38
2020-09-28
2020/39
2020-10-10
2020/40
2020-10-11
2020/41
2020-10-21
2020/42
2020-10-28
2020/43
How can I find the first and last day of grouped week?
You can use the WEEKDAY function. Demo:
select
date_add(dt, interval -WEEKDAY(dt)-1 day ) FirstDayOfWeek,
date_add(date_add(dt, interval -WEEKDAY(dt)-1 day), interval 6 day) LastDayOfWeek,
week(dt) wk
from (
select '2020-09-02' dt union all
select '2020-09-07' union all
select '2020-09-17'
) t
Returns
FirstDayOfWeek LastDayOfWeek wk
2020-08-30 2020-09-05 35
2020-09-06 2020-09-12 36
2020-09-13 2020-09-19 37

MySQL date grouping select statement

I have a MySQL 8.0.23 table called events with the following schema:
eventUID - integer NOT NULL
camtimestamp - MySQL datetime stamp
direction - string - either "In" or "Out"
propUID - integer
In a single SELECT statement, I am trying to determine by hour for the last 24 hours how many cars are "In" and how many are "Out". Here is what I am trying (does not yet have the 24 hour limit built in).
select camtimestamp,count(*) from events where direction ="In" and propUID = 7 group by year(camtimestamp),month(camtimestamp),day(camtimestamp),hour(camtimestamp);
And this is a sample of what I am getting.
2022-02-14 22:02:40 38
2022-02-14 21:56:56 15
2022-02-14 20:55:30 47
2022-02-14 19:59:18 51
2022-02-14 18:59:50 36
2022-02-14 17:52:04 10
2022-02-14 16:58:01 16
2022-02-14 15:59:00 36
2022-02-14 14:58:52 44
I also have a table called datehourlist with which I can join in my SELECT.
Sample data:
2019-05-01 00:00:00
2019-05-01 01:00:00
2019-05-01 02:00:00
2019-05-01 03:00:00
2019-05-01 04:00:00
2019-05-01 05:00:00
2019-05-01 06:00:00
2019-05-01 07:00:00
2019-05-01 08:00:00
Also:
mysql> select min(datehour) from datehourlist;
+---------------------+
| min(datehour) |
+---------------------+
| 2019-05-01 00:00:00 |
+---------------------+
1 row in set (0.02 sec)
mysql> select max(datehour) from datehourlist;
+---------------------+
| max(datehour) |
+---------------------+
| 2040-12-31 00:00:00 |
+---------------------+
1 row in set (0.02 sec)
datehourlist has every hour in it from May 1, 2019 until December 31, 2040.
This is a sample of what I really want from this:
Column 1 below is a rounded grouped timestamp (vs col 1 above being a non-rounded, actual timestamp)
Column 1 below does not skip an hour if there is no data from that hour.
Column 2 below is the "In" count for that hour.
Column 3 below is the "Out" count for that hour.
2019-05-02 06:00:00 5 10
2019-05-02 07:00:00 127 10
2019-05-02 08:00:00 0 0
2019-05-02 09:00:00 115 10
2019-05-02 10:00:00 71 10
2019-05-02 11:00:00 147 10
2019-05-02 12:00:00 140 10
What SELECT statement should I use to get the output I need?
Also, how would I optimize that SELECT statement?
Within events, I have 500k events and growing by 100s everyday.
Thank you in advance for your help.
Thank you for a great solution so quickly.
SELECT dhl.datehour datehour,
COALESCE(SUM(ev.direction = 'In'), 0) `In`,
COALESCE(SUM(ev.direction = 'Out'), 0) `Out`
FROM datehourlist dhl
LEFT JOIN events ev
ON DATE_FORMAT(ev.camtimestamp, '%Y-%m-%d %H:00:00') = dhl.datehour
WHERE ev.camtimestamp >= DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00') - INTERVAL 24 HOUR
AND ev.camtimestamp < DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00')
AND ev.propUID = 7
GROUP BY dhl.datehour;
First, we need a trunc_to_hour() function that takes an arbitrary DATETIME or TIMESTAMP value and gives back the beginning of its hour. That is this.
DATE_FORMAT(camtimestamp, '%Y-%m-%d %H:00:00')
Second, we need a WHERE expression that can handle the most recent 24 hours. That is this.
WHERE camtimestamp >= DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00') - INTERVAL 24 HOUR
AND camtimestamp < DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00')
For the example timestamp 2021-03-14 16:04:30 this gives the following.
WHERE camtimestamp >= `2021-03-13 16:00:00`
AND camtimestamp < '2021-03-14 16:00:00`
That is, it chooses records for the most recent 24 complete clock hours. You may have to adjust this WHERE expression if you want the hours to date.
Third, we need conditional sums (for In and Out).
It happens that the expression direction = 'In' gives 1 when direction is In, 0 when direction is some other string (like Out), and NULL if direction itself is NULL. So
SUM(direction='In')
counts the rows meeting that criterion.
Fourth, when the SUM is null, we want to show zero. Like this.
COALESCE(SUM(direction='In'),0)
Fifth, we can put it together something like this:
SELECT DATE_FORMAT(ev.camtimestamp, '%Y-%m-%d %H:00:00') datehour,
COALESCE(SUM(ev.direction = 'In'), 0) `In`,
COALESCE(SUM(ev.direction = 'Out'), 0) `Out`
FROM events ev
WHERE ev.camtimestamp >= DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00') - INTERVAL 24 HOUR
AND ev.camtimestamp < DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00')
AND ev.propUID = 7
GROUP BY DATE_FORMAT(ev.camtimestamp, '%Y-%m-%d %H:00:00')
That gives you your result set. But it still might be missing some hours if there are no records for those hours.
So, sixth, we can join that to your pre-existing hourly calendar table like this:
SELECT dhl.datehour datehour,
COALESCE(SUM(ev.direction = 'In'), 0) `In`,
COALESCE(SUM(ev.direction = 'Out'), 0) `Out`
FROM datehourlist dhl
LEFT JOIN events ev
ON DATE_FORMAT(ev.camtimestamp, '%Y-%m-%d %H:00:00') = dhl.datehour
WHERE ev.camtimestamp >= DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00') - INTERVAL 24 HOUR
AND ev.camtimestamp < DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00')
AND ev.propUID = 7
GROUP BY dhl.datehour
And that should do it. (Not debugged.)

MySQL if 2 hours between two dates and times

This is a tough one i'm trying to figure out.
This is my table:
task_reminders
- id
- date
- time
I want to SELECT ALL rows which has a date and time 3 hours after the current (NOW) date and time (UTC). It is tough because the date and time columns are separate.
Examples:
For example, if the current date and time is 2019-01-20 08:30:00, I want to select all rows that have a date and time that is 3 hours after that time (only counting hours).
2019-01-20 11:50:00 this would work
2019-01-20 11:10:00 this would work too
2019-01-20 10:00:00 would NOT work
2019-01-20 12:00:00 would NOT work
Another example: If the current date and time is 2019-01-19 11:20:00, these would and would not work:
2019-01-20 02:50:00 this would work
2019-01-20 02:30:00 this would work too
2019-01-20 01:10:00 would NOT work
2019-01-20 03:45:00 would NOT work
It is kind of hard because the DATE and TIME are separate in my database. How would i do this? Thank you!
Since you just want the hours to be different from each other by 3, you can check for 2 conditions:
HOUR(NOW()) is < 21, in which case the dates should be the same and HOUR(time) = HOUR(NOW()) + 3; or
HOUR(NOW()) >= 21, in which case the date should be 1 more than CURDATE() and HOUR(time) =HOUR(NOW()) - 21`
So your query would be:
SELECT *
FROM task_reminders
WHERE HOUR(`time`) = HOUR(NOW()) + 3 AND `date` = CURDATE() OR
HOUR(`time`) = HOUR(NOW()) - 21 AND `date` = CURDATE() + INTERVAL 1 DAY
Note
Since hours wrap, there is no need to include the conditions about HOUR(NOW()) relative to 21 in the query
I'm assuming your date and time columns are of datatype DATE and TIME respectively.
Checkout this it will compare date and hour but skip time and seconds as you want
SELECT *
FROM task_reminders
WHERE DATE_FORMAT(CAST(CONCAT(`edate`, ' ', `etime`) AS DATETIME), '%Y-%m-%d %H') >=
DATE_FORMAT(DATE_ADD(NOW(), INTERVAL 3 HOUR) , '%Y-%m-%d %H')
Please check sqlfiddle at http://www.sqlfiddle.com/#!9/9bec7a/1
Use this query and replace the table and variable names
$result = mysqli_query($con,"SELECT sent_date FROM invitations WHERE email='$email' AND uid='$session_uid' AND `sent_date` < SUBDATE( CURRENT_DATE, INTERVAL 3 HOUR)");

I want to calculate time between 2 Dates of a day with multiple rows in mysql

I'm using this query to calculate the login time of a user on the app for the whole day and previous 5 days
Select
sec_to_time(sum(time_to_sec(TIMEDIFF((IFNULL(logoff_time, ADDTIME(now(), '05:00:00'))),login_time)))) as online_time
from tb_sessions
WHERE
(login_time BETWEEN DATE(DATE_ADD(now(), INTERVAL (-6) DAY))
AND
ADDTIME(now(), '5:00:00')) AND user_id = 30982
AND TIME(`login_time`) between "00:00:00" AND "23:59:59"
group by DATE(login_time)
Now i have some new requirements:
Calculate time from 07:00:00 to 23:59:59
My Table: tb_sessions
id | user_id | login_time | logoff_time
1 3098 2017-06-10 06:30:00 2017-06-10 07:45:00
2 3098 2017-06-10 07:45:01 2017-06-10 08:30:00
By using above query total oline time is = 02:00:00
But i want only time from 7:00 to 8:30, so total time will be = 1:30:00
I make some changes in query with cases but no success.
You can check my query on the below link:
http://sqlfiddle.com/#!9/4620af/12
You could use greatest to take the latest of the dates login_time and 7:00 on the same day, and then use greatest again to exclude negative time differences (when also logoff time is before 7:00):
Select date(login_time) date,
time_format(sec_to_time(sum(greatest(0, time_to_sec(timediff(
ifnull(logoff_time, now()),
greatest(login_time, date_add(date(login_time), interval 7 hour))
))))), '%H:%i:%s') online
from tb_sessions
where login_time between date(date_add(now(), interval (-3) day)) and now()
and user_id = 3098
and time(login_time) between "00:00:00" and "23:59:59"
group by date(login_time)
See it run on sqlfiddle
Explanation
The inner greatest call looks like this:
greatest(login_time, date_add(date(login_time), interval 7 hour))
The second argument takes the date-only from the login_time, so it corresponds to midnight of that day, and then adds 7 hours to it: so this represents 7:00 on that day. greatest will return the latest of these two timestamps. If the first argument represents a time than 7:00, it will be returned. If not, the second argument (i.e. 7:00) will be returned.
The outer greatest call looks like this:
greatest(0, time_to_sec(timediff(....)))
This will make sure the time difference is not negative. Take this example record:
login_time | logoff_time
----------------+----------------
2017-06-01 6:30 | 2017-06-01 6:45
In this case the innermost greatest will return 2017-06-01 7:00, because 6:30 is too early. But that will make timediff() return a negative time interval: -15 minutes. What we really want is 0, because there is no time the user was logged on after 7:00. This is what greatest will do: greatest(0, -15) = 0, so the negative value will be eliminated and will not influence the sum.
Condition on login_time
I left the condition time(login_time) between "00:00:00" and "23:59:59" there, but it really does not do anything, since that is true for all times (unless they are null, but then they would not pass the first condition either).
Edit after New Requirements
In comments you asked how to group by each day when a user doesn't log off on the same day but stays online until 1 or 2 days later.
In that case you need a helper table that will list all days you want to see in the output. This could for instance be seven records for the 7 last days.
Then you have to join your table with it so that there is at least an overlap of the user's session with such a reference date. The calculation of the online time will have to take into account that the log off time might not be before mid night.
Here is the updated query:
select ref_date date,
time_format(sec_to_time(sum(greatest(0, time_to_sec(timediff(
least(ifnull(logoff_time, now()), date_add(ref_date, interval 1 day ), now()),
greatest(login_time, date_add(ref_date, interval 7 hour))
))))), '%H:%i:%s') online
from ( select date(date_add(now(), interval (-6) DAY)) as ref_date union all
select date(date_add(now(), interval (-5) DAY)) union all
select date(date_add(now(), interval (-4) DAY)) union all
select date(date_add(now(), interval (-3) DAY)) union all
select date(date_add(now(), interval (-2) DAY)) union all
select date(date_add(now(), interval (-1) DAY)) union all
select date(now())
) ref
inner join tb_sessions
on login_time < date_add(ref_date, interval 1 day)
and logoff_time > date_add(ref_date, interval 7 hour)
where user_id = 3098
group by ref_date
See it run on sqlfiddle.

SQL Hourly Data

The query below retrieves weather data from a MySql database, and groups this data in to an hourly format.
select hour(datetime) AS hour
, avg(Temperature) as AVGT
from Database.minute
WHERE DATETIME
BETWEEN (CURDATE() + INTERVAL (SELECT hour(NOW())) hour - INTERVAL 23 hour)
AND ((CURDATE() + INTERVAL (SELECT hour(NOW())) hour))
group by hour
order by (CURDATE() + INTERVAL (SELECT hour(NOW())) hour - INTERVAL 23 hour)
Output is as follows:
hour AVGT
19 11.730
20 11.970
21 11.970
22 11.760
23 11.660
0 11.700
1 11.830
2 12.370
3 12.770
4 12.840
5 12.840
6 12.540
7 12.500
8 12.030
9 12.100
10 12.300
11 12.060
12 11.090
13 10.920
14 10.920
15 10.820
16 10.760
17 10.690
18 10.560
The time is now 18:15. All of the above output is correct apart from the data gathered for hour '18'. Instead of getting the average value between 18:00 and 18:15, it just outputs the average at time 18:00. ie. ignoring data between 18:01 and 18:14.
How can I modify the above query to include data in the current hour (18:00 to Now)?
Thanks
Why don't you simply try
SELECT Hour(datetime) AS hour,
Avg(temperature) AS AVGT
FROM DATABASE.minute
WHERE datetime BETWEEN ( Curdate() + INTERVAL (SELECT Hour(Now())) hour -
INTERVAL 23 hour ) AND Now()
GROUP BY hour
ORDER BY ( Curdate() + INTERVAL (SELECT Hour(Now())) hour - INTERVAL 23 hour )
I agree with #Ankur's answer (your filter citerion should not filter records up to the current hour, but rather the current time), however your date/time operations are very strange:
You don't need a subquery (SELECT Hour(NOW())) to obtain HOUR(NOW());
You can express ( Curdate() + INTERVAL (SELECT Hour(NOW())) hour - INTERVAL 23 hour ) more simply:
CURDATE() + INTERVAL HOUR(NOW()) - 23 HOUR
Or, in my view, more clearly:
DATE_FORMAT(NOW() - INTERVAL 23 HOUR, '%Y-%m-%d %H:00:00')
Your ORDER BY clause is a constant and therefore achieves nothing: did you mean to order by hour?
Therefore:
SELECT HOUR(datetime) AS hour,
AVG(Temperature) AS AVGT
FROM Database.minute
WHERE datetime BETWEEN
DATE_FORMAT(NOW() - INTERVAL 23 HOUR, '%Y-%m-%d %H:00:00')
AND NOW()
GROUP BY hour
ORDER BY hour