MySQL date grouping select statement - mysql

I have a MySQL 8.0.23 table called events with the following schema:
eventUID - integer NOT NULL
camtimestamp - MySQL datetime stamp
direction - string - either "In" or "Out"
propUID - integer
In a single SELECT statement, I am trying to determine by hour for the last 24 hours how many cars are "In" and how many are "Out". Here is what I am trying (does not yet have the 24 hour limit built in).
select camtimestamp,count(*) from events where direction ="In" and propUID = 7 group by year(camtimestamp),month(camtimestamp),day(camtimestamp),hour(camtimestamp);
And this is a sample of what I am getting.
2022-02-14 22:02:40 38
2022-02-14 21:56:56 15
2022-02-14 20:55:30 47
2022-02-14 19:59:18 51
2022-02-14 18:59:50 36
2022-02-14 17:52:04 10
2022-02-14 16:58:01 16
2022-02-14 15:59:00 36
2022-02-14 14:58:52 44
I also have a table called datehourlist with which I can join in my SELECT.
Sample data:
2019-05-01 00:00:00
2019-05-01 01:00:00
2019-05-01 02:00:00
2019-05-01 03:00:00
2019-05-01 04:00:00
2019-05-01 05:00:00
2019-05-01 06:00:00
2019-05-01 07:00:00
2019-05-01 08:00:00
Also:
mysql> select min(datehour) from datehourlist;
+---------------------+
| min(datehour) |
+---------------------+
| 2019-05-01 00:00:00 |
+---------------------+
1 row in set (0.02 sec)
mysql> select max(datehour) from datehourlist;
+---------------------+
| max(datehour) |
+---------------------+
| 2040-12-31 00:00:00 |
+---------------------+
1 row in set (0.02 sec)
datehourlist has every hour in it from May 1, 2019 until December 31, 2040.
This is a sample of what I really want from this:
Column 1 below is a rounded grouped timestamp (vs col 1 above being a non-rounded, actual timestamp)
Column 1 below does not skip an hour if there is no data from that hour.
Column 2 below is the "In" count for that hour.
Column 3 below is the "Out" count for that hour.
2019-05-02 06:00:00 5 10
2019-05-02 07:00:00 127 10
2019-05-02 08:00:00 0 0
2019-05-02 09:00:00 115 10
2019-05-02 10:00:00 71 10
2019-05-02 11:00:00 147 10
2019-05-02 12:00:00 140 10
What SELECT statement should I use to get the output I need?
Also, how would I optimize that SELECT statement?
Within events, I have 500k events and growing by 100s everyday.
Thank you in advance for your help.
Thank you for a great solution so quickly.
SELECT dhl.datehour datehour,
COALESCE(SUM(ev.direction = 'In'), 0) `In`,
COALESCE(SUM(ev.direction = 'Out'), 0) `Out`
FROM datehourlist dhl
LEFT JOIN events ev
ON DATE_FORMAT(ev.camtimestamp, '%Y-%m-%d %H:00:00') = dhl.datehour
WHERE ev.camtimestamp >= DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00') - INTERVAL 24 HOUR
AND ev.camtimestamp < DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00')
AND ev.propUID = 7
GROUP BY dhl.datehour;

First, we need a trunc_to_hour() function that takes an arbitrary DATETIME or TIMESTAMP value and gives back the beginning of its hour. That is this.
DATE_FORMAT(camtimestamp, '%Y-%m-%d %H:00:00')
Second, we need a WHERE expression that can handle the most recent 24 hours. That is this.
WHERE camtimestamp >= DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00') - INTERVAL 24 HOUR
AND camtimestamp < DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00')
For the example timestamp 2021-03-14 16:04:30 this gives the following.
WHERE camtimestamp >= `2021-03-13 16:00:00`
AND camtimestamp < '2021-03-14 16:00:00`
That is, it chooses records for the most recent 24 complete clock hours. You may have to adjust this WHERE expression if you want the hours to date.
Third, we need conditional sums (for In and Out).
It happens that the expression direction = 'In' gives 1 when direction is In, 0 when direction is some other string (like Out), and NULL if direction itself is NULL. So
SUM(direction='In')
counts the rows meeting that criterion.
Fourth, when the SUM is null, we want to show zero. Like this.
COALESCE(SUM(direction='In'),0)
Fifth, we can put it together something like this:
SELECT DATE_FORMAT(ev.camtimestamp, '%Y-%m-%d %H:00:00') datehour,
COALESCE(SUM(ev.direction = 'In'), 0) `In`,
COALESCE(SUM(ev.direction = 'Out'), 0) `Out`
FROM events ev
WHERE ev.camtimestamp >= DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00') - INTERVAL 24 HOUR
AND ev.camtimestamp < DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00')
AND ev.propUID = 7
GROUP BY DATE_FORMAT(ev.camtimestamp, '%Y-%m-%d %H:00:00')
That gives you your result set. But it still might be missing some hours if there are no records for those hours.
So, sixth, we can join that to your pre-existing hourly calendar table like this:
SELECT dhl.datehour datehour,
COALESCE(SUM(ev.direction = 'In'), 0) `In`,
COALESCE(SUM(ev.direction = 'Out'), 0) `Out`
FROM datehourlist dhl
LEFT JOIN events ev
ON DATE_FORMAT(ev.camtimestamp, '%Y-%m-%d %H:00:00') = dhl.datehour
WHERE ev.camtimestamp >= DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00') - INTERVAL 24 HOUR
AND ev.camtimestamp < DATE_FORMAT(NOW(), '%Y-%m-%d %H:00:00')
AND ev.propUID = 7
GROUP BY dhl.datehour
And that should do it. (Not debugged.)

Related

Select data between a given interval of time from two columns MySQL

I need to retrieve data from a table from a given interval of time. My table is like this -
id
Start Time
End Time
1
06:30:00
07:00:00
2
06:45:00
07:15:00
3
13:15:00
14:00:00
4
09:30:00
10:15:00
Given interval of time - (05:00:00 - 10:00:00)
My Expectation -
id
Start Time
End Time
1
06:30:00
07:00:00
2
06:45:00
07:15:00
4
09:30:00
10:15:00
I need to get the id (4) as its start time in the given interval of time.
So what will be the query?
so far I can imagine this -
To get all the values between the given interval of time - (05:00:00 - 10:00:00) use:
select *
from tbl
where Start_Time between '05:00:00' and '10:00:00'
or End_Time between '05:00:00' and '10:00:00';
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ba828afe1a3ca9f21d6c104ea7dcfdff
Using comparison operator
SELECT * FROM tbl WHERE
(`start_time`>= '05:00:00' AND `start_time` <= '10:00:00') OR
(`end_time`>= '05:00:00' AND `end_time` <= '10:00:00')

How to know the date interval covered by WEEK function

I want to do other 2 columns that shows me, respectively, the first date and the last day of the week grouped.
Currently, I'm summarizing a query by week period. When I run this query, it shows me the results on table:
SELECT
pc.date,
CONCAT(YEAR(pc.date), '/', WEEK(pc.date)) as year_week
FROM pc
GROUP BY CONCAT(YEAR(pc.date), '/', WEEK(pc.date))
ORDER BY pc.date
date
year_week
2020-09-02
2020/35
2020-09-07
2020/36
2020-09-17
2020/37
2020-09-23
2020/38
2020-09-28
2020/39
2020-10-10
2020/40
2020-10-11
2020/41
2020-10-21
2020/42
2020-10-28
2020/43
How can I find the first and last day of grouped week?
You can use the WEEKDAY function. Demo:
select
date_add(dt, interval -WEEKDAY(dt)-1 day ) FirstDayOfWeek,
date_add(date_add(dt, interval -WEEKDAY(dt)-1 day), interval 6 day) LastDayOfWeek,
week(dt) wk
from (
select '2020-09-02' dt union all
select '2020-09-07' union all
select '2020-09-17'
) t
Returns
FirstDayOfWeek LastDayOfWeek wk
2020-08-30 2020-09-05 35
2020-09-06 2020-09-12 36
2020-09-13 2020-09-19 37

mysql query for grabbing multiple date ranges

I seem to be having a bit of trouble coming up a query to achieve what I want. I have a table like the following..
| Date(TIMESTAMP) | Count |
|---------------------|-------|
| 2016-02-01 01:00:00 | 52 |
| 2016-01-05 11:30:00 | 14 |
| 2016-02-01 04:20:00 | 36 |
| ... | ... |
The table has about 40,000 rows. What I would like to do is grab the totals for multiple date ranges so I end up with the following...
| Period | Total |
|------------|-------|
| All | 10245 |
| Past year | 1401 |
| Past month | 104 |
| Past week | 26 |
Currently I am running through a loop in my PHP script and doing an individual query for each date range I'm looking for. Actually there are about 10 queries I'm doing per loop to grab different stats but for the example I'm simplifying it. This takes forever and I am hoping there is a more elegant way to do this, however I've spent quite a bit of time now trying different things and researching and have gotten nowhere. I understand how to use CASE to group but not when a record may need to be in multiple bins. Any help?
Try this UNION query:
SELECT 'All', COUNT(*) AS Total FROM yourTable
UNION
SELECT 'Past year', COUNT(*) AS Total
FROM yourTable
WHERE DATE(TIMESTAMP) > DATE_ADD(NOW(), INTERVAL -1 YEAR)
UNION
SELECT 'Past month', COUNT(*) AS Total
FROM yourTable
WHERE DATE(TIMESTAMP) > DATE_ADD(NOW(), INTERVAL -1 MONTH)
UNION
SELECT 'Past week', COUNT(*) AS Total
FROM yourTable
WHERE DATE(TIMESTAMP) > DATE_ADD(NOW(), INTERVAL -1 WEEK)
1st. get known to function getting first date of year, first date of month and first date of week.
Then compose your sql using count and filter with first and last date of different period.
ref:
MySQL Select First Day of Year and Month
month
https://stackoverflow.com/a/19259159/1258492
week https://stackoverflow.com/a/11831133/1258492
select 'All' as period, count(1) from
tbl
union
select 'Past Year' as period, count(1) from
tbl
where timestamp between
MAKEDATE(year(now())-1,1) and
last_day(MAKEDATE(year(now())-1,1) + interval 11 month)
union
select 'Past Month' as period, count(1) from
tbl
where timestamp between
LAST_DAY(NOW() - INTERVAL 2 MONTH) + INTERVAL 1 DAY and
LAST_DAY(NOW() - INTERVAL 1 MONTH)
union
select 'Past Week' as period, count(1) from
tbl
where timestamp between
adddate(curdate(), INTERVAL 1-DAYOFWEEK(curdate())-7 DAY) and
adddate(curdate(), INTERVAL 7-DAYOFWEEK(curdate())-7 DAY) ;
You may use subqueries. Use one subquery per time breakdown like so:
SELECT everything, 'past year'
FROM
(
SELECT sum(c) AS 'everything'
FROM reports
) t1,
(
SELECT sum(c) AS 'past year'
FROM reports
WHERE d >= DATE_ADD(CURDATE(), INTERVAL -1 YEAR)
) t2

Select data between 2 dates and average hourly output

I am have a series of temperature data which is gathered every minute and put into a MySQL database. I want to write a query to select all temperatures for the last 24 hours then group and average them into hours. Is this possible via a SQL command?
I think its a case of selecting all records between the two date/times, but thats the point I get stuck.
Data Example:
ID Temp Timestamp
3922 22 2015-11-17 14:12:23
3923 22 2015-11-17 14:13:23
3924 22.05 2015-11-17 14:14:23
3925 22.05 2015-11-17 14:15:23
Needed output / Result
Temp Time
22 2015-11-17 14:00:00
23 2015-11-17 15:00:00
23 2015-11-17 16:00:00
22.05 2015-11-17 17:00:00
I hope you can help as I am totally lost with SQL commands.
Try this query
SELECT
AVG(Temp) AS Temp,
DATE_FORMAT(Timestamp, '%Y-%m-%d %H:00:00') AS Time
FROM Table
WHERE DATE_FORMAT(Timestamp, '%Y-%m-%d %H:00:00') >= DATE_FORMAT(DATE_ADD(NOW(), INTERVAL -1 DAY), '%Y-%m-%d %H:00:00')
GROUP BY DATE_FORMAT(Timestamp, '%Y-%m-%d %H:00:00')
Yes, this is quite feasible in SQL. It is easiest to get the hour out as:
select hour(timestamp), avg(temp)
from t
where timestamp >= date_sub(now(), interval 1 day)
group by hour(timestamp);
This isn't perfect, because the first and last hour boundary is from a different day. So, this is more like the output you want:
select date_add(date(timestamp), interval hour(timestamp) hour) as `time`,
avg(temp)
from t
where timestamp >= date_sub(now(), interval 1 day)
group by date_add(date(timestamp), interval hour(timestamp) hour);

I have an sql query for finding a nearest free time after specific date. How can I make it to also look for a date BEFORE this date?

This is a follow up to my previous question A query for finding the nearest free time slot in mysql - why it doesn't work?
Basically I have a table:
id | start_time | duration
1 | 2015-10-21 19:41:35 | 15
2 | 2015-10-21 19:41:50 | 15
3 | 2015-10-21 19:42:05 | 15
4 | 2015-10-21 19:42:35 | 15
etc.
and it contains the event start_time and its duration. I asked for help with finding the nearest time slot in which the event can be placed between the existing events. #Richard came up with a perfect answer https://stackoverflow.com/a/33689786/3766930 and suggested a query:
SELECT (a.start_time + INTERVAL a.duration SECOND) AS free_after FROM notes a
WHERE
NOT EXISTS ( SELECT 1 FROM notes b WHERE b.start_time
BETWEEN (a.start_time + INTERVAL a.duration SECOND) AND
(a.start_time + INTERVAL a.duration SECOND) + INTERVAL 15 SECOND - INTERVAL 1 MICROSECOND) AND
(a.start_time + INTERVAL a.duration SECOND) BETWEEN '2015-10-21 19:41:30' AND '2015-10-21 19:43:50'
which works great.
Now I was wondering if there's a possibility of finding a most suitable date not only between the existing dates, but also right before them.
For example: I would set a begin_date as 2015-10-21 16:00:00 and end_date as 2015-10-21 21:00:00. Currently the result of #Richard's query would be 2015-10-21 19:42:20. But is there a way of creating a query that in this result will return 2015-10-21 19:41:20 as the closest one to the first date that is already in database?
A simple solution for this would be to use date_sub with an order by statement, limiting the results to show only 1 record.
This is the result:
SELECT date_sub(start_time, interval duration second) as free_before FROM `notes` where start_time>'2015-10-21 16:00:00' order by start_time asc limit 1
Bonus for you
Using the previous solution #Richard provided. Putting it all together to show all free times in 1 table could result in the following:
select * from (SELECT date_sub(start_time, interval duration second) as free_times FROM `notes` where start_time>'2015-10-21 16:00:00' order by start_time asc limit 1) a
union
(SELECT (a.start_time + INTERVAL a.duration SECOND) AS free_times FROM notes a
WHERE
NOT EXISTS ( SELECT 1 FROM notes b WHERE b.start_time
BETWEEN (a.start_time + INTERVAL a.duration SECOND) AND
(a.start_time + INTERVAL a.duration SECOND) + INTERVAL 15 SECOND - INTERVAL 1 MICROSECOND) AND
(a.start_time + INTERVAL a.duration SECOND) BETWEEN '2015-10-21 19:41:30' AND '2015-10-21 19:43:50')
Edit:
I will only write my part of the query. The other part was working correctly and only if you want me to change it I will (never fix something that ain't broke)
If you want interval of 10 seconds ->
SELECT date_sub(start_time, interval 10 second) as free_times FROM `notes` where start_time>'2015-10-21 16:00:00' order by start_time asc limit 1
If you want interval of 15 seconds ->
SELECT date_sub(start_time, interval 15 second) as free_times FROM `notes` where start_time>'2015-10-21 16:00:00' order by start_time asc limit 1
In this case you'll have to change your start_time and the duration accordingly.
Would this work for you?
Given:
select * from notes;
+----+---------------------+----------+
| id | start_time | duration |
+----+---------------------+----------+
| 1 | 2015-11-17 10:10:10 | 15 |
| 2 | 2015-11-17 10:20:40 | 15 |
| 3 | 2015-11-17 10:30:00 | 15 |
+----+---------------------+----------+
This result:
select (start_time - interval 15 second) as earlier_date
from notes
where start_time > '2015-11-17 10:15:00'
AND start_time < '2015-11-17 10:25:00'
order by start_time
limit 1;
+---------------------+
| earlier_date |
+---------------------+
| 2015-11-17 10:20:25 |
+---------------------+
Important: This sample doesn't pay any attention to entries that might fall immediately in front of the search window (because your example didn't include any). This query as-is will create overlaps if there are impinging entries just prior to the search window.
Take your base table and insert a fake row that is the minimum start time and subtract 15 seconds.
So instead of notes, use a subquery like this:
SELECT
MIN(start_time) - INTERVAL 15 seconds AS start_time,
0 AS duration
FROM notes
UNION ALL
SELECT start_time, duration
FROM notes