Looking for a pattern of temperatures
Trying to produce a list of results from a simple table that records temperature and time every 5 minutes. The table only has two columns 'temp' and 'ttime'. The time is recorded as MySQL timestamp.
What I need to do is check for any patterns where the temperature goes over 40 for more than two hours within a 24 hour period using just the data from same table and there are thousands of rows of data.
Quick sample of data:
temp
ttime
35
2022-08-14 12:05:00
40
2022-08-14 12:10:00
41
2022-08-14 12:15:00
37
2022-08-14 12:20:00
Not sure how to even start something like this.
I'd try something like
SELECT * FROM data d1
WHERE temp >= 40
AND temp >= ALL (
SELECT temp FROM d2
WHERE d2.ttime BETWEEN
date_add(d1.ttime, INTERVAL -2 HOUR)
AND d1.ttime
);
to get the first record after two hours higher than 40 degrees (can't verify at the moment)
Related
The problem is as follows. I have a table that contains the following columns:
machine
timestamp
speed (meters/minute)
C1
22/9/2020, 16:45
15
C1
22/9/2020, 16:55
5
C1
22/9/2020, 17:20
19
What I want to know is the distance travelled in each hour for each machine, so I need to subtract the timestamp of the current row from the timestamp of the next row, and then multiply by the speed of the first row (e.g: 16:55 - 16:45 = 10 minutes -> 10 * 15 = 150 meters between 16:45 and 16:55).
I was able to do that by using a logic similar to the one below (it is not all the same query):
' 1st get the timestamp of the next row'
lead(query."timestamp") OVER (PARTITION BY query.id ORDER BY query."timestamp") AS lead_timestamp
' 2nd get the duration'
query."lead_timestamp" - query."timestamp" AS duration
' 3rd calculate the distance'
query."duration" * query."speed" AS distance
' 4th group by hour'
GROUP BY date_trunc('hour', CAST(query."timestamp" AS timestamp)
It works almost 100% fine. I get a table similar to the one below:
machine
timestamp
duration (meters)
C1
22/9/2020, 16:00
275
C1
22/9/2020, 17:00
...
But as you can see, as I group data per hour, the total meters for the hour 16:00 are not correct because there wasn't a timestamp after the timestamp equal to "22/9/2020, 16:55" that forced the grouping to end at "22/9/2020, 16:59". So, in the end, I added to the hour 16:00 part of the duration for the hour 17:00 (those 20 minutes were added to hour 16:00).
I am not sure how to solve this problem but I have looked into UNION to add an 'artificial' row whenever there's a transition of hour between timestamps, before even starting subtracting values to calculate the duration. But it seems rather complicated since I would have to do it for each machine and I don't know how many rows it would be.
Can you help me? Thanks! If I was not clear pls ask for more info!
This could be of help:
WITH cte as (
select 'c1' as machine, '2020-09-22 16:46' as timestamp, 15 as speed
union all
select 'c1', '2020-09-22 16:56', 5
union all
select 'c1', '2020-09-22 17:20', 19)
select
machine,
timestamp,
speed,
lead(timestamp) over (partition by machine) nextTime,
TIMEDIFF( lead(timestamp) over (partition by machine), timestamp) diffTime,
minute(TIMEDIFF( lead(timestamp) over (partition by machine), timestamp))*speed meters
from cte;
output:
machine
timestamp
speed
nextTime
diffTime
meters
c1
2020-09-22 16:46
15
2020-09-22 16:56
00:10:00.000000
150
c1
2020-09-22 16:56
5
2020-09-22 17:20
00:24:00.000000
120
c1
2020-09-22 17:20
19
I'm measuring different kind of events daily and get records looking like that:
id measurement_date value
111 2020-12-01 21:30:00 100
111 2020-12-02 22:00:12 110
111 2020-12-03 21:35:17 80
114 2020-12-02 21:47:56 780
114 2020-12-04 21:55:47 700
....
Then I am having a query transforming the data to get the difference between 2 measurements.
I am running this transform on different windows of time (1 day, 7 days, 1 month).
For 1 day it is quite straightforward as I either have a measurement or if missing I have no intermediary data to compensate and therefore place a 0.
Here is the query I use:
SELECT id,
(ft.value - ft2.value) as progression,
FROM feed_table ft
JOIN feed_table ft2 on ft2.id = ft.id
AND date_format(ft2.date, '%Y-%m-%d') = date_format(date_sub(CURDATE(), interval 1 day), '%Y-%m-%d')
WHERE date_format(ft.date, '%Y-%m-%d') = date_format(CURDATE(), '%Y-%m-%d')
However for longer windows, like 7 days, for example, I would like to make use of the intermediary data if they exist.
Let's say I am measuring on the 7 days window between today 2020-12-08 and 7 days before 2020-12-01, but I only have the following measurements which are neither today nor 7 days ago but are still inside the 7 days window:
id measurement_date value
111 2020-12-02 21:30:00 200
111 2020-12-06 21:30:00 300
Then the query above with a 7D interval and the right settings should return :
id progression
111 100
(max date value - min date value in the 7 days window)
I was thinking of aggregating by user_id and using the min-max date in the having close, but my self-join wouldn't work anymore...
Any idea?
How can I select all rows from a table where a date column is within a specific range of dates, at a given period (e.g. every 14 days)?
The table has a date column with most every date represented, possibly multiple times. The range is defined by a start date and an end date. The period is a number of days. For example:
Start: 2016-01-01 (friday)
End: 2016-12-31 (saturday)
period: 14 (days)
For the above, the query should return rows for every other Friday in 2016. That is, it should return the rows for the following dates:
2016-01-01
2016-01-15
2016-01-29
2016-02-12
2016-02-26
2016-03-11
2016-03-25
2016-04-08
2016-04-22
2016-05-06
2016-05-20
2016-06-03
2016-06-17
2016-07-01
2016-07-15
2016-07-29
2016-08-12
2016-08-26
2016-09-09
2016-09-23
2016-10-07
2016-10-21
2016-11-04
2016-11-18
2016-12-02
2016-12-16
2016-12-30
Currently, this is done in a stored procedure where a loop fills a temp table with the target dates, which is later joined on. However, I am trying to rewrite this code to step away from stored procedures.
What would be the best way to get the desired rows without using the stored procedure & a temp table? Keep in mind that (one of) the table(s) is quite large at around 1M records indexed on date, so any calculated values might impact the performance severely.
Alternatively, I could calculate all dates in the interval in PHP/RoR and use a massive IN clause, but hopefully there is a better solution.
Try this:
table_name1 is your table
date1 the date field
"2022-01-02" the start (twice included)
"2022-01-10" the end
3 the interval
SELECT date1
FROM table_name1
WHERE date1 BETWEEN "2022-01-02" AND "2022-01-10"
AND (DATE("2022-01-02") - date1) % 3 = 0;
Tested it with MySQL 5.6.
I have statistical data like this:
time val1
1424166578 51
1424166877 55
1424167178 57
1424167477 57
time is a unix timestamp. There is one record every 5 minutes excluding nights and sundays. This continues over several weeks.
Now I want to get these values for an average day and an average week. The result should include values for every 5 minutes like normal but for average past days or weeks.
The result should look like this:
time val1
0 43.423
300 46.635
600 51.887
...
So time could be a timestamp with relative time since day or week start. Perhaps it is better to use DATETIME... not sure.
If I use GROUP BY FROM_UNIXTIME(time, '%Y%m%d') for example I get one value for the whole day. But I want all average values for all days.
You seem to be interested in grouping dates by five minute intervals instead of dates. This is fairly straightforward:
SELECT
HOUR(FROM_UNIXTIME(time)) AS HH,
(MINUTE(FROM_UNIXTIME(time)) DIV 5) * 5 AS MM,
AVG(val1) AS VAL
FROM your_table
WHERE time > UNIX_TIMESTAMP(CURRENT_TIMESTAMP - INTERVAL 7 DAY)
GROUP BY HH, MM
The following result will explain how date is clamped:
time FROM_UNIXTIME(time) HH MM
1424166578 2015-02-17 14:49:38 14 45
1424166877 2015-02-17 14:54:37 14 50
1424167178 2015-02-17 14:59:38 14 55
1424167477 2015-02-17 15:04:37 15 00
I would approach this as:
select date(from_unixtime(time)) as day, avg(val)
from table t
group by date(from_unixtime(time))
order by day;
Although you can use the format argument, I think of that more for converting the value to a string than to a date/time.
I am creating a REST API for a booking calendar, and right now I am trying to figure out the most efficient way of writing a query that returns all timestamps between two dates with a 15 minute interval. If I supply2013-09-21 and 2013-09-22 I would like to get:
2013-09-21 00:15:00
2013-09-21 00:30:00
2013-09-21 00:45:00
2013-09-21 01:00:00
2013-09-21 01:15:00
2013-09-21 01:30:00
...
2013-09-22 23:15:00
2013-09-22 23:30:00
2013-09-22 23:45:00
I would then use this query as a subquery and apply some conditions on it to remove timeslots outside working hours (which are not constant), booked timeslots, etc.
I have seen a lot of blog posts where the author creates a "calendar table" which stores all these timestamps, but that seems like a waste to me since that data doesn't need to be stored.
Any suggestions on how I could do this or a better way to fetch/store the data?
Here is a process that generates 95 rows incrementing a date variable as it goes and then left join the table with the dated entries to the "solid" table that has generated dated rows.
select str_to_date('2010-01-01', '%Y-%m-%d') into #ad;
select * from
(select (#ad := date_add(#ad, INTERVAL 15 MINUTE)) as solid_date from wp_posts limit 95) solid
left join
wp_posts
on solid.solid_date = post_date
I've no idea how to generate an arbitrary number of rows in mysql so i'm just selecting from a table with more than 95 rows (24 hours * 4 appointments per hour less one at midnight) -- my wordpress posts table. Nothing stopping you making just such a table and having a single column with a single incrementing integer in if there are no better ways to do it (i'm an oracle guru not a mysql one). Maybe there isn't one: How do I make a row generator in MySQL?
Where you see wp_posts, substitute the name of your appointments table. Where you see the date, substitute your start date.
The query above produces a list of dates starting 15 after midnight on the chosen day (in my example 2010-01-01)
You can add a WHERE appointments.primary_key_column_here IS NULL if you want to find free slots only
Note you didn't include midnight in your spec. If you want midnight on the first day, start the date in the variable 15 minutes before and limit yourself to 96 rows. If you want midnight on the end day, just limit yourself to 96 rows