MariaDB version 10.4.10.
I have a stock scraper script that fetches stock data every hour and inserts it into a MySQL database. I want a way to get price difference for each stock between, for example:
stocks fetched at 2020-03-25 07:00 and 2020-03-25 19:00 (12 hours)
stocks fetched at 2020-03-25 07:00 and 2020-03-26 07:00 (24 hours)
stocks fetched at 2020-03-25 08:00 and 2020-03-25 20:00 (12 hours)
stocks fetched at 2020-03-25 08:00 and 2020-03-26 08:00 (24 hours)
etc
The database structure looks something like this:
stocks( time_fetched DATETIME, name VARCHAR, price INT )
Some sample data:
**time_fetched name price**
2020-03-25 07:00:00 stock_A 10
2020-03-25 07:00:00 stock_B 14
2020-03-25 08:00:00 stock_A 12
2020-03-25 08:00:00 stock_B 20
...
2020-03-25 19:00:00 stock_A 28
2020-03-25 19:00:00 stock_B 32
2020-03-25 20:00:00 stock_A 40
2020-03-25 20:00:00 stock_B 36
...
2020-03-26 07:00:00 stock_A 12
2020-03-26 07:00:00 stock_B 16
2020-03-26 08:00:00 stock_A 18
2020-03-26 08:00:00 stock_B 16
Expected result:
**time_fetched name current_price price_12h_ago price_24h_ago**
2020-03-25 19:00:00 stock_A 28 10 NULL
2020-03-25 19:00:00 stock_B 32 14 NULL
2020-03-25 20:00:00 stock_A 40 12 NULL
2020-03-25 20:00:00 stock_B 36 20 NULL
2020-03-26 07:00:00 stock_A 12 28 10
2020-03-26 07:00:00 stock_B 16 32 14
2020-03-26 08:00:00 stock_A 18 40 12
2020-03-26 08:00:00 stock_B 16 36 20
Currently I am using SQL similar to this:
WITH prices AS (
SELECT time_fetched, name, price,
LAG(price, 12) OVER(PARTITION BY name ORDER BY time_fetched) AS price_12h_ago,
LAG(price, 24) OVER(PARTITION BY name ORDER BY time_fetched) AS price_24h_ago
FROM stocks
)
SELECT time_fetched, name, price AS current_price, price_12h_ago, price_24h_ago
FROM prices
This works, given that all stocks have price data fetched for all hours. In reality, there are sometimes gaps beetween hours, and price data for some hours and some stocks are missing in the stocks database.
This means that the above code that fetches price on 12 rows before the current one, does not always represent 12 hours before current row.
So I would need a way to get price difference based on actual timediff.
Hope this makes any sense to anyone out there :)
You can use the range() clause. If your times are precise:
SELECT time_fetched, name, price,
MIN(price) OVER (PARTITION BY name
ORDER BY time_fetched
RANGE BETWEEN INTERVAL 12 hour AND INTERVAL 12 hour
) as price_12h_ago,
MIN(price) OVER (PARTITION BY name
ORDER BY time_fetched
RANGE BETWEEN INTERVAL 24 hour AND INTERVAL 24 hour
) as price_24h_ago
FROM stocks;
Unless you set the minutes and seconds to exactly 0, you may want a broader range than just an instant. For instance:
SELECT time_fetched, name, price,
MIN(price) OVER (PARTITION BY name
ORDER BY time_fetched
RANGE BETWEEN INTERVAL '12:05' HOUR_MINUTE AND INTERVAL '11:55' HOUR_MINUTE
) as price_12h_ago,
MIN(price) OVER (PARTITION BY name
ORDER BY time_fetched
RANGE BETWEEN INTERVAL '24:05' HOUR_MINUTE AND INTERVAL '23:55' HOUR_MINUTE
) as price_24h_ago
FROM stocks;
Related
MySQL version 8.0
I want to calculate time difference between two datetime column.
And get rows where duration >= 12:00:00.
which I would normally do:
select id
, start_time
, end_time
, timediff(end_time, start_time) as duration
from table;
which I would get something like this:
id start_time end_time duration
0 1 2020-06-01 01:00:00 2020-06-01 14:00:00 13:00:00
1 2 2020-06-01 01:00:00 2020-06-01 18:00:00 17:00:00
2 3 2020-06-01 19:00:00 2020-06-02 10:00:00 15:00:00
3 4 2020-06-02 04:00:00 2020-06-02 16:00:00 12:00:00
For duration column I don't want times between 00:00:00 ~ 04:00:00 to be added towards the duration. So for the first row duration = 10:00:00 since 01:00:00~14:00:00 = 10:00:00, ignoring times between 00:00:00 ~ 04:00:00
same for second row we substract 3 hours from duration.
so my desired output would be:
id start_time end_time duration
0 1 2020-06-01 01:00:00 2020-06-01 14:00:00 10:00:00
1 2 2020-06-01 01:00:00 2020-06-01 18:00:00 14:00:00
2 3 2020-06-01 19:00:00 2020-06-02 10:00:00 11:00:00
3 4 2020-06-02 04:00:00 2020-06-02 16:00:00 12:00:00
There are lots of rows where times include minutes and seconds too.
Thanks in advance!
I've grabbed all rows where duration >= 12:00:00.
Then separated data into 4 regions depending on their start_time.
a_region = 00~04
b_region = 04~12
c_region = 12~16
d_region = 16~24
For a_region I've subtracted 04:00:00 - start_time which is time we should compensate to duration in a_region.
compensation = 04:00:00 - start_time
compensated_time = duration - compensation.
For b_region it needs no compensation if it has passed 00~04 it means it already passed duration = 12:00:00.
For c_region,
compensation = 16:00:00 - start_time
compensated_time = duration - compensation
For d_region since we've grabbed duration >= 12:00:00
it will pass all of 00~04 therefore
compensated_time = duration - 04:00:00.
I solved it using Python but above is the logic I've used.
One option uses greatest():
select id
, start_time
, end_time
, timediff(
greatest(,
end_time,
date_format(end_time, '%Y-%m-%d 04:00:00')
),
greatest(
start_time,
date_format(start_time, '%Y-%m-%d 04:00:00')
)
) as duration
from table;
My table have fields that represent starting and ending working period as datetime.
I need to find related entries that match a total of 14hours min over a sliding period of 24 hours.
I think window function will (maybe) save me, but MariadDB (i use) doesn't implement yet Range time intervals in window function.
here is some example data:
id starting_hour ending_hour
-- ------------------- -------------------
1 2018-09-02 06:00:00 2018-09-02 08:30:00
2 2018-09-03 08:30:00 2018-09-03 10:00:00
4 2018-09-03 11:00:00 2018-09-03 15:00:00
5 2018-09-02 15:30:00 2018-09-02 16:00:00
6 2018-09-02 16:15:00 2018-09-02 17:00:00
7 2018-09-20 00:00:00 2018-09-20 08:00:00
8 2018-09-19 10:00:00 2018-09-19 12:00:00
9 2018-09-19 12:00:00 2018-09-19 16:00:00
10 2018-10-08 12:00:00 2018-10-08 14:00:00
11 2018-10-29 09:00:00 2018-10-29 10:00:00
So how to find rows where in a 24 hours window their sum a more or equal to 14 hours.
thanks
Edit:
SELECT
id,
starting_hour,
ending_hour,
TIMEDIFF (ending_hour, starting_hour) AS duree,
(
SELECT SUM(TIMEDIFF(LEAST(ending_hour, DATE_ADD(a.starting_hour, INTERVAL 24 HOUR)), starting_hour)) / 10000
FROM `table` b
WHERE b.starting_hour BETWEEN a.starting_hour AND DATE_ADD(a.starting_hour, INTERVAL 24 HOUR)
) AS duration
FROM
`table` a
HAVING duration >= 14
ORDER BY starting_hour ASC
;
This returns Id 8 but i want the whole period. (eg: Id 8, Id 9 and Id 7)
EDIT2:
The expected results are ranges of working time where they are in a window of 24 hours and where their sum are more or equal to 14 hours.
EDIT 3:
In fact under MySQL 8 this seems to work.
SELECT * FROM (
SELECT
*,
SEC_TO_TIME(SUM(TIME_TO_SEC(TIMEDIFF(hs.`ending_hour`, hs.`starting_hour`))) OVER (ORDER BY hs.starting_hour RANGE BETWEEN INTERVAL '12' HOUR PRECEDING AND INTERVAL '12' HOUR following)) AS tot
FROM
table hs
WHERE hs.`starting_hour` > DATE_SUB(NOW(), INTERVAL 50 DAY) AND hs.`ending_hour` <= NOW()
ORDER BY hs.`starting_hour` ASC
) t1
HAVING tot >= '14:00:00'
;
Is there a way to do it under MariaDB 10.2 without window function ? Or without window range function ?
Suppose I have 5 records for a sales table.
ID Name datetime_col
1 ABC 2016-09-15 02:07:56
2 HSJ 2016-09-31 11:45:45
3 JSD 2016-11-26 07:09:56
4 JUH 2016-12-31 12:00:00
5 IGY 2017-01-13 14:00:07
I want to find how many records are there in sales table for each hour between 2016-09-15 AND 2017-01-13
Then result should be like
Hour sales_at_this_hour
2016-09-15 01:00:00 0
2016-09-15 02:00:00 1
2016-09-15 03:00:00 0
...
...
2017-01-13 01:00:00 0
2017-01-13 02:00:00 0
2017-01-13 03:00:00 0
....
2017-01-13 14:00:00 1
Then find the average of sales_at_this_hour using MySQL
EDIT: sorry not fully understand the question at first.
Use DATE_FORMAT
select
DATE_FORMAT(datetime_col, '%Y-%m-%d %h:00:00') as date,
count(id) as count
from table_name
group by date;
Get result with hours that has sales_at_this_hour > 1 (not exactly what you ask for)
datetime_col count
2016-02-04 05:00:00 5
2016-02-04 07:00:00 1
2016-02-04 08:00:00 5
2016-02-04 10:00:00 10
2016-02-04 11:00:00 1
Provide start_date and end_date, and then use DATEDIFF to calculate total time interval for the average calculation.
set #start_date = '2016-01-01', #end_date = '2017-01-01';
select
DATE_FORMAT(group_by_date.datetime, '%h:00:00') as hour,
AVG(group_by_date.count) / DATEDIFF(#end_date, #start_date) as average
from (
select
DATE_FORMAT(created_dtm, '%Y-%m-%d %h:00:00') as datetime,
count(id) as count
from table_name
where created_dtm > #start_date
and created_dtm < #end_date
group by datetime
) group_by_date
group by hour;
For each hour,
average sale count per day = total sale count / total days
hour average
01:00:00 0.03841209
02:00:00 0.01653005
03:00:00 0.0306716
04:00:00 0.01147541
05:00:00 0.01179831
I've been working on a MySQL query that sorts data into weeks but I just can't figure out how to do it.
I would like to sort the data into weeks for the current and last 11 weeks. Each week will run from Monday 00:00:00 to Sunday 23:59:59.
(Taking todays date as 2014-12-04)...
Week 1: 2014-12-01 > 2014-12-07 - (Last Monday 00:00:00 to next Sunday 23:59:59)
Week 2: 2014-11-24 > 2014-11-30 - (Monday before last 00:00:00 to last Sunday 23:59:59)
Week 3: 2014-11-17 > 2014-11-23 - (Monday before before last 00:00:00 to last last Sunday 23:59:59)
And so on...
For each week the value field data will be totalled.
I need the data returned to be in the format:
datetime: The first date (Always a Monday) of that week.
value: The total of all the values in that week.
For example, the returned data:
Week 1: 2014-12-01 : Totalled value=11
Week 2: 2014-11-24 : Totalled value=3
Week 3: 2014-11-17 : Totalled value=9
Week 4: 2014-11-10 : Totalled value=7
Table_1 data:
table1id datetime value
1 2014-09-01 06:00:00 4
2 2014-09-04 17:00:00 6
3 2014-09-09 18:00:00 9
4 2014-09-15 07:00:00 4
5 2014-09-20 10:00:00 2
6 2014-09-25 10:00:00 3
7 2014-09-30 09:00:00 8
8 2014-10-01 14:00:00 5
9 2014-10-05 10:00:00 7
10 2014-10-09 18:00:00 3
11 2014-10-15 05:00:00 4
12 2014-10-20 07:00:00 8
13 2014-10-24 16:00:00 9
14 2014-10-29 15:00:00 5
15 2014-10-31 16:00:00 7
16 2014-11-05 09:00:00 2
17 2014-11-10 08:00:00 4
18 2014-11-15 16:00:00 3
19 2014-11-20 10:00:00 9
20 2014-11-25 10:00:00 2
21 2014-11-30 10:00:00 1
22 2014-12-01 15:00:00 7
23 2014-12-04 18:00:00 2
I 'could' just pull all the data unsorted for the date range using PHP and sort it from there but I'd rather the MySQL server do it.
Any suggestions would be greatly appreciated. :-)
based on generate days from date range
you can do smething like that:
select mondays.week, mondays.day, sum(value)
from
(select a.a+1 week, curdate() - WEEKDAY(curdate()) - INTERVAL (7*a.a) DAY as day from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9 union all select 10 union all select 11) as a) as mondays,
Table_1
where Table_1.datetime between mondays.day and (mondays.day + interval(7) day)
group by mondays.week, mondays.day;
I have a Rates table that records the rate of a process
DateTime Rate
2013-11-25 05:00:00 22
2013-11-25 06:00:00 78
2013-11-25 07:00:00 33
2013-11-25 07:10:00 56
2013-11-25 08:30:00 12
and a Downtime table that records time periods where the above data may not be valid
StartDateTime EndDateTime
2013-11-25 04:59:00 2013-11-25 05:10:00
2013-11-25 07:00:00 2013-11-25 07:15:00
How can I get the following output where any Rate value recorded between any period in the Downtime table is replaced by a fixed value e.g. 50?
DateTime Rate
2013-11-25 05:00:00 50
2013-11-25 06:00:00 78
2013-11-25 07:00:00 50
2013-11-25 07:10:00 50
2013-11-25 08:30:00 12
This should do the trick:
SELECT r.datetime, if(d.startDatetime IS NULL, r.rate, 50) rate
FROM rates r
LEFT JOIN downtime d
ON r.datetime BETWEEN d.startDatetime AND d.endDatetime
Fiddle here.