Calculating a Moving Average MySQL? - mysql
Good Day,
I am using the following code to calculate the 9 Day Moving average.
SELECT SUM(close)
FROM tbl
WHERE date <= '2002-07-05'
AND name_id = 2
ORDER BY date DESC
LIMIT 9
But it does not work because it first calculates all of the returned fields before the limit is called. In other words it will calculate all the closes before or equal to that date, and not just the last 9.
So I need to calculate the SUM from the returned select, rather than calculate it straight.
IE. Select the SUM from the SELECT...
Now how would I go about doing this and is it very costly or is there a better way?
If you want the moving average for each date, then try this:
SELECT date, SUM(close),
(select avg(close) from tbl t2 where t2.name_id = t.name_id and datediff(t2.date, t.date) <= 9
) as mvgAvg
FROM tbl t
WHERE date <= '2002-07-05' and
name_id = 2
GROUP BY date
ORDER BY date DESC
It uses a correlated subquery to calculate the average of 9 values.
Starting from MySQL 8, you should use window functions for this. Using the window RANGE clause, you can create a logical window over an interval, which is very powerful. Something like this:
SELECT
date,
close,
AVG (close) OVER (ORDER BY date DESC RANGE INTERVAL 9 DAY PRECEDING)
FROM tbl
WHERE date <= DATE '2002-07-05'
AND name_id = 2
ORDER BY date DESC
For example:
WITH t (date, `close`) AS (
SELECT DATE '2020-01-01', 50 UNION ALL
SELECT DATE '2020-01-03', 54 UNION ALL
SELECT DATE '2020-01-05', 51 UNION ALL
SELECT DATE '2020-01-12', 49 UNION ALL
SELECT DATE '2020-01-13', 59 UNION ALL
SELECT DATE '2020-01-15', 30 UNION ALL
SELECT DATE '2020-01-17', 35 UNION ALL
SELECT DATE '2020-01-18', 39 UNION ALL
SELECT DATE '2020-01-19', 47 UNION ALL
SELECT DATE '2020-01-26', 50
)
SELECT
date,
`close`,
COUNT(*) OVER w AS c,
SUM(`close`) OVER w AS s,
AVG(`close`) OVER w AS a
FROM t
WINDOW w AS (ORDER BY date DESC RANGE INTERVAL 9 DAY PRECEDING)
ORDER BY date DESC
Leading to:
date |close|c|s |a |
----------|-----|-|---|-------|
2020-01-26| 50|1| 50|50.0000|
2020-01-19| 47|2| 97|48.5000|
2020-01-18| 39|3|136|45.3333|
2020-01-17| 35|4|171|42.7500|
2020-01-15| 30|4|151|37.7500|
2020-01-13| 59|5|210|42.0000|
2020-01-12| 49|6|259|43.1667|
2020-01-05| 51|3|159|53.0000|
2020-01-03| 54|3|154|51.3333|
2020-01-01| 50|3|155|51.6667|
Use something like
SELECT
sum(close) as sum,
avg(close) as average
FROM (
SELECT
(close)
FROM
tbl
WHERE
date <= '2002-07-05'
AND name_id = 2
ORDER BY
date DESC
LIMIT 9 ) temp
The inner query returns all filtered rows in desc order, and then you avg, sum up those rows returned.
The reason why the query given by you doesn't work is due to the fact that the sum is calculated first and the LIMIT clause is applied after the sum has already been calculated, giving you the sum of all the rows present
an other technique is to do a table:
CREATE TABLE `tinyint_asc` (
`value` tinyint(3) unsigned NOT NULL default '0',
PRIMARY KEY (value)
) ;
INSERT INTO `tinyint_asc` VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41),(42),(43),(44),(45),(46),(47),(48),(49),(50),(51),(52),(53),(54),(55),(56),(57),(58),(59),(60),(61),(62),(63),(64),(65),(66),(67),(68),(69),(70),(71),(72),(73),(74),(75),(76),(77),(78),(79),(80),(81),(82),(83),(84),(85),(86),(87),(88),(89),(90),(91),(92),(93),(94),(95),(96),(97),(98),(99),(100),(101),(102),(103),(104),(105),(106),(107),(108),(109),(110),(111),(112),(113),(114),(115),(116),(117),(118),(119),(120),(121),(122),(123),(124),(125),(126),(127),(128),(129),(130),(131),(132),(133),(134),(135),(136),(137),(138),(139),(140),(141),(142),(143),(144),(145),(146),(147),(148),(149),(150),(151),(152),(153),(154),(155),(156),(157),(158),(159),(160),(161),(162),(163),(164),(165),(166),(167),(168),(169),(170),(171),(172),(173),(174),(175),(176),(177),(178),(179),(180),(181),(182),(183),(184),(185),(186),(187),(188),(189),(190),(191),(192),(193),(194),(195),(196),(197),(198),(199),(200),(201),(202),(203),(204),(205),(206),(207),(208),(209),(210),(211),(212),(213),(214),(215),(216),(217),(218),(219),(220),(221),(222),(223),(224),(225),(226),(227),(228),(229),(230),(231),(232),(233),(234),(235),(236),(237),(238),(239),(240),(241),(242),(243),(244),(245),(246),(247),(248),(249),(250),(251),(252),(253),(254),(255);
After you can used it like that:
select
date_add(tbl.date, interval tinyint_asc.value day) as mydate,
count(*),
sum(myvalue)
from tbl inner
join tinyint_asc.value <= 30 -- for a 30 day moving average
where date( date_add(o.created_at, interval tinyint_asc.value day ) ) between '2016-01-01' and current_date()
group by mydate
This query is fast:
select date, name_id,
case #i when name_id then #i:=name_id else (#i:=name_id)
and (#n:=0)
and (#a0:=0) and (#a1:=0) and (#a2:=0) and (#a3:=0) and (#a4:=0) and (#a5:=0) and (#a6:=0) and (#a7:=0) and (#a8:=0)
end as a,
case #n when 9 then #n:=9 else #n:=#n+1 end as n,
#a0:=#a1,#a1:=#a2,#a2:=#a3,#a3:=#a4,#a4:=#a5,#a5:=#a6,#a6:=#a7,#a7:=#a8,#a8:=close,
(#a0+#a1+#a2+#a3+#a4+#a5+#a6+#a7+#a8)/#n as av
from tbl,
(select #i:=0, #n:=0,
#a0:=0, #a1:=0, #a2:=0, #a3:=0, #a4:=0, #a5:=0, #a6:=0, #a7:=0, #a8:=0) a
where name_id=2
order by name_id, date
If you need an average over 50 or 100 values, it's tedious to write, but
worth the effort. The speed is close to the ordered select.
Related
How to find maximum time range collision occurencies in Mysql
I have a time range entity with start and end datetime column. I need to find the maximum occurrencies (count) of overlapping the same time slot. In the example above, the count is 4. https://www.db-fiddle.com/f/pcq1MjQeqSEMDdyGxkFsR5/0 Probably I need some recurring query but I don't know how to start.
For MySQL 5.x: SELECT SUM(points2.weight) max_weight FROM ( SELECT start dt FROM slots UNION DISTINCT SELECT `end` FROM slots ) points1 JOIN ( SELECT dt, SUM(weight) weight FROM ( SELECT start dt, 1 weight FROM slots UNION ALL SELECT `end`, -1 FROM slots ) points GROUP BY dt ) points2 ON points1.dt >= points2.dt GROUP BY points1.dt ORDER BY max_weight DESC LIMIT 1 https://dbfiddle.uk/f0b56Q4X (step-by-step, with comments)
Getting total average between dates
I have a table named sales with the following format. sale_id user_id sale_date sale_cost j847bv-6ggd bd48ta36-cn5x 2017-01-10 15:43:12 30 vf87x2-15gr bd48ta36-cn5x 2017-01-05 13:41:16 60 3gfd7f-2cdd 8g4f5ccf-1fet 2017-01-15 14:10:12 100 4bgfd5-12vn 8g4f5ccf-1fet 2017-01-20 19:47:14 20 b58e32-bf87 8g4f5ccf-1fet 2017-01-20 17:35:13 15 bg87db-127g gr4gg1f4-3gbb 2017-01-20 12:26:15 80 How could I get the average amount that a user (user_d) spends within the first X amount of days since their first purchase? I don't want an average for every user, but a total average for all. I know that I can get the average using select avg(sale_cost) but I'm unsure how to find out the average for a date period.
You can find average of total for each user within 10 days date range from intial sales date like this: select avg(sale_cost) from ( select sum(t.sale_cost) sale_cost from your_table t join ( select user_id, min(sale_date) start_date, date_add(min(sale_date), interval 10 day) end_date from your_table group by user_id ) t2 on t.user_id = t2.user_id and t.sale_date between t2.start_date and t2.end_date group by t.user_id ) t; It finds the first sale_date and date 10 days after this for each user. Then joins it with the table to get total for each user within that range and then finally average of the above calculated totals. Demo If you want to find the average between overall first sale_date (not individual) and 10 days from it, use: select avg(sale_cost) from ( select sum(t.sale_cost) sale_cost from your_table t join ( select min(sale_date) start_date, date_add(min(sale_date), interval 10 day) end_date from your_table ) t2 on t.sale_date between t2.start_date and t2.end_date group by t.user_id ) t; Demo
The between operator comes in handy whenever it comes to checking ranges SELECT column_name(s) FROM table_name WHERE column_name BETWEEN value1 AND value2; In this case value1 and value2 will be replaced by your dates using: '2011-01-01 00:00:00' AND '2011-01-31 23:59:59' or sale_date AND DATE_ADD(OrderDate,INTERVAL 10 DAY) The first way is faster and also the between values are inclusive.
MySQL Select where column greater than or equal to closest past date from given date
TABLE Table: Id Date 1 01-10-15 2 01-01-16 3 01-03-16 4 01-06-16 5 01-08-16 Given two dates startdate 01-02-16 and enddate 01-05-16. I need to get the data from the table such that it returns all data between the closest past date from startdate and closest future date from enddate including the two dates. So the result will look like this. Result: Id Date 2 01-01-16 3 01-03-16 4 01-06-16 What I am doing What I am doing now is fetching the whole data and removing from the array results less than closest fromdate and greater than closest enddate What I want What I want is to do this in query itself so that I don't have to fetch the whole data from table each time.
If you column's type is date, use union can do it: (select * from yourtable where `date` <= '2016-01-02' order by `date` desc limit 1) -- This query will get record which is closest past date from startdate union (select * from yourtable where `date` => '2016-01-05' order by `date` asc limit 1) -- This query will get record which is closest future date from enddate union (select * from yourtable where `date` between '2016-01-02' and '2016-01-05') Demo Here
Imaging your date is in YYYY-mm-dd ## get rows within the dates SELECT * FROM tab WHERE ymd BETWEEN :start_date AND :end_date ## get one row closest to start date UNION SELECT * FROM tab WHERE ymd < :start_date ORDER BY ymd DESC LIMIT 1 ## get one row closest to end date UNION SELECT * FROM tab WHERE ymd > :end_date ORDER BY ymd LIMIT 1
Try this Select * From dTable Where [Date] Between (Select Max(t1.Date) From dTable t1 Where t1.date <startdate) And (Select Min(t2.Date) From dTable t2 Where t2.date >enddate)
If Date is String, STR_TO_DATE and DATEDIFF can be used here. SELECT id, Date FROM tab where STR_TO_DATE(Date, '%d-%m-%y') BETWEEN('2016-02-01')AND('2016-05-01') or id = (SELECT id FROM tab where STR_TO_DATE(Date, '%d-%m-%y') > '2016-05-01' ORDER BY DATEDIFF(STR_TO_DATE(Date, '%d-%m-%y'), '2016-05-01') Limit 1) or id = (SELECT id FROM tab where STR_TO_DATE(Date, '%d-%m-%y') < '2016-02-01' ORDER BY DATEDIFF('2016-02-01', STR_TO_DATE(Date, '%d-%m-%y')) Limit 1)
MySQL - get min/max of consecutive events in a series of rows
I have a table that looks like this: http://sqlfiddle.com/#!9/152d2/1/0 CREATE TABLE Table1 ( id int, value decimal(10,5), dt datetime, threshold_id int ); Current Query: SELECT sensors_id, DATE_FORMAT(datetime, '%Y-%m-%d'), MIN(value), MAX(value) FROM Readings WHERE datetime < "2015-11-18 00:00:00" AND datetime > "2015-10-18 00:00:00" AND sensors_id = 9 GROUP BY DATE_FORMAT(datetime, '%Y-%m-%d') ORDER BY datetime DESC What I'm trying to do is to return the min/max value in each group, where threshold_id IS NOT NULL. Therefore, the example should return something like: min_value | max_value | start_date | end_date 9 | 10.5 | 2015-07-29 10:52:31 | 2015-07-29 10:57:31 8.5 | 9.5 | 2015-07-29 11:03:31 | 2015-07-29 11:05:31 I can't work out how to do this grouping. I need to return the min/max for each group of consecutive rows where the threshold_id IS NOT NULL.
Use user variables to compare existing value to the previous value and increment a column you can use to group by,tested on my machine. SELECT MIN(value),MAX(value),MIN(dt),MAX(dt) FROM ( SELECT id,value,dt, CASE WHEN COALESCE(threshold_id,'')=#last_ci THEN #n ELSE #n:=#n+1 END AS g, #last_ci := COALESCE(threshold_id,'') As th FROM Table1, (SELECT #n:=0) r ORDER BY id ) s WHERE th!='' GROUP BY g For mysql 8 this could be rewritten as below.Use a CTE to get different sequences and GROUP By the difference between them. WITH cte as ( SELECT *, ROW_NUMBER() OVER (ORDER BY id)as rn, ROW_NUMBER() OVER (PARTITION BY threshold_id ORDER BY id)as rnn FROM Table1 ORDER BY id ) SELECT MIN(value),MAX(value),MIN(dt),MAX(dt) FROM cte WHERE threshold_id IS NOT NULL GROUP BY rn-rnn MYSQL8 FIDDLE
Your sample data only includes a single day's worth, so you only get a single row back (assuming you want to group by day): SELECT DAYOFYEAR(dt) `day`, MIN(`value`) min_value, MAX(`value`) max_value FROM Table1 GROUP BY `day` ORDER BY `day` ASC
specific status on consecutive days
I have a MySQL table ATT which has EMP_ID,ATT_DATE,ATT_STATUS with ATT_STATUS with different values 1-Present,2-Absent,3-Weekly-off. I want to find out those EMP_ID's which have status 2 consecutively for 10 days in a given date range. Please help
Please have a try with this: SELECT EMP_ID FROM ( SELECT IF((#prevDate!=(q.ATT_DATE - INTERVAL 1 DAY)) OR (#prevEmp!=q.EMP_ID) OR (q.ATT_STATUS != 2), #rownum:=#rownum+1, #rownum:=#rownum) AS rownumber, #prevDate:=q.ATT_DATE, #prevEmp:=q.EMP_ID, q.* FROM ( SELECT EMP_ID , ATT_DATE , ATT_STATUS FROM org_tb_dailyattendance, (SELECT #rownum:=0, #prevDate:='', #prevEmp:=0) vars WHERE ATT_DATE BETWEEN '2013-01-01' AND '2013-02-15' ORDER BY EMP_ID, ATT_DATE, ATT_STATUS ) q ) sq GROUP BY EMP_ID, rownumber HAVING COUNT(*) >= 10 The logic is, to first sort the table by employee id and the dates. Then introduce a rownumber which increases only if the days are not consecutive or the employee id is not the previous one or the status is not 2 Then I just grouped by this rownumber and counted if there are 10 rows in each group. That should be the ones who were absent for 10 days or more.
Have you tried something like this SELECT EMP_ID count(*) as consecutive_count min(ATT_DATE) FROM (SELECT * FROM ATT ORDER BY EMP_ID) GROUP BY EMP_ID, ATT_DATE WHERE ATT_STATUS = 2 HAVING consecutive_count > 10