Calculating total of days between multiple date ranges - mysql
I have a problem figuring out how to calculate total days between different date ranges using MySQL.
I need to count total of days between different date ranges without days that include each other date range.
Data example:
from
to
2021/08/28
2021/09/29
2021/08/29
2021/09/01
2021/09/01
2021/09/01
Date ranges example and output
Dates 2021-08-28 2021-08-29 2021-08-30 2021-08-31 2021-09-01 2021-09-02 2021-09-03 2021-09-04
Range1 |--------------------|
Range2 |--------------------|
Range3 |--------------------|
Total Days: 6
Dates 2021-08-28 2021-08-29 2021-08-30 2021-08-31 2021-09-01 2021-09-02 2021-09-03 2021-09-04
Range1 |--------------------|
Range2 |--------------------------------------------|
Range3 |--------|
Total Days: 5
Possibly the simplest method is a recursive CTE:
with recursive dates as (
select `from`, `to`
from t
union all
select `from` + interval 1 day, `to`
from dates
where `from` < `to`
)
select count(distinct `from`)
from dates;
Note that from and to are really bad names for columns because they are SQL keywords.
EDIT:
In MySQL 5.7, you can use a tally table -- a table of numbers.
Assuming your original table has enough rows for the widest time span, you can use:
select count(distinct `from` + interval (n - 1) day)
from t cross join
(select (#rn := #rn + 1) as n
from t cross join
(select #rn := 0) params
) n
on `from` + interval (n - 1) day <= `to`;
If your table is really big, you might want a limit for the widest time period.
Related
SQL get consecutive starting and end date with specific period
I have a hotel_availablities table something like this. date availability 2021-01-15 y 2021-01-16 y 2021-01-17 y 2021-01-18 n 2021-01-19 n 2021-01-20 y 2021-01-21 n 2021-01-22 y 2021-01-23 y I wanted to get the results of possible available date range values where period of stay is 2 days. date range 2021-01-15 : 2021-01-16 2021-01-16 : 2021-01-17 2021-01-22 : 2021-01-23 If period of stays was 3 days I would get results as below date range 2021-01-15 : 2021-01-18 How can I achieve this result with sql?
This is a gaps and islands problem. Assuming you are using MySQL 8+, we can use the difference in row numbers method here: WITH cte AS ( SELECT *, ROW_NUMBER() OVER (ORDER BY date) rn1, ROW_NUMBER() OVER (PARTITION BY availability ORDER BY date) rn2 FROM yourTable ) SELECT MIN(date) AS start_date, MAX(date) AS end_date, COUNT(*) AS cnt FROM cte WHERE availability = 'y' GROUP BY rn1 - rn2 HAVING COUNT(*) >= 2; -- but change to COUNT(*) >= 3, e.g. for three days in a row Demo Note that my query does not give the exact output you expect, but maybe this would be enough for your requirement. If you wanted to break out each island larger than 2 days in terms of pairs of 2 days at a time, you might have to also bring in a calendar table here.
Assuming you have a row for each date, you can use a single window function -- and no aggregation. That window function is a count of 'y" in the current row and next n - 1 days: select date, date + interval <n - 1> day from (select t.*, sum(availability = 'y') over (order by date rows between current row and <n - 1> following ) as num_y from t ) t where num_y = <n>;
Through below query you can achieve that. First I have numbered the rows with row_number() user lead() to get next consecutive dates. In lead second parameter is determining how many consecutive dates will be considered. WITH t AS ( SELECT date ,ROW_NUMBER() OVER(ORDER BY date) rownumber FROM hotel_availablities where availability='y' ), t2 as (SELECT date StartDate ,lead(date ,1)over (partition by date_add(date ,INTERVAL -rownumber day)) EndDate FROM t) select concat(startdate,' - ',enddate)daterange from t2 where enddate is not null
SQL/MySQL: split a quantity value into multiple rows by date
I have a table with three columns: planning_start_date - planning_end_date - quantity. For example I have this data: planning_start_date | planning_end_date | quantity 2019-03-01 | 2019-03-31 | 1500 I need to split the value 1500 into multiple rows with the adverage per day, so 1500 / 31 days = 48,38 per day. The expected result should be: date daily_qty 2019-03-01 | 48,38 2019-03-02 | 48,38 2019-03-03 | 48,38 ... 2019-03-31 | 48,38 Anyone with some suggestions?
Should you decide to upgrade to MySQL 8.0, here's a recursive CTE that will generate a list of all the days between planning_start_date and planning_end_date along with the required daily quantity: WITH RECURSIVE cte AS ( SELECT planning_start_date AS date, planning_end_date, quantity / (DATEDIFF(planning_end_date, planning_start_date) + 1) AS daily_qty FROM test UNION ALL SELECT date + INTERVAL 1 DAY, planning_end_date, daily_qty FROM cte WHERE date < planning_end_date ) SELECT `date`, daily_qty FROM cte ORDER BY `date` Demo on dbfiddle
In MySLQ 8+, you can use a recursive CTE like this: with recursive cte(dte, planning_end_date, quantity, days) as ( select planning_start_date as dte, planning_end_date, quantity, datediff(planning_end_date, planning_start_date) + 1 as days from t union all select dte + interval 1 day as dte, planning_end_date, quantity, days from cte where dte < planning_end_date ) select dte, quantity / days from cte; Here is a db<>fiddle. In earlier versions, you want a numbers table of some sort. For instance, if your table has enough rows, you can just use it: select (planning_start_date + interval n.n day), quantity / (datediff(planning_end_date, planning_start_date) + 1) from t join (select (#rn := #rn + 1) as n from t cross join (select #rn := 0) params ) n on planning_start_date + interval n.n day <= planning_end_date; You can use any table that is large enough for n.
Find number of "active" rows each month for multiple months in one query
I have a mySQL database with each row containing an activate and a deactivate date. This refers to the period of time when the object the row represents was active. activate deactivate id 2015-03-01 2015-05-10 1 2013-02-04 2014-08-23 2 I want to find the number of rows that were active at any time during each month. Ex. Jan: 4 Feb: 2 Mar: 1 etc... I figured out how to do this for a single month, but I'm struggling with how to do it for all 12 months in a year in a single query. The reason I would like it in a single query is for performance, as information is used immediately and caching wouldn't make sense in this scenario. Here's the code I have for a month at a time. It checks if the activate date comes before the end of the month in question and that the deactivate date was not before the beginning of the period in question. SELECT * from tblName WHERE activate <= DATE_SUB(NOW(), INTERVAL 1 MONTH) AND deactivate >= DATE_SUB(NOW(), INTERVAL 2 MONTH) If anybody has any idea how to change this and do grouping such that I can do this for an indefinite number of months I'd appreciate it. I'm at a loss as to how to group.
If you have a table of months that you care about, you can do: select m.*, (select count(*) from table t where t.activate_date <= m.month_end and t.deactivate_date >= m.month_start ) as Actives from months m; If you don't have such a table handy, you can create one on the fly: select m.*, (select count(*) from table t where t.activate_date <= m.month_end and t.deactivate_date >= m.month_start ) as Actives from (select date('2015-01-01') as month_start, date('2015-01-31') as month_end union all select date('2015-02-01') as month_start, date('2015-02-28') as month_end union all select date('2015-03-01') as month_start, date('2015-03-31') as month_end union all select date('2015-04-01') as month_start, date('2015-04-30') as month_end ) m; EDIT: A potentially faster way is to calculate a cumulative sum of activations and deactivations and then take the maximum per month: select year(date), month(date), max(cumes) from (select d, (#s := #s + inc) as cumes from (select activate_date as d, 1 as inc from table t union all select deactivate_date, -1 as inc from table t ) t cross join (select #s := 0) param order by d ) s group by year(date), month(date);
Calculating a Moving Average MySQL?
Good Day, I am using the following code to calculate the 9 Day Moving average. SELECT SUM(close) FROM tbl WHERE date <= '2002-07-05' AND name_id = 2 ORDER BY date DESC LIMIT 9 But it does not work because it first calculates all of the returned fields before the limit is called. In other words it will calculate all the closes before or equal to that date, and not just the last 9. So I need to calculate the SUM from the returned select, rather than calculate it straight. IE. Select the SUM from the SELECT... Now how would I go about doing this and is it very costly or is there a better way?
If you want the moving average for each date, then try this: SELECT date, SUM(close), (select avg(close) from tbl t2 where t2.name_id = t.name_id and datediff(t2.date, t.date) <= 9 ) as mvgAvg FROM tbl t WHERE date <= '2002-07-05' and name_id = 2 GROUP BY date ORDER BY date DESC It uses a correlated subquery to calculate the average of 9 values.
Starting from MySQL 8, you should use window functions for this. Using the window RANGE clause, you can create a logical window over an interval, which is very powerful. Something like this: SELECT date, close, AVG (close) OVER (ORDER BY date DESC RANGE INTERVAL 9 DAY PRECEDING) FROM tbl WHERE date <= DATE '2002-07-05' AND name_id = 2 ORDER BY date DESC For example: WITH t (date, `close`) AS ( SELECT DATE '2020-01-01', 50 UNION ALL SELECT DATE '2020-01-03', 54 UNION ALL SELECT DATE '2020-01-05', 51 UNION ALL SELECT DATE '2020-01-12', 49 UNION ALL SELECT DATE '2020-01-13', 59 UNION ALL SELECT DATE '2020-01-15', 30 UNION ALL SELECT DATE '2020-01-17', 35 UNION ALL SELECT DATE '2020-01-18', 39 UNION ALL SELECT DATE '2020-01-19', 47 UNION ALL SELECT DATE '2020-01-26', 50 ) SELECT date, `close`, COUNT(*) OVER w AS c, SUM(`close`) OVER w AS s, AVG(`close`) OVER w AS a FROM t WINDOW w AS (ORDER BY date DESC RANGE INTERVAL 9 DAY PRECEDING) ORDER BY date DESC Leading to: date |close|c|s |a | ----------|-----|-|---|-------| 2020-01-26| 50|1| 50|50.0000| 2020-01-19| 47|2| 97|48.5000| 2020-01-18| 39|3|136|45.3333| 2020-01-17| 35|4|171|42.7500| 2020-01-15| 30|4|151|37.7500| 2020-01-13| 59|5|210|42.0000| 2020-01-12| 49|6|259|43.1667| 2020-01-05| 51|3|159|53.0000| 2020-01-03| 54|3|154|51.3333| 2020-01-01| 50|3|155|51.6667|
Use something like SELECT sum(close) as sum, avg(close) as average FROM ( SELECT (close) FROM tbl WHERE date <= '2002-07-05' AND name_id = 2 ORDER BY date DESC LIMIT 9 ) temp The inner query returns all filtered rows in desc order, and then you avg, sum up those rows returned. The reason why the query given by you doesn't work is due to the fact that the sum is calculated first and the LIMIT clause is applied after the sum has already been calculated, giving you the sum of all the rows present
an other technique is to do a table: CREATE TABLE `tinyint_asc` ( `value` tinyint(3) unsigned NOT NULL default '0', PRIMARY KEY (value) ) ; INSERT INTO `tinyint_asc` VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41),(42),(43),(44),(45),(46),(47),(48),(49),(50),(51),(52),(53),(54),(55),(56),(57),(58),(59),(60),(61),(62),(63),(64),(65),(66),(67),(68),(69),(70),(71),(72),(73),(74),(75),(76),(77),(78),(79),(80),(81),(82),(83),(84),(85),(86),(87),(88),(89),(90),(91),(92),(93),(94),(95),(96),(97),(98),(99),(100),(101),(102),(103),(104),(105),(106),(107),(108),(109),(110),(111),(112),(113),(114),(115),(116),(117),(118),(119),(120),(121),(122),(123),(124),(125),(126),(127),(128),(129),(130),(131),(132),(133),(134),(135),(136),(137),(138),(139),(140),(141),(142),(143),(144),(145),(146),(147),(148),(149),(150),(151),(152),(153),(154),(155),(156),(157),(158),(159),(160),(161),(162),(163),(164),(165),(166),(167),(168),(169),(170),(171),(172),(173),(174),(175),(176),(177),(178),(179),(180),(181),(182),(183),(184),(185),(186),(187),(188),(189),(190),(191),(192),(193),(194),(195),(196),(197),(198),(199),(200),(201),(202),(203),(204),(205),(206),(207),(208),(209),(210),(211),(212),(213),(214),(215),(216),(217),(218),(219),(220),(221),(222),(223),(224),(225),(226),(227),(228),(229),(230),(231),(232),(233),(234),(235),(236),(237),(238),(239),(240),(241),(242),(243),(244),(245),(246),(247),(248),(249),(250),(251),(252),(253),(254),(255); After you can used it like that: select date_add(tbl.date, interval tinyint_asc.value day) as mydate, count(*), sum(myvalue) from tbl inner join tinyint_asc.value <= 30 -- for a 30 day moving average where date( date_add(o.created_at, interval tinyint_asc.value day ) ) between '2016-01-01' and current_date() group by mydate
This query is fast: select date, name_id, case #i when name_id then #i:=name_id else (#i:=name_id) and (#n:=0) and (#a0:=0) and (#a1:=0) and (#a2:=0) and (#a3:=0) and (#a4:=0) and (#a5:=0) and (#a6:=0) and (#a7:=0) and (#a8:=0) end as a, case #n when 9 then #n:=9 else #n:=#n+1 end as n, #a0:=#a1,#a1:=#a2,#a2:=#a3,#a3:=#a4,#a4:=#a5,#a5:=#a6,#a6:=#a7,#a7:=#a8,#a8:=close, (#a0+#a1+#a2+#a3+#a4+#a5+#a6+#a7+#a8)/#n as av from tbl, (select #i:=0, #n:=0, #a0:=0, #a1:=0, #a2:=0, #a3:=0, #a4:=0, #a5:=0, #a6:=0, #a7:=0, #a8:=0) a where name_id=2 order by name_id, date If you need an average over 50 or 100 values, it's tedious to write, but worth the effort. The speed is close to the ordered select.
Find max of continuous streak and the current streak from datetime
I have the following data of a particular user - Table temp - time_stamp 2015-07-19 10:52:00 2015-07-18 10:49:00 2015-07-12 10:43:00 2015-06-08 12:32:00 2015-06-07 11:33:00 2015-06-06 10:05:00 2015-06-05 04:17:00 2015-04-14 04:11:00 2014-04-02 23:19:00 So the output for the query should be - Maximum streak = 4, Current streak = 2 Max streak = 4 because of these - 2015-06-08 12:32:00 2015-06-07 11:33:00 2015-06-06 10:05:00 2015-06-05 04:17:00 And current streak is 2 because of these (Assuming today's date is 2015-07-19)- 2015-07-19 10:52:00 2015-07-18 10:49:00 EDIT: I want a simple SQL query for MYSQL
For MAX streak(streak) you can use this, I have use the same query to calculate max streak. This may help you SELECT * FROM ( SELECT t.*, IF(#prev + INTERVAL 1 DAY = t.d, #c := #c + 1, #c := 1) AS streak, #prev := t.d FROM ( SELECT date AS d, COUNT(*) AS n FROM table_name group by date ) AS t INNER JOIN (SELECT #prev := NULL, #c := 1) AS vars ) AS t ORDER BY streak DESC LIMIT 1;
A general approach with the gaps and islands queries is to tag each row with its rank in the data and with its rank in the full list of dates. The clusters will all have the same difference. Caveats: I don't know if this query will be efficient. I don't remember if MySQL allows for scalar subqueries. I didn't look up the way to calculate a day interval in MySQL. select user_id, max(time_stamp), count(*) from ( select t.user_id, t.time_stamp, ( select count(*) from T as t2 where t2.user_id = t.user_id and t2.time_stamp <= t.time_stamp ) as rnk, number of days from t.time_stamp to current_date as days from T as t ) as data group by usr_id, days - rnk