MySQL moving average over past 14 days - mysql

in my MySQL database I have the following view (originally combining 2 data tables) called MYDATA.
TABLE MYDATA
user
myDate
items
17
2020-01-01
1.0
22
2020-01-01
6.0
17
2020-01-02
3.2
17
2020-01-04
4.0
17
2020-01-08
1.0
17
2020-01-09
6.2
22
2020-01-09
4.0
17
2020-01-10
5.3
As you can see NOT all dates (column myDate) contain items. For a selected user (i.e. user 17) I need to calculate the moving average of sold items (column items) over past 14 days ("this" day included) for ALL DATES (i.e. including 2020-01-03 which is not included in the MYDATA table). So basically I want to obtain the following:
user
myDate
result
17
2020-01-01
(avg last 14 days)
17
2020-01-02
(avg last 14 days)
17
2020-01-03
(avg last 14 days)
...
...
...
17
2020-12-30
(avg last 14 days)
17
2020-12-31
(avg last 14 days)
Feel free to play with it in SQLFiddle: http://sqlfiddle.com/#!9/02cc94/1
If needed I have a table "calendar" containing all the year's dates as well.
TABLE CALENDAR
myDate
2020-01-01
2020-01-02
2020-01-03
2020-01-04
2020-01-05
2020-01-06
How can I proceed please? Thanks for any help. I've been stuck on this issue for months.

Look for
SELECT user, t1.myDate, SUM(t2.items) / 14 avg_items
FROM calendar t1
JOIN test t2 ON t2.myDate BETWEEN t1.myDate - INTERVAL 13 DAY and t1.myDate
GROUP BY t2.user, t1.myDate
ORDER BY 1,2
fiddle - MySQL8-specific construction (CTE) is used for calendar table generation only.

Related

Change Data Capture Redshift

I have a table
DAY 1
ID
amount
DATE
1
10
12-02-2020
2
15
12-02-2020
3
20
12-02-2020
4
25
12-02-2020
I did a sum of the amount on day one which turns out to be 70
Now next day I have few more rows where the amount is UPDATED an APPENDED
New tables looks like this
DAY 2
ID
amount
DATE
1
10
12-02-2020
2
20
13-02-2020
3
20
12-02-2020
4
25
12-02-2020
5
30
13-02-2020
6
35
14-02-2020
Now if you see the ID 2 has new updates amount which is 20 earlier 15
and it has new data from dates 13 and 14 on ID 5 and 6
Can I just run a query where it will only process the changed data and add it to the
previous sum
so like 30+35+5(as only 5 increased from the last value)
total = 70
Mainly to process changed data
This will very much depend on how the historical data will be provided.
This example requires additional Day column in the historical data table AND that you're using a MySQL version that supports LAG() (e.g. MySQL v8+ OR MariaDB 10.3+). Let's assume that it's possible for the historical data table to be like this:
ID
Amount
Date
Day
1
10
2020-02-12
1
2
15
2020-02-12
1
3
20
2020-02-12
1
4
25
2020-02-12
1
1
10
2020-02-12
2
2
20
2020-02-13
2
3
20
2020-02-12
2
4
25
2020-02-12
2
5
30
2020-02-13
2
6
35
2020-02-14
2
.. then maybe a query like this:
SELECT Day,
SUM(amount) AS Total,
SUM(amount)-LAG(SUM(amount)) OVER (ORDER BY Day) AS diff
FROM historical_data
GROUP BY Day
ORDER BY Day;
OR (in for MariaDB):
SELECT Day, Total,
Total-LAG(Total) OVER (ORDER BY Day) AS Diff
FROM
(SELECT Day,
SUM(amount) AS Total
FROM historical_data
GROUP BY Day) A;
This will return result like:
Day
Total
diff
1
70
2
140
70
I was following an example from this site on how to use LAG() to get the row data value above it an using them to subtract the SUM(amount) value for that day.
Here's a demo fiddle of the experiment.

How to create a SQL query that calculate monthly grow in population

I want to create a SQL query that count the number of babies born in month A, then it should count the babies born in month B but the second record should have the sum of month A plus B. For example;
Month | Number
--------|---------
Jan | 5
Feb | 7 <- Here were 2 babies born but it have the 5 of the previous month added
Mar | 13 <- Here were 6 babies born but it have the 7 of the two previous months added
Can somebody maybe please help me with this, is it possible to do something like this?
I have a straight forward table with babyID, BirthDate, etc.
Thank you very much
Consider using a subquery that calculates a running count. Both inner and outer query would be aggregate group by queries:
Using the following sample data:
babyID Birthdate
1 2015-01-01
2 2015-01-15
3 2015-01-20
4 2015-02-01
5 2015-02-03
6 2015-02-21
7 2015-03-11
8 2015-03-21
9 2015-03-27
10 2015-03-30
11 2015-03-31
SQL Query
SELECT MonthName(BirthDate) As BirthMonth, Count(*) As BabyCount,
(SELECT Count(*) FROM BabyTable t2
WHERE Month(t2.BirthDate) <= Month(BabyTable.BirthDate)) As RunningCount
FROM BabyTable
GROUP BY Month(BirthDate)
Output
BirthMonth BabyCount RunningCount
January 3 3
February 3 6
March 5 11

schedule after 0:00 hour not working before first interval

I have a table with 4 schedule interval:
id time_int progr_comment A B C D
1 05:30:00 Good Morning 1 4 2 17
2 06:50:00 Have a nice day 1 4 2 17
3 17:00:00 Welcome Home 1 4 4 23
4 18:30:00 Good Evening 1 4 2 22
5 22:00:00 Good Night 1 4 2 20
For each interval I compare with NOW() and I will take the variables (A, B, C, D) from table based on time_int <= NOW() then my program create a query to mysql. The problem come after 0:00 hour. The query does not find anything until NOW() pass the first interval (ex. 5:30:00).

Select rows from last existing 12 months

There's a DATETIME column called time. How could I select all rows that fall within the last existing 12 months (NOT within the last year from today)? Not every month might have a row, and months may have more than one row.
For example, out of this table (ORDER BY time DESC), rows with ids 2 to 17 would be selected.
id time
-- ----
17 2015-04-01
16 2015-04-01
15 2015-03-01
14 2015-02-01
13 2015-01-01
12 2014-12-01
11 2014-11-01
10 2014-10-01
9 2013-12-01
8 2013-11-01
7 2013-10-01
6 2013-09-01
5 2013-09-01
4 2013-09-01
3 2013-09-01
2 2013-08-01
1 2013-07-01
Another way to put this:
Take the table above and group by month/year, so we get:
2015-04
2015-03
2015-02
2015-01
2014-12
2014-11
2014-10
2013-12
2013-11
2013-10
2013-09
2013-08
2013-07
Now take the 12 most recent months from this list, which is everything except 2013-07.
2015-04
2015-03
2015-02
2015-01
2014-12
2014-11
2014-10
2013-12
2013-11
2013-10
2013-09
2013-08
And select everything from those months.
I guess I could do this with multiple queries or subqueries but is there another way to do this?
If your time field is only month-precision, you could do it with a pretty simple subselect:
SELECT * FROM Table t1
WHERE time IN (
SELECT DISTINCT time FROM Table t2 ORDER BY time DESC LIMIT 12
)
If your timestamps are full-precision, you could do the same thing, but you'd need to do some date manipulation to round the dates to the month for comparison.

Find first and last business day in R or Mysql

I'm looking to get a list of the first and last business days of the month.
Its basically a list of business days:
2009-01-03
2009-01-04
2009-01-05
...
I just want to get a list of the first and last days, basically and max and min day(date) for each year-month combination.
Any suggestions?
Your question states that you already have a list of business days and that you need a way of finding the minimum and maximum for each year-month combination.
You can use ddply in package plyr to do this. I also make use of package lubridate because it has some convenience functions to extract the year and month from a date.
Create some data:
library(lubridate)
x <- sample(seq(as.Date("2011-01-01"), by="1 day", length.out=365), 100)
df <- data.frame(date=x, year=year(x), month=month(x))
Now extract the min and max for each month:
library(plyr)
ddply(df, .(year, month), summarize, first=min(date), last=max(date))
year month first last
1 2011 1 2011-01-03 2011-01-30
2 2011 2 2011-02-03 2011-02-19
3 2011 3 2011-03-06 2011-03-29
4 2011 4 2011-04-09 2011-04-30
5 2011 5 2011-05-01 2011-05-29
6 2011 6 2011-06-04 2011-06-28
7 2011 7 2011-07-02 2011-07-29
8 2011 8 2011-08-10 2011-08-30
9 2011 9 2011-09-01 2011-09-28
10 2011 10 2011-10-07 2011-10-31
11 2011 11 2011-11-01 2011-11-28
12 2011 12 2011-12-01 2011-12-30