I'm trying to create a query that will return totals of a number based on every week so I can create a rising trend line chart. In my table I have a number of records that record a completion date (completed). I'd like to be able to create a query that generates a rolling total every week. So if week 1 there are 10 completed, week 2 there are 15 completed, and week 3 has 5 completed the desired result would be:
Week 1 totals: 10
Week 2 totals: 25
Week 3 Totals: 30
Sample data:
id status sched
12 Successful 2017-04-04 00:00:00.000
15 Successful 2017-06-20 19:30:00.000
18 Successful 2017-10-17 18:00:00.000
26 Successful 2017-04-05 00:00:00.000
29 Successful 2017-06-16 00:00:00.000
30 Successful 2017-04-06 00:00:00.000
31 Successful 2017-04-07 00:00:00.000
32 Successful 2017-04-06 00:00:00.000
34 Successful 2017-10-18 18:00:00.000
35 Successful 2017-06-13 00:00:00.000
This is the query I'm using to successfully generate data BY WEEK without any rollups. I tried adding "WITH ROLLUP" but it only gave the grand total at the end, not week by week.
select DATE_FORMAT(completed,'%d/%m/%Y') AS nd , wk, count(*)as totals
from
(
select id, completed, yearweek(completed)as wk from w10_upgrades
where status = 'Successful' and type = 'Normal'
and yearweek(completed) is not null
) x
GROUP BY wk
ORDER BY wk;
Desired output:
wk totals
201714 10
201715 25 (output would = week 201714 + 201715)
201716 55 (output would = week 201714 + 201715 + 201716)
ect...
Any direction is appreciated. I can't find anything related to this.
Final result
SET #runtot:=0;
SELECT
DATE_FORMAT(completed,'%d/%m/%Y') AS niceDate,
wk,
(#runtot := #runtot + c) AS rt
FROM
(SELECT
completed,
yearweek(completed)as wk,
COUNT(*) AS c
FROM `w10_upgrades`
where status = 'Successful' and type = 'Normal'
and yearweek(completed) is not null
GROUP BY wk
ORDER BY wk) x
Related
I have a MySQL requirement to select data from a table based on a start date and end date and group it by weekly also selecting the data in reverse order by date. Assume that, I have chosen the start date as 1st November and the end date as 04 December. Now, I would like to fetch the data as 04 December to 28 November, 27 November to 20 November, 19 November to 12 November and so on and sum the value count for that week.
Given an example table,
id
value
created_at
1
10
2021-10-11
2
13
2021-10-17
3
11
2021-10-25
4
8
2021-11-01
5
1
2021-11-10
6
4
2021-11-18
7
34
2021-11-25
8
17
2021-12-04
Now the result should be like 2021-12-04 to 2021-11-28 as one week, following the same in reverse order and summing the column value data for that week. I have tried in the query to add the interval of 7 days after the end date but it didn't work.
SELECT count(value) AS total, MIN(R.created_at)
FROM data_table AS D
WHERE D.created_at BETWEEN '2021-11-01' AND '2021-12-04' - INTERVAL 7 DAY ORDER BY D.created_at;
And it's also possible to have the last week may have lesser than 7 days.
Expected output:
end_interval
start_interval
total
2021-12-04
2021-11-27
17
2021-11-27
2021-11-20
34
2021-11-20
2021-11-13
4
2021-11-13
2021-11-06
1
2021-11-06
2021-10-30
8
2021-10-30
2021-10-25
11
Note that the last week is only 5 days depending upon the selected from and end dates.
One option to address this problem is to
generate a calendar of all your intervals, beginning from last date till first date, with a split of your choice, using a recursive query
joining back the calendar with the original table
capping start_interval at your start_date value
aggregating values for each interval
You can have three variables to be set, to customize your date intervals and position:
SET #start_date = DATE('2021-10-25');
SET #end_date = DATE('2021-12-04');
SET #interval_days = 7;
Then use the following query, as already described:
WITH RECURSIVE cte AS (
SELECT #end_date AS end_interval,
DATE_SUB(#end_date, INTERVAL #interval_days DAY) AS start_interval
UNION ALL
SELECT start_interval AS end_interval,
GREATEST(DATE(#start_date), DATE_SUB(start_interval, INTERVAL #interval_days DAY)) AS start_interval
FROM cte
WHERE start_interval > #start_date
)
SELECT end_interval, start_interval, SUM(_value) AS total
FROM cte
LEFT JOIN tab
ON tab.created_at BETWEEN start_interval AND end_interval
GROUP BY end_interval, start_interval
Check the demo here.
update: this can be done with python. here
i have a table like this:
event_id vendor_id start_date end_date
1 100 2021-01-01 2021-01-31
2 101 2021-01-15 2021-02-15
3 102 2021-02-01 2021-02-31
4 103 2021-02-01 2021-03-31
5 104 2021-03-01 2021-03-31
6 105 2021-03-01 2021-04-31
7 100 2021-04-01 2021-04-31
i would like an output like this: number of events based on month. but if the event between two or more months, it must be included in the count for each month. For example, The event in the second row (event_id=2) takes place in both January and February. Therefore, this event should be included in the total both in January and February.
output:
month total_event
2021-01 2 ---->> event_id=(1,2)
2021-02 3 ---->> event_id=(2,3,4)
2021-03 3 ---->> event_id=(4,5,6)
2021-04 2 ---->> event_id=(6,7)
Note: I wrote it to make the " --->> event_id= : " part better understood. i dont needed. i just need the month and the total_event.
i tried this query:
select date_format(start_date,'%Y-%m') as month,count(event_id) as total_event
group by date_format(start_date,'%Y-%m')
month total_event
2021-01 2
2021-02 2
2021-03 2
2021-04 1
but it counts only by start_date, so the numbers are missing.
Idea
To get the valid months list from the table
To calculate the event counts by event table's joining with the months
MySQL 8.0+
We can get the valid months list by Recursive.
Here is a full SQL. Assumed that your event table is c!
WITH RECURSIVE all_dates(dt) AS (
-- anchor
SELECT MIN(c.`start_date`) AS dt FROM c
UNION ALL
-- recursion with stop condition
SELECT dt + INTERVAL 1 MONTH
FROM all_dates WHERE dt + INTERVAL 1 MONTH <= (SELECT MAX(c.end_date) FROM c)
)
SELECT LEFT(dt, 7) AS `month`, COUNT(d.dt) AS total_event, GROUP_CONCAT(DISTINCT c.`event_id`) AS event_ids FROM all_dates d
INNER JOIN c ON LEFT(d.dt, 7) >= LEFT(c.start_date, 7) AND LEFT(d.dt, 7) <= LEFT(c.end_date, 7)
GROUP BY LEFT(dt, 7);
Changing the question because of a misunderstanding in use case.
Amazon Redshift Query for the following problem statement.
The data structure:
id - primary key
acc_id - id unique to a loan account (this id will be same for all
emi's for a particular loan account, this maybe repeated 6 times or
12 times based on loan tenure which can be 6 months or 12 months
respectively)
status - PAID or UNPAID (the emi's unpaid are followed my unpaid
emi's only)
s_id - just a scheduling id which would be consecutive numbers for a
a particular loan id
due_date - the due date for that particular emi
principal - amount that is due
The table:
id acc_id status s_id due_date principal
9999957 10003 PAID 102 2018-07-02 12:00:00 4205
9999958 10003 UNPAID 103 2018-08-02 12:00:00 4100
9999959 10003 UNPAID 104 2018-09-02 12:00:00 4266
9999960 10003 UNPAID 105 2018-10-02 12:00:00 4286
9999962 10004 PAID 106 2018-07-02 12:00:00 3200
9999963 10004 PAID 107 2018-08-02 12:00:00 3100
9999964 10004 UNPAID 108 2018-09-02 12:00:00 3266
9999965 10004 UNPAID 109 2018-10-02 12:00:00 3286
The use case -
The unpaid amount becomes delinquent (overdue) after the due_date.
So I need to calculate delinquent amount at the end of every month from the first due_date in this case is 2nd July to last due_date (assume it to be 2nd November which is the current month)
I also need to calculate days past due at the end of that month.
Illustration from the above data:
From the sample data provided, no EMI is due at the end of July so amount delinquent is 0
But at the end of August - the id 9999958 is due - as of 31st August
the amount delinquent is 4100 and days past due is 29 (31st August minus 2nd August)
The catch: I need to calculate this for the loan (acc_id) and not the emi.
To further explain, A first EMI will be 29 days due on 1st month and 59 days due on second month, also second EMI will be 29 days due on second month. But I need this at loan level (acc_id).
The same example continued for 30th september, the acc_id 10003 is due since 2nd August so as of 30th September the due amount is 8366 (4100 + 4266) and DPD (days_past_due) is 59 (29 + 30).
Also acc_id 10004 is due 3100 and DPD is 28 (30th september - 2nd september).
The final output would be something like this:
Month_End DPD_Band Amount
2018/08/31 0-29 4100
2018/08/31 30-59 0
2018/08/31 60-89 0
2018/08/31 90+ 0
2018/09/30 0-29 3100
2018/09/30 30-59 8366
2018/09/30 60-89 0
2018/09/30 90+ 0
Query attempt: DPD bands can be created based on case statements on delinquent days. I need real help in first creating End-of-months and then finding the portfolio level amounts as explained above for different delinquent days.
Edited to be RedShift compatible after the op clarified which RDBMS. (MySQL would need a different answer)
The following creates one record for each month between your first record, and the end of last month.
It then joins on to your unpaid records, and the aggregation chooses which bracket to put the results in to.
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
SUM(CASE WHEN days_past_due >= 00 AND days_past_due < 30 THEN principal ELSE 0 END) AS dpd_00_29,
SUM(CASE WHEN days_past_due >= 30 AND days_past_due < 60 THEN principal ELSE 0 END) AS dpd_30_59,
SUM(CASE WHEN days_past_due >= 60 AND days_past_due < 90 THEN principal ELSE 0 END) AS dpd_60_89,
SUM(CASE WHEN days_past_due >= 90 THEN principal ELSE 0 END) AS dpd_90plus
FROM
monthly_delinquents
GROUP BY
month_end_date
ORDER BY
month_end_date
That said, normally the idea of pivoting things like this is a bad idea. What happens when something is a year past due? It just sits in the 90plus category and never moves. And, if you want to expand it you need to change the query and any other query you ever write that depends on it.
Instead, you could normalise your output...
WITH
first_month AS
(
SELECT LAST_DAY(MIN(due_date)) AS end_date FROM yourTable
),
months AS
(
SELECT
LAST_DAY(ADD_MONTHS(first_month.end_date, s.id)) AS end_date
FROM
first_month
CROSS JOIN
generate_series(
1,
DATEDIFF(month, (SELECT end_date FROM first_month), CURRENT_DATE)
)
AS s(id)
),
monthly_delinquents AS
(
SELECT
yourTable.*,
months.end_date AS month_end_date,
DATEDIFF(DAY, yourTable.due_date, months.end_date) AS days_past_due
FROM
months
LEFT JOIN
yourTable
ON yourTable.status = 'UNPAID'
AND yourTable.due_date < months.end_date
)
SELECT
month_end_date,
(days_past_due / 30) * 30 AS days_past_due_band,
SUM(principal) AS total_principal,
COUNT(*) AS total_rows
FROM
monthly_delinquents
GROUP BY
month_end_date,
(days_past_due / 30) * 30
ORDER BY
month_end_date,
(days_past_due / 30) * 30
id modid userid timemodified FROM_UNIXTIME(timemodified,'%d-%m-%Y')
410 32 46 1438971403 03-08-2015
411 32 46 1438971403 03-08-2015
412 66 977 1438971403 07-08-2015
412 66 977 1438971403 07-08-2015
413 67 34 1438971423 07-08-2015
414 68 16 1438971424 07-08-2015
415 132 23 1438972154 07-08-2015
416 134 2 1438972465 08-08-2015
417 115 2 1438996430 08-08-2015
418 130 977 1438996869 08-08-2015
I got this query from framing the last 4weeks ago by calculating from today's date. Now, I want to show the users for 4 weeks individually like week1, week2, week3 & week4, either it could be column wise or row wise, which would be the best.
In detailed, from the above query, I need to separate data from week1 to week4,like
Week4 : No user
Week3 : 2 users (2,977)
Week2 : 4 users (16, 23, 34, 977)
Week1 : 1 user (46)
SET #unix_four_weeks_ago = UNIX_TIMESTAMP(curdate()) - 2419200;
SELECT *,FROM_UNIXTIME(timemodified,'%d-%m-%Y') FROM mod_users WHERE timemodified >= #unix_four_weeks_ago
My guess is that you want to split the user count per week according to the timemodified column. I would use the WEEK() function to do that.
The following SQL would add a weeknumber column to identify the week number:
SELECT WEEK(timemodified) weeknumber, dates.*
FROM dates
Then, if you want to get the distinct user count, you can simply use the following SQL:
SELECT WEEK(timemodified) weeknumber, COUNT(DISTINCT(userid)) users_count
FROM dates
GROUP BY weeknumber
You can also add a WHERE clause to get only certain weeks as you wish. So, to get the last 4 weeks from the 23-08-2015, I would do:
SELECT WEEK(timemodified) weeknumber, COUNT(DISTINCT(userid)) users_count
FROM dates
WHERE WEEK(timemodified) <= WEEK('2015-08-23')
AND WEEK(timemodified) > (WEEK('2015-08-23') - 4)
GROUP BY weeknumber
Let's hope I assumed correctly. :-)
I'm trying to find the number of orders that were open for each week. An order that is open for multiple weeks should be included in each week's count that it was open. The data looks something like below
id open_dt close_dt
1 2014-01-01 07:00:00 2014-01-01 07:00:00
2 2014-01-01 07:00:00 2014-01-02 07:00:00
3 2014-01-02 07:00:00 2014-01-09 07:00:00
4 2014-01-08 07:00:00 NULL
NULL close_dt counts as still open and should appear in each week since it was opened
My query looks like below however it isn't returning the numbers I'm expecting:
SELECT YEAR(open_dt) AS year, WEEK(open_dt) AS week, count(*) 'open'
FROM table
WHERE open_dt >= week(open_dt)
OR
(
close_dt > week(open_dt)
OR close_dt IS NULL
)
GROUP BY YEAR(open_dt), WEEK(open_dt)
I'm trying to get results like below:
year week open
2014 1 3
2014 2 2
2014 3 1
...
Appreciate any tips or guidance.
This is a case where it helps to have a calendar table or list of weeks. Let me assume that you have at least one open in each week:
select yw.y, yw.w, count(t.open_dt) as "Open"
from (select distinct year(open_dt) as y, week(open_dt) as w,
year(open_dt) * 100 + week(open_dt) as yw
from table t
) yw left outer join
table t
on yw.yw >= year(open_dt)*100 + week(open_dt) and
(yw.yw <= year(close_dt)*100 + week(close_dt) or close_dt is null)
group by yw.y, yw.w
order by yw.y, yw.w;