Mysql last_value window function on a datetime column - mysql

Below is the data set I am working with:
customer_id, event_date, status, credit_limit
1, 2019-1-1, C, 1000
1, 2019-1-5, F, 1000
1, 2019-3-10, [NULL], 1000
1, 2019-3-10, [NULL], 1000
1, 2019-8-27, L, 1000
2, 2019-1-1, L, 2000
2, 2019-1-5, [NULL], 2500
2, 2019-3-10, [NULL], 2500
3, 2019-1-1, S, 5000
3, 2019-1-5, [NULL], 6000
3, 2019-3-10, B, 5000
4, 2019-3-10, B, 10000
I am trying to solve for the following:
For each customer_id, show account status at month end for the year 2019
I have tried using windows function last_value(), but it does not give me the latest date in a month. Here is my query:
with cte1 as
(select customer_id, status,
event_date,
last_value(date_format(event_date, '%Y-%m-%d')) over ( partition by customer_id, event_date
order by event_date) as l_v
from cust_acct ca
where event_date between "2019-01-01 00:00:00" and "2019-12-31 11:59:59")
select * from cte1
It returns:
Customer_id, Status, Event_date, L_v
1, C, 2019-01-01 00:00:00, 2019-01-01
1, F, 2019-01-05 00:00:00, 2019-01-05
1, [NULL], 2019-03-10 00:00:00, 2019-03-10
1, [NULL], 2019-03-10 00:00:00, 2019-03-10
1, L, 2019-08-27 00:00:00, 2019-08-27
2, L, 2019-01-01 00:00:00, 2019-01-01
2, [NULL], 2019-01-05 00:00:00, 2019-01-05
2, [NULL], 2019-03-10 00:00:00, 2019-03-10
3, S, 2019-01-01 00:00:00, 2019-01-01
3, [NULL], 2019-01-05 00:00:00, 2019-01-05
3, B, 2019-03-10 00:00:00, 2019-03-10
4, B, 2019-03-10 00:00:00, 2019-03-10
Customer_id 1, for month 2019-01, should have a last_value of '2019-01-05' in the column l_v. Why is the query showing both dates in january in column l_v?

LAST_VALUE() is not the proper window function in this case.
It can be used only if you extend the window:
WITH cte1 AS (
SELECT customer_id, status, event_date,
LAST_VALUE(DATE(event_date)) OVER (
PARTITION BY customer_id, DATE_FORMAT(event_date, '%Y-%m')
ORDER BY event_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS l_v
FROM cust_acct ca
WHERE event_date BETWEEN '2019-01-01 00:00:00' AND '2019-12-31 11:59:59'
)
SELECT * FROM cte1;
You should use FIRST_VALUE():
WITH cte1 AS (
SELECT customer_id, status, event_date,
FIRST_VALUE(DATE(event_date)) OVER (
PARTITION BY customer_id, DATE_FORMAT(event_date, '%Y-%m')
ORDER BY event_date DESC
) AS l_v
FROM cust_acct ca
WHERE event_date BETWEEN '2019-01-01 00:00:00' AND '2019-12-31 11:59:59'
)
SELECT * FROM cte1;
or better MAX():
WITH cte1 AS (
SELECT customer_id, status, event_date,
MAX(DATE(event_date)) OVER (
PARTITION BY customer_id, DATE_FORMAT(event_date, '%Y-%m')
) AS l_v
FROM cust_acct ca
WHERE event_date BETWEEN '2019-01-01 00:00:00' AND '2019-12-31 11:59:59'
)
SELECT * FROM cte1;
See the demo.

Related

count by hours in between with start and end time data

In table, data is in Timestamp format, but I shared it in Time(start_at), Time(end_at) format.
Table structure:
id, start_at, end_at
1, 03:00:00, 06:00:00
2, 02:00:00, 05:00:00
3, 01:00:00, 08:00:00
4, 08:00:00, 13:00:00
5, 09:00:00, 21:00:00
6, 13:00:00, 16:00:00
6, 15:00:00, 19:00:00
For result we need to count ids which were active in between the start_at, end_at time.
hours, count
0, 0
1, 1
2, 2
3, 3
4, 3
5, 2
6, 1
7, 1
8, 1
9, 2
10, 2
11, 2
12, 2
13, 3
14, 2
15, 3
16, 2
17, 2
18, 2
19, 1
20, 1
21, 0
22, 0
23, 0
Either
WITH RECURSIVE
cte AS (
SELECT 0 `hour`
UNION ALL
SELECT `hour` + 1 FROM cte WHERE `hour` < 23
)
SELECT cte.`hour`, COUNT(test.id) `count`
FROM cte
LEFT JOIN test ON cte.`hour` >= HOUR(test.start_at)
AND cte.`hour` < HOUR(test.end_at)
GROUP BY 1
ORDER BY 1;
or
WITH RECURSIVE
cte AS (
SELECT CAST('00:00:00' AS TIME) `hour`
UNION ALL
SELECT `hour` + INTERVAL 1 HOUR FROM cte WHERE `hour` < '23:00:00'
)
SELECT cte.`hour`, COUNT(test.id) `count`
FROM cte
LEFT JOIN test ON cte.`hour` >= test.start_at
AND cte.`hour` < test.end_at
GROUP BY 1
ORDER BY 1;
The 1st query returns hours column in time format whereas the 2nd one returns numeric value for this column. Select the variant which is safe for you.
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=5a77b6e3158be06c7a551cb7e64673de

Get active users by month

Using MySQL, I'm trying to get the number of active users I have in any given month. I have a table with ActivationDate and TerminationDate columns, and if the month being counted is after the ActivationDate and TerminationDate is null, then the user is active and should be counted. I would like to summarize these amounts by month. I'm thinking I could just sum each side and calculate the total but breaking that down won't give me a running total. I've tried with window functions, but I don't have enough experience with them to know exactly what I'm doing wrong and I'm not certain how to ask the right question.
So for instance, if I have the following data...
UserId ActivationDate TerminationDate
1 2020-01-01 null
2 2020-01-15 null
3 2020-01-20 2020-01-30
4 2020-02-01 null
5 2020-02-14 2020-02-27
6 2020-02-15 2020-02-28
7 2020-03-02 null
8 2020-03-05 null
9 2020-03-20 2020-03-21
I would like my results to be similar to:
2020-01 2 (there are 2 active users, since one signed up but cancelled before the end of the month)
2020-02 3 (2 from the previous month, plus 1 that signed up this month and is still active)
2020-03 5 (3 from previous, 2 new, 1 cancellation)
You can unpivot, then aggregate and sum. In MySQL 8.0.14 or higher, you can use a lateral join:
select date_format(x.dt, '%Y-%m-01') as dt_month,
sum(sum(cnt)) over(order by date_format(x.dt, '%Y-%m-01')) as cnt_active_users
from mytable t
cross join lateral (
select t.activationdate as dt, 1 as cnt
union all select t.terminationdate, -1
) x
where x.dt is not null
group by dt_month
order by dt_month
In earlier 8.x versions:
select date_format(x.dt, '%Y-%m-01') as dt_month,
sum(sum(cnt)) over(order by date_format(x.dt, '%Y-%m-01')) as cnt_active_users
from (
select activationdate as dt, 1 as cnt from from mytable
union all select terminationdate, -1 from mytable
) x
where x.dt is not null
group by dt_month
order by dt_month
You don't say what version of MySQL. If you're using 8.0, this should work:
create table userdates (
UserId int not null,
ActivationDate date not null,
TerminationDate date null
);
insert into userdates (UserId, ActivationDate, TerminationDate)
values
(1, cast("2020-01-01" as date), null )
, (2, cast("2020-01-15" as date), null )
, (3, cast("2020-01-20" as date), cast("2020-01-30" as date))
, (4, cast("2020-02-01" as date), null )
, (5, cast("2020-02-14" as date), cast("2020-02-27" as date))
, (6, cast("2020-02-15" as date), cast("2020-02-28" as date))
, (7, cast("2020-03-02" as date), null )
, (8, cast("2020-03-05" as date), null )
, (9, cast("2020-03-20" as date), cast("2020-03-21" as date))
, (10, cast("2020-07-20" as date), null)
, (11, cast("2019-09-12" as date), cast("2019-09-14" as date));
WITH RECURSIVE d (dt)
AS (
SELECT cast("2019-01-01" as date)
UNION ALL
SELECT date_add(dt, interval 1 month)
FROM d
WHERE dt < cast("2020-12-01" as date)
)
select d.dt
, count(distinct ud.UserId) as UserCount
from userdates ud
right outer join d on d.dt >= date_format(ud.ActivationDate, '%Y-%m-01')
and (d.dt <= ud.TerminationDate or ud.TerminationDate is null)
group by d.dt;

Cumulative sum grouped by year, month and day in a JSON object

Let's say I have a table orders with the following rows:
ID Cost Date (timestamp)
1 100 2020-06-30 21:18:53.328386+00
2 45 2020-06-30 11:18:53.328386+00
3 200 2020-05-29 21:32:56.620174+00
4 20 2020-06-28 21:32:56.620174+00
And I need a query that returns exactly this:
Month Year Costs
5 2020 {"1": 0, "2": 0, ..., "29": 200, "30": 200, "31": 200}
6 2020 {"1": 0, "2": 0, ..., "28": 20, "29": 20, "30": 165}
Please note that the column Costs has to be a json with the key being the day in the month and the value being the cumulative sum of all previous days in that month.
I know this is probably not a task that postgres should be doing, but I'm just curious to see what is the solution to it (even if its not the most efficient in production environments)
You can use two levels of aggregation and json_object_agg():
select date_month, date_year, json_object_agg(date_day, cnt) costs
from (
select
extract(month from date) date_month,
extract(year from date) date_year,
extract(day from date) date_day,
sum(sum(cost)) over(
partition by extract(month from date), extract(year from date)
order by extract(day from date)
) cnt
from mytable
group by 1, 2, 3
) t
group by date_month, date_year

SQL Query:- Difference prices of same month with same year with different tables , if not just show zero

There are two different tables, just need to subtract price between same month with same year, if no data just show zero for that particular month and year .Now, it just subtracting with row by row irrespective of month and year.
Table 1 Table2
Price tran_date Price post_date
60 2018-01-01 30 2018-01-15
40 2018-02-08 30 2018-02-02
50 2018-12-28 30 2018-11-01
40 2019-03-01 10 2019-01-08
80 2019-04-11 60 2019-04-29
40 2019-10-01
Expected Answer:
Sum(price). Year
30 January 2018
10 February 2018
30 November 2018
50 December 2018
-10 January 2019
40 March 2019
20 April 2019.
40 October 2019
Actual Answer:
Sum(Price) Year
30 January 2018
10 February 2018
10 December 2018
30 March 2019
20 April 2019
-40 October 2019
SQL Query for table1
Select sum(price) from table1 where date(tran_date)
between ‘2018-01-01’ and ‘2019-12-31’
group by month(tran_date),year(tran_date)
SQL Query for table2
Select sum(price) from table2 where date(post_date)
between ‘2018-01-01’ and ‘2019-12-31’
group by month(post_date),year(post_date)
It’s should not subtract from 1st row of table1 with 1st row of table2,it should subtract with same month with same year. If there is no data just show zero for that particular month and year.
Please do help.Thanks in Advance.
seems you want the absolute difference, try add abs()
sample
select date_year, date_month,
abs(sum(price))
from ((select date_year, date_month, price from
(values (60, '2018', '01'),
(40, '2018', '02'),
(50, '2018', '12'),
(40, '2019', '03'),
(80, '2019', '04') ) table1 (price, date_year, date_month)
) union all
(select date_year, date_month, - price from (
values (30, '2018', '01'),
(30, '2018', '02'),
(30, '2018', '11'),
(10, '2019', '01'),
(60, '2019', '04'),
(40, '2019', '10')
) table2 (price, date_year, date_month)
)
) t
group by date_year, date_month
order by date_year, date_month
see the fiddle
https://www.db-fiddle.com/f/qVQYB2KXSTbJNEkSH1oGuG/0
Is this what you want?
select year(dte), month(dte),
greatest( sum(price), 0)
from ((select tran_date as dte, price from table1
) union all
(select post_date, - price from table2
)
) t
group by year(dte), month(dte);
It seems very strange to not subtract the values. I suspect you might just want:
select year(dte), month(dte),
sum(price)
from ((select tran_date as dte, price from table1
) union all
(select post_date, - price from table2
)
) t
group by year(dte), month(dte)

MySQL statement - select max from a group

hi i have the following table, and I want to select the max(count(*)) of plugged for each month. sqlfiddle.com/#!2/13036/1
select * from broadcast
profile, plugged, company, tstamp
1, 2, 1, 2013-10-01 08:20:00
1, 3, 1, 2013-10-01 08:20:00
2, 1, 1, 2013-10-01 08:20:00
2, 3, 1, 2013-10-01 08:20:00
3, 1, 1, 2013-10-01 08:20:00
3, 1, 1, 2013-09-01 08:20:00
so if I do something like the following:
select plugged,
count(*),
extract(month from tstamp),
extract(year from tstamp)
from broadcast
where company=1
group by plugged,
extract(month from tstamp),
extract(year from tstamp)
order by count(*) desc;
output:
plugged, count(*), extract(month from tstamp), extract(year from tstamp)
3, 2, 10, 2013
1, 2, 10, 2013
2, 1, 10, 2013
1, 1, 9, 2013
desired output:
plugged, count(*), extract(month from tstamp), extract(year from tstamp)
3, 2, 10, 2013
1, 2, 10, 2013
1, 1, 9, 2013
which is right... but I only want the max(count(*)) (for example first row only in this case). There may be scenarios where there are 2 rows with the max count, but for each MONTH/YEAR i only want to return the max count row(s)...do I need an inner select statement or something?
try this
select plugged, max(counts) counts, month , year
from
(select plugged ,count(*) as counts ,extract(month from tstamp) month , extract(year from tstamp) year from broadcast where company=1
group by plugged,month ,year order by counts desc ) as x ;