Get monthly counts on multiple dates - mysql

I have a table that looks like this
id
date registered
date cancelled
1
2021-01-01
2021-03-02
2
2021-01-05
2021-01-21
3
2021-02-04
2021-02-25
4
2021-02-16
2021-03-26
How do I generate a query in mysql that will give me counts of cancelled and registered for each month.
I can do it for just one of the dates but don't know how to combine for both dates.
For eg for a single date I would do this.
SELECT date_format(`users`.`dateregistered`,_utf8'%Y-%m') AS `DateREegistered`, count(0) AS `Registration Count`
FROM `users`
GROUP BY date_format(`users`.`dateregistered`,_utf8'%Y-%m')
But I want something like this
Date
Registered Count
Cancelled Count
2021-01
2
1
2021-02
2
1
2021-03
0
2
Please let me know if you have any ideas.

You can join the distinct months appearing in date registered and date registered to the table and use conditional aggregation:
SELECT t.Date,
SUM(t.Date = date_format(dateregistered, '%Y-%m')) `Registered Count`,
SUM(t.Date = date_format(datecancelled, '%Y-%m')) `Cancelled Count`
FROM (
SELECT date_format(dateregistered, '%Y-%m') Date FROM users
UNION
SELECT date_format(datecancelled, '%Y-%m') FROM users
) t INNER JOIN users u
ON t.Date IN (date_format(dateregistered, '%Y-%m'), date_format(datecancelled, '%Y-%m'))
GROUP BY t.Date
See the demo.
Results:
Date
Registered Count
Cancelled Count
2021-01
2
1
2021-02
2
1
2021-03
0
2

Related

Creating an overdraft statement

I'm currently stuck on how to create a statement that shows daily overdraft statements for a particular council.
I have the following, councils, users, markets, market_transactions, user_deposits.
market_transaction run daily reducing user's account balance. When the account_balance is 0 the users go into overdraft (negative). When users make a deposit their account balance increases.
I Have put the following tables to show how transactions and deposits are stored.
if I reverse today's transactions I'm able to get what account balance a user had yesterday but to formulate a query to get the daily OD amount is where the problem is.
USERS
user_id
name
account_bal
1
Wells
-5
2
James
100
3
Joy
10
4
Mumbi
-300
DEPOSITS
id
user_id
amount
date
1
1
5
2021-04-26
2
3
10
2021-04-26
3
3
5
2021-04-25
4
4
5
2021-04-25
TRANSACTIONS
id
user_id
amount_tendered
date
1
1
5
2021-04-27
2
2
10
2021-04-26
3
3
15
2021-04-26
4
4
50
2021-04-25
The Relationships are as follows,
COUNCILS
council_id
name
1
a
2
b
3
c
MARKETS
market_id
name
council_id
1
x
3
2
y
1
3
z
2
MARTKET_USER_LINK
id
market_id
user_id
1
1
3
2
2
2
3
3
1
I'm running this SQL query to get the total amount users have spent and subtracting with the current user account balance.
Don't know If I can use this to figure out the account_balance for each day.
SELECT u.user_id, total_spent, total_deposits,m.council_id
FROM users u
JOIN market_user_link ul ON ul.user_id= u.user_id
LEFT JOIN markets m ON ul.market_id =m.market_id
LEFT JOIN councils c ON m.council_id =c.council_id
LEFT JOIN (
SELECT user_id, SUM(amount_tendered) AS total_spent
FROM transactions
WHERE DATE(date) BETWEEN DATE('2021-02-01') AND DATE(NOW())
GROUP BY user_id
) t ON t.user_id= u.user_id
ORDER BY user_id, total_spent ASC
// looks like this when run
| user_id | total_spent | council_id |
|-------------|----------------|------------|
| 1 | 50.00 | 1 |
| 2 | 2.00 | 3 |
I was hoping to reverse transactions and deposits done to get the account balance for a day then get the sum of users with an account balance < 0... But this has just failed to work.
The goal is to produce a query that shows daily overdraft (Only SUM the total account balance of users with account balance below 0 ) for a particular council.
Expected Result
date
council_id
o_d_amount
2021-04-24
1
-300.00
2021-04-24
2
-60.00
2021-04-24
3
-900.00
2021-04-25
1
-600.00
2021-04-25
2
-100.00
2021-04-25
3
-1200.00
This is actually not that hard, but the way you asked makes it hard to follow.
Also, your expected result should match the data you provided.
Edited: Previous solution was wrong - It counted withdraws and deposits more than once if you have more than one event for each user/date.
Start by having the total exchanged on each day, like
select user_id, date, sum(amount) exchanged_on_day from (
select user_id, date, amount amount from deposits
union all select user_id, date, -amount_tendered amount from transactions
) d
group by user_id, date
order by user_id, date;
What follows gets the state of the account only on days that had any deposits or withdraws.
To get the results of all days (and not just those with account movement) you just have to change the cross join part to get a table with all dates you want (like Get all dates between two dates in SQL Server) but I digress...
select dates.date, c.council_id, u.name username
, u.account_bal - sum(case when e.date >= dates.date then e.exchanged_on_day else 0 end) as amount_on_start_of_day
, u.account_bal - sum(case when e.date > dates.date then e.exchanged_on_day else 0 end) as amount_on_end_of_day
from councils c
inner join markets m on c.council_id=m.council_id
inner join market_user_link mul on m.market_id=mul.market_id
inner join users u on mul.user_id=u.user_id
left join (
select user_id, date, sum(amount) exchanged_on_day from (
select user_id, date, amount amount from deposits
union all select user_id, date, -amount_tendered amount from transactions
) d group by user_id, date
) e on u.user_id=e.user_id --exchange on each Day
cross join (select distinct date from (select date from deposits union select date from transactions) datesInternal) dates --all days that had a transaction
group by dates.date, c.council_id, u.name, u.account_bal
order by dates.date desc, c.council_id, u.name;
From there you can rearrange to get the result you want.
select date, council_id
, sum(case when amount_on_start_of_day<0 then amount_on_start_of_day else 0 end) o_d_amount_start
, sum(case when amount_on_end_of_day<0 then amount_on_end_of_day else 0 end) o_d_amount_end
from (
select dates.date, c.council_id, u.name username
, u.account_bal - sum(case when e.date >= dates.date then e.exchanged_on_day else 0 end) as amount_on_start_of_day
, u.account_bal - sum(case when e.date > dates.date then e.exchanged_on_day else 0 end) as amount_on_end_of_day
from councils c
inner join markets m on c.council_id=m.council_id
inner join market_user_link mul on m.market_id=mul.market_id
inner join users u on mul.user_id=u.user_id
left join (
select user_id, date, sum(amount) exchanged_on_day from (
select user_id, date, amount amount from deposits
union all select user_id, date, -amount_tendered amount from transactions
) d group by user_id, date
) e on u.user_id=e.user_id --exchange on each Day
cross join (select distinct date from (select date from deposits union select date from transactions) datesInternal) dates --all days that had a transaction
group by dates.date, c.council_id, u.name, u.account_bal
) result
group by date, council_id
order by date;
You can check it on https://www.db-fiddle.com/f/msScT6B5F7FjU2aQXVr2da/6
Basically the query maps users to councils, caculates periods of overdrafts for users, them aggregates over councils. I assume that starting balance is dated start of the month '2021-04-01' (it could be ending balance as well, see below), change it as needed. Also that negative starting balance counts as an overdraft. For simplicity and debugging the query is divided into a number of steps.
with uc as (
select distinct m.council_id, mul.user_id
from markets m
join market_user_link mul on m.market_id = mul.market_id
),
user_running_total as (
select user_id, date,
coalesce(lead(date) over(partition by user_id order by date) - interval 1 day, date) nxt,
sum(sum(s)) over(partition by user_id order by date) rt
from (
select user_id, date, -amount_tendered s
from transactions
union all
select user_id, date, amount
from deposits
union all
select user_id, se.d, se.s
from users
cross join lateral (
select date(NOW() + interval 1 day) d, 0 s
union all
select '2021-04-01' d, account_bal
) se
) t
group by user_id, date
),
user_overdraft as (
select user_id, date, nxt, least(rt, 0) ovd
from user_running_total
where date <= date(NOW())
),
dates as (
select date
from user_overdraft
union
select nxt
from user_overdraft
),
council__overdraft as (
select uc.council_id, d.date, sum(uo.ovd) total_overdraft, lag(sum(uo.ovd), 1, sum(uo.ovd) - 1) over(partition by uc.council_id order by d.date) prev_ovd
from uc
cross join dates d
join user_overdraft uo on uc.user_id = uo.user_id and d.date between uo.date and uo.nxt
group by uc.council_id, d.date
)
select council_id, date, total_overdraft
from council__overdraft
where total_overdraft <> prev_ovd
order by date, council_id
Really council__overdraft is quite usable, the last step just compacts output excluding intermidiate dates when overdraft is not changed.
With following sample data:
users
user_id name account_bal
1 Wells -5
2 James 100
3 Joy 10
4 Mumbi -300
deposits, odered by date, extra row added for the last date
id user_id amount date
3 3 5 2021-04-25
4 4 5 2021-04-25
1 1 5 2021-04-26
2 3 10 2021-04-26
5 3 73 2021-05-06
transactions, odered by date (note the added row, to illustrate running total in action)
id user_id amount_tendered date
5 4 50 2021-04-25
2 2 10 2021-04-26
3 3 15 2021-04-26
1 1 5 2021-04-27
4 3 17 2021-04-27
councils
council_id name
1 a
2 b
3 c
markets
market_id name council_id
1 x 3
2 y 1
3 z 2
market_user_link
id market_id user_id
1 1 3
2 2 2
3 3 1
4 3 4
the query ouput is
council_id
date
overdraft
1
2021-04-01
0
2
2021-04-01
-305
3
2021-04-01
0
2
2021-04-25
-350
2
2021-04-26
-345
2
2021-04-27
-350
3
2021-04-27
-7
3
2021-05-06
0
Alternatively, provided the users table is holding a closing (NOW()) balance, replace user_running_total CTE with the following code
user_running_total as (
select user_id, date,
coalesce(lead(date) over(partition by user_id order by date) - interval 1 day, date) nxt,
coalesce(sum(sum(s)) over(partition by user_id order by date desc
rows between unbounded preceding and 1 preceding), sum(s)) rt
from (
select user_id, date, amount_tendered s
from transactions
union all
select user_id, date, -amount
from deposits
union all
select user_id, se.d, se.s
from users
cross join lateral (
select date(NOW() + interval 1 day) d, account_bal s
union all
select '2021-04-01' d, 0
) se
) t
where DATE(date) between date '2021-04-01' and date(NOW() + interval 1 day)
group by user_id, date
),
This way the query starts with closing balance dated next date after now and rollouts a running total in the reverse order till '2021-04-01' as a starting date.
Output
council_id
date
overdraft
1
2021-04-01
0
2
2021-04-01
-260
3
2021-04-01
-46
2
2021-04-25
-305
3
2021-04-25
-41
2
2021-04-26
-300
3
2021-04-26
-46
2
2021-04-27
-305
3
2021-04-27
-63
3
2021-05-06
0
db-fiddle both versions

Get occupancy per every 15-minute slot

We have a room where we can only have XX number of people inside due to current limitations. They come at different times and stay for a different length of time.
I'm trying to get a sum of people currently inside for each 15-min period for a specific date. The server is MySQL 8.0.21 deployed on AWS RDS.
MySQL 8.0 Table: Booking
ID
Name
PartySize
Date
BookedFrom
BookedTo
1
John
2
2021-01-01
2021-01-01 08:30:00
2021-01-01 10:00:00
2
Mary
4
2021-01-01
2021-01-01 09:00:00
2021-01-01 11:00:00
3
Nick
3
2021-01-01
2021-01-01 10:30:00
2021-01-01 12:30:00
I also have a "helper table" with a time slot for each 24 hour 15-min slot
MySQL Table: Timeslot
ID
Time
1
00:00:00
2
00:15:00
3
00:30:00
35
08:30:00
37
09:00:00
38
09:15:00
For example, when I run this query below, I will get the correct count (6 people) for 09:30. What is the most efficient way to get this result for each 15-min slot? Please note that while the BookedTo (datetime field) value may be past midnight, I will always be only making date specific queries.
SELECT
t.id, b.date, t.time, SUM(b.partysize) AS total
FROM
booking b,
timeslot t
WHERE
b.date = '2021-01-01'
AND t.time = '09:15:00'
AND b.bookedfrom <= '2021-01-01 09:15:00'
AND b.bookedto >= '2021-01-01 09:15:00'
Looking for this output for all times (including zeros)
Slot_ID
Date
Time
Total
33
2021-01-01
08:00:00
0
34
2021-01-01
08:15:00
0
35
2021-01-01
08:30:00
2
36
2021-01-01
08:30:00
2
37
2021-01-01
09:00:00
6
38
2021-01-01
09:15:00
6
SELECT
t.id as slot_id,
coalesce(b.date, '2021-01-01') as date,
t.time,
coalesce(sum(b.partysize),0) as total
FROM
timeslot t
LEFT JOIN booking b
ON t.time >= TIME(b.bookedfrom) AND t.time < TIME(b.bookedto) AND b.date = '2021-01-01'
WHERE
t.time BETWEEN '08:00:00' AND '17:00:00'
GROUP BY
t.id,
b.date,
t.time
Now, you have some confusing other requirements, but basically this works because multiple rows of timeslot will match to a single row of booking because of the time range expressed.
The confusing requirements are, you say it's only for 8-5pm, but "bookings might extend to the next day".. does it mean that a booking will start at 4pm and finish at 9am the next day? in which case you might need to adjust the AND b.date = '2021-01-01' to be more like AND (DATE(b.bookedfrom) = '2021-01-01' OR DATE(b.bookedto) = '2021-01-01') ...
Use a CTE that returns the specific date for which you want the results, which may not be the same as the column Date in Booking and CROSS join it to Timeslot.
The result should be LEFT joined to Booking and then aggregate:
WITH cte(Date) AS (SELECT '2021-01-01')
SELECT t.ID, t.time, c.Date,
COALESCE(SUM(b.PartySize), 0) Total
FROM cte c CROSS JOIN Timeslot t
LEFT JOIN Booking b
ON b.BookedFrom <= CONCAT(c.Date, ' ', t.time)
AND b.BookedTo >= CONCAT(c.Date, ' ', ADDTIME(t.time, '00:15:00'))
WHERE t.time BETWEEN '08:00:00' AND '17:00:00'
GROUP BY t.ID, c.Date, t.time
Since BookedFrom and BookedTo may not contain the same date, it is not safe to compare only the time parts of the 2 columns to the column time of Timeslot.
This is why all these conditions in the ON clause are needed.
See the demo.
this query works great ... if you wanna have all dates for all slots .. you will have to have a date table too (ideally within timeslot -> cross join dates and timeslots) ...
use inner join if you wanna get only matching dates and timeslots ..
SELECT t.id as slot_id
, b.date
, t.time as slot
, sum(ifnull(party_size,0)) as total
FROM test.timeslot t
LEFT JOIN test.booking b
ON t.time BETWEEN time(b.booked_from) AND time(b.booked_to)
GROUP BY t.id
, b.date
, t.time;
for all timeslots and selected dates:
https://www.db-fiddle.com/f/gLt2Fs8HTDUakMahZHxcTi/0
for matching timeslots and dates:
SELECT t.id as slot_id
, b.date
, t.time as slot
, sum(ifnull(party_size,0)) as total
FROM test.timeslot t
JOIN test.booking b
ON t.time BETWEEN time(b.booked_from) AND time(b.booked_to)
GROUP BY t.id
, b.date
, t.time;

How to find whether the same customers who ordered this month also ordered the next month?

I have an orders table
Order_id User_id Order_date
1 32 2020-07-19
2 24 2020-07-21
3 27 2020-07-27
4 24 2020-08-14
5 32 2020-08-18
6 32 2020-08-19
7 58 2020-08-20
Now I want to find how many of the users ordered in 1st month also ordered in the next month. In this case, user_id's 32,24,27 ordered in 7th month but only 24 and 32 ordered in the next month.
I want the result to be like :
Date Retained_Users Total_users
2020-07 Null 3
2020-08 2 3
I'm lost here. Can someone please help me with this?
In MySQL 8.0, you can do this with window functions:
select
order_month,
count(distinct case when cnt_orders_last_month > 0 then user_id end) retained_users,
count(distinct user_id) total_users
from (
select
user_id,
date_format(order_date, '%Y-%m-01') as order_month,
count(*) over(
partition by user_id
order by date(date_format(order_date, '%Y-%m-01'))
range between interval 1 month preceding and interval 1 day preceding
) cnt_orders_last_month
from mytable
) t
group by order_month
The logic lies in the range specification of the window function; it orders record by month, and counts how many orders the customer placed last month. Then all that is left to do is aggregate and count distinct users.
Demo on DB Fiddle

MySQL: Count two date columns and group by day

I need to draw a line chart that will visualize both the orders and pickups for each day between certain dates. The order and pickup dates are stored in unixtime. My table looks something like this:
id order_date pickup_date
-------------------------------
1 1472749664 1472133376
2 1472551372 1472567548
3 1472652545 1472901368
4 1473154659 1473512323
5 1473512923 1475229824
6 1475586643 1475652635
What I am after is something like this
date orders pickups
-------------------------------
01-09-2016 1 0
02-09-2016 4 1
03-09-2016 3 2
04-09-2016 7 1
05-09-2016 0 0
06-09-2016 1 1
07-09-2016 6 3
08-09-2016 0 0
08-09-2016 3 5
10-09-2016 2 4
I know I can count based on one column, for example:
SELECT
COUNT(id) AS orders,
FROM_UNIXTIME(order_dates, '%d-%m-%Y') AS date
FROM orders
GROUP BY request_date
But I not sure how to count two columns and group them for each day.
You could use a query like this:
SELECT sum(orders) as orders, sum(pickups) as pickups, date
FROM (
SELECT
COUNT(id) AS orders, 0 as pickups,
FROM_UNIXTIME(`order_date`, '%d-%m-%Y') AS date
FROM orders
GROUP BY order_date
UNION
SELECT
0 AS orders, COUNT(id) as pickups,
FROM_UNIXTIME(`pickup_date`, '%d-%m-%Y') AS date
FROM orders
GROUP BY pickup_date ) ut
GROUP BY date
Here is a fiddle.

MySQL Query for count of same column separately based on condition

We have an order table with fields as below e.g.
Timestamp PaymentID OrderID
341231231 6 1
342131231 12 2
123123123 18 3
123123122 14 4
123123143 12 5
433453454 6 6
445456456 18 7
What we want to do is get an output which will give us a month wise report on order count for each Payment Type but the payments are to be clubbed together for e.g. 6,8 PaymentID comes under type C so the count should be added for both in one
and all other PaymentID's come under type P
So the output what we want is like below.
Year Month C_Orders P_Orders
2015 01 0 4
2015 02 4 3
2015 03 1 0
2015 04 2 1
We tried 2 queries but has incorrect outputs
select SUBSTRING(CONVERT_TZ(FROM_UNIXTIME(co.timestamp),'+00:00','+5:30'),1,4) as year,SUBSTRING(CONVERT_TZ(FROM_UNIXTIME(co.timestamp),'+00:00','+5:30'),6,2) as month, co.payment_id, count(co.payment_id) as c_orders,co1.payment_id, count(co1.payment_id) as p_orders from
orders as co, orders as co1
WHERE co.payment_id in (6,18)
AND co1.payment_id not in (6,18)
GROUP BY year,month
AND
select SUBSTRING(CONVERT_TZ(FROM_UNIXTIME(co.timestamp),'+00:00','+5:30'),1,4) as year,SUBSTRING(CONVERT_TZ(FROM_UNIXTIME(co.timestamp),'+00:00','+5:30'),6,2) as month, 'COD', count(co.payment_id) as cod_orders
from
orders as co
WHERE co.timestamp >= UNIX_TIMESTAMP(CONVERT_TZ('2014-01-01 00:00:00','+00:00','+5:30')) AND co.timestamp <= UNIX_TIMESTAMP(CONVERT_TZ('2020-12-31 23:59:59','+00:00','+5:30')) AND co.is_parent_order = 'N' AND co.status IN ('C','G','E','P') AND co.payment_id in (6,18)
GROUP BY year,month
union
select SUBSTRING(CONVERT_TZ(FROM_UNIXTIME(co.timestamp),'+00:00','+5:30'),1,4) as year,SUBSTRING(CONVERT_TZ(FROM_UNIXTIME(co.timestamp),'+00:00','+5:30'),6,2) as month, 'PREPAID', count(co.payment_id) as prepaid_orders
from
orders as co
WHERE co.timestamp >= UNIX_TIMESTAMP(CONVERT_TZ('2014-01-01 00:00:00','+00:00','+5:30')) AND co.timestamp <= UNIX_TIMESTAMP(CONVERT_TZ('2020-12-31 23:59:59','+00:00','+5:30')) AND co.is_parent_order = 'N' AND co.status IN ('C','G','E','P') AND co.payment_id not in (6,18)
GROUP BY year,month
Use a case statement to only sum/count the values when a condition is met.
select year, month,
sum(case when payment_id in (6,18) then 1 else 0 end) as 'payment_id in (6,18)',
count(case when payment_id not in (6,18) then payment_id else null end) as 'payment_id not in (6,18)'
from table
group by 1,2