Create date_from and date_to columns in mysql - mysql

I've got a table named 'T1' which I want to transpose and have date_from and date_to columns. The table itself has the data of who is a manager of a particular company. So I want to know since when to when a user was responsible for a company. I can do it easily in BigQuery with the following query but I'm struggling to do the same in MySQL.
WITH T1 AS ( SELECT 9 as rating, 'company1' as cid, 100 as user, '2017-08-20' AS created UNION ALL
SELECT 9 as rating, 'company1' as cid, 101 as user, '2017-08-22' AS created UNION ALL
SELECT 10 as rating, 'company1' as cid, 101 as user, '2017-08-21' AS created
)
SELECT cid, rating, user, CAST(created as DATE) as date_from,
CAST(COALESCE(MIN(CAST(created as DATE)) OVER(PARTITION BY cid, rating ORDER BY CAST(created as DATE) DESC ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING),
DATE_ADD(current_date(), INTERVAL 1 DAY)) as DATE) AS date_to
FROM T1
The original table format:
rating cid user created
9 company1 100 2017-08-20
9 company1 101 2017-08-22
10 company1 101 2017-08-21
The final table should have the following format:
cid rating user date_from date_to
1 company1 9 101 2017-08-22 2018-02-24
2 company1 9 100 2017-08-20 2017-08-22
3 company1 10 101 2017-08-21 2018-02-24
Thank you!

You really need lead(), which is not available in MySQL (and which would make the BigQuery query simpler). One method uses a correlated subquery:
select t1.*, t1.created as date_from,
(select min(tt1.created)
from t1 tt1
where tt1.cid = t1.cid and tt1.created > t1.created
) as date_to
from t1;

Related

Get all transaction details of a user f their 2nd month of transaction

Trying to get the 2nd transaction month details for all the customers
Date User_id amount
2021-11-01 1 100
2021-11-21 1 200
2021-12-20 2 110
2022-01-20 2 200
2022-02-04 1 50
2022-02-21 1 100
2022-03-22 2 200
For every customer get all the records in the month of their 2nd transaction (There can be multiple transaction in a month and a day by a particular user)
Expected Output
Date User_id amount
2022-02-04 1 50
2022-02-21 1 100
2022-01-20 2 200
You can use dense_rank:
select Date, User_id, amount from
(select *, dense_rank() over(partition by User_id order by year(Date), month(date)) r
from table_name) t
where r = 2;
Fiddle
If dense_rank is an option you can:
with cte1 as (
select *, extract(year_month from date) as yyyymm
from t
), cte2 as (
select *, dense_rank() over (partition by user_id order by yyyymm) as dr
from cte1
)
select *
from cte2
where dr = 2
Note that it is possible to write the above using one cte.

Creating an overdraft statement

I'm currently stuck on how to create a statement that shows daily overdraft statements for a particular council.
I have the following, councils, users, markets, market_transactions, user_deposits.
market_transaction run daily reducing user's account balance. When the account_balance is 0 the users go into overdraft (negative). When users make a deposit their account balance increases.
I Have put the following tables to show how transactions and deposits are stored.
if I reverse today's transactions I'm able to get what account balance a user had yesterday but to formulate a query to get the daily OD amount is where the problem is.
USERS
user_id
name
account_bal
1
Wells
-5
2
James
100
3
Joy
10
4
Mumbi
-300
DEPOSITS
id
user_id
amount
date
1
1
5
2021-04-26
2
3
10
2021-04-26
3
3
5
2021-04-25
4
4
5
2021-04-25
TRANSACTIONS
id
user_id
amount_tendered
date
1
1
5
2021-04-27
2
2
10
2021-04-26
3
3
15
2021-04-26
4
4
50
2021-04-25
The Relationships are as follows,
COUNCILS
council_id
name
1
a
2
b
3
c
MARKETS
market_id
name
council_id
1
x
3
2
y
1
3
z
2
MARTKET_USER_LINK
id
market_id
user_id
1
1
3
2
2
2
3
3
1
I'm running this SQL query to get the total amount users have spent and subtracting with the current user account balance.
Don't know If I can use this to figure out the account_balance for each day.
SELECT u.user_id, total_spent, total_deposits,m.council_id
FROM users u
JOIN market_user_link ul ON ul.user_id= u.user_id
LEFT JOIN markets m ON ul.market_id =m.market_id
LEFT JOIN councils c ON m.council_id =c.council_id
LEFT JOIN (
SELECT user_id, SUM(amount_tendered) AS total_spent
FROM transactions
WHERE DATE(date) BETWEEN DATE('2021-02-01') AND DATE(NOW())
GROUP BY user_id
) t ON t.user_id= u.user_id
ORDER BY user_id, total_spent ASC
// looks like this when run
| user_id | total_spent | council_id |
|-------------|----------------|------------|
| 1 | 50.00 | 1 |
| 2 | 2.00 | 3 |
I was hoping to reverse transactions and deposits done to get the account balance for a day then get the sum of users with an account balance < 0... But this has just failed to work.
The goal is to produce a query that shows daily overdraft (Only SUM the total account balance of users with account balance below 0 ) for a particular council.
Expected Result
date
council_id
o_d_amount
2021-04-24
1
-300.00
2021-04-24
2
-60.00
2021-04-24
3
-900.00
2021-04-25
1
-600.00
2021-04-25
2
-100.00
2021-04-25
3
-1200.00
This is actually not that hard, but the way you asked makes it hard to follow.
Also, your expected result should match the data you provided.
Edited: Previous solution was wrong - It counted withdraws and deposits more than once if you have more than one event for each user/date.
Start by having the total exchanged on each day, like
select user_id, date, sum(amount) exchanged_on_day from (
select user_id, date, amount amount from deposits
union all select user_id, date, -amount_tendered amount from transactions
) d
group by user_id, date
order by user_id, date;
What follows gets the state of the account only on days that had any deposits or withdraws.
To get the results of all days (and not just those with account movement) you just have to change the cross join part to get a table with all dates you want (like Get all dates between two dates in SQL Server) but I digress...
select dates.date, c.council_id, u.name username
, u.account_bal - sum(case when e.date >= dates.date then e.exchanged_on_day else 0 end) as amount_on_start_of_day
, u.account_bal - sum(case when e.date > dates.date then e.exchanged_on_day else 0 end) as amount_on_end_of_day
from councils c
inner join markets m on c.council_id=m.council_id
inner join market_user_link mul on m.market_id=mul.market_id
inner join users u on mul.user_id=u.user_id
left join (
select user_id, date, sum(amount) exchanged_on_day from (
select user_id, date, amount amount from deposits
union all select user_id, date, -amount_tendered amount from transactions
) d group by user_id, date
) e on u.user_id=e.user_id --exchange on each Day
cross join (select distinct date from (select date from deposits union select date from transactions) datesInternal) dates --all days that had a transaction
group by dates.date, c.council_id, u.name, u.account_bal
order by dates.date desc, c.council_id, u.name;
From there you can rearrange to get the result you want.
select date, council_id
, sum(case when amount_on_start_of_day<0 then amount_on_start_of_day else 0 end) o_d_amount_start
, sum(case when amount_on_end_of_day<0 then amount_on_end_of_day else 0 end) o_d_amount_end
from (
select dates.date, c.council_id, u.name username
, u.account_bal - sum(case when e.date >= dates.date then e.exchanged_on_day else 0 end) as amount_on_start_of_day
, u.account_bal - sum(case when e.date > dates.date then e.exchanged_on_day else 0 end) as amount_on_end_of_day
from councils c
inner join markets m on c.council_id=m.council_id
inner join market_user_link mul on m.market_id=mul.market_id
inner join users u on mul.user_id=u.user_id
left join (
select user_id, date, sum(amount) exchanged_on_day from (
select user_id, date, amount amount from deposits
union all select user_id, date, -amount_tendered amount from transactions
) d group by user_id, date
) e on u.user_id=e.user_id --exchange on each Day
cross join (select distinct date from (select date from deposits union select date from transactions) datesInternal) dates --all days that had a transaction
group by dates.date, c.council_id, u.name, u.account_bal
) result
group by date, council_id
order by date;
You can check it on https://www.db-fiddle.com/f/msScT6B5F7FjU2aQXVr2da/6
Basically the query maps users to councils, caculates periods of overdrafts for users, them aggregates over councils. I assume that starting balance is dated start of the month '2021-04-01' (it could be ending balance as well, see below), change it as needed. Also that negative starting balance counts as an overdraft. For simplicity and debugging the query is divided into a number of steps.
with uc as (
select distinct m.council_id, mul.user_id
from markets m
join market_user_link mul on m.market_id = mul.market_id
),
user_running_total as (
select user_id, date,
coalesce(lead(date) over(partition by user_id order by date) - interval 1 day, date) nxt,
sum(sum(s)) over(partition by user_id order by date) rt
from (
select user_id, date, -amount_tendered s
from transactions
union all
select user_id, date, amount
from deposits
union all
select user_id, se.d, se.s
from users
cross join lateral (
select date(NOW() + interval 1 day) d, 0 s
union all
select '2021-04-01' d, account_bal
) se
) t
group by user_id, date
),
user_overdraft as (
select user_id, date, nxt, least(rt, 0) ovd
from user_running_total
where date <= date(NOW())
),
dates as (
select date
from user_overdraft
union
select nxt
from user_overdraft
),
council__overdraft as (
select uc.council_id, d.date, sum(uo.ovd) total_overdraft, lag(sum(uo.ovd), 1, sum(uo.ovd) - 1) over(partition by uc.council_id order by d.date) prev_ovd
from uc
cross join dates d
join user_overdraft uo on uc.user_id = uo.user_id and d.date between uo.date and uo.nxt
group by uc.council_id, d.date
)
select council_id, date, total_overdraft
from council__overdraft
where total_overdraft <> prev_ovd
order by date, council_id
Really council__overdraft is quite usable, the last step just compacts output excluding intermidiate dates when overdraft is not changed.
With following sample data:
users
user_id name account_bal
1 Wells -5
2 James 100
3 Joy 10
4 Mumbi -300
deposits, odered by date, extra row added for the last date
id user_id amount date
3 3 5 2021-04-25
4 4 5 2021-04-25
1 1 5 2021-04-26
2 3 10 2021-04-26
5 3 73 2021-05-06
transactions, odered by date (note the added row, to illustrate running total in action)
id user_id amount_tendered date
5 4 50 2021-04-25
2 2 10 2021-04-26
3 3 15 2021-04-26
1 1 5 2021-04-27
4 3 17 2021-04-27
councils
council_id name
1 a
2 b
3 c
markets
market_id name council_id
1 x 3
2 y 1
3 z 2
market_user_link
id market_id user_id
1 1 3
2 2 2
3 3 1
4 3 4
the query ouput is
council_id
date
overdraft
1
2021-04-01
0
2
2021-04-01
-305
3
2021-04-01
0
2
2021-04-25
-350
2
2021-04-26
-345
2
2021-04-27
-350
3
2021-04-27
-7
3
2021-05-06
0
Alternatively, provided the users table is holding a closing (NOW()) balance, replace user_running_total CTE with the following code
user_running_total as (
select user_id, date,
coalesce(lead(date) over(partition by user_id order by date) - interval 1 day, date) nxt,
coalesce(sum(sum(s)) over(partition by user_id order by date desc
rows between unbounded preceding and 1 preceding), sum(s)) rt
from (
select user_id, date, amount_tendered s
from transactions
union all
select user_id, date, -amount
from deposits
union all
select user_id, se.d, se.s
from users
cross join lateral (
select date(NOW() + interval 1 day) d, account_bal s
union all
select '2021-04-01' d, 0
) se
) t
where DATE(date) between date '2021-04-01' and date(NOW() + interval 1 day)
group by user_id, date
),
This way the query starts with closing balance dated next date after now and rollouts a running total in the reverse order till '2021-04-01' as a starting date.
Output
council_id
date
overdraft
1
2021-04-01
0
2
2021-04-01
-260
3
2021-04-01
-46
2
2021-04-25
-305
3
2021-04-25
-41
2
2021-04-26
-300
3
2021-04-26
-46
2
2021-04-27
-305
3
2021-04-27
-63
3
2021-05-06
0
db-fiddle both versions

Update columns based on calculation

My table looks like this:
id entry_date
1 21/12/2020 15:00
1 21/12/2020 17:00
1 21/12/2020 19:00
2 24/12/2020 00:00
2 24/12/2020 12:00
I have a list of id's connected to datestamps. I can manage to calculate the difference between their latest and first entry as follows:
SELECT id, TIMESTAMPDIFF(hour, MIN(entry_date), MAX(entry_date))
FROM mytable
GROUP BY id;
However, I am unsure how I can update my table to reflect these calculations. What I want is the following:
id entry_date time_difference
1 21/12/2020 15:00 4
1 21/12/2020 17:00 4
1 21/12/2020 19:00 4
2 24/12/2020 00:00 12
2 24/12/2020 12:00 12
In MySQL, you can self-join:
update mytable t
inner join (
select id,
timestampdiff(hour, min(entry_date), max(entry_date)) as time_difference
from mytable
group by id
) t1 on t1.id = t.id
set t.time_difference = t1.time_difference
I would not necessarily recommend storing this derived information, because it is hard to keep it up to date. Instead, you can create a view. If you are running MySQL 8.0:
create view myview as
select t.*,
timestampdiff(
hour,
min(entry_date) over(partition by id),
max(entry_date) over(partition by id)
) as timedifference
from mytable t
You can use a join in the update:
update mytable t join
(SELECT id, TIMESTAMPDIFF(hour, MIN(entry_date), MAX(entry_date)) as diff
FROM mytable
GROUP BY id
) tt
using (id)
set t.time_difference = tt.diff;

Get Percentage of Last X entries in MySQL

I have 2 tables in MySQL(InnoDB). The first is an employee table. The other table is the expense table. For simplicity, the employee table contains just id and first_name. The expense table contains id, employee_id(foreign key), amount_spent, budget, and created_time. What I would like is a query that returns the percentage of their budget spent for the most recent X number of expense they've registered.
So given the employee table:
| id | first_name
-------------------
1 alice
2 bob
3 mike
4 sally
and the expense table:
| id | employee_id | amount_spent | budget | created_time
----------------------------------------------------------
1 1 10 100 10/18
2 1 50 100 10/19
3 1 0 40 10/20
4 2 5 20 10/22
5 2 10 70 10/23
6 2 75 100 10/24
7 3 50 50 10/25
The query for the last 3 trips would return
|employee_id| first_name | percentage_spent |
--------------------------------------------
1 alice .2500 <----------(60/240)
2 bob .4736 <----------(90/190)
3 mike 1.000 <----------(50/50)
The query for the last 2 trips would return
|employee_id| first_name | percentage_spent |
--------------------------------------------
1 alice .3571 <----------(50/140)
2 bob .5000 <----------(85/170)
3 mike 1.000 <----------(50/50)
It would be nice if the query, as noted above, did not return any employees who have not registered any expenses (sally). Thanks in advance!!
I'll advise you to convert datatype of created_time as DATETIME in order to get accurate results.
As of now, I've assumed that most recent id indicates most recent spents as it's what sample data suggests.
Following query should work (didn't tested though):
select t2.employee_id,t1.first_name,
sum(t2.amount_spent)/sum(t2.budget) as percentage_spent
from employee t1
inner join
(select temp.* from
(select e.*,#num := if(#type = employee_id, #num + 1, 1) as row_number,
#type := employee_id as dummy
from expense e
order by employee_id,id desc) temp where temp.row_number <= 3 //write value of **n** here.
) t2
on t1.id = t2.employee_id
group by t2.employee_id
;
Click here for DEMO
Feel free to ask doubt(s), if you've any.
Hope it helps!
If you are using mysql 8.0.2 and higher you might use window function for it.
SELECT employee_id, first_name, sliding_sum_spent/sliding_sum_budget
FROM
(
SELECT employee_id, first_name,
SUM(amount_spent) OVER (PARTITION BY employee_id
ORDER BY created_time
RANGE BETWEEN 3 PRECEDING AND 0 FOLLOWING) AS sliding_sum_spent,
SUM(budget) OVER (PARTITION BY employee_id
ORDER BY created_time
RANGE BETWEEN 3 PRECEDING AND 0 FOLLOWING) AS sliding_sum_budget,
COUNT(*) OVER (PARTITION BY employee_id
ORDER BY created_time DESC) rn
FROM expense
JOIN employee On expense.employee_id = employee.id
) t
WHERE t.rn = 1
As mentioned by Harshil, order of row according to the created_time may be a problem, therefore, it would be better to use date date type.

COUNT DISTINCT + COUNT GROUP BY HAVING (value) + GROUP BY months

I have a table with columns: cid, date
Sample table data: Note: cid contains string values eg: 'otsytb8o7sbs50w9doghwzvfy0vb8f9h' many are duplicated.
cid. date
--------------------------------------------------------
1 2015-10-10 04:57:57
2 2015-10-10 05:03:58
3 2015-10-10 05:24:49
4 2015-10-10 05:28:24
5 2015-10-10 05:28:26
6 2015-10-10 05:28:40
7 2015-10-10 05:30:39
8 2015-10-10 05:33:04
9 2015-10-10 05:35:42
9 2015-10-10 05:36:03
I want to get the following:
Count of Distinct cid as uniqVisits
Count of cid HAVING (count <= 1) as bounced
Grouped by month
I want to get bounce rate per month from Cookie ID's (cid).
So I am looking for: ( COUNT of unique Cookie ID's with a count of <=1 ) for bounced, and ( COUNT DISTINCT cid's ) for total unique visitors, Grouped By month
Desired result:
uniqVisits | bounced | month
-----------|---------|-------
2345 | 325 | 2015-10
-----------|---------|-------
7345 | 734 | 2015-11
-----------|---------|-------
3982 | 823 | 2015-12
-----------|---------|-------
4291 | 639 | 2016-01
I have tried a lot of methods the below is the closest I can get but it gives me error: "Operand should contain 1 column(s)"
SELECT count(*) AS bounced,
( SELECT count( DISTINCT(cid) ) AS uniqVisits,
SUBSTR(DATE(date),1,7) AS month
FROM table ) AS uniqVisits
FROM (
SELECT COUNT(cid) AS bounced,
SUBSTR(DATE(date),1,7) AS month
FROM table
GROUP BY cid
HAVING (count <= 1)
) AS x
GROUP BY month
How can I write this query to give me the desired result I want in the "Desired result:" chart / table illustrated above?
BTW: I also tried the below query but it times out, and then throws a server error: It also does not group the second query into month, obviously because of the "cid having count <=1"
SELECT c1.uniqVisits,
c1.month,
c2.bounced
FROM ( SELECT COUNT(DISTINCT t1.cid) AS `uniqVisits`,
SUBSTR(DATE(t1.date),1,7) AS `month`
FROM table t1
GROUP BY month
) c1
JOIN ( SELECT COUNT(*) AS `bounced`,
SUBSTR(DATE(t2.date),1,7) AS `month`
FROM table t2
GROUP BY month, cid HAVING (count <= 1)
) c2
ON c2.month = c1.month
ORDER BY c1.month
So I have resolved this:
SELECT uniqVisitors, COUNT(*) AS bounced, T1.month
FROM (
SELECT cid,
SUBSTR(DATE(date),1,7) AS month
FROM table
GROUP BY cid
HAVING COUNT(*) <= 1
) T1
LEFT JOIN
( SELECT count( DISTINCT(cid) ) AS uniqVisitors,
SUBSTR(DATE(date),1,7) AS month
FROM table
GROUP By month ) T2
ON T1.month = T2.month
GROUP BY month
Gives me:
uniqVisitors | bounced | month
---------------------------------
7237 6822 2015-10
12597 12136 2015-11
12980 12573 2015-12
12091 11695 2016-01
5396 5134 2016-02