find last two amounts of each customer based on dates in mysql - mysql

| recharge_table |
r_date
r_name
r_amount
01-01-2020
Phineas
120
01-02-2020
Phineas
130
01-03-2020
Phineas
199
01-04-2020
Candes
299
03-01-2020
Candes
149
03-02-2020
Ferb
149
03-03-2020
Platypus
349
05-08-2020
Ferb
459
09-11-2020
Candes
199
06-10-2020
Platypus
299
find last two amounts of each customer based on dates, and in ascending order of name.
output must be as below:-
| Candes | 199 | 299 |
| Ferb | 459 | 159 |
| Phineas | 199 | 130 |
| Platypus | 299 | 349 |
If Possible, also give explanation.

You can use row_number() and conditional aggregation:
select r_name,
max(case when seqnum = 1 then r_amount end),
max(case when seqnum = 2 then r_amount end)
from (select r.*,
row_number() over (partition by r_name order by r_date desc) as seqnum
from recharge_table r
) r
where seqnum <= 2
group by r_name;

Related

sql exclude rows based on first occurrence of data and conditions

I have created a dataset that has columns for 2 customers:
Cust_No Transaction_date amount credit_debit running_total row_num
1 5/27/2022 800 D -200 1
1 5/26/2022 300 D 600 2
1 5/22/2022 800 C 900 3
1 5/20/2022 100 C 100 4
9 5/16/2022 500 D -300 1
9 5/14/2022 300 D 200 2
9 5/6/2022 200 C 500 3
9 5/5/2022 500 D 300 4
9 5/2/2022 300 D 800 5
9 5/2/2022 500 C 1100 6
9 5/1/2022 500 C 600 7
9 5/1/2022 100 C 100 8
The result I am looking for is:
Cust_No Transaction_date amount credit_debit running_total row_num
1 5/27/2022 800 D -200 1
1 5/26/2022 300 D 600 2
1 5/22/2022 800 C 900 3
9 5/16/2022 500 D -300 1
9 5/14/2022 300 D 200 2
9 5/6/2022 200 C 500 3
9 5/5/2022 500 D 300 4
9 5/2/2022 300 D 800 5
9 5/2/2022 500 C 1100 6
I sorted the dataset based on latest transaction for each customer.
We note the latest transaction amount and search for first occurrence of same amount that was a credit (C) and exclude the rest of the rows after it.
In the example above: Customer 9 has lastest debit transaction of 500, so we look for most recent credit transaction of 500 and exclude all the rows after that for customer 9.
Progress Made so far:
calculated the running total using logic:
sum (case when credit_debit ='C' then amount else -1*amount end) over (partition by cust_no order by transaction_date desc ) as running_total
I also got the data using lead 1,2,3,4,5 but this is not efficient and I could have multiple rows before I find the first credit number with amount same as 1st row:
case when lead(amount, 1) over(partition by cust_no order by transaction_date desc) = amount then amount else null end as lead1
No sure which dbms this is for but it need a lateral join in postgres.
It searches for the most recent transaction identified when rn = 1, then it matches that amount to an earlier credit transaction of the same amount and using the rn of that row to form a boundary of row numbers to be returned:
with CTE as (
select
Cust_No, Transaction_date, amount, credit_debit, running_total
, row_number() over(partition by cust_no order by transaction_date DESC) as rn
from mytable
)
, RANGE as (
select *
from CTE
left join lateral (
select c.rn as ignore_after
from CTE as c
where CTE.Cust_No = c.Cust_No
and CTE.amount = c.amount
and c.credit_debit = 'C'
and CTE.rn = 1
order by c.rn ASC
limit 1
) oa on true
where CTE.rn = 1
)
select
CTE.*
from CTE
inner join RANGE on CTE.rn between RANGE.rn and RANGE.ignore_after
and CTE.cust_no = RANGE.cust_no
Cust_No | Transaction_date | amount | credit_debit | running_total | rn
------: | :--------------- | -----: | :----------- | ------------: | -:
1 | 2022-05-27 | 800 | D | -200 | 1
1 | 2022-05-26 | 300 | D | 600 | 2
1 | 2022-05-22 | 800 | C | 900 | 3
9 | 2022-05-16 | 500 | D | -300 | 1
9 | 2022-05-14 | 300 | D | 200 | 2
9 | 2022-05-06 | 200 | C | 500 | 3
9 | 2022-05-05 | 500 | D | 300 | 4
9 | 2022-05-02 | 300 | D | 800 | 5
9 | 2022-05-02 | 500 | C | 1100 | 6
for postgres see: db<>fiddle here
nb: for an "outer apply" example I have also used SQL Server in the following fiddle see: db<>fiddle here

SQL query to find the concurrent sessions based on start and end time

Below is a sample dataset showing TV sessions of each TV set of each household.
Household “111” switch on their TV “1” at 500 and switch it off at 570. However, this has been captured
in the data as 2 separate rows. You will have to write a query to convert this into a single row.
Similar modification needs to be made to all other subsequent occurrences. Please note that a single
valid TV session can be split into more than 2 rows as well (As shown by rows 5-8).
Input :
Table [session]
Household_ID TV_Set_ID Start_time End_time
111 1 500 550
111 1 550 570
111 1 590 620
111 1 650 670
111 2 660 680
111 2 680 700
111 2 700 750
111 2 750 770
112 2 1050 1060
113 1 1060 1080
113 1 1080 1100
113 1 1100 1120
113 1 1500 1520
Expected Output :-
Household_ID TV_Set_ID Start_time End_time
111 1 500 570
111 1 590 620
111 1 650 670
111 2 660 770
112 2 1050 1060
113 1 1060 1120
113 1 1500 1520
I tried to find the lead time and find the difference and calculate the difference between that and the End time and thought I could group by but then that logic wont work since we dont just want the start and end time but even the gaps in the sessions. I'm stuck with the logic. Could someone tell how to proceed further ?
with result as
(
select Household_ID, TV_Set_ID, Start_time, End_time, lead(Start_time)
over (partition by Household_ID, TV_Set_ID order by Household_ID, TV_Set_ID) as lead_start
from session )
select *,lead_start - End_time as diff from result ;
Here is a way to get this done
In the data block i create groups which is defined as any record whose previous end_time doenst match with my start_time and assign a group_number to it if its different, else i keep it same.
After that in the main block i group by this group_number, along with the household_id,tv_set_id to get the results.
with data
as (
select *
,case when lag(end_time) over(partition by household_id,tv_set_id order by end_time)
<> start_time then
sum(1) over(partition by household_id,tv_set_id order by end_time)
else
sum(0) over(partition by household_id,tv_set_id order by end_time)
end as group_number
from t
)
select household_id
,tv_set_id
,min(start_time) as start_time
,max(end_time) as end_time
from data
group by household_id,tv_set_id,group_number
+--------------+-----------+------------+----------+
| household_id | tv_set_id | start_time | end_time |
+--------------+-----------+------------+----------+
| 111 | 1 | 500 | 570 |
| 111 | 1 | 590 | 620 |
| 111 | 1 | 650 | 670 |
| 111 | 2 | 660 | 770 |
| 112 | 2 | 1050 | 1060 |
| 113 | 1 | 1060 | 1120 |
| 113 | 1 | 1500 | 1520 |
+--------------+-----------+------------+----------+
db fiddle link
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ba5ade186ebc3cf693c505d863691670
You could conditionally increment your data as in this case.
You might need to increment your data in case that the same household and TV set is used later. The column sequence created in the cte is used for that reason. Find the indicative answer.
WITH cte AS (
select t.*,
sum(flag) over (partition by household_id, tv_set_id order by household_id, tv_set_id, start_time) as sequence
from (select t.*,
case when start_time = LAG(end_time,1) OVER (PARTITION BY household_id, tv_set_id ORDER BY household_id, tv_set_id, start_time)
then 0
else 1
end as flag
from t
)t)
SELECT household_id, tv_set_id, MIN(start_time), MAX(end_time)
FROM cte
GROUP BY household_id, tv_set_id, sequence

Select where in() for each id return equal rows count

How to select rows for each user_id equals select numbers of count for each user_id?
My example table:
mp3_id | user_id
--------------------
120 | 840
123 | 840
126 | 840
128 | 455
130 | 840
131 | 840
132 | 840
135 | 840
144 | 840
158 | 840
159 | 455
161 | 455
169 | 455
180 | 840
181 | 455
184 | 455
186 | 455
189 | 455
My simple query:
select mp3_id where user_id IN (840,455) limit 8
Return:
mp3_id | user_id
--------------------
120 | 840
123 | 840
126 | 840
128 | 455
130 | 840
131 | 840
132 | 840
135 | 840
But I want to this select:
mp3_id | user_id
--------------------
120 | 840
123 | 840
126 | 840
130 | 840
128 | 455
159 | 455
161 | 455
169 | 455
I want each user_id to return an equal row count. How to?
You could do this with a UNION:
select mp3_id where user_id = 840 limit 4
union all
select mp3_id where user_id = 455 limit 4
try this query:
select
yt.mp3_id,
e.user_id
from
(
select distinct user_id from your_table
) e
join your_table yt on true
where yt.mp3_id in (select tt.mp3_id from your_table tt where tt.user_id = e.user_id order by tt.mp3_id limit 4)
and tihs same query but with condition
select
yt.mp3_id,
e.user_id
from
(
select distinct user_id from your_table where user_id in (840,455)
) e
join your_table yt on true
where yt.mp3_id in (select tt.mp3_id from your_table tt where tt.user_id = e.user_id order by tt.mp3_id limit 4)
SELECT x.*
FROM my_table x
JOIN my_table y
ON y.user_id = x.user_id
AND y.mp3_id <= x.mp3_id
GROUP
BY x.mp3_id HAVING COUNT(*) <= 4
ORDER
BY user_id DESC
, mp3_id;
or faster
SELECT mp3_id, user_id FROM
(
SELECT x.*, CASE WHEN #prev = user_id THEN #i:=#i+1 ELSE #i:=1 END i, #prev:=user_id FROM my_table x, (SELECT #prev:=null,#i:=1) vars ORDER BY user_id DESC, mp3_id
) a
WHERE i<=4;

Group by date every 10 days

I have this scheme:
+----+--+--------+--------------------+
| ID | Amount | paydate |
+----+-----------+--------------------+
| 1 | 200 |2016-11-05 |
+----+-----------+--------------------+
| 2 | 3000 |2016-11-10 |
+----+-----------+--------------------+
| 3 | 2500 |2016-11-11 |
+----+-----------+--------------------+
| ID | 100 |2016-11-21 |
+----+-----------+--------------------+
| 1 | 200 |2016-11-22 |
+----+-----------+--------------------+
| 2 | 3000 |2016-11-23 |
+----+-----------+--------------------+
| 3 | 2500 |2016-11-29 |
+----+-----------+--------------------+
How can I get the total Amount grouped by every 10 days like from the first of every month to the 10th then from 11th to 20th and from 21st to the end of the month?
to be shown like this :
+-----------+------------------------+
| Amount | paydate |
+-----------+------------------------+
| 3200 |2016-11-1 to 2016-11-10 |
+-----------+------------------------+
| 2500 |2016-11-11 to 2016-11-20|
+-----------+------------------------+
| 5800 |2016-11-21 to 2016-11-31|
+-----------+------------------------+
I tried
SELECT
SUM(Amount) AS Amount,
year(Facture.paydate) AS Annee,
month(Facture.paydate) AS Mois
FROM Facture
GROUP BY year(Facture.paydate), month(serFacture.paydate)
but this does not give me the result I need.
select sum(Amount) as sum_amount
,case
when day(paydate) <= 10 then concat(DATE_FORMAT(paydate,'%Y-%m-01'),' to ',DATE_FORMAT(paydate,'%Y-%m-10'))
when day(paydate) <= 20 then concat(DATE_FORMAT(paydate,'%Y-%m-11'),' to ',DATE_FORMAT(paydate,'%Y-%m-20'))
else concat(DATE_FORMAT(paydate,'%Y-%m-21'),' to ',DATE_FORMAT(paydate,'%Y-%m-31'))
end as paydate_period
from t
group by paydate_period
;
sum_amount paydate_period
3200 2016-11-01 to 2016-11-10
2500 2016-11-11 to 2016-11-20
5800 2016-11-21 to 2016-11-31
Here is an example query:
select
case
when day(date_field) between 1 and 10 then "01 to 10"
when day(date_field) between 11 and 20 then "11 to 20"
when day(date_field) between 21 and 31 then "21 to 31"
end as the_range,
date_format(date_field, "%m%Y") as the_month,
count(*)
from
the_table
group by
the_range, the_month
order by
the_month, the_range;
You can adapt the query so you display your result the way you need.

Still show the proper set of time even if there's no entry for that time

I have this query where it gets the average and group the values by 15 mins from 12 AM to 11:45 PM.
SELECT FROM_UNIXTIME(t_stamp/1000, '%m/%d/%Y %l:%i %p') as t_stamp,
ROUND(AVG(CASE WHEN id = '001' THEN value END),2) Value1,
ROUND(AVG(CASE WHEN id = '002' THEN value END),2) Value2,
ROUND(AVG(CASE WHEN id = '003' THEN value END),2) Value3
FROM table1
WHERE tagid IN ("001", "002", "003") and
date(from_unixtime(t_stamp/1000)) BETWEEN "2014-05-01" AND "2014-05-01"
GROUP BY DATE(from_unixtime(t_stamp/1000)), HOUR(from_unixtime(t_stamp/1000)), MINUTE(from_unixtime(t_stamp/1000)) DIV 15
The output looks like this
t_stamp | Value1 | Value2 | Value3
05/01/2014 12:00 AM | 199 | 99 | 100
05/01/2014 12:15 AM | 299 | 19 | 140
05/01/2014 12:30 AM | 399 | 59 | 106
05/01/2014 12:45 AM | 499 | 59 | 112
.
.
.
05/01/2014 11:00 PM | 149 | 199 | 100
05/01/2014 11:15 PM | 599 | 93 | 123
05/01/2014 11:30 PM | 129 | 56 | 150
05/01/2014 11:45 PM | 109 | 60 | 134
It works fine but I've noticed that sometimes if there's no entry for like the time 12:30 instead of showing
t_stamp | Value1 | Value2 | Value3
05/01/2014 12:00 AM | 199 | 99 | 100
05/01/2014 12:15 AM | 299 | 19 | 140
05/01/2014 12:30 AM | Null | Null | Null
05/01/2014 12:45 AM | 499 | 59 | 112
It will show the set of time like this:
t_stamp | Value1 | Value2 | Value3
05/01/2014 12:00 AM | 199 | 99 | 100
05/01/2014 12:15 AM | 299 | 19 | 140
05/01/2014 12:33 AM | 122 | 141 | 234
05/01/2014 12:45 AM | 499 | 59 | 112
What I would like to happen is when there's no time for that 15 min group it will still show the proper set of time and then just show null on the column values. The output I would like is like this:
t_stamp | Value1 | Value2 | Value3
05/01/2014 12:00 AM | 199 | 99 | 100
05/01/2014 12:15 AM | 299 | 19 | 140
05/01/2014 12:30 AM | Null | Null | Null
05/01/2014 12:45 AM | 499 | 59 | 112
How can I do this?
Thank You.
You need a table that's a source of cardinal numbers as a start for this. For the moment let's assume it exists, and it's called cardinal.
Then, you need to create a query (a virtual table) that will return rows with timestamps every fifteen minutes, starting with the earliest relevant timestamp and ending with the latest. Here's how to do that for your query.
SELECT '2014-05-01' + INTERVAL (cardinal.n * 15) MINUTE as t_stamp
FROM cardinal
WHERE cardinal.n <= 24*4
Then you need to JOIN that virtual table to your existing query, as follows
SELECT DATE_FORMAT(t_stamp.t_stamp, '%m/%d/%Y %l:%i %p') t_stamp,
ROUND(AVG(CASE WHEN id = '001' THEN value END),2) Value1,
ROUND(AVG(CASE WHEN id = '002' THEN value END),2) Value2,
ROUND(AVG(CASE WHEN id = '003' THEN value END),2) Value3
FROM table1 AS t
LEFT JOIN (
SELECT '2014-05-01' + INTERVAL (cardinal.n * 15) MINUTE as t_stamp
FROM cardinal
WHERE cardinal.n <= 24*4
) AS t_stamp
ON t_stamp.t_stamp = FROM_UNIXTIME(t.t_stamp/1000)
WHERE tagid IN ("001", "002", "003")
AND date(from_unixtime(t_stamp/1000)) BETWEEN "2014-05-01" AND "2014-05-01"
GROUP BY DATE(from_unixtime(t_stamp/1000)),
HOUR(from_unixtime(t_stamp/1000)),
MINUTE(from_unixtime(t_stamp/1000)) DIV 15
Notice that the LEFT JOIN makes sure the rows will NULL values from your original query get included in the result set.
Now, where does this magical cardinal table come from?
You can generate it as two views, like this. This particular view generates numbers from 0 to 100 000, which is more than enough for quarters of hours for a year.
CREATE OR REPLACE VIEW cardinal10 AS
SELECT 0 AS N UNION
SELECT 1 AS N UNION
SELECT 2 AS N UNION
SELECT 3 AS N UNION
SELECT 4 AS N UNION
SELECT 5 AS N UNION
SELECT 6 AS N UNION
SELECT 7 AS N UNION
SELECT 8 AS N UNION
SELECT 9 AS N;
CREATE OR REPLACE VIEW cardinal AS
SELECT A.N + 10*(B.N + 10*(C.N + 10*(D.N + 10*(E.N)))) AS N
FROM cardinal10 A,cardinal10 B,cardinal10 C,
cardinal10 D,cardinal10 E;
Here's a writeup on the topic.
http://www.plumislandmedia.net/mysql/filling-missing-data-sequences-cardinal-integers/