SQL query for latest Order Status in MySQL - mysql

We are tracking order statuses sent by our shipping partners via a webhook fired by them. The webhook adds a row everytime its fired and therefore each order has multiple rows associated with it.
Structure of the table
enter image description here
We are trying to create a sql query to get the following
Find the last received row for a 'awb'. Get the current_status in that row. If the current_status is any of 'PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED' then find the row with the first occurrence of these statuses for that specific 'awb'
Check number of days between first occurrence and last occurrence of those statuses for the awb
and output the awbs that have more than 2 days difference.
Here is the query i have been able to create.
WITH ranked_order_status AS (
SELECT os.*,
datediff(
now() ,
first_value(recived_at) over (partition by awb order by recived_at asc)
) as diff,
ROW_NUMBER() OVER (PARTITION BY awb ORDER BY recived_at desc) AS rn
FROM order_status AS os where current_status in ('PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED')
)
SELECT * FROM ranked_order_status WHERE rn = 1 and diff > 2
This unfortunately shows me all awbs with rows having those statuses and not only the ones with the last received current status of 'PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED'
Any idea how i can edit this?

So if I understood it correctly, this should be a clear case of an analytical function using RANK().
This would be my approach regarding the constraints you mentioned:
WITH t1 AS (
SELECT os.*,
FIRST_VALUE(os.received_at) OVER(PARTITION BY os.awb
ORDER BY os.received_at) AS first_received_at
FROM order_status AS os
WHERE os.current_status IN ('PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED')
),
t2 AS (SELECT RANK() OVER (PARTITION BY t1.awb
ORDER BY t1.received_at DESC) AS reverse_event_sequence,
DATE_DIFF(t1.received_at, t1.first_received_at, DAY) AS day_diff
t1.*
FROM t1
),
final AS (
SELECT *
FROM t2
WHERE t2.day_diff > 2
AND t2.reverse_event_sequence = 1
)
SELECT *
FROM final
Basically, you want to first grab the first value of the received_at for every row, then you want to rank all the events for every awb and order it on descending order to always have the last event as rank=1, and then you apply the desired constraints on the date difference :)
I have to mention that not having a data sample doesn't help, though. And would appreciate any feedback on my approach :)

You can enumerate the rows from the most recent rows in two ways:
Partitioned by awb and ordered by received at descending
Partitioned by awb and status and ordered by received at descending
For the last status for each awb, the difference between these will be zero. You can select this and then aggregate:
select awb, current_status,
min(received_at), max(received_at)
from (select os.*,
row_number() over (partition by awb order by received_at desc) as seqnum,
row_number() over (partition by awb, current_status order by received_at desc) as seqnum_2
from order_status os
) os
where seqnum = seqnum_2 and
current_status in ('PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED')
group by awb, current_status
having max(received_at) > min(received_at) + interval 2 day;

Related

mysql select row before max value

My Requirement is to get the row before the max(create_date).
SELECT servicecalls_servicecall_id, max(created_at) FROM
service_followup_details where servicecalls_servicecall_id IN (SELECT
service_call_id from service_calls where status=2) group by
servicecalls_servicecall_id
How can I do it by MySQL query?
Get the max(create_at), then pick the one which is less than that max value.
select servicecalls_servicecall_id,created_at
from `service_followup_details`
where created_at<(
SELECT max(created_at)
FROM `service_followup_details`
where servicecalls_servicecall_id IN
(SELECT service_call_id from service_calls
where status=2)
group by servicecalls_servicecall_id
)
order by created_at desc
limit 1
;
Well you can achieve this with the help of ROW_NUMBER() function.
Check my implementation below. What I've done is, I'm sorting the records by created_at in a DESC order. Off course, this is partitioned by the servicecalls_servicecall_id so the sequence will restart for each new servicecalls_servicecall_id. Then you can simply pick the records that have row_number=2 to get the closest record to the max created_date.
SELECT servicecalls_servicecall_id,
ROW_NUMBER() OVER (partition by servicecalls_servicecall_id ORDER BY created_at DESC) AS row_number
FROM service_followup_details
WHERE servicecalls_servicecall_id IN (
SELECT service_call_id FROM service_calls WHERE status=2
)
AND row_number = 2;

Average days duration between multiple transactions and latest transaction gap mysql

I have the transaction table with the following columns :
TRANSACTION_ID, USER_ID, MERCHANT_NAME, TRANSACTION_DATE, AMOUNT
-)Query to calculate time difference (in days) between current and previous order of
each customer
-)the avg time difference between two orders for every
customer.
Note : Exclude users with single transactions
I tried the following code to get the 1st part of the query but it looks too messy
with t1 as
(Select USER_ID,TRANSACTION_DATE,Dense_rank() over(partition by USER_ID order by TRANSACTION_DATE desc) as r1
from CDM_Bill_Details
order by USER_ID, TRANSACTION_DATE desc)
Select t11.USER_ID, datediff(t11.TRANSACTION_DATE,t111.TRANSACTION_DATE) from t1 as t11,t1 as t111
where (t11.r1=1 and t111.r1=2) and (t11.USER_ID=t111.USER_ID)
Please try this:
with t2 as (select *,
lag(t1.TRANSACTION_DATE, 1) OVER (PARTITION BY USER_ID ORDER BY TRANSACTION_DATE) AS previous_date,
datediff(t1.TRANSACTION_DATE, lag(t1.TRANSACTION_DATE, 1) OVER (PARTITION BY USER_ID ORDER BY TRANSACTION_DATE)) AS diff_prev_curr
from CDM_Bill_Details t1)
select *,
avg(diff_prev_curr) OVER (PARTITION BY USER_ID) AS avg_days_diff
from t2
where previous_date is not null

MySQL: How to find the maximum length of an uninterrupted sequence of certain values?

Given a table:
date value
02.10.2019 1
03.10.2019 2
04.10.2019 2
05.10.2019 -1
06.10.2019 1
07.10.2019 1
08.10.2019 2
09.10.2019 2
10.10.2019 -1
11.10.2019 2
12.10.2019 1
How to find the maximum length of an uninterrupted sequence of positive values (4 in that example)?
This is a gaps-and-islands problem. One simple method is the difference of row numbers to identify the islands:
select min(date), max(date), count(*) as length
from (select t.*,
row_number() over (order by date) as seqnum_1,
row_number() over (partition by sign(value) order by date) as seqnum_2
from t
) t
group by sign(value), (seqnum_1 - seqnum_2)
order by count(*) desc
limit 1;
This is a little hard to explain. I find that if you stare at the results of the subquery, you will see how the difference identifies the groups.
Assuming there are no gaps in the dates, another method finds the next non-positive number (if any):
select t.*,
datediff(date, coalesce(next_end_date, max_date)) as num
from (select t.*,
min(case when value <= 0 then date end) over (order by date desc) as next_end_date,
max(date) over () as max_date
from t
) t
where value > 0
order by datediff(date, coalesce(next_end_date, max_date)) desc
limit 1;

MySQL Date difference between two rows

I have a TABLE with Columns: USER_ID,TIMESTAMP and ACTION
Every row tells me which user did what action at a certain time-stamp.
Example:
Alice starts the application at 2014-06-12 16:37:46
Alice stops the application at 2014-06-12 17:48:55
I want a list of users with the time difference between the first row in which they start the application and the last row in which they close it.
Here is how I'm trying to do it:
SELECT USER_ID,DATEDIFF(
(SELECT timestamp FROM MOBILE_LOG WHERE ACTION="START_APP" AND USER_ID="Alice" order by TIMESTAMP LIMIT 1),
(SELECT timestamp FROM MOBILE_LOG WHERE ACTION ="CLOSE_APP" AND USER_ID="Alice" order by TIMESTAMP LIMIT 1)
) AS Duration FROM MOBILE_LOG AS t WHERE USER_ID="Alice";
I ask for the DATEDIFF between two SELECT queries, but I just get a list of Alice`s with -2 as Duration.
Am i on the right track?
I think you should group this table by USER_ID and find minimum date of "START_APP" and maximum of "CLOSE_APP" for each user. Also you should use in DATEDIFF the CLOSE_APP time first and then START_APP time in this case you will get positive value result
SELECT USER_ID,
DATEDIFF(MAX(CASE WHEN ACTION="CLOSE_APP" THEN timestamp END),
MIN(CASE WHEN ACTION="START_APP" THEN timestamp END)
) AS Duration
FROM MOBILE_LOG AS t
GROUP BY USER_ID
SQLFiddle demo
SELECT user_id, start_time, close_time, DATEDIFF(close_time, start_time) duration
FROM
(SELECT MIN(timestamp) start_time, user_id FROM MOBILE_LOG WHERE action="START_APP" GROUP BY user_id) start_action
JOIN
(SELECT MAX(timestamp) close_time, user_id FROM MOBILE_LOG WHERE ACTION ="CLOSE_APP" GROUP BY user_id) close_action
USING (user_id)
WHERE USER_ID="Alice";
You make two "tables" with the earliest time for start for each user, and the latest time for close for each user. Then join them so that the actions of the same user are together.
Now that you have everything setup you can easily subtract between them.
You have the int value because you use the function DATEDIFF, it shows you the number of days between two dates, if you want to have the number of hours and minutes and seconds between dates you have to use TIMEDIFF
Try this:
select t1.USER_ID, TIMEDIFF(t2.timestamp, t1.timestamp)
from MOBILE_LOG t1, MOBILE_LOG t2
where (t1.action,t1.timestamp) in (select action, max(timestamp) from MOBILE_LOG t where t.ACTION = "START_APP" group by USER_ID)
and (t1.action,t1.timestamp) in (select action, max(timestamp), max(id) from MOBILE_LOG t where t.ACTION = "CLOSE_APP" group by USER_ID)
and t1.USER_ID = t2.USER_ID
It will show you difference between two latest dates (startdate,enddate) for all user.
P.S: Sorry, I wrote it without any databases, and may be there are some mistakes. If you have problems with (t1.action,t1.timestamp) in (select...) split it on two: where t1.action in (select ...) and t1.timestamp in (select ...)

MySQL - How to select rows with the min(timestamp) per hour of a given date

I have a table of production readings and need to get a result set containing a row for the min(timestamp) for EACH hour.
The column layout is quite simple:
ID,TIMESTAMP,SOURCE_ID,SOURCE_VALUE
The data sample would look like:
123,'2013-03-01 06:05:24',PMPROD,12345678.99
124,'2013-03-01 06:15:17',PMPROD,88888888.99
125,'2013-03-01 06:25:24',PMPROD,33333333.33
126,'2013-03-01 06:38:14',PMPROD,44444444.44
127,'2013-03-01 07:12:04',PMPROD,55555555.55
128,'2013-03-01 10:38:14',PMPROD,44444444.44
129,'2013-03-01 10:56:14',PMPROD,22222222.22
130,'2013-03-01 15:28:02',PMPROD,66666666.66
Records are added to this table throughout the day and the source_value is already calculated, so no sum is needed.
I can't figure out how to get a row for the min(timestamp) for each hour of the current_date.
select *
from source_readings
use index(ID_And_Time)
where source_id = 'PMPROD'
and date(timestamp)=CURRENT_DATE
and timestamp =
( select min(timestamp)
from source_readings use index(ID_And_Time)
where source_id = 'PMPROD'
)
The above code, of course, gives me one record. I need one record for the min(hour(timestamp)) of the current_date.
My result set should contain the rows for IDs: 123,127,128,130. I've played with it for hours. Who can be my hero? :)
Try below:
SELECT * FROM source_readings
JOIN
(
SELECT ID, DATE_FORMAT(timestamp, '%Y-%m-%d %H') as current_hour,MIN(timestamp)
FROM source_readings
WHERE source_id = 'PMPROD'
GROUP BY current_hour
) As reading_min
ON source_readings.ID = reading_min.ID
SELECT a.*
FROM Table1 a
INNER JOIN
(
SELECT DATE(TIMESTAMP) date,
HOUR(TIMESTAMP) hour,
MIN(TIMESTAMP) min_date
FROM Table1
GROUP BY DATE(TIMESTAMP), HOUR(TIMESTAMP)
) b ON DATE(a.TIMESTAMP) = b.date AND
HOUR(a.TIMESTAMP) = b.hour AND
a.timestamp = b.min_date
SQLFiddle Demo
With window function:
WITH ranked (
SELECT *, ROW_NUMBER() OVER(PARTITION BY HOUR(timestamp) ORDER BY timestamp) rn
FROM source_readings -- original table
WHERE date(timestamp)=CURRENT_DATE AND source_id = 'PMPROD' -- your custom filter
)
SELECT * -- this will contain `rn` column. you can select only necessary columns
FROM ranked
WHERE rn=1
I haven't tested it, but the basic idea is:
1) ROW_NUMBER() OVER(PARTITION BY HOUR(timestamp) ORDER BY timestamp)
This will give each row a number, starting from 1 for each hour, increasing by timestamp. The result might look like:
|rest of columns |rn
123,'2013-03-01 06:05:24',PMPROD,12345678.99,1
124,'2013-03-01 06:15:17',PMPROD,88888888.99,2
125,'2013-03-01 06:25:24',PMPROD,33333333.33,3
126,'2013-03-01 06:38:14',PMPROD,44444444.44,4
127,'2013-03-01 07:12:04',PMPROD,55555555.55,1
128,'2013-03-01 10:38:14',PMPROD,44444444.44,1
129,'2013-03-01 10:56:14',PMPROD,22222222.22,2
130,'2013-03-01 15:28:02',PMPROD,66666666.66,1
2) Then on the main query we select only rows with rn=1, in other words, rows that has lowest timestamp in each hourly partition (1st row after sorted by timestamp in each hour).