I am trying to create a query for getting the current streak in MySQL based on status
ID
Dated
Status
1
2022-03-08
1
2
2022-03-09
1
3
2022-03-10
0
4
2022-03-11
1
5
2022-03-12
0
6
2022-03-13
1
7
2022-03-14
1
8
2022-03-16
1
9
2022-03-18
0
10
2022-03-19
1
11
2022-03-20
1
In the above table current streak should be 2( i.e 2022-03-20 - 2022-03-19) based on status 1. Any help or ideas would be greatly appreciated!
WITH cte AS (
SELECT SUM(Status) OVER (ORDER BY Dated DESC) s1,
SUM(NOT Status) OVER (ORDER BY Dated DESC) s2
FROM table
)
SELECT MAX(s1)
FROM cte
WHERE NOT s2;
SELECT DATEDIFF(MAX(CASE WHEN Status THEN Dated END),
MAX(CASE WHEN NOT Status THEN Dated END))
FROM table
and so on...
This is a gaps and islands problem. In your case, you want the island of status 1 records which occurs last. We can use the difference in row numbers method, assuming you are using MySQL 8+.
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY Dated) rn1,
ROW_NUMBER() OVER (PARTITION BY Status ORDER BY Dated) rn2
FROM yourTable
),
cte2 AS (
SELECT *, RANK() OVER (ORDER BY rn1 - rn2 DESC) rnk
FROM cte
WHERE Status = 1
)
SELECT ID, Dated, Status
FROM cte2
WHERE rnk = 1
ORDER BY Dated;
Demo
We can use 2 one row CTE's to find the latest date where the status was not the same as the latest one and then count the records superieur.
**Schema (MySQL v8.0)**
create table t(
ID int,
Dated date,
Status int);
insert into t values
(1,'2022-03-08',1),
(2,'2022-03-09',1),
(3,'2022-03-10',0),
(4,'2022-03-11',1),
(5,'2022-03-12',0),
(6,'2022-03-13',1),
(7,'2022-03-14',1),
(8,'2022-03-16',1),
(9,'2022-03-18',0),
(10,'2022-03-19',1),
(11,'2022-03-20',1);
---
**Query #1**
with latest AS
(SELECT
dated lastDate,
status lastStatus
from t
order by dated desc
limit 1 ),
lastDiff as
(select MAX(dated) diffDate
from t,latest
where not status = lastStatus
)
select count(*)
from t ,lastDiff
where dated > diffDate;
| count(*) |
| -------- |
| 2 |
---
[View on DB Fiddle](https://www.db-fiddle.com/)
We could also consider using datediff() to find the number of days that the streak has lasted which might be more interesting than count() seeing as there are some days where there is no record.
Related
Trying to get the 2nd transaction month details for all the customers
Date User_id amount
2021-11-01 1 100
2021-11-21 1 200
2021-12-20 2 110
2022-01-20 2 200
2022-02-04 1 50
2022-02-21 1 100
2022-03-22 2 200
For every customer get all the records in the month of their 2nd transaction (There can be multiple transaction in a month and a day by a particular user)
Expected Output
Date User_id amount
2022-02-04 1 50
2022-02-21 1 100
2022-01-20 2 200
You can use dense_rank:
select Date, User_id, amount from
(select *, dense_rank() over(partition by User_id order by year(Date), month(date)) r
from table_name) t
where r = 2;
Fiddle
If dense_rank is an option you can:
with cte1 as (
select *, extract(year_month from date) as yyyymm
from t
), cte2 as (
select *, dense_rank() over (partition by user_id order by yyyymm) as dr
from cte1
)
select *
from cte2
where dr = 2
Note that it is possible to write the above using one cte.
Suppose I have the following set in a table:
empid
start_time
end_time
1
8
9
1
9
10
1
11
12
1
12
13
1
13
14
1
14
15
I want to have an sql (or an sql process ) that convert the previous set to the following set:
empid
start_time
end_time
1
8
10
1
11
15
It means that if the end_time of a record equals to the start_time of the next record we shall remove one record and update the record with the new value (of course without touching the main table)
This is a type of gaps-and-islands problem. In this case, you can use lag to see where an "island" starts, then use a cumulative sum to assign the same number within an island and aggregate:
select empid, min(start_time), max(end_time)
from (select t.*,
sum(case when prev_end_time = start_time then 0 else 1 end) over (partition by empid order by start_time) as island
from (select t.*,
lag(end_time) over (partition by empid order by start_time) as prev_end_time
from t
) t
) t
group by empid, island;
Here is a db<>fiddle.
I have this table below and want to get the min value of quantity, max value of quantity, first value of quantity and last value of quantity. The new table should be grouped by date with a 1 day interval.
id item quantity date
1 xLvCm 2 2020-01-10 19:15:03
1 UBizL 4 2020-01-10 20:16:41
1 xLvCm 1 2020-01-10 21:21:12
1 xLvCm 3 2020-01-11 11:14:00
1 UBizL 1 2020-01-11 15:01:10
1 moJEe 4 2020-01-12 00:15:50
1 moJEe 1 2020-01-12 02:11:23
1 UBizL 1 2020-01-12 04:16:17
1 KiZoX 3 2020-01-13 10:10:02
1 KiZoX 2 2020-01-13 19:05:40
1 KiZoX 1 2020-01-13 20:14:33
This is the expected table result
min(quantity) max(quantity) first(quantity) last(quantity) date
1 4 2 1 2020-01-10 19:15:03
1 3 3 1 2020-01-11 11:14:00
1 4 4 1 2020-01-12 00:15:50
1 4 3 1 2020-01-13 10:10:02
The SQL query I have tried is
SELECT MIN(quantity), MAX(quantity), FIRST(quantity), LAST(quantity) FROM tablename GROUP BY date
I can't figure out how to include the first and last values of quantity and group by day (like 10, 11, 12, 13) instead of date like (2020-01-10 19:15:03)
It is important to state the database tool you are using because of the different functionality available in each of them. But if you were using Snowflake this is something I would try:
select distinct day(date) as day_of_month,
min(quantity) over (partition by day(date) order by date range between unbounded preceding and UNBOUNDED FOLLOWING) min_quantity,
max(quantity) over (partition by day(date) order by date range between unbounded preceding and UNBOUNDED FOLLOWING) max_quantity ,
last_value(QUANTITY) over (partition by day(date) order by date range BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) as last_quantity,
first_value(QUANTITY) over (partition by day(date) order by date range BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) as first_quantity
from demo_db.staging.test
It is important to note that this is a costly query. If your table is huge this might take too long.
A common approach to this problem is to use window functions and aggregation. Here is one method:
SELECT date(date), MIN(quantity), MAX(quantity),
MAX(CASE WHEN seqnum_a = 1 THEN quantity END) as first_quantity,
MAX(CASE WHEN seqnum_d = 1 THEN quantity END) as last_quantity
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY date(date) ORDER BY date) as seqnum_a,
ROW_NUMBER() OVER (PARTITION BY date(date) ORDER BY date des) as seqnum_d
FROM tablename t
) t
GROUP BY date(date);
Try this:
select A.minquantity,A.maxquantity,B.firstquantity,C.lastquantity,A.date from (
(select min(quantity) as minquantity,max(quantity) as maxquantity,Date(date) as date
from Test group by Date(date))A
join
(select Date(date) as date,quantity as firstquantity from
Test where date in (select min(date) from Test group by Date(date)))B
on A.date=B.date
join
(select Date(date)as date,quantity as lastquantity from Test
where date in (select max(date) from Test group by Date(date)))C
on A.date=C.date
);
Output:
1 4 2 1 2020-01-10
1 3 3 1 2020-01-11
1 4 4 1 2020-01-12
1 3 3 1 2020-01-13
Data in table rider_status will be like:
rider_id online_status date_time
2 1 2019-10-17 08:00:40
3 1 2019-10-17 09:30:30
2 0 2019-10-17 12:30:40
2 1 2019-10-17 14:50:50
2 0 2019-10-17 18:50:50
Online status 0 = not working
Online status 1 = working
Now I want to calculate rider '2' total working hour of that particular date (for example '2019-19-17'). And further I want to calculate total hour of that rider for particular date range (for example '2019-10-05' to '2019-10-30').
My answer for rider_id '2' should be like:
12:30:40 - 08:00:40 = 04:30:12
18:50:50 - 14:50:50 = 06:00:00
--------
Total working hour = 10:30:12
Assuming that you will always get an online status of 1,0,1,0 etc then you can use LEAD() and LAG() to match the log-off to the log-on. Be careful to ensure that you are using the user ID in the window expression
https://www.geeksforgeeks.org/mysql-lead-and-lag-function/
This will also apply to MSSQL from 2008 onwards
you can then use TIMEDIFF() to get the difference between the two times
https://www.w3resource.com/mysql/date-and-time-functions/mysql-timediff-function.php
In MSSQL you would use DATEDIFF(MINUTE,Time1,Time2)
Assuming that the 1s and 0s are interleaved for each person, you can use:
select rider_id,
sum(timestamp_diff(second, datetime, next_datetime)) as time_in_seconds
from (select t.*,
lead(date_time) over (partition by rider_id order by date_time) as next_date_time
from t
) t
where online_status = 1
group by rider_id;
Select rider_id,
case when lead(
online_status) over
partition by rider_id order by
riderid, date_time <> online_status
then
lead(date_time)over (partition
by rider_id order by rider_id)
-
lag(updatedtime) over (partition
by rider_id order by rider_id)
End
From table
I've got a table named 'T1' which I want to transpose and have date_from and date_to columns. The table itself has the data of who is a manager of a particular company. So I want to know since when to when a user was responsible for a company. I can do it easily in BigQuery with the following query but I'm struggling to do the same in MySQL.
WITH T1 AS ( SELECT 9 as rating, 'company1' as cid, 100 as user, '2017-08-20' AS created UNION ALL
SELECT 9 as rating, 'company1' as cid, 101 as user, '2017-08-22' AS created UNION ALL
SELECT 10 as rating, 'company1' as cid, 101 as user, '2017-08-21' AS created
)
SELECT cid, rating, user, CAST(created as DATE) as date_from,
CAST(COALESCE(MIN(CAST(created as DATE)) OVER(PARTITION BY cid, rating ORDER BY CAST(created as DATE) DESC ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING),
DATE_ADD(current_date(), INTERVAL 1 DAY)) as DATE) AS date_to
FROM T1
The original table format:
rating cid user created
9 company1 100 2017-08-20
9 company1 101 2017-08-22
10 company1 101 2017-08-21
The final table should have the following format:
cid rating user date_from date_to
1 company1 9 101 2017-08-22 2018-02-24
2 company1 9 100 2017-08-20 2017-08-22
3 company1 10 101 2017-08-21 2018-02-24
Thank you!
You really need lead(), which is not available in MySQL (and which would make the BigQuery query simpler). One method uses a correlated subquery:
select t1.*, t1.created as date_from,
(select min(tt1.created)
from t1 tt1
where tt1.cid = t1.cid and tt1.created > t1.created
) as date_to
from t1;