I have a calendar table that looks like this:ยจ
calendar_date
2022-01-01
2022-01-02
2022-01-03
2022-01-04
And another table that has a date_from, and a date to field. If date_to is empty, that row is valid until a more recent date appears.
date_from
date_to
value
2022-01-01
null
1
2022-01-04
2022-01-05
2
2022-01-07
null
3
My expected result is the following:
calendar_date
value
2022-01-01
1
2022-01-02
1
2022-01-03
1
2022-01-04
2
2022-01-05
2
2022-01-06
1
2022-01-07
3
I've tried joining the tables on date_between but it leaves a gap between when 2 ends and 3 begins... This is my query so far:
select
c.calendar_date, v.value
from calendar c
left join values v
on v.start_date <= c.calendar_date
and (
c.calendar_date between v.start_date
and ifnull(
v.stop_date,
ifnull(
(
select min(v2.start_date)
from values v2
where v2.start_date > v.start_date
),
date_add(curdate(), interval 2 year)
)
)
)
Like this:
calendar_date
value
2022-01-01
1
2022-01-02
1
2022-01-03
1
2022-01-04
2
2022-01-05
2
2022-01-06
null
2022-01-07
3
Related
I have a database that looks like this:
ID
Sale_Date(YYYY-MM-DD)
Total_Volume
123
2022-01-01
0
123
2022-01-02
2
123
2022-01-03
5
456
2022-04-06
38
456
2022-04-07
40
456
2022-04-08
45
I want to get a daily sale column from Total Volume. which is just by subtracting the total volume on date x with total volume on date x-1 for each id.
ID
Sale_Date(YYYY-MM-DD)
Total_Volume
Daily_Sale
123
2022-01-01
0
0
123
2022-01-02
2
2
123
2022-01-03
5
3
456
2022-04-06
38
38
456
2022-04-07
40
2
456
2022-04-08
45
5
My initial attempt was using a rank function and self join but that didnt turn out correct.
with x as (
select
distinct t1.ID,
t1.Sale_Date,
t1.Total_volume,
rank() over (partition by ID order by Sale_Date) as ranker
from t t1 order by t1.Sale_Date)
select t2.ID, t2.ranker, t2.Sale_date, t1.Total_volume, t1.Total_volume - t2.Total_volume as Daily_sale
from x t1, x t2 where t1.ID = t2.ID and t2.ranker = t1.ranker-1 order by t1.ID;
You should use:
the LAG window function to retrieve last "Sale_Date" value
the COALESCE function to replace NULL with "Total Volume" for each first rows
Then subtract Total_Volume from the previous value of Total_Volume and coalesce if the value of the LAG is NULL.
SELECT *,
COALESCE(`Total_Volume`
-LAG(`Total_Volume`) OVER(PARTITION BY `ID`
ORDER BY `Sale_Date(YYYY-MM-DD)`), `Total_Volume`) AS `Daily_Sale`
FROM tab
Check the demo here.
I have this table with My Id, "rank" the order messages where sent, and the message_send_time.
ID Rank message_send_time
1 1 2022-01-01 00:33:04
1 2 2022-01-01 00:34:04
2 1 2022-01-01 00:30:04
2 2 2022-01-01 00:32:04
2 3 2022-01-01 00:33:04
I want to calculate the interval of minutes between my group Id and based on the rank oh the messages, how I can calculate this in SQL ?
ID Rank message_send_time Interval_time_minutes
1 1 2022-01-01 00:33:04
1 2 2022-01-01 00:34:04 1
2 1 2022-01-01 00:30:04
2 2 2022-01-01 00:32:04 2
2 3 2022-01-01 00:33:04 1
You can try to use lag window function with TIMESTAMPDIFF
Query #1
select
id,
`Rank`,
message_send_time,
TIMESTAMPDIFF(MINUTE,lag(message_send_time,1) over (partition by id order by `Rank`),message_send_time) Interval_time_minutes
from T;
id
Rank
message_send_time
Interval_time_minutes
1
1
2022-01-01 00:33:04
1
2
2022-01-01 00:34:04
1
2
1
2022-01-01 00:30:04
2
2
2022-01-01 00:32:04
2
2
3
2022-01-01 00:33:04
1
View on DB Fiddle
We can use the functions datediff() and lag()
create table messages(
ID int,
message_send_time timestamp);
insert into messages values
(1,' 2022-01-01 00:33:04'),
(1,' 2022-01-01 00:34:04'),
(2,' 2022-01-01 00:30:04'),
(2,' 2022-01-01 00:32:04'),
(2,' 2022-01-01 00:33:04');
select
id,
rank() over(partition by id order by message_send_time) "Rank",
message_send_time,
timediff(
message_send_time,
lag(message_send_time,1) over (partition by id order by message_send_time)
) as "Interval"
from messages;
id | Rank | message_send_time | Interval
-: | ---: | :------------------ | :-------
1 | 1 | 2022-01-01 00:33:04 | null
1 | 2 | 2022-01-01 00:34:04 | 00:01:00
2 | 1 | 2022-01-01 00:30:04 | null
2 | 2 | 2022-01-01 00:32:04 | 00:02:00
2 | 3 | 2022-01-01 00:33:04 | 00:01:00
db<>fiddle here
select
id
, `Rank`
, message_send_time,
TIMESTAMPDIFF(
MINUTE
, LAG(message_send_time,1) over (
partition by id order by `Rank`
)
, message_send_time
) Interval_time_minutes
from T;
there is table in my database (MySQL 5.7.36),I try to find consecutive day
with condition
if consecutive day > 7
consecutive day will be set zero
DATE_SERV
2022-01-01
2022-01-02
2022-01-03
2022-01-05
2022-01-06
2022-01-09
2022-01-10
2022-01-11
my actually expect table is
DATE_SERV
day_consecutive
2022-01-01
1
2022-01-02
2
2022-01-03
3
2022-01-05
1
2022-01-06
2
2022-01-09
1
2022-01-10
2
2022-01-11
3
2022-01-12
4
2022-01-13
5
2022-01-14
6
2022-01-15
7
2022-01-16
1
2022-01-17
2
I wrote this up before, thinking you were using MySQL 8.x (which supports window functions, unfortunately 5.x does not). Anyway, just posting it in case it's useful to someone else ...
You can adapt the approach from this blog Gaps and Islands Across Date Ranges. First identify the "islands" or groups of consecutive dates
SELECT
DATE_SERV
, SUM( IF( DATEDIFF(DATE_SERV, Prev_Date) = 1, 0, 1) ) OVER(
ORDER BY DATE_SERV
) AS DateGroup_Num
FROM
(
SELECT DATE_SERV
, LAG(DATE_SERV,1) OVER (
ORDER BY DATE_SERV
) AS Prev_Date
FROM YourTable
) grp
Which produces this result:
DATE_SERV
DateGroup_Num
2022-01-01
1
2022-01-02
1
2022-01-03
1
2022-01-05
2
2022-01-06
2
2022-01-09
3
2022-01-10
3
2022-01-11
3
Then use a conditional SUM(...) to find the earliest date per group, and display the number of consecutive days since that date:
SELECT
t.DATE_SERV
, DATEDIFF(
t.DATE_SERV
, MIN(t.DATE_SERV) OVER(
PARTITION BY t.DateGroup_Num
ORDER BY t.DATE_SERV
)
) +1 AS Consecutive_Days
FROM (
SELECT
DATE_SERV
, SUM( IF( DATEDIFF(DATE_SERV, Prev_Date) = 1, 0, 1) ) OVER(
ORDER BY DATE_SERV
) AS DateGroup_Num
FROM
(
SELECT DATE_SERV
, LAG(DATE_SERV,1) OVER (
ORDER BY DATE_SERV
) AS Prev_Date
FROM YourTable
) grp
) t
Results:
DATE_SERV
Consecutive_Days
2022-01-01
1
2022-01-02
2
2022-01-03
3
2022-01-05
1
2022-01-06
2
2022-01-09
1
2022-01-10
2
2022-01-11
3
db<>fiddle here
How to do this in Mysql to get all users even no records or absent on that selected date range?
attendance_tbl
ID
user_id
time_in
time_out
created_at
1
001
2022-01-01 08:00:00
2022-01-01 17:00:00
2022-01-03 08:00:00
2
002
2022-01-01 08:15:24
2022-01-01 17:00:00
2022-01-03 08:15:24
3
003
2022-01-02 08:44:55
2022-01-02 17:00:00
2022-01-04 08:44:55
4
004
2022-01-03 08:40:22
2022-01-03 17:00:00
2022-01-04 08:40:22
users_tbl
ID
user_id
f_name
1
001
John Doe
2
002
Jane Doe
3
003
Ronal Black
4
004
Lucy White
Expected Output Daterange : from 2022-01-01 to 2022-01-03
Will get all the Users Fullname
ID
user_id
Date
f_name
time_in
time_out
created_at
1
001
Jan 1 2022
John Doe
2022-01-01 08:00:00
2022-01-01 17:00:00
2022-01-03 08:00:00
2
002
Jan 1 2022
Jane Doe
2022-01-01 08:15:24
2022-01-01 08:15:24
2022-01-03 08:00:00
3
003
Jan 1 2022
Ronal Black
4
004
Jan 1 2022
Lucy White
5
001
Jan 2 2022
John Doe
6
002
Jan 2 2022
Jane Doe
7
003
Jan 2 2022
Ronal Black
2022-01-02 17:00:00
2022-01-02 17:00:00
2022-01-02 17:00:00
8
004
Jan 2 2022
Lucy White
9
001
Jan 3 2022
John Doe
10
002
Jan 3 2022
Jane Doe
11
003
Jan 3 2022
Ronal Black
12
004
Jan 3 2022
Lucy White
2022-01-04 17:00:00
2022-01-04 17:00:00
2022-01-04 17:00:00
Given that you want to include the absent data, we need to start by getting the date range for the desired period. Using a user variable to store and increment a counter value is a performant way of doing this -
SELECT
'2022-01-01' + INTERVAL #row_number DAY `date`,
#row_number := #row_number + 1
FROM `attendance_tbl`, (SELECT #row_number := 0) AS `x`
LIMIT 31 /* 31 days in January */
If you have a table with a contiguous integer sequence (auto-incremented PK without deletes), you could use that instead -
SELECT '2022-01-01' + INTERVAL (`id` - 1) DAY `date`
FROM `attendance_tbl`
WHERE `id` <= 31 /* 31 days in January */
ORDER BY `id` ASC
We then add a cross join to build the full set of dates and users -
SELECT *
FROM (
SELECT '2022-01-01' + INTERVAL (`id` - 1) DAY `date`
FROM `attendance_tbl`
WHERE `id` <= 31
ORDER BY `id` ASC
) d
CROSS JOIN `users_tbl` `u`
By cross joining between these two tables we get the cartesian product (all combinations of the two sets). We then just take it a step further by using a left join to the attendance data -
SELECT
`u`.`user_id`,
DATE_FORMAT(`d`.`date`, '%b %e %Y') `date`,
`u`.`f_name`,
`a`.`time_in`,
`a`.`time_out`
FROM (
SELECT
'2022-01-01' + INTERVAL (`id` - 1) DAY `date`,
(SELECT TIMESTAMP(`date`, '00:00:00')) `begin`,
(SELECT TIMESTAMP(`date`, '23:59:59')) `end`
FROM `attendance_tbl`
WHERE `id` <= 31
ORDER BY `id` ASC
) d
CROSS JOIN `users_tbl` `u`
LEFT JOIN `attendance_tbl` `a`
ON `u`.`user_id` = `a`.`user_id`
AND `a`.`time_in` BETWEEN `d`.`begin` AND `d`.`end`
ORDER BY `d`.`date`, `u`.`user_id`
If your attendance_tbl can have more than 1 row per user per day then you will need to add GROUP BY d.date, u.user_id and aggregate_functions in the select list.
I have added begin and end to the derived table. This is to allow for index use for the join. This is not important while the attendance_tbl is small but will matter more as the table grows. Adding an index on (user_id, time_in) will make a huge difference to performance in the longer term.
Here's a db<>fiddle for you to play with.
To run this from PHP using PDO you could do something like this -
<?php
$pdo = new PDO($dsn, $user, $password);
$sql = "SELECT
`u`.`user_id`,
DATE_FORMAT(`d`.`date`, '%b %e %Y') `date`,
`u`.`f_name`,
`a`.`time_in`,
`a`.`time_out`
FROM (
SELECT
:START_DATE + INTERVAL (`id` - 1) DAY `date`,
(SELECT TIMESTAMP(`date`, '00:00:00')) `begin`,
(SELECT TIMESTAMP(`date`, '23:59:59')) `end`
FROM `attendance_tbl`
WHERE `id` <= :DAYS_RANGE
ORDER BY `id` ASC
) d
CROSS JOIN `users_tbl` `u`
LEFT JOIN `attendance_tbl` `a`
ON `u`.`user_id` = `a`.`user_id`
AND `a`.`time_in` BETWEEN `d`.`begin` AND `d`.`end`
ORDER BY `d`.`date`, `u`.`user_id`";
$stmt = $pdo->prepare($sql);
$startDate = new DateTime('2022-01-01');
$endDate = new DateTime('2022-02-01');
$interval = $startDate->diff($endDate, true);
$daysRange = $interval->days + 1;
// Execute the statement
$stmt->execute([
':START_DATE' => $startDate->format('Y-m-d'),
':DAYS_RANGE' => $daysRange]
);
$attendance = $stmt->fetchAll(PDO::FETCH_OBJ);
Check this. In here I call the attendance_tbl twice, one for creating a list of date and users and the other for fetching the data (time in and time out). And by using BETWEEN as #nnichols suggested to filter the selected range you prefer which I just realized earlier.
select u.`user_id`, date(a.time_in) as `date`, u.`f_name`, b.`time_in`, b.`time_out`, b.created_at from attendance_tbl a
join users_tbl u
left join attendance_tbl b on b.`user_id`=u.`user_id` and date(b.`time_in`)=date(a.`time_in`)
WHERE DATE(a.time_in) BETWEEN '2022-01-01' AND '2022-01-31'
GROUP BY `date`, u.user_id;
RESULT
user_id date f_name time_in time_out created_at
------- ---------- ----------- ------------------- ------------------- ---------------------
001 2022-01-01 John Doe 2022-01-01 08:00:00 2022-01-01 17:00:00 2022-01-03 08:00:00
002 2022-01-01 Jane Doe 2022-01-01 08:15:24 2022-01-01 17:00:00 2022-01-03 08:15:24
003 2022-01-01 Ronal Black (NULL) (NULL) (NULL)
004 2022-01-01 Lucy White (NULL) (NULL) (NULL)
001 2022-01-02 John Doe (NULL) (NULL) (NULL)
002 2022-01-02 Jane Doe (NULL) (NULL) (NULL)
003 2022-01-02 Ronal Black 2022-01-02 08:44:55 2022-01-02 17:00:00 2022-01-04 08:44:55
004 2022-01-02 Lucy White (NULL) (NULL) (NULL)
001 2022-01-03 John Doe (NULL) (NULL) (NULL)
002 2022-01-03 Jane Doe (NULL) (NULL) (NULL)
003 2022-01-03 Ronal Black (NULL) (NULL) (NULL)
004 2022-01-03 Lucy White 2022-01-03 08:40:22 2022-01-03 17:00:00 2022-01-04 08:40:22
For the ID column just create a table with AUTO_INCREMENT id and insert your selected data.
To format your date (if you really really need to) like the one in your example result, just change the DATE(a.time_in) to DATE_format(a.time_in, '%b %d %Y').
SQL Fiddle Example
Somewhat new to SQL and I'm running into a bit of issue with a project. I have a table like this:
ID
subscription_ID
renewal_date
1
11
2022-01-01 00:00:00
2
11
2022-01-02 00:00:00
3
12
2022-01-01 00:00:00
4
12
2022-01-01 12:00:00
5
13
2022-01-01 12:00:00
6
13
2022-01-03 12:00:00
My goal is to return rows where the subscription_ID matches and the start_date is within or equal to a certain # of days (hours would work as well). For instance, I'd like rows where subscription_ID matches and the start_date is within or equal to 1 day such that my results from the table above would be:
ID
subscription_ID
renewal_date
1
11
2022-01-01 00:00:00
2
11
2022-01-02 00:00:00
3
12
2022-01-01 00:00:00
4
12
2022-01-01 12:00:00
Any assistance would be greatly appreciated--thanks!
If I understand correctly maybe you are trying something like:
select t.*
from test_tbl t
join ( SELECT subscription_id
, MAX(diff) max_diff
FROM
( SELECT x.subscription_id
, DATEDIFF(MIN(y.start_date),x.start_date) diff
FROM test_tbl x
JOIN test_tbl y ON y.subscription_id = x.subscription_id
AND y.start_date > x.start_date
GROUP BY x.subscription_id , x.start_date
) z
GROUP BY subscription_id
) as t1 on t.subscription_id=t1.subscription_id
where t1.max_diff<=1;
Result:
id subscription_id start_date
1 11 2022-01-01 00:00:00
2 11 2022-01-02 00:00:00
3 12 2022-01-01 00:00:00
4 12 2022-01-01 12:00:00
The subquery returns:
subscription_id max_diff
11 1
12 0
13 2
which is used on the where condition.
Demo