I have a table that stores the records for users' different sessions(subscribe, unsubscribe, away, online). I am able to calculate the time duration for each session using the following given query.
There is a scenario that supposes a user starts his online session on "15-May-2022 at 11:00:00 PM", after that on the next day he set himself away on "16-May-2022 at 02:00:00 AM"
Total online is 3 Hours which I am getting as the last row on the date 15-May-2022.
But I need like this
On 15-May till "15-May-2022 23:59:59" it was online for 1 Hour and on 16-May from "16-May-2022 00:00:00 to 16-May-2022 at 02:00:00 AM", it was online for 2 Hours. So in response, it should return 1 hour for 15-May and 2 hours for 16-May, not a total of 3 hours on 15-May.
I am using the lead function to get the duration from the created_at column, is there any way in which I can restrict the lead function to calculate duration till the next created_at till 23:59:59.
Here is my working query. I am using the latest MySQL(8) version.
select `id`, `user_id`, `status`, `created_at`,
SEC_TO_TIME(TIMESTAMPDIFF(SECOND, created_at,
LEAD(created_at) OVER (PARTITION BY user_id ORDER BY created_at))) as duration,
date(created_at) as date from `user_websocket_events` as `all_status`
where created_at between '2022-05-15 00:00:00' and '2022-05-16 23:59:59' and `status` is not null
and user_id in (69) order by `id` asc;
Here is some sample data.
INSERT INTO user_websocket_events (id, user_id, event, status, extra_attributes, created_at, updated_at) VALUES (10816, 69, 'subscribe', 'online', null, '2022-05-15 12:57:31', '2022-05-14 10:57:37');
INSERT INTO user_websocket_events (id, user_id, event, status, extra_attributes, created_at, updated_at) VALUES (10817, 69, 'away', 'away', null, '2022-05-15 20:57:31', '2022-05-14 10:57:37');
INSERT INTO user_websocket_events (id, user_id, event, status, extra_attributes, created_at, updated_at) VALUES (10818, 69, 'online', 'online', null, '2022-05-15 22:57:31', '2022-05-14 10:57:37');
INSERT INTO user_websocket_events (id, user_id, event, status, extra_attributes, created_at, updated_at) VALUES (10819, 69, 'away', 'away', null, '2022-05-16 02:57:31', '2022-05-14 10:57:37');
INSERT INTO user_websocket_events (id, user_id, event, status, extra_attributes, created_at, updated_at) VALUES (10820, 69, 'unsubscribe', 'unsubscribe', null, '2022-05-16 03:57:31', '2022-05-14 10:57:37');
Using an on-the-fly calendar table to split a session by days
with recursive calendar as (
select timestamp('2022-05-01 00:00') start_time, timestamp('2022-05-01 23:59:59') end_time, 1 id
union all
select start_time + interval 1 day, end_time + interval 1 day, id+1
from calendar
where id < 100
)
select e.id, e.status, date(greatest(c.start_time, e.created_at)) date,
greatest(c.start_time, e.created_at) as created_at,
least(c.end_time, e.ended_at) as ended_at
from (
select `id`, `user_id`, `status`, `created_at`,
-- a session end is restricted to the end of the requierd interval
LEAD(created_at, 1, '2022-05-16 23:59:59') OVER (PARTITION BY user_id ORDER BY created_at) as ended_at
from `user_websocket_events`
where
-- only sessions started within the requierd interval
created_at between '2022-05-15 00:00:00' and '2022-05-16 23:59:59' and `status` is not null
and user_id in (69)
) e
join calendar c on c.start_time < e.ended_at and e.created_at < c.end_time
order by id;
db<>fiddle
You need to handle it using CASE statement.
Now instead of the actual created_at and LEAD(created_at), we need something like below.
First Case:
If the created_at and LEAD(created_at) falls on different date, then consider date(created_at) + '23:59:59' else consider created_at as ENDTIME.
CASE
WHEN Date(lead_created_at)=Date(created_at) THEN lead_created_at
ELSE Addtime(Timestamp(Date(created_at)),'23:59:59')
END
Second Case:
If the created_at and LAG(created_at) falls on different date, then consider date(created_at) + '00:00:00' else consider created_at as STARTTIME.
CASE
WHEN Date(lag_created_at)=Date(created_at) THEN created_at
ELSE Timestamp(Date(created_at))
END
Finally, the query can be written as below to get the desired output.
SELECT `id`,
`user_id`,
`status`,
`created_at`,
CASE
WHEN Date(lag_created_at)=Date(created_at) THEN created_at
ELSE Timestamp(Date(created_at))
end new_starttime,
CASE
WHEN Date(lead_created_at)=Date(created_at) THEN lead_created_at
WHEN lead_created_at is null then null
ELSE Addtime(Timestamp(Date(created_at)),'23:59:59')
end AS new_endtime,
Sec_to_time(Timestampdiff(second,
CASE
WHEN Date(lag_created_at)=Date(created_at) THEN created_at
ELSE Timestamp(Date(created_at))
end,
CASE
WHEN Date(lead_created_at)=Date(created_at) THEN lead_created_at
WHEN lead_created_at is null then null
ELSE Addtime(Timestamp(Date(created_at)),'23:59:59')
end )) AS duration,
date
FROM (
SELECT `id`,
`user_id`,
`status`,
`created_at`,
(Lead(created_at) over (partition BY user_id ORDER BY created_at)) AS lead_created_at,
coalesce(lag(created_at) over (partition BY user_id ORDER BY created_at),created_at) AS lag_created_at,
date(created_at) AS date
FROM `user_websocket_events` AS `all_status`
WHERE created_at BETWEEN '2022-05-15 00:00:00' AND '2022-05-16 23:59:59'
AND `status` IS NOT NULL
AND user_id IN (69) )tmp
ORDER BY `id` ASC;
Resultset:
id
user_id
status
created_at
new_starttime
new_endtime
duration
date
10816
69
online
2022-05-15 12:57:31
2022-05-15 12:57:31
2022-05-15 20:57:31
08:00:00
2022-05-15
10817
69
away
2022-05-15 20:57:31
2022-05-15 20:57:31
2022-05-15 22:57:31
02:00:00
2022-05-15
10818
69
online
2022-05-15 22:57:31
2022-05-15 22:57:31
2022-05-15 23:59:59
01:02:28
2022-05-15
10819
69
away
2022-05-16 02:57:31
2022-05-16 00:00:00
2022-05-16 03:57:31
03:57:31
2022-05-16
10820
69
unsubscribe
2022-05-16 03:57:31
2022-05-16 03:57:31
null
null
2022-05-16
Note: The query will not handle the scenario where sessions last more than 48 hour.
DB Fiddle: Try it here
Related
Using MySQL, I'm trying to get the number of active users I have in any given month. I have a table with ActivationDate and TerminationDate columns, and if the month being counted is after the ActivationDate and TerminationDate is null, then the user is active and should be counted. I would like to summarize these amounts by month. I'm thinking I could just sum each side and calculate the total but breaking that down won't give me a running total. I've tried with window functions, but I don't have enough experience with them to know exactly what I'm doing wrong and I'm not certain how to ask the right question.
So for instance, if I have the following data...
UserId ActivationDate TerminationDate
1 2020-01-01 null
2 2020-01-15 null
3 2020-01-20 2020-01-30
4 2020-02-01 null
5 2020-02-14 2020-02-27
6 2020-02-15 2020-02-28
7 2020-03-02 null
8 2020-03-05 null
9 2020-03-20 2020-03-21
I would like my results to be similar to:
2020-01 2 (there are 2 active users, since one signed up but cancelled before the end of the month)
2020-02 3 (2 from the previous month, plus 1 that signed up this month and is still active)
2020-03 5 (3 from previous, 2 new, 1 cancellation)
You can unpivot, then aggregate and sum. In MySQL 8.0.14 or higher, you can use a lateral join:
select date_format(x.dt, '%Y-%m-01') as dt_month,
sum(sum(cnt)) over(order by date_format(x.dt, '%Y-%m-01')) as cnt_active_users
from mytable t
cross join lateral (
select t.activationdate as dt, 1 as cnt
union all select t.terminationdate, -1
) x
where x.dt is not null
group by dt_month
order by dt_month
In earlier 8.x versions:
select date_format(x.dt, '%Y-%m-01') as dt_month,
sum(sum(cnt)) over(order by date_format(x.dt, '%Y-%m-01')) as cnt_active_users
from (
select activationdate as dt, 1 as cnt from from mytable
union all select terminationdate, -1 from mytable
) x
where x.dt is not null
group by dt_month
order by dt_month
You don't say what version of MySQL. If you're using 8.0, this should work:
create table userdates (
UserId int not null,
ActivationDate date not null,
TerminationDate date null
);
insert into userdates (UserId, ActivationDate, TerminationDate)
values
(1, cast("2020-01-01" as date), null )
, (2, cast("2020-01-15" as date), null )
, (3, cast("2020-01-20" as date), cast("2020-01-30" as date))
, (4, cast("2020-02-01" as date), null )
, (5, cast("2020-02-14" as date), cast("2020-02-27" as date))
, (6, cast("2020-02-15" as date), cast("2020-02-28" as date))
, (7, cast("2020-03-02" as date), null )
, (8, cast("2020-03-05" as date), null )
, (9, cast("2020-03-20" as date), cast("2020-03-21" as date))
, (10, cast("2020-07-20" as date), null)
, (11, cast("2019-09-12" as date), cast("2019-09-14" as date));
WITH RECURSIVE d (dt)
AS (
SELECT cast("2019-01-01" as date)
UNION ALL
SELECT date_add(dt, interval 1 month)
FROM d
WHERE dt < cast("2020-12-01" as date)
)
select d.dt
, count(distinct ud.UserId) as UserCount
from userdates ud
right outer join d on d.dt >= date_format(ud.ActivationDate, '%Y-%m-01')
and (d.dt <= ud.TerminationDate or ud.TerminationDate is null)
group by d.dt;
I have orders table which has the start date, end date and anticipated end date columns, I can able to get all the active work orders in month but I am looking for selected month average working orders.
I am trying to find an idea to get but unable to get, can someone please help on this?
SQL Fiddle
Updated Fiddle (Can we combine those 3 queries into single Query1+Query2-Query3 = desired count which is 7 in this case)
Updated as per the comments:
Average working means for example there are thousands of orders are in the database and some might close in the middle of the month and some might start in the start of the month some might start in the next month. So I want to know on average how many orders are working in the month.
Desired Result or Count is: 7, because 4 Orders are closed in the month and 4 are started in the month.
MySQL 5.6 Schema Setup:
CREATE TABLE `orders` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`order_num` BIGINT(20) NULL DEFAULT NULL,
`start_date` DATE NULL DEFAULT NULL,
`anticpated_end_date` DATE NULL DEFAULT NULL,
`end_date` DATE NULL DEFAULT NULL,
PRIMARY KEY (`id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
AUTO_INCREMENT=1
;
INSERT INTO `orders` (`order_num`, `start_date`, `anticpated_end_date`, `end_date`) VALUES
('124267', '2019-01-11', '2020-01-10', '2020-01-10'),
('464335', '2019-01-03', '2019-11-15', '2019-12-13'),
('313222', '2019-01-03', '2020-02-15', NULL),
('63356', '2019-04-12', '2019-05-15', '2019-06-13'),
('235233', '2020-01-20', '2020-11-15', NULL),
('313267', '2019-01-03', '2020-01-15', '2020-01-19'),
('123267', '2019-12-10', '2020-07-31', NULL),
('234523', '2019-12-07', '2020-10-15', NULL),
('12344', '2020-01-03', '2020-02-15', NULL),
('233523', '2019-01-03', '2020-01-02', '2020-01-02'),
('233423', '2020-01-05', '2020-03-15', NULL),
('45644', '2020-01-11', '2020-08-15', NULL),
('233723', '2019-06-03', '2020-01-05', '2020-01-05'),
('345234', '2020-02-02', '2020-02-15', NULL),
('232423', '2020-02-03', '2020-03-15', NULL);
Query 1:
SELECT order_num, start_date, anticpated_end_date, end_date
FROM orders
WHERE start_date <= date("2020-01-31")
AND
(
(
end_date IS NULL AND
(
anticpated_end_date >= date("2020-01-31") OR
anticpated_end_date BETWEEN date("2020-01-01") AND date("2020-01-31")
)
) OR
(
end_date >= date("2020-01-31") OR
end_date BETWEEN date("2020-01-01") AND date("2020-01-31")
)
);
For the first query, I find this easier to read...
SELECT order_num, start_date, anticpated_end_date, end_date
FROM orders
WHERE start_date < '2020-01-01'
AND COALESCE(end_date,anticpated_end_date) > '2020-01-31';
If you're only interested in the count of that result, then consider the following...
SELECT SUM(start_date < '2020-01-01' AND COALESCE(end_date,anticpated_end_date) > '2020-01-31')n
FROM orders;
Does that help?
I am assuming that if the end date is marked as null then it is an active order else the order is not active.
So all active orders for month of Jan would be where end date is null and the start date is on or before 31 Jan 2020.
Based o above 2 assumptions the resulting query would look like this:
select order_num, start_date, end_date, anticpated_end_date
from orders
where end_date is null
and start_date <= date("2020-01-31")
order by start_date,end_date,anticpated_end_date;
I have a table containing alternating ON & OFF events with its timestamp. How do i calculate total time between each ON & OFF?
Status Timestamp
============================
ON 2019-01-01 07:00:00
OFF 2019-01-01 08:30:00
ON 2019-01-01 09:00:00
OFF 2019-01-01 10:00:00
ON 2019-01-01 10:30:00
OFF 2019-01-01 11:30:00
Consider the following...
CREATE TABLE my_table
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
dt DATETIME NOT NULL,
status VARCHAR(5) NOT NULL
);
INSERT INTO my_table VALUES
(1,'2015-01-01 13:00:00','ON'),
(2,'2015-01-01 13:10:00','OFF'),
(3,'2015-01-01 13:20:00','ON'),
(4,'2015-01-01 13:30:00','OFF'),
(5,'2015-01-01 13:35:00','ON'),
(6,'2015-01-01 13:40:00','OFF'),
(7,'2015-01-01 13:50:00','ON'),
(8,'2015-01-01 15:00:00','OFF');
SELECT x.*,
TIMEDIFF(MIN(y.dt),x.dt) AS TimeDiff
FROM my_table AS x
INNER JOIN my_table AS y ON y.dt >= x.dt
WHERE x.status = 'ON' AND y.status = 'OFF'
GROUP
BY x.id;
Refer DB FIDDLE For More:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=00dc040da540f852f08b2f02750bc16d
CREATE TABLE events (
`status` VARCHAR(3),
`timestamp` VARCHAR(19)
);
INSERT INTO events
(`status`, `timestamp`)
VALUES
('ON', '2019-01-01 07:00:00'),
('OFF', '2019-01-01 08:30:00'),
('ON', '2019-01-01 09:00:00'),
('OFF', '2019-01-01 10:00:00'),
('ON', '2019-01-01 10:30:00'),
('OFF', '2019-01-01 11:30:00');
SELECT
TIME_FORMAT(SEC_TO_TIME(TIME_TO_SEC(
SUM(TIMEDIFF(offtime, ontime))
)), '%H:%i')
AS total FROM (
SELECT e.timestamp AS offtime, (SELECT timestamp
FROM events AS st WHERE st.timestamp < e.timestamp AND st.status = "ON"
ORDER BY st.timestamp DESC LIMIT 1) AS ontime
FROM events AS e WHERE e.status='OFF') AS onoffs
Selects every OFF record, joins the most recent ON record to it, sums time ranges. With your data it gives the result: total 03:30
Doesn't account for open ranges. E.g. if the data series is started with OFF; or if it ends with ON, the time up to current moment would not be counted.
https://www.db-fiddle.com/f/hr27GhACxGd7ZvFaa52xiK/0#
I have a table with 6 columns: failure date, ipaddress, assettag, sid(primary key), rdl and error type.
I need a table with columns as First failure, Recent(Last) failure, ipaddress, assettag, rdl
But the records are to be there only if the date is repeated for 4 days from the current datetime. Not even one single day to be missed.
Ex: If today is 30th May, I need all the records whose failure date is there every single day--30th, 29th, 28th, 27th. If a record date is there only for two/three/one day(s)--it has to be ignored.
I can get First and Last failures using "min(date) and max(date)-group by ipaddress" but not able to get the records as per the condition--"failure (date) to be repeated for 4 days from current datetime"
select min(date), max(date), ipaddress, assettag, rdl
from flashinglist.response
where ((DATE_FORMAT((date_sub(NOW(), interval 24 hour)), '%y-%m-%d')) in
(select group_concat((DATE_FORMAT(date,'%y-%m-%d')) separator ', ')
from flashinglist.response group by ipaddress)
and (DATE_FORMAT((date_sub(NOW(), interval 48 hour)), '%y-%m-%d')) in
(select group_concat((DATE_FORMAT(date,'%y-%m-%d')) separator ', ')
from flashinglist.response group by ipaddress)
and (DATE_FORMAT((date_sub(NOW(), interval 72 hour)), '%y-%m-%d')) in
(select group_concat((DATE_FORMAT(date,'%y-%m-%d')) separator ', ')
from flashinglist.response group by ipaddress)
and (DATE_FORMAT((date_sub(NOW(), interval 96 hour)), '%y-%m-%d')) in
(select group_concat((DATE_FORMAT(date,'%y-%m-%d')) separator ', ')
from flashinglist.response group by ipaddress) )
order by max(date) desc
The above query should work as I am concatenating all dates group by IP and checking through 'IN' condition, but it doesn't work, not able to figure out why. (used 'date_format' to find only date instead of timestamp)
Below is the schema and sample data:
CREATE TABLE `response` (
`date` varchar(50) NOT NULL,
`ipaddress` varchar(16) NOT NULL,
`assettag` varchar(200) NOT NULL,
`sid` int(4) NOT NULL AUTO_INCREMENT PRIMARY KEY,
`rdl` varchar(30) NOT NULL,
`errortype` int(2) NOT NULL)
ENGINE=InnoDB DEFAULT CHARSET=latin1;
Sample data:
INSERT INTO `response` (`date`, `ipaddress`, `assettag`, `sid`, `rdl`, `errortype`) VALUES
('2019-05-31 09:46:10.878', '123.34.45.67', 'fresh', 483, '13234', 1),
('2019-05-30 19:46:11.578', '123.34.45.67', 'fresh', 490, '13234', 1),
('2019-05-29 14:30:11.577', '123.34.45.67', 'fresh', 496, '13234', 1),
('2019-05-28 17:23:11.573', '123.34.45.67', 'fresh', 499, '13234', 1),
('2019-05-27 22:32:11.550', '123.34.45.67', 'fresh', 503, '13234', 1),
('2019-05-29 12:54:11.571', '457.673.768.24', 'store', 560, '9297', 1),
('2019-05-31 08:46:11.569', '457.673.768.24', 'store', 565, '9297', 1),
('2019-05-28 10:45:11.566', '457.673.768.24', 'store', 567, '9297', 1),
('2019-05-30 20:16:11.566', '457.673.768.24', 'store', 569, '9297', 1),
('2019-05-29 23:46:11.234', '140.232.546.74', 'sample', 580, '6076', 1),
('2019-05-31 09:26:11.562', '140.232.546.74', 'sample', 581, '6076', 1),
('2019-05-30 19:34:16.533', '140.232.546.74', 'sample', 583, '6076', 1);
COMMIT;
Please change values according to today's date and the last 4 days.
My output should return First failure, Recent(Last) failure, ipaddress, assettag, rdl-- with the above sample data, it has to show IP records: 123.34.45.67 and 457.673.768.24 with corresponding max and min dates with in the range of 1 to 96 hours (4 days) only.
IP- 140.232.546.74 should not appear as it is the failure is not repeated for 4 days (28th date is missing). Hope this clears my question.
Count the number of different dates in the result, and test if this is the required number.
SELECT min(date) AS mindate, max(date) AS maxdate, date, ipaddress, assettag, rdl
FROM flashinglist.response
WHERE date < DATE_SUB(NOW(), interval 1 hour)
AND date > date_sub(NOW(), interval 96 hour)
GROUP BY ipaddress
ORDER BY mindate DESC
HAVING COUNT(DISTINCT DATE(date)) = DATE_SUB(maxdate, mindate) + 1
You also shouldn't have these lines:
AND (date > date_sub(NOW(), interval 24 hour) )
AND (date > date_sub(NOW(), interval 48 hour))
AND (date > date_sub(NOW(), interval 72 hour))
since they will exclude rows that are more than 1 day old.
I have a question that make me feel silly !
I have to do some stats on the use of my apps.
I have a table call : customer_point
id int(11) auto_increment
id_customer int(11)
type_point int(11)
date timestamp CURRENT_TIMESTAMP
I want to make this request for the entire month (with a row for each night ;) ) :
SELECT COUNT( id_customer ) , type_point, date(date)
FROM customer_point
WHERE date BETWEEN "2014-06-01 20:00:00" AND "2014-06-02 10:00:00"
GROUP BY type_point, date;
I nearly sure that i miss a crusial point but i can't find witch one.
Thank you very much for reading me !
Bye,
edit :
Sample :
INSERT INTO `customer_point` ( `id` , `id_customer` , `type_point`, `date` )
VALUES ( '', '15', '1', '2014-06-01 22:50:00'), ( '', '15', '1', '2014-06-01 23:52:00'), ( '', '15', '1', '2014-06-02 9:50:00'), ( '', '15', '1', '2014-06-30 22:50:00'), ( '', '15', '1', '2014-06-30 23:52:00'), ( '', '15', '1', '2014-07-01 02:50:00', ( '', '15', '1', '2014-07-01 09:50:00');
result :
1, 3, 2014-06-01
1, 4, 2014-06-30
I hope this will help everbody to understand my probleme :/
If you just want coutns of the actual data, check the date is within the range you are interested in and that the time is at night (ie, greater than 8pm or less than 10am, if would seem from your SQL):-
SELECT type_point, date(customer_point.date) AS aDate, COUNT( id_customer )
FROM customer_point
WHERE DATE(customer_point.date) BETWEEN "2014-06-01" AND "2014-06-30"
AND TIME(customer_point.date) >= '20:00:00' OR TIME(customer_point.date) <= '10:00:00'
GROUP BY type_point, aDate;
To get a row per day, irrespective of whether there is any data that day(ie, a count of zero it no data) then you need to generate a list of dates and then LEFT JOIN your data to it.
Something like this:-
SELECT sub0.aDate, type_point, COUNT( id_customer )
FROM
(
SELECT DATE_ADD('2014-06-01', INTERVAL units.i + tens.i * 10 DAY) AS aDate
FROM
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units
CROSS JOIN
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3) tens
) sub0
LEFT OUTER JOIN customer_point
ON sub0.aDate = date(customer_point.date)
WHERE sub0.aDate BETWEEN "2014-06-01" AND "2014-06-30"
GROUP BY sub0.aDate, type_point;
You would also probably need to generate a list of type_point values.
EDIT - to go with the updated question, can you just subtract 10 hours from the date / time. So 10am on the 1st July becomes midnight on the 30th June?
SELECT type_point, date(DATE_ADD(customer_point.date, INTERVAL -10 HOUR)) AS aDate, COUNT( id_customer )
FROM customer_point
WHERE DATE(DATE_ADD(customer_point.date, INTERVAL -10 HOUR)) BETWEEN "2014-06-01" AND "2014-06-30"
AND TIME(customer_point.date) >= '20:00:00' OR TIME(customer_point.date) <= '10:00:00'
GROUP BY type_point, aDate;
SQL fiddle:-
http://www.sqlfiddle.com/#!2/ddc95/2
The issue with this is whether items from before 10am on the 1st of June count as dates for May or for June?
Using mysql you even could do
WHERE date LIKE "2014-06-%"
Edit: You need exactly from 20:00 and then you have to take in account the first day of the next mounth until the 22:00...
Ok, then just substract those 20 hours to the date:
SELECT DATE_SUB(column, INTERVAL 20 HOUR)....
Finally:
SELECT COUNT( id_customer ) , type_point, DATE_SUB(date, INTERVAL 20 HOUR) as mydate
FROM customer_point
WHERE mydate LIKE "2014-06-%"
GROUP BY type_point, date;