Related
I have a table with userID, clockin(1)/Clockout(0), dateTime for few employees. in/out shows when someone is on (1) or off(0) clock.
Job shift can span across midnight, as in punch in before midnight, and punchout in the morning. (Eg: Date 21st in table)
Shift can last more than 24 hours (hypothetically) (Eg : Date 24)
Punchin and punchout can happen multiple times within 24 hrs as well(Eg : Date 22).
I would like to get the sum of hours worked per day for any given user_id but within midnight to midnight, even though the shift might span across midnight. Timestamps are shown all with :30:00 for clarity. Only one user_id is shown, but this table can have info from multiple users, so user_id will be used in the where clause.
[id] [User_id] [Date_time] [in_out]
1 1 2022-08-20 09:30:00 1
2 1 2022-08-20 21:30:00 0
3 1 2022-08-21 20:30:00 1
4 1 2022-08-22 08:30:00 0
5 1 2022-08-22 09:30:00 1
6 1 2022-08-22 14:30:00 0
7 1 2022-08-23 12:30:00 1
8 1 2022-08-25 09:30:00 0
9 1 2022-08-25 12:30:00 1
So The desired query result would be something like below. The formatting does not matter. Total time per day in seconds or minutes or anything will work.
[Day] [hours_worked]
2022-08-20 12:00:00
2022-08-21 03:30:00
2022-08-22 13:00:00
2022-08-23 11:30:00
2022-08-24 24:00:00
2022-08-25 09:30:00
I started with the code from Get total hours worked in a day mysql This works well when punch-in happens before punch outs in a day, and does not handle midnights. Just trying to adapt to the specific case. Any help much appreciated.
To do this in MySQL 5.6, I can only think of a not so nice query, but let's create the data first
CREATE TABLE events
(`id` int, `User_id` int, `Date_time` datetime, `in_out` int);
INSERT INTO events
(`id`, `User_id`, `Date_time`, `in_out`)
VALUES
(1, 1, '2022-08-20 09:30:00', 1),
(2, 1, '2022-08-20 21:30:00', 0),
(3, 1, '2022-08-21 20:30:00', 1),
(4, 1, '2022-08-22 08:30:00', 0),
(5, 1, '2022-08-22 09:30:00', 1),
(6, 1, '2022-08-22 14:30:00', 0),
(7, 1, '2022-08-23 12:30:00', 1),
(8, 1, '2022-08-25 09:30:00', 0),
(9, 1, '2022-08-25 12:30:00', 1);
Based on https://stackoverflow.com/a/60173743/19657183, one can get the dates for every single day between the first and last event date. Then, you can JOIN the result with the events to figure out the ones which overlap. From that you can calculate the time differences and sum them up grouped by day:
SELECT User_id, start_of_day,
sec_to_time(sum(timestampdiff(SECOND, CAST(GREATEST(cast(start_of_day as datetime), prev_date_time) AS datetime),
CAST(LEAST(start_of_next_day, Date_time) AS datetime)))) AS diff
FROM (
SELECT * FROM (
SELECT id, User_id,
CASE WHEN #puid IS NULL or #puid <> User_id THEN NULL ELSE #pdt END AS prev_date_time, #pdt := Date_time AS Date_time,
CASE WHEN #puid IS NULL or #puid <> User_id THEN NULL ELSE #pio END AS prev_in_out, #pio := in_out in_out,
#puid := User_id
FROM (SELECT * FROM events ORDER BY User_id, Date_time) e,
(SELECT #pdt := '1970-01-01 00:00:00', #pio := NULL, #puid := NULL) init ) tr
WHERE prev_in_out = 1 and in_out = 0) event_ranges
JOIN (
SELECT #d start_of_day,
#d := date_add(#d, interval 1 day) start_of_next_day
FROM (SELECT #d := date(min(Date_time)) FROM events) min_d,
(SELECT x1.N + x10.N*10 + x100.N*100 + x1000.N*1000
FROM (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) x1,
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) x10,
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) x100,
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) x1000
WHERE x1.N + x10.N*10 + x100.N*100 + x1000.N*1000 <= (SELECT date(max(Date_time)) - date(min(Date_time)) FROM events)) days_off) day_ranges
ON prev_date_time < start_of_next_day AND Date_time >= start_of_day
GROUP BY User_id,start_of_day;
I encountered a problem using sqlfiddle.com: it returned 00:00:00 if e.g. the time difference was exactly 24 hours (didn't matter if I used timediff or sec_to_time). I haven't seen this problem neither on MySQL 8 nor in db-fiddle.com (using MySQL 5.6). So, it might be, that you've to work around this problem.
EDIT: rewrote completely to solve the problem in MySQL 5.6 as requested by the OP.
EDIT #2: Updated the query to take sorting and grouping into account.
EDIT #3: changed initial assignment of the variables.
Using MySQL, I'm trying to get the number of active users I have in any given month. I have a table with ActivationDate and TerminationDate columns, and if the month being counted is after the ActivationDate and TerminationDate is null, then the user is active and should be counted. I would like to summarize these amounts by month. I'm thinking I could just sum each side and calculate the total but breaking that down won't give me a running total. I've tried with window functions, but I don't have enough experience with them to know exactly what I'm doing wrong and I'm not certain how to ask the right question.
So for instance, if I have the following data...
UserId ActivationDate TerminationDate
1 2020-01-01 null
2 2020-01-15 null
3 2020-01-20 2020-01-30
4 2020-02-01 null
5 2020-02-14 2020-02-27
6 2020-02-15 2020-02-28
7 2020-03-02 null
8 2020-03-05 null
9 2020-03-20 2020-03-21
I would like my results to be similar to:
2020-01 2 (there are 2 active users, since one signed up but cancelled before the end of the month)
2020-02 3 (2 from the previous month, plus 1 that signed up this month and is still active)
2020-03 5 (3 from previous, 2 new, 1 cancellation)
You can unpivot, then aggregate and sum. In MySQL 8.0.14 or higher, you can use a lateral join:
select date_format(x.dt, '%Y-%m-01') as dt_month,
sum(sum(cnt)) over(order by date_format(x.dt, '%Y-%m-01')) as cnt_active_users
from mytable t
cross join lateral (
select t.activationdate as dt, 1 as cnt
union all select t.terminationdate, -1
) x
where x.dt is not null
group by dt_month
order by dt_month
In earlier 8.x versions:
select date_format(x.dt, '%Y-%m-01') as dt_month,
sum(sum(cnt)) over(order by date_format(x.dt, '%Y-%m-01')) as cnt_active_users
from (
select activationdate as dt, 1 as cnt from from mytable
union all select terminationdate, -1 from mytable
) x
where x.dt is not null
group by dt_month
order by dt_month
You don't say what version of MySQL. If you're using 8.0, this should work:
create table userdates (
UserId int not null,
ActivationDate date not null,
TerminationDate date null
);
insert into userdates (UserId, ActivationDate, TerminationDate)
values
(1, cast("2020-01-01" as date), null )
, (2, cast("2020-01-15" as date), null )
, (3, cast("2020-01-20" as date), cast("2020-01-30" as date))
, (4, cast("2020-02-01" as date), null )
, (5, cast("2020-02-14" as date), cast("2020-02-27" as date))
, (6, cast("2020-02-15" as date), cast("2020-02-28" as date))
, (7, cast("2020-03-02" as date), null )
, (8, cast("2020-03-05" as date), null )
, (9, cast("2020-03-20" as date), cast("2020-03-21" as date))
, (10, cast("2020-07-20" as date), null)
, (11, cast("2019-09-12" as date), cast("2019-09-14" as date));
WITH RECURSIVE d (dt)
AS (
SELECT cast("2019-01-01" as date)
UNION ALL
SELECT date_add(dt, interval 1 month)
FROM d
WHERE dt < cast("2020-12-01" as date)
)
select d.dt
, count(distinct ud.UserId) as UserCount
from userdates ud
right outer join d on d.dt >= date_format(ud.ActivationDate, '%Y-%m-01')
and (d.dt <= ud.TerminationDate or ud.TerminationDate is null)
group by d.dt;
I have a database table (raw_data) where there are multiple rows. I am looking to count the number of rows between a given time interval (9:25:00 and 9:29:59) by grouping the rows if the time difference between each row is less than or equal to 2 seconds.
For example:
EventId Date Time
1 2019/10/16 9:27:08
2 2019/10/16 9:27:11
3 2019/10/16 9:27:37
4 2019/10/16 9:27:40
5 2019/10/16 9:27:45
6 2019/10/16 9:27:45
7 2019/10/16 9:27:45
8 2019/10/16 9:27:57
the data in this snippet should yield a count of 6 (when counting items that are less than 2 seconds from each other). I.e. if an item is less than 2 seconds from the next row, chances are its the same event and therefore grouped together.
Much appreciated
Have attempted queries like:
(found at: MySQL grouping results by time periods)
SELECT count(*)
FROM
(
SELECT a.starttime AS ThisTimeStamp, MIN(b.starttime) AS NextTimeStamp
FROM raw_data a
INNER JOIN raw_data b
ON a.starttime < b.starttime
and a.startdate = b.startdate
where a.startdate ='2019-10-16'
and a.starttime >= '09:27:00' and a.starttime < '09:28:00'
and b.startdate ='2019-10-16'
and b.starttime >= '09:27:00' and b.starttime < '09:28:00'
GROUP BY a.starttime
) Sub1
WHERE Sub1.ThisTimeStamp < (Sub1.NextTimeStamp - 2)
purposefully hard coding the dates and times and comparing the results manually but the result always end up being different from the manual count.
With this table.
CREATE TABLE table2
(`EventId` int, `Date` date, `Time` time)
;
INSERT INTO table2
(`EventId`, `Date`, `Time`)
VALUES
(1, '2019-10-16', '9:27:08'),
(2, '2019-10-16, '9:27:11'),
(3, '2019-10-16, '9:27:37'),
(4, '2019-10-16, '9:27:40'),
(5, '2019-10-16, '9:27:45'),
(6, '2019-10-16, '9:27:45'),
(7, '2019-10-16', '9:27:45'),
(8, '2019-10-16', '9:27:57')
;
And this select statement
SELECT
EventId,`Date`, `Time`
FROM
(Select
EventId,`Date`, `Time`
,if (TIMESTAMPDIFF(SECOND,#date_time,STR_TO_DATE(CONCAT(`Date`, ' ', `Time`), '%Y-%m-%d %H:%i:%s') ) > 2
,1,-1) inintrvall
,#date_time := STR_TO_DATE(CONCAT(`Date`, ' ', `Time`), '%Y-%m-%d %H:%i:%s')
From table2,
(SELECT #date_time:= (SELECT min(STR_TO_DATE(CONCAT(`Date`, ' ', SUBTIME( `Time`, "5")), '%Y-%m-%d %H:%i:%s') )
FROM table2)) ti
order by `Date` ASC, `Time` ASC) t1
WHERE inintrvall = 1
order by `Date` ASC, `Time` ASC;
You will get 6 rows
EventId Date Time
1 2019-10-16 09:27:08
2 2019-10-16 09:27:11
3 2019-10-16 09:27:37
4 2019-10-16 09:27:40
5 2019-10-16 09:27:45
8 2019-10-16 09:27:57
Group by will not work on time intervalls.
so this little algorithm.
I check every row, if the prior row has a datetime older than 2 seconds
Then it marks it with 1 and if not with -1.
The ugly part is to get the actual datetime to better calculate the time differenz, for example when when a new day begins.
For these purposes it would be better to save the directy as timestamp or datetime.
I am trying to calculate data from my database but first I've noticed strange behavior from the results I get, second, I have trouble making a request that take into account refills.
I have a table with :
Name - DateTime - content
I want to group by day the rows and select the difference of the number to have the consumption.
For example :
Name - DateTime - Content
Foo - 22-04-2018 6:00 - 120
Foo - 22-04-2018 10:00 - 119
Foo - 22-04-2018 16:00 - 118
The content has decreased, the result should be -2.
Output of my request = -2
Another example :
Name - DateTime - Content
Foo - 23-04-2018 6:00 - 50
Foo - 23-04-2018 10:00 - 90
Foo - 23-04-2018 16:00 - 120
Here we can notice that the number has increased. It means that instead of a consumption, we have refilled the reserve and the content has increased.
The result should be -70.
Output of my request : 30
My request :
SELECT day,
Abs(Sum(diffn)) AS totN
FROM (SELECT Date(datetime) AS day,
Max(content) - Min(content) AS diffN
FROM logs
WHERE NAME = 'Foo'
AND datetime >= '2018-04-22 00:00:00'
AND datetime <= '2018-04-23 00:00:00'
GROUP BY Date(datetime)) a
GROUP BY day;
But for the second example I have 30 as a result instead of 70, I don't know why...
I would like your help to change my request and take refills into account so that I get the results I want.
Thanks!
You need to determine the Prefix by comparing the highest and the lowest value, the time (hour) included. I'm using the 'CASE' function with two subqueries here.
Maybe you'll need to turn the year-month-day around, because I'm using the german datetime-format.
SET #datetime = '2018-04-22';
SELECT date(datetime) as day
,(CASE WHEN
(SELECT content FROM logs WHERE date(datetime) = #datetime ORDER BY datetime LIMIT 1)
>
(SELECT content FROM logs WHERE date(datetime) = #datetime ORDER BY datetime desc LIMIT 1)
THEN min(content) - max(content)
ELSE max(content) - min(content) END) as diffN
FROM logs
WHERE Name = 'Foo' AND date(datetime) = #datetime
GROUP BY day(datetime)
ORDER BY datetime
;
This should do the job:
SELECT day(datetime) as day, max(content) - min(content) as diffN
FROM logs
WHERE Name = 'Foo'
AND datetime >= '2018-04-23 00:00:00'
AND datetime <= '2018-04-24 00:00:00'
GROUP BY day(datetime)
Also, change the date filters it should be betweeen 23 and 24.
It might be that you need to establish the first and last datetime and their associated content. For example
drop table if exists t;
create table t (name varchar(3), dt datetime, content int);
insert into t values
('Foo' , '2018-04-22 06:00:00', 120),
('Foo' , '2018-04-22 10:00:00', 119),
('Foo' , '2018-04-22 16:00:00', 118),
('Foo' , '2018-04-23 06:00:00', 50),
('Foo' , '2018-04-23 10:00:00', 90),
('Foo' , '2018-04-23 16:00:00', 120);
select s.name,lastinday,firstinday,lastinday - firstinday
from
(
select name,dt, content lastinday
from t
where dt = (Select max(dt) from t t1 where t1.name = t.name and date(t1.dt) = date(t.dt))
) s
join
(
select name,dt, content firstinday
from t
where dt = (Select min(dt) from t t1 where t1.name = t.name and date(t1.dt) = date(t.dt))
) t
on t.name = s.name and date(t.dt) = date(s.dt);
+------+-----------+------------+------------------------+
| name | lastinday | firstinday | lastinday - firstinday |
+------+-----------+------------+------------------------+
| Foo | 118 | 120 | -2 |
| Foo | 120 | 50 | 70 |
+------+-----------+------------+------------------------+
2 rows in set (0.00 sec)
Why are you grouping it second time:
Ideally this should work:
SELECT Date(datetime) AS day,
Max(content) - Min(content) AS diffN
FROM logs
WHERE NAME = 'Foo'
AND datetime >= '2018-04-22 00:00:00'
AND datetime <= '2018-04-23 00:00:00'
GROUP BY Date(datetime)
Result of this query will contain only 2 rows - 1 for 22th and 1 for 23rd day. There is no need of grouping it again by day
I have a question that make me feel silly !
I have to do some stats on the use of my apps.
I have a table call : customer_point
id int(11) auto_increment
id_customer int(11)
type_point int(11)
date timestamp CURRENT_TIMESTAMP
I want to make this request for the entire month (with a row for each night ;) ) :
SELECT COUNT( id_customer ) , type_point, date(date)
FROM customer_point
WHERE date BETWEEN "2014-06-01 20:00:00" AND "2014-06-02 10:00:00"
GROUP BY type_point, date;
I nearly sure that i miss a crusial point but i can't find witch one.
Thank you very much for reading me !
Bye,
edit :
Sample :
INSERT INTO `customer_point` ( `id` , `id_customer` , `type_point`, `date` )
VALUES ( '', '15', '1', '2014-06-01 22:50:00'), ( '', '15', '1', '2014-06-01 23:52:00'), ( '', '15', '1', '2014-06-02 9:50:00'), ( '', '15', '1', '2014-06-30 22:50:00'), ( '', '15', '1', '2014-06-30 23:52:00'), ( '', '15', '1', '2014-07-01 02:50:00', ( '', '15', '1', '2014-07-01 09:50:00');
result :
1, 3, 2014-06-01
1, 4, 2014-06-30
I hope this will help everbody to understand my probleme :/
If you just want coutns of the actual data, check the date is within the range you are interested in and that the time is at night (ie, greater than 8pm or less than 10am, if would seem from your SQL):-
SELECT type_point, date(customer_point.date) AS aDate, COUNT( id_customer )
FROM customer_point
WHERE DATE(customer_point.date) BETWEEN "2014-06-01" AND "2014-06-30"
AND TIME(customer_point.date) >= '20:00:00' OR TIME(customer_point.date) <= '10:00:00'
GROUP BY type_point, aDate;
To get a row per day, irrespective of whether there is any data that day(ie, a count of zero it no data) then you need to generate a list of dates and then LEFT JOIN your data to it.
Something like this:-
SELECT sub0.aDate, type_point, COUNT( id_customer )
FROM
(
SELECT DATE_ADD('2014-06-01', INTERVAL units.i + tens.i * 10 DAY) AS aDate
FROM
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units
CROSS JOIN
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3) tens
) sub0
LEFT OUTER JOIN customer_point
ON sub0.aDate = date(customer_point.date)
WHERE sub0.aDate BETWEEN "2014-06-01" AND "2014-06-30"
GROUP BY sub0.aDate, type_point;
You would also probably need to generate a list of type_point values.
EDIT - to go with the updated question, can you just subtract 10 hours from the date / time. So 10am on the 1st July becomes midnight on the 30th June?
SELECT type_point, date(DATE_ADD(customer_point.date, INTERVAL -10 HOUR)) AS aDate, COUNT( id_customer )
FROM customer_point
WHERE DATE(DATE_ADD(customer_point.date, INTERVAL -10 HOUR)) BETWEEN "2014-06-01" AND "2014-06-30"
AND TIME(customer_point.date) >= '20:00:00' OR TIME(customer_point.date) <= '10:00:00'
GROUP BY type_point, aDate;
SQL fiddle:-
http://www.sqlfiddle.com/#!2/ddc95/2
The issue with this is whether items from before 10am on the 1st of June count as dates for May or for June?
Using mysql you even could do
WHERE date LIKE "2014-06-%"
Edit: You need exactly from 20:00 and then you have to take in account the first day of the next mounth until the 22:00...
Ok, then just substract those 20 hours to the date:
SELECT DATE_SUB(column, INTERVAL 20 HOUR)....
Finally:
SELECT COUNT( id_customer ) , type_point, DATE_SUB(date, INTERVAL 20 HOUR) as mydate
FROM customer_point
WHERE mydate LIKE "2014-06-%"
GROUP BY type_point, date;