Group by week returning strange intervals - mysql

For some odd reason, group by week is returning odd date intervals with a datetime field.
"Completed" is a datetime field, and using this query:
SELECT
Completed,
COUNT( DISTINCT Table1.ID ) AS ActivityCount
FROM Table1
JOIN Table1Items
ON Table1.ID = Table1Items.ID
JOIN database_database.Table2
ON Table2.Item = Table1Items.Item
WHERE Completed != '0000-00-00' AND Completed >= '2012-09-25' AND Completed <= '2012-10-25'
GROUP BY WEEK(Completed)
I'm getting:
Completed ActivityCount CompletedTimestamp
2012-09-25 300 2012-09-25 00:00:00
2012-10-02 764 2012-10-02 00:00:00
2012-10-08 379 2012-10-08 00:00:00
2012-10-17 659 2012-10-17 00:00:00
2012-10-22 382 2012-10-22 00:00:00
some are 7 days apart, others are 6 days apart, others are 5.... and one is 9?
Why does it group the dates by such strange intervals instead of just 7 days?

The week function does not count the difference of the dates.
The week function returns the week number of a date. If you group by it, then in the group will be dates at the start and end of the week and in bettween. The difference betwween the single dates can be greater than 7 days or less.

The answer, as alluded to by juergen d, was to aggregate the date column -- use min or max depending on whether you want to the first day or last day of the week used as the consistent interval; e.g.:
SELECT MIN(Completed), COUNT( DISTINCT Table1.ID ) AS ActivityCount FROM Table1 JOIN Table1Items ON Table1.ID = Table1Items.ID JOIN database_database.Table2 ON Table2.Item = Table1Items.Item WHERE Completed != '0000-00-00' AND Completed >= '2012-09-25' AND Completed <= '2012-10-25' GROUP BY WEEK( Completed)

Related

Get occupancy per every 15-minute slot

We have a room where we can only have XX number of people inside due to current limitations. They come at different times and stay for a different length of time.
I'm trying to get a sum of people currently inside for each 15-min period for a specific date. The server is MySQL 8.0.21 deployed on AWS RDS.
MySQL 8.0 Table: Booking
ID
Name
PartySize
Date
BookedFrom
BookedTo
1
John
2
2021-01-01
2021-01-01 08:30:00
2021-01-01 10:00:00
2
Mary
4
2021-01-01
2021-01-01 09:00:00
2021-01-01 11:00:00
3
Nick
3
2021-01-01
2021-01-01 10:30:00
2021-01-01 12:30:00
I also have a "helper table" with a time slot for each 24 hour 15-min slot
MySQL Table: Timeslot
ID
Time
1
00:00:00
2
00:15:00
3
00:30:00
35
08:30:00
37
09:00:00
38
09:15:00
For example, when I run this query below, I will get the correct count (6 people) for 09:30. What is the most efficient way to get this result for each 15-min slot? Please note that while the BookedTo (datetime field) value may be past midnight, I will always be only making date specific queries.
SELECT
t.id, b.date, t.time, SUM(b.partysize) AS total
FROM
booking b,
timeslot t
WHERE
b.date = '2021-01-01'
AND t.time = '09:15:00'
AND b.bookedfrom <= '2021-01-01 09:15:00'
AND b.bookedto >= '2021-01-01 09:15:00'
Looking for this output for all times (including zeros)
Slot_ID
Date
Time
Total
33
2021-01-01
08:00:00
0
34
2021-01-01
08:15:00
0
35
2021-01-01
08:30:00
2
36
2021-01-01
08:30:00
2
37
2021-01-01
09:00:00
6
38
2021-01-01
09:15:00
6
SELECT
t.id as slot_id,
coalesce(b.date, '2021-01-01') as date,
t.time,
coalesce(sum(b.partysize),0) as total
FROM
timeslot t
LEFT JOIN booking b
ON t.time >= TIME(b.bookedfrom) AND t.time < TIME(b.bookedto) AND b.date = '2021-01-01'
WHERE
t.time BETWEEN '08:00:00' AND '17:00:00'
GROUP BY
t.id,
b.date,
t.time
Now, you have some confusing other requirements, but basically this works because multiple rows of timeslot will match to a single row of booking because of the time range expressed.
The confusing requirements are, you say it's only for 8-5pm, but "bookings might extend to the next day".. does it mean that a booking will start at 4pm and finish at 9am the next day? in which case you might need to adjust the AND b.date = '2021-01-01' to be more like AND (DATE(b.bookedfrom) = '2021-01-01' OR DATE(b.bookedto) = '2021-01-01') ...
Use a CTE that returns the specific date for which you want the results, which may not be the same as the column Date in Booking and CROSS join it to Timeslot.
The result should be LEFT joined to Booking and then aggregate:
WITH cte(Date) AS (SELECT '2021-01-01')
SELECT t.ID, t.time, c.Date,
COALESCE(SUM(b.PartySize), 0) Total
FROM cte c CROSS JOIN Timeslot t
LEFT JOIN Booking b
ON b.BookedFrom <= CONCAT(c.Date, ' ', t.time)
AND b.BookedTo >= CONCAT(c.Date, ' ', ADDTIME(t.time, '00:15:00'))
WHERE t.time BETWEEN '08:00:00' AND '17:00:00'
GROUP BY t.ID, c.Date, t.time
Since BookedFrom and BookedTo may not contain the same date, it is not safe to compare only the time parts of the 2 columns to the column time of Timeslot.
This is why all these conditions in the ON clause are needed.
See the demo.
this query works great ... if you wanna have all dates for all slots .. you will have to have a date table too (ideally within timeslot -> cross join dates and timeslots) ...
use inner join if you wanna get only matching dates and timeslots ..
SELECT t.id as slot_id
, b.date
, t.time as slot
, sum(ifnull(party_size,0)) as total
FROM test.timeslot t
LEFT JOIN test.booking b
ON t.time BETWEEN time(b.booked_from) AND time(b.booked_to)
GROUP BY t.id
, b.date
, t.time;
for all timeslots and selected dates:
https://www.db-fiddle.com/f/gLt2Fs8HTDUakMahZHxcTi/0
for matching timeslots and dates:
SELECT t.id as slot_id
, b.date
, t.time as slot
, sum(ifnull(party_size,0)) as total
FROM test.timeslot t
JOIN test.booking b
ON t.time BETWEEN time(b.booked_from) AND time(b.booked_to)
GROUP BY t.id
, b.date
, t.time;

Summing data for last 7 day look back window

I want a query that can give result with sum of last 7 day look back.
I want output date and sum of last 7 day look back impressions for each date
e.g. I have a table tblFactImps with below data:
dateFact impressions id
2015-07-01 4022 30
2015-07-02 4021 33
2015-07-03 4011 34
2015-07-04 4029 35
2015-07-05 1023 39
2015-07-06 3023 92
2015-07-07 8027 66
2015-07-08 2024 89
I need output with 2 columns:
dateFact impressions_last_7
query I got:
select dateFact, sum(if(datediff(curdate(), dateFact)<=7, impressions,0)) impressions_last_7 from tblFactImps group by dateFact;
Thanks!
If your fact table is not too big, then a correlated subquery is a simple way to do what you want:
select i.dateFact,
(select sum(i2.impressions)
from tblFactImps i2
where i2.dateFact >= i.dateFact - interval 6 day
) as impressions_last_7
from tblFactImps i;
You can achieve this by LEFT OUTER JOINing the table with itself on a date range, and summing the impressions grouped by date, as follows:
SELECT
t1.dateFact,
SUM(t2.impressions) AS impressions_last_7
FROM
tblFactImps t1
LEFT OUTER JOIN
tblFactImps t2
ON
t2.dateFact BETWEEN
DATE_SUB(t1.dateFact, INTERVAL 6 DAY)
AND t1.dateFact
GROUP BY
t1.dateFact;
This should give you a sliding 7-day sum for each date in your table.
Assuming your dateFact column is indexed, this query should also be relatively fast.

Calculating 'on time performance' between scheduled and actual times

I have a table of scheduled times for an event and a second of actual times the event happened, for example:
Table A
ID Date Scheduled
1 2014-09-01 07:05:00
2 2014-09-02 07:05:00
3 2014-09-03 08:05:00
4 2014-09-04 07:10:00
Table B
ID Date Actual
1 2014-09-01 07:10:00
2 2014-09-02 07:16:00
3 2014-09-03 08:00:00
4 2014-09-04 14:15:00
If we assume that anything within 10 minutes of schedule is considered 'on time', is there a way to return the 'on time performance' using MySQL? In the data set above, the on time performance would be 50%, since two of the events happened within 10 minutes of the schedule.
Supplementary edit: If an event is early, that would also be considered on time
Yes. However, I don't understand why the date and time are in different columns. You just need to join the two tables together and do some conditional logic:
select avg(case when a.date = b.date and a.actual <= a.scheduled + interval 10 minute then 1
when b.date < a.date then 1
else 0
end) as OnTimePerformance
from tablea a join
tableb b
on a.id = b.id;
This doesn't handle the case where an event is scheduled on one day (say 11:55 p.m.) and the actual time is the next day (12:01 a.m.). Your data suggests this does not happen. This condition would be easier if the date and time were in a single column.
Here is another way of doing it
select
round((on_time/tot)*100) as performance
from
(
select
count(*) as tot,
sum(
case when
timestampdiff(minute,concat(t1.Date,' ',t1.Scheduled),concat(t2.Date,' ',t2.Actual)) < 10
then 1
end
) as on_time
from tableA t1
join tableB t2 on t1.id = t2.id
)p;
DEMO

Mysql select dates that doesn't intersects with booked date ranges

I have a table of dates, each date represent a task, the task takes three days to complete.
I want to select all the unbooked dates that doesn't intersects with another booked task.
I've been trying and googling for three days now and I think it is time to ask for help.
date booked
=========== =======
2014-09-01 0
2014-09-02 1
2014-09-05 0
2014-09-10 1
2014-09-15 0
2014-09-16 0
2014-09-20 1
2014-09-25 0
The expected result:
date booked
=========== =======
2014-09-01 0
2014-09-15 0
2014-09-16 0
2014-09-25 0
You can use a left join (by adding 3 days in date column) with is null on same table to get the unbooked dates and doesnot intersects with other task's booked date
select t.*,
t1.date date1
from t
left join t t1 on(t.date = t1.date + interval 3 day)
where t.booked = 0 and t1.date is null
Fiddle Demo
Refer to the following answer.
Detect overlapping date ranges from the same table
If you could change date to Start_Date and add column End_Date (Start_Date + 3), then adding NOT to the answer quoted will result you in an answer that do not have overlapping date ranges.
select dr1.* from date_ranges dr1
inner join date_ranges dr2
where NOT (dr2.start > dr1.start -- start after dr1 is started
and dr2.start < dr1.end) -- start before dr1 is finished
From the results of the above query you can select the row which have 0 for booked column.

Finding gaps in concurrent date ranges - MySQL

I have a table like the one below. In reality there are 50,000 users, and a technically infinite number of ranges for each user. There is no limit on date gaps, starts, ends, overlaps, etc.
User From To
A 2011-01-03 2013-04-09
A 2012-04-16 2012-03-08
A 2012-12-11 2013-06-17
A 2013-07-17
A 2013-09-22 2013-12-24
B 2011-04-06 2013-01-02
B 2012-02-12 2012-02-14
B 2012-11-10 2013-03-16
B 2013-04-16
B 2013-04-22
I need to calculate the number of weekdays in 2013 not covered by these ranges for each user. The blank 'To' date means the range is ongoing.
In the example above it would be the number of weekdays between 2013-06-18 and 2013-07-16 for user A, and between 2013-03-17 and 2013-04-15 for B.
I have a lookup table of individual weekdays, but anything I do to the date ranges using min and max ends up giving me a 'solid' date range from 2013-01-01 to 2013-12-31.
I'm not bright....
Thank you.
SELECT users.User, COUNT(*)
FROM users
CROSS JOIN weekdays
LEFT JOIN userDates ON
userDates.User = users.User
AND userDates.From <= weekdays.date
AND (userDates.To IS NULL OR userDates.To >= weekdays.date)
WHERE weekdays.date >= '2013-01-01'
AND weekdays.date < '2014-01-01'
AND userDates.User IS NULL
GROUP BY users.User