Only count working days in a DATEDIFF (MySQL) - mysql

So, next problem :'), I have the following query that #MatBailie provided to me here (thanks again!):
SELECT
taskname,
employee,
SUM(
DATEDIFF(
LEAST( enddate, '2023-12-31'),
GREATEST(startdate, '2023-01-01')
)
+1
) AS total_days,
FROM
schedule
WHERE
startDate <= '2023-12-31'
AND
endDate >= '2023-01-01'
GROUP BY
employee,
taskname
This query will tell me how many days a certain employee has spent on a certain task in a given period of time, and it works great!
The next thing I would like to do however, is to substract non-working days from the SUM of DATEDIFFs for some of the tasks (e.g. when the task has "count_non_working_days= 0" in a reference table called 'activities').
For example, my schedule also keeps track of the amount of days off every employee has taken (days off are also scheduled as tasks). But of course, days off that fall in a weekend or on a holiday should not be counted towards the total of days off a person has taken in a year. (Note that I did consider scheduling days off only on weekdays/non-holidays, but this is not a practical option in the scheduling software I use because employees request a leave from date A to date B, and this request is approved or denied as-is (they don't make 3 holiday requests excluding the weekends if they want to go on a vacation for 3 weeks, if you get my drift).
So, if an employee goes on a vacation for 10 days, this is counted as 10 days off, but this holiday may have 1 or 2 weekends in it, so the sum of days of that the employee has taken off should be 6, 7 or 8, and not 10. Furthermore, if it has a holiday such as Easter Monday in it (I have all dates of my holidays in a PHP array), this should also be subtracted.
I have tried the solutions mentioned here, but I couldn't get them to work (a) because those are in SQL server and (b) because they don't allow putting in an array of holidays, (c) nor allow toggling the subtraction on and off depending on the event type.
Here's my attempt of explaining what I'm trying to do in my pseudo-SQL:
SELECT
taskname,
employee,
IF( activities.count_non_working_days=1,
-- Just count the days that fall in the current year:
SUM(
DATEDIFF(
LEAST( enddate, '2023-12-31'),
GREATEST( startdate, '2023-01-01')
)
+ 1
) AS total_days,
-- Subtract the amount of saturdays, sundays and holidays:
SUM(
DATEDIFF(
LEAST( enddate, '2023-12-31'),
GREATEST( startdate, '2023-01-01')
)
- [some way of getting the amount of saturdays, sundays and holidays that fall within this date range]
+ 1
) AS total_days
)
FROM
schedule
LEFT JOIN
activities
ON activity.name = schedule.name
WHERE
startDate <= '2023-12-31'
AND
endDate >= '2023-01-01'
GROUP BY
employee,
taskname
I know the query above is probably faulty on so many levels, but I hope it clarifies what I'm trying to do.
Thanks once more for all the help!
Edit: basically I need something like this, but in MySQL and preferably with a toggle that turns the subtraction on or off depending on the task type.
Edit 2: To clarify: my schedule table holds ALL activities, including holidays. For example, some records may include:
employee
taskname
startDate
endDate
Mr. Anderson
Programming
2023-01-02
2023-01-06
Mr. Anderson
Programming
2023-01-09
2023-01-14
Mr. Anderson
Vacation
2023-01-14
2023-01-31
In another table, Programming is defined as "count_non_working_days=1", because working in the weekends should count, while Vacation is defined as "count_non_working_days=0", because taking a day off on the weekend should not count towards your total amount of days taken off.
The totals for this month should therefore state that:
Mr. Anderson has done Programming for 11 days (of which 1 was on a saturday)
Mr. Anderson has taken 12 days off for (because the 2 weekends in this period don't count as days off).

Create a calendar table, with every date of interest (so, something like 2000-01-01 to 2099-01-01) and include columns such as is_working_day which can be set to TRUE/FLASE or 1/0. Then you can update that column as necessary, and join on that table in your query to get working dates that the employee has booked off.
In short, you count the relevant dates, rather than deducting the irrelevant dates.
SELECT
s.employee,
s.taskname,
COUNT(*) AS total_days,
FROM
(
schedule AS s
INNER JOIN
activities AS a
ON a.taskname = s.taskname
)
INNER JOIN
calendar AS c
ON c.calendar_date >= s.startDate
AND c.calendar_date <= s.endDate
AND c.is_working_day >= 1 - a.count_non_working_days
WHERE
c.calendar_date >= '2023-01-01'
AND c.calendar_date <= '2023-12-31'
GROUP BY
s.employee,
s.taskname
Your calendar table can then also include flags such as is_weekend, is_bank_holiday, is_fubar, is_amazing, etc, and the is_working_day can be a computed column from those inputs.
Note on is_working_day filter...
WHERE
( count_non_working_day = 1 AND is_working_day IN (0, 1) )
OR
( count_non_working_day = 0 AND is_working_day IN ( 1) )
-- change to (1 - count_non_working_day)
WHERE
( (1 - count_non_working_day) = 0 AND is_working_day IN (0, 1) )
OR
( (1 - count_non_working_day) = 1 AND is_working_day IN ( 1) )
-- simplify
WHERE
( (1 - count_non_working_day) <= is_working_day )
OR
( (1 - count_non_working_day) <= is_working_day )
-- simplify
WHERE
( (1 - count_non_working_day) <= is_working_day )
Demo: https://dbfiddle.uk/YAmpLmVE

This is to calculate all the weeekends between two giving dates It may help you :
SELECT (
((WEEK('2022-12-31') - WEEK('2022-01-01')) * 2) -
(case when weekday('2022-12-31') = 6 then 1 else 0 end) -
(case when weekday('2022-01-01') = 5 then 1 else 0 end)
)
You will have to substract also holidays that fall within this date range.

Related

how to hardcode sertain datetimes from now for five years ahead

I'm not very advanced in SQL and hope that someone can help me with one query. I'm building a booking platform in React with backend and API in Symfony. So I have starttime, endtime (will be calculated automatically with DATEADD depends on the duration of the chosen service), I will calculate avaliable timeslots to offer the new bookings (with DATEDIFF).
It seems all possible only in case if I'll have an additional table "Agenda_timeslots" with its own starttime,
endtime and
index - [0]free, [1]booked or [2]not available. So each service depends on its fixed duration will take some certain number of timeslots in this agenda.
So my question is - how to write query to generate such table with conditions - timeslots by 20min start from 9am till 5pm only from Monday to Friday for the next five years. If i'll have this timelots, I can loop through it and use SQL formulas. I believe there should be the way than to do it manually
Or I don't need timeslots at all? Only time from 9am to 5pm and the calculate only available hours in between booking time, to see if requested service will fit in the free time gap?
Thank you for any help
timeslots by 20min start from 9am till 5pm only from Monday to Friday for the next five years.
WITH RECURSIVE
cte1 AS ( SELECT #start_date `date`
UNION ALL
SELECT `date` + INTERVAL 1 DAY FROM cte1 WHERE `date` < #end_date ),
cte2 AS ( SELECT CAST('09:00' AS TIME) `time`
UNION ALL
SELECT `time` + INTERVAL 20 minute FROM cte2 WHERE `time` < '17:00' )
SELECT TIMESTAMP(cte1.`date`, cte2.`time`) `datetime`
FROM cte1
CROSS JOIN cte2
WHERE WEEKDAY(cte1.`date`) < 5
For provided conditions SET #end_date := #start_date + INTERVAL 5 YEAR.

Returning the next-to-last entry using MySQL

A little info: people check-in but they don't check out. Each check-in creates an auto-incremented entry into the _checkins table with a timestamp, MemberID, etc.
Here's the data the query needs to return:
Member info (name, picture, ID, etc)
The number of check-ins they've had in the last 30 days
The time since they're last check-in must be less than 2 hours for
them to be on the list.
The date of their last check-in NOT COUNTING TODAY (in other words,
the next to last "Created" entry in the _checkins table).
I have it all working except the last part. I feel like LIMIT is going to be part of the solution but I just can't find a way to implement it correctly.
Here's what I've got so far:
SELECT m.ImageURI, m.ID, m.FirstName, m.LastName,
ROUND(time_to_sec(timediff(NOW(), MAX(ci.Created))) / 3600, 1) as
'HoursSinceCheckIn', CheckIns
FROM _checkins ci LEFT JOIN _members m ON ci.MemberID = m.ID
INNER JOIN(SELECT MemberID, COUNT(DISTINCT ID) as 'CheckIns'
FROM _checkins
WHERE(
Created BETWEEN NOW() - INTERVAL 30 DAY AND NOW()
)
GROUP BY MemberID
) lci ON ci.MemberID=lci.MemberID
WHERE(
ci.Created BETWEEN NOW() - INTERVAL 30 DAY AND NOW()
AND TIMESTAMPDIFF(HOUR, ci.Created, NOW()) < 2
AND ci.Reverted = 0
)
GROUP BY m.ID
ORDER BY CheckIns ASC
You can simplify greatly (and make your code safer, as well):
SELECT _Members.ImageURI, _Members.ID, _Members.FirstName, _Members.LastName,
ROUND(TIME_TO_SEC(TIMEDIFF(NOW(), _FilteredCheckins.lastCheckin)) / 3600, 1) AS hoursSinceCheckIn, _FilteredCheckins.checkIns,
(SELECT MAX(_Checkins.created)
FROM _Checkins
WHERE _Checkins.memberId = _Members.ID
AND _Checkins.created < _FilteredCheckins.lastCheckin) AS previousCheckin
FROM _Members
JOIN (SELECT memberId, COUNT(*) AS checkIns, MAX(created) AS lastCheckin
FROM _Checkins
WHERE created >= NOW() - INTERVAL 30 DAY
GROUP BY memberId
HAVING lastCheckin >= NOW() - INTERVAL 2 HOURS) _FilteredCheckins
ON _FilteredCheckins.memberId = _Members.ID
ORDER BY _FilteredCheckins.checkIns ASC
We're counting all checkins in the last 30 days, including the most recent, but that's trivially adjustable.
I'm assuming _Checkins.id is unique (it should be), so COUNT(DISTINCT ID) can be simplified to COUNT(*). If this isn't the case you'll need to put it back.
(Side note: please don't use BETWEEN, especially with date/time types)
(humorous side note: I keep mentally reading this as "chickens"....)

Select all available items in a specific period

So I have 2 tables caring and client, like this
client {
id,
name
}
caring {
id,
startDate,
endDate,
clientId
}
I need to get all clients that have at least one day available between two provided dates, you can see my screenshot as reference.
In screenshot I have two clients, and I need to return both of them. As you can see, the first client have three free days (21.5.-23.5.) between provided period (16.5.-29.5.) and the second client have not any caring periods.
So far i have tried something like this
SELECT * FROM client cl
WHERE cl.id NOT IN (SELECT clientId FROM caring
WHERE endDate >= CURDATE() AND endDate <= DATE_ADD(CURDATE(), INTERVAL 14 DAY))
This one return only clients that don't have carings at all. That is partially what I need because this query don't cover first client from my screenshot. Then I tried query bellow.
SELECT ca.startDate, ca.endDate, cl.firstName, cl.lastName
FROM caring ca
LEFT JOIN client cl on cl.id = ca.clientId
WHERE ca.startDate NOT IN (
SELECT endDate
FROM caring
) AND ca.startDate <= '2017-05-29' AND ca.endDate >= '2017-05-16'
But im not getting desired results.
Any idea how I can achieve this, thx in advance!
Select carings in period of interest and limit start/end dates to this period, respectively. This limitation will allow for easier counting of "booked" i.e. not-free days later on.
SELECT ca.id,
-- Limit start/end dates to period of interest, respectively
GREATEST (ca.startDate, '2017-05-16') AS `effectiveStartDate`,
LEAST (ca.endDate, '2017-05-29') AS `effectiveEndDate`,
ca.clientId
FROM carings ca
WHERE ca.startDate <= '2017-05-29' AND ca.endDate >= '2017-05-16';
Next, count booked days:
DATEDIFF (DATE_ADD (LEAST (ca.endDate, '2017-05-29'), INTERVAL 1 DAY),
GREATEST (ca.startDate, '2017-05-16'))
AS `effectiveDays`
Finally, filter out clients that are booked over the whole period. This is done by comparing
the sum of booked days per client (GROUP BY) to
the number of days of the whole period (HAVING sumDays < DATEDIFF(...)).
As you want also clients that are not booked at all over the whole period, I would suggest to start from the clients table and "just" LEFT JOIN the (effective) carings:
SELECT cl.id, cl.name, IFNULL (SUM (eca.effectiveDays), 0) AS `sumDays`
FROM clients cl
LEFT JOIN
(SELECT ca.id,
-- Limit start/end dates to period of interest, respectively
GREATEST (ca.startDate, '2017-05-16') AS `effectiveStartDate`,
LEAST (ca.endDate, '2017-05-29') AS `effectiveEndDate`,
DATEDIFF (
DATE_ADD (LEAST (ca.endDate, '2017-05-29'), INTERVAL 1 DAY),
GREATEST (ca.startDate, '2017-05-16'))
AS `effectiveDays`,
ca.clientId
FROM carings ca
WHERE ca.startDate <= '2017-05-29' AND ca.endDate >= '2017-05-16')
eca -- effectiveCarings
ON eca.clientId = cl.id
GROUP BY cl.id, cl.name
HAVING sumDays <
DATEDIFF (DATE_ADD ('2017-05-29', INTERVAL 1 DAY), '2017-05-16')
ORDER BY cl.id;
See also http://sqlfiddle.com/#!9/1038b9/19
Select clients whose endDate happens before the last day of your provided period and there's a gap between endDate and startDate during the specified period.
SELECT * FROM client FULL OUTER JOIN caring ON client.id = caring.clientId WHERE endDate <= '2017-05-28' AND DATEDIFF(day, startDate, endDate) > DATEDIFF(day, '2017-05-16' , endDate);

MySQL group by week

I have a large number of records with a transaction datetime field going back several years. I would like to do a comparative analysis between the same timespan this year and last. How can I group by week over a 3 month range?
I'm running into problems using the YEARWEEK and WEEK functions because of the day the year 2012 starts of versus the day 2011 starts on.
Given that I have records with datetimes everyday from Jan 1st to the current day, and records with the same datetimes from the prior year, how can I group by week so the output is sums with dates like: 01/01/2011, 01/08/2011, 01/15/2011, etc., and 01/01/2012, 01/08/2012, 01/15/2012, etc.?
My query so far is as follows:
SELECT
DATE_FORMAT(A.transaction_date, '%Y-%m-%d') as date,
ROUND(sum(A.quantity), 3) AS quantity,
ROUND(sum(A.total_amount), 3) AS amount,
A.product_code,
D.fuel_type_code,
D.fuel_type_name,
C.customer_code,
C.customer_name
FROM
cl_transactions AS A
INNER JOIN
card AS B ON A.card_number=B.card_number
INNER JOIN
customer AS C ON B.customer_code=C.customer_code
INNER JOIN
fuel_type AS D ON A.fuel_type=D.fuel_type_code
WHERE
((A.transaction_date >= DATE_FORMAT(NOW() - INTERVAL 3 MONTH, '%Y-%m-01')) OR (A.transaction_date - INTERVAL 1 YEAR >= DATE_FORMAT(NOW() - INTERVAL 15 MONTH, '%Y-%m-01') AND A.transaction_date <= NOW() - INTERVAL 1 YEAR))
GROUP BY
A.transaction_date, fuel_type_code;
I would essentially like something that achieves the following pseudo-query:
GROUP BY
STARTING FROM THE OLDEST DATE (A.transaction_date + INTERVAL 6 DAY)
I started with an inner query using sqlvariables to build out from/to ranges for this year and last year of each respective start of year/month/day (ex: 2012-01-01 and 2011-01-01 respectively). From that, I'm also pre-formatting the date for final output so you have ONE master date basis for display reflecting that of whatever the "this year" week would be.
From that, I do a join to the transaction table where the transaction date is BETWEEN the respective start of current week and start of next week. Since date/time stamps include hour minute, 2012-01-01 by itself is implied as 12:00:00am (midnight) of the day. and between will go UP TO 7 days later 12:00:00 am. And that date will become the start date of the following week.
So, by joining on the date being between EITHER last yr or this yr time period, its the same group qualification. So the field selection does a ROUND( SUM( IF() )) per respective last year or this year. if the incoming transaction date is LESS than the current year's week start, then it must be a record from prior year, otherwise its for the current year. So, respectively, add the value itself, or zero as it applies.
So now, you have the group by. The week that it qualified for was already prepared from the inner query via "ThisYearWeekOf" formatted column, regardless of the otherwise computed "YEARWEEK()" or "WEEK()". The date ranges took care of that qualification for us.
Finally, I added the fuel-type as a join and included that as the group by. You have to group by all non-aggregate columns for proper SQL, although MySQL lets you get by by just grabbing the first entry for the given group if it is NOT so specified in group by.
To close, I DID include the information for the customer as you didn't have it in the group by and did not appear to be applicable... it would just arbitrarily grab one. However, I've added it to the group by, so now your records will show at the per customer level, per product and fuel type, how much sales and quantity between this year and last.
SELECT
JustWeekRange.ThisYearWeekOf,
CTrans.product_code,
FT.fuel_type_code,
FT.fuel_type_name,
C.customer_code,
C.customer_name,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, CTrans.Quantity, 0 )), 3) as LastYrQty,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, CTrans.total_amount, 0 )), 3) as LastYrAmt,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, 0, CTrans.Quantity )), 3) as ThisYrQty,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, 0, CTrans.total_amount )), 3) as ThisYrAmt,
FROM
( SELECT
DATE_FORMAT(#ThisYearDate, '%Y-%m-%d') as ThisYearWeekOf,
#LastYearDate as LastYrWeekStart,
#ThisYearDate as ThisYrWeekStart,
#LastYearDate := date_add( #LastYearDate, interval 7 day ) LastYrStartOfNextWeek,
#ThisYearDate := date_add( #ThisYearDate, interval 7 day ) ThisYrStartOfNextWeek
FROM
(select #ThisYearDate := '2012-01-01',
#LastYearDate := '2011-01-01' ) sqlvars,
cl_transactions justForLimit
HAVING
ThisYrWeekStart < '2012-04-01'
LIMIT 15 ) JustWeekRange
JOIN cl_transactions AS CTrans
ON CTrans.transaction_date BETWEEN
JustWeekRange.LastYrWeekStart AND JustWeekRange.LastYrStartOfNextWeek
OR CTrans.transaction_date BETWEEN
JustWeekRange.ThisYrWeekStart AND JustWeekRange.ThisYrStartOfNextWeek
JOIN fuel_type FT
ON CTrans.fuel_type = FT.fuel_type_code
JOIN card
ON CTrans.card_number = card.card_number
JOIN customer AS C
ON card.customer_code = C.customer_code
GROUP BY
JustWeekRange.ThisYearWeekOf,
CTrans.product_code,
FT.fuel_type_code,
FT.fuel_type_name,
C.customer_code,
C.customer_name

Grouping by X days

I have a database that shows me different stats about different campaigns, each row has a timestamp value name "date".
I wrote the code for choosing and summarizing a range of dates, for example: 21-24/07/2010.
Now I need to add an option to choose a range of dates, but also to group the stats for each X days.
Let's say the user chooses to see stats from all the month: 01/07-31/07. I would like to present him the stats grouped by X days, let's say 3, so he will see the stats 01-03/07, 04-06/07,07-09/07 and so on...
I almost managed doing it using this code:
SELECT t1.camp_id,from_days( floor( to_days( date ) /3 ) *3 ) AS 'first_date'
FROM facebook_stats t1
INNER JOIN facebook_to_campaigns t2 ON t1.camp_id = t2.facebook_camp_id
WHERE date
BETWEEN 20100717000000
AND 20100724235959
GROUP BY from_days( floor( to_days( date ) /3 ) *3 ) , t2.camp_id
It actually does group it (by 3 days), but the problem is that for some reason it starts from the 16/07, and not the 17/07, then grouping each time 3 days at a time.
Would love to hear a solution to the code or I gave, or a better solution you have in mind.
To_Days returns the number of days since the year 0. When you divide it by 3, it considers only the quotient and not the remainder. eg. If it has been 5 days since year 0, then to_days will return 1.
To_days(20100717000000) must be leaving a remainder of 1. Basically To_days(20100716000000) is exactly divisible by 3 but 17th is not.
You could try this query:
DECLARE #startDate datetime
DECLARE #endDate datetime
DECLARE #groupByInterval INT
SET #startdate = 20100717000000
SET #enddate = 20100724235959
SET #groupByInterval = 3
SELECT
t1.camp_id, from_days(
to_days(#startDate)+floor(
(to_days(date)-to_days(#startDate))/#groupByInterval
)
* #groupByInterval)
AS first_date
FROM facebook_stats t1
INNER JOIN facebook_to_campaigns t2 ON t1.camp_id = t2.facebook_camp_id
WHERE date
BETWEEN #startDate
AND #endDate
GROUP BY first_date , t2.camp_id