MySQL group by week - mysql

I have a large number of records with a transaction datetime field going back several years. I would like to do a comparative analysis between the same timespan this year and last. How can I group by week over a 3 month range?
I'm running into problems using the YEARWEEK and WEEK functions because of the day the year 2012 starts of versus the day 2011 starts on.
Given that I have records with datetimes everyday from Jan 1st to the current day, and records with the same datetimes from the prior year, how can I group by week so the output is sums with dates like: 01/01/2011, 01/08/2011, 01/15/2011, etc., and 01/01/2012, 01/08/2012, 01/15/2012, etc.?
My query so far is as follows:
SELECT
DATE_FORMAT(A.transaction_date, '%Y-%m-%d') as date,
ROUND(sum(A.quantity), 3) AS quantity,
ROUND(sum(A.total_amount), 3) AS amount,
A.product_code,
D.fuel_type_code,
D.fuel_type_name,
C.customer_code,
C.customer_name
FROM
cl_transactions AS A
INNER JOIN
card AS B ON A.card_number=B.card_number
INNER JOIN
customer AS C ON B.customer_code=C.customer_code
INNER JOIN
fuel_type AS D ON A.fuel_type=D.fuel_type_code
WHERE
((A.transaction_date >= DATE_FORMAT(NOW() - INTERVAL 3 MONTH, '%Y-%m-01')) OR (A.transaction_date - INTERVAL 1 YEAR >= DATE_FORMAT(NOW() - INTERVAL 15 MONTH, '%Y-%m-01') AND A.transaction_date <= NOW() - INTERVAL 1 YEAR))
GROUP BY
A.transaction_date, fuel_type_code;
I would essentially like something that achieves the following pseudo-query:
GROUP BY
STARTING FROM THE OLDEST DATE (A.transaction_date + INTERVAL 6 DAY)

I started with an inner query using sqlvariables to build out from/to ranges for this year and last year of each respective start of year/month/day (ex: 2012-01-01 and 2011-01-01 respectively). From that, I'm also pre-formatting the date for final output so you have ONE master date basis for display reflecting that of whatever the "this year" week would be.
From that, I do a join to the transaction table where the transaction date is BETWEEN the respective start of current week and start of next week. Since date/time stamps include hour minute, 2012-01-01 by itself is implied as 12:00:00am (midnight) of the day. and between will go UP TO 7 days later 12:00:00 am. And that date will become the start date of the following week.
So, by joining on the date being between EITHER last yr or this yr time period, its the same group qualification. So the field selection does a ROUND( SUM( IF() )) per respective last year or this year. if the incoming transaction date is LESS than the current year's week start, then it must be a record from prior year, otherwise its for the current year. So, respectively, add the value itself, or zero as it applies.
So now, you have the group by. The week that it qualified for was already prepared from the inner query via "ThisYearWeekOf" formatted column, regardless of the otherwise computed "YEARWEEK()" or "WEEK()". The date ranges took care of that qualification for us.
Finally, I added the fuel-type as a join and included that as the group by. You have to group by all non-aggregate columns for proper SQL, although MySQL lets you get by by just grabbing the first entry for the given group if it is NOT so specified in group by.
To close, I DID include the information for the customer as you didn't have it in the group by and did not appear to be applicable... it would just arbitrarily grab one. However, I've added it to the group by, so now your records will show at the per customer level, per product and fuel type, how much sales and quantity between this year and last.
SELECT
JustWeekRange.ThisYearWeekOf,
CTrans.product_code,
FT.fuel_type_code,
FT.fuel_type_name,
C.customer_code,
C.customer_name,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, CTrans.Quantity, 0 )), 3) as LastYrQty,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, CTrans.total_amount, 0 )), 3) as LastYrAmt,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, 0, CTrans.Quantity )), 3) as ThisYrQty,
ROUND( SUM( IF( CTrans.transaction_date < JustWeekRange.ThisYrWeekStart, 0, CTrans.total_amount )), 3) as ThisYrAmt,
FROM
( SELECT
DATE_FORMAT(#ThisYearDate, '%Y-%m-%d') as ThisYearWeekOf,
#LastYearDate as LastYrWeekStart,
#ThisYearDate as ThisYrWeekStart,
#LastYearDate := date_add( #LastYearDate, interval 7 day ) LastYrStartOfNextWeek,
#ThisYearDate := date_add( #ThisYearDate, interval 7 day ) ThisYrStartOfNextWeek
FROM
(select #ThisYearDate := '2012-01-01',
#LastYearDate := '2011-01-01' ) sqlvars,
cl_transactions justForLimit
HAVING
ThisYrWeekStart < '2012-04-01'
LIMIT 15 ) JustWeekRange
JOIN cl_transactions AS CTrans
ON CTrans.transaction_date BETWEEN
JustWeekRange.LastYrWeekStart AND JustWeekRange.LastYrStartOfNextWeek
OR CTrans.transaction_date BETWEEN
JustWeekRange.ThisYrWeekStart AND JustWeekRange.ThisYrStartOfNextWeek
JOIN fuel_type FT
ON CTrans.fuel_type = FT.fuel_type_code
JOIN card
ON CTrans.card_number = card.card_number
JOIN customer AS C
ON card.customer_code = C.customer_code
GROUP BY
JustWeekRange.ThisYearWeekOf,
CTrans.product_code,
FT.fuel_type_code,
FT.fuel_type_name,
C.customer_code,
C.customer_name

Related

Only count working days in a DATEDIFF (MySQL)

So, next problem :'), I have the following query that #MatBailie provided to me here (thanks again!):
SELECT
taskname,
employee,
SUM(
DATEDIFF(
LEAST( enddate, '2023-12-31'),
GREATEST(startdate, '2023-01-01')
)
+1
) AS total_days,
FROM
schedule
WHERE
startDate <= '2023-12-31'
AND
endDate >= '2023-01-01'
GROUP BY
employee,
taskname
This query will tell me how many days a certain employee has spent on a certain task in a given period of time, and it works great!
The next thing I would like to do however, is to substract non-working days from the SUM of DATEDIFFs for some of the tasks (e.g. when the task has "count_non_working_days= 0" in a reference table called 'activities').
For example, my schedule also keeps track of the amount of days off every employee has taken (days off are also scheduled as tasks). But of course, days off that fall in a weekend or on a holiday should not be counted towards the total of days off a person has taken in a year. (Note that I did consider scheduling days off only on weekdays/non-holidays, but this is not a practical option in the scheduling software I use because employees request a leave from date A to date B, and this request is approved or denied as-is (they don't make 3 holiday requests excluding the weekends if they want to go on a vacation for 3 weeks, if you get my drift).
So, if an employee goes on a vacation for 10 days, this is counted as 10 days off, but this holiday may have 1 or 2 weekends in it, so the sum of days of that the employee has taken off should be 6, 7 or 8, and not 10. Furthermore, if it has a holiday such as Easter Monday in it (I have all dates of my holidays in a PHP array), this should also be subtracted.
I have tried the solutions mentioned here, but I couldn't get them to work (a) because those are in SQL server and (b) because they don't allow putting in an array of holidays, (c) nor allow toggling the subtraction on and off depending on the event type.
Here's my attempt of explaining what I'm trying to do in my pseudo-SQL:
SELECT
taskname,
employee,
IF( activities.count_non_working_days=1,
-- Just count the days that fall in the current year:
SUM(
DATEDIFF(
LEAST( enddate, '2023-12-31'),
GREATEST( startdate, '2023-01-01')
)
+ 1
) AS total_days,
-- Subtract the amount of saturdays, sundays and holidays:
SUM(
DATEDIFF(
LEAST( enddate, '2023-12-31'),
GREATEST( startdate, '2023-01-01')
)
- [some way of getting the amount of saturdays, sundays and holidays that fall within this date range]
+ 1
) AS total_days
)
FROM
schedule
LEFT JOIN
activities
ON activity.name = schedule.name
WHERE
startDate <= '2023-12-31'
AND
endDate >= '2023-01-01'
GROUP BY
employee,
taskname
I know the query above is probably faulty on so many levels, but I hope it clarifies what I'm trying to do.
Thanks once more for all the help!
Edit: basically I need something like this, but in MySQL and preferably with a toggle that turns the subtraction on or off depending on the task type.
Edit 2: To clarify: my schedule table holds ALL activities, including holidays. For example, some records may include:
employee
taskname
startDate
endDate
Mr. Anderson
Programming
2023-01-02
2023-01-06
Mr. Anderson
Programming
2023-01-09
2023-01-14
Mr. Anderson
Vacation
2023-01-14
2023-01-31
In another table, Programming is defined as "count_non_working_days=1", because working in the weekends should count, while Vacation is defined as "count_non_working_days=0", because taking a day off on the weekend should not count towards your total amount of days taken off.
The totals for this month should therefore state that:
Mr. Anderson has done Programming for 11 days (of which 1 was on a saturday)
Mr. Anderson has taken 12 days off for (because the 2 weekends in this period don't count as days off).
Create a calendar table, with every date of interest (so, something like 2000-01-01 to 2099-01-01) and include columns such as is_working_day which can be set to TRUE/FLASE or 1/0. Then you can update that column as necessary, and join on that table in your query to get working dates that the employee has booked off.
In short, you count the relevant dates, rather than deducting the irrelevant dates.
SELECT
s.employee,
s.taskname,
COUNT(*) AS total_days,
FROM
(
schedule AS s
INNER JOIN
activities AS a
ON a.taskname = s.taskname
)
INNER JOIN
calendar AS c
ON c.calendar_date >= s.startDate
AND c.calendar_date <= s.endDate
AND c.is_working_day >= 1 - a.count_non_working_days
WHERE
c.calendar_date >= '2023-01-01'
AND c.calendar_date <= '2023-12-31'
GROUP BY
s.employee,
s.taskname
Your calendar table can then also include flags such as is_weekend, is_bank_holiday, is_fubar, is_amazing, etc, and the is_working_day can be a computed column from those inputs.
Note on is_working_day filter...
WHERE
( count_non_working_day = 1 AND is_working_day IN (0, 1) )
OR
( count_non_working_day = 0 AND is_working_day IN ( 1) )
-- change to (1 - count_non_working_day)
WHERE
( (1 - count_non_working_day) = 0 AND is_working_day IN (0, 1) )
OR
( (1 - count_non_working_day) = 1 AND is_working_day IN ( 1) )
-- simplify
WHERE
( (1 - count_non_working_day) <= is_working_day )
OR
( (1 - count_non_working_day) <= is_working_day )
-- simplify
WHERE
( (1 - count_non_working_day) <= is_working_day )
Demo: https://dbfiddle.uk/YAmpLmVE
This is to calculate all the weeekends between two giving dates It may help you :
SELECT (
((WEEK('2022-12-31') - WEEK('2022-01-01')) * 2) -
(case when weekday('2022-12-31') = 6 then 1 else 0 end) -
(case when weekday('2022-01-01') = 5 then 1 else 0 end)
)
You will have to substract also holidays that fall within this date range.

How to count only business days (Monday to Friday) per month between two dates in MySQL 5.7?

I have the following table called vacations, where the employee id is displayed along with the start and end date of their vacations:
employee
start
end
1001
26/10/21
22/11/21
What I am looking for is to visualize the number of vacation days that each employee had, but separating them by month and without non-working days (Saturdays and Sundays).
For example, if you wanted to view the vacations for employee 1001, the following result should be displayed:
days
month
4
10
16
11
I have the following query that I have worked with:
SELECT id_employee,
EXTRACT(YEAR_MONTH FROM t.Date) as YearMonth,
COUNT(1) as Days
FROM (SELECT v.id_employee,
DATE_ADD(v.start, interval s.seq - 1 DAY) AS Date
FROM vacations v
CROSS JOIN seq_1_to_100 s
WHERE DATE_ADD(v.start, interval s.seq - 1 DAY) <= v.end
ORDER BY v.id_employee, , v.start, s.seq
) t
GROUP BY id_employee,
EXTRACT(YEAR_MONTH FROM t.Date)
With this query I separate the days between a range of two dates with their respective month, but how could I adapt it to stop considering Saturdays and Sundays? I'm working with MySQL 5.7 in phpMyAdmin
instead of count sum the compaarison of weekday function, which give what day it is .
But you should always save fates n a valid mysql manner 2021-10-28
SELECT id_employee,
EXTRACT(YEAR_MONTH FROM t.Date) as YearMonth,
SUM(WEEKDAY(`Date`) < 5) as Days
FROM (SELECT v.id_employee,
DATE_ADD(v.start, interval s.seq - 1 DAY) AS Date
FROM vacations v
CROSS JOIN seq_1_to_100 s
WHERE DATE_ADD(v.start, interval s.seq - 1 DAY) <= v.end
ORDER BY v.id_employee, v.start, s.seq
) t
GROUP BY id_employee,
EXTRACT(YEAR_MONTH FROM t.Date)

How to group mysql results in weeks

I have a table like this:
I need to sum how many messages were delivered per msisdn in last 8 weeks(but for each week) from date entered. Here is what I came up with:
SELECT count(*) as ukupan_broj, SUM(IF (sent_messages.delivered = 1,1,0 )) as broj_dostavljenih,
count(*) - SUM(IF (sent_messages.delivered = 1,1,0 )) as non_billed,
SUM(IF (sent_messages.delivered = 1,1,0 )) / count(*) as ratio,
`sent_messages`.`msisdn`,
MONTH(`sent_messages`.`datetime`) AS MONTH, WEEK(`sent_messages`.`datetime`) AS WEEK,
DATE_FORMAT(`sent_messages`.`datetime`, '%Y-%m-%d') AS DATE
FROM `sent_messages`
INNER JOIN `received_messages` on `received_messages`.`uniqueid`=`sent_messages`.`originalID`
and `received_messages`.`msisdn`=`sent_messages`.`msisdn`
WHERE `sent_messages`.`datetime` >= '2016-12-12'
AND `sent_messages`.`originalID` = `received_messages`.`uniqueid`
AND `sent_messages`.`datetime` <= '2017-12-30'
AND `sent_messages`.`datetime` >= `received_messages`.`datetime`
AND `sent_messages`.`datetime` <= ( `received_messages`.`datetime` + INTERVAL 2 HOUR )
AND `sent_messages`.`type` = 'PAID'
GROUP BY WEEK
ORDER BY DATE ASC
And because I'm grouping it by WEEK, my result is showing sum of all delivered, undelivered etc. but not per msisdn. Here is how result looks like:
And when I add msisdn in GROUP BY clause I don't get the result the way I need it.
And I need it like this:
Please help me to write optimized query to fetch these results for each msisdn per last 8 weeks, because I'm stuck.
WEEK(...) has a problem near the first of the year. Instead, you could use TO_DAYS:
WHERE datetime > CURDATE() - INTERVAL 8 WEEK -- for the last 8 weeks
GROUP BY MOD(TO_DAYS(datetime), 7) -- group by week
That is quite simple, but there is a bug in it. It only works if today is the last day of a "week". And if date%7 lands on the desired day of week.
WHERE datetime > CURDATE() - INTERVAL 9 WEEK -- for the last 8 weeks
GROUP BY MOD(TO_DAYS(datetime) - 3, 7) -- group by week
Is the first cut at fixing the bugs -- 9-week interval will include the current partial week and the partial week 8 weeks ago. The "- 3" (or whatever number works) will align your "week" to start on Monday or Sunday or whatever.
SUM(IF (sent_messages.delivered = 1,1,0 )) can be shortened to SUM(delivered = 1) or even SUM(delivered) if that column only has 0 or 1 values.

Query with grouping my multiple date ranges

I need to query data with count and sum by multiple date ranges and I am looking for a faster query than what I am doing now.
I have a transaction table with a date and amount. I need to present a table with a count of transactions and total amount by date ranges of today, yesterday, this week, last week, this month, last month. Currently I am doing sub queries, is there a better way?
select
(select count(date) from transactions where date between ({{today}})) as count_today,
(select sum(amount) from transactions where date between ({{today}})) as amount_today,
(select count(date) from transactions where date between ({{yesterday}})) as count_yesterday,
(select sum(amount) from transactions where date between ({{yesterday}})) as amount_yesterday,
(select count(date) from transactions where date between ({{thisweek}})) as count_thisweek,
(select sum(amount) from transactions where date between ({{thisweek}})) as amount_thisweek,
etc...
Is there a better way?
although you have a marked solution, I have another that will probably simplify your query even further using MySQL variables so you don't have to mis-type / calculate dates and such...
Instead of declaring variables up front, you can do them inline as a select statement, then use them as if they were columns in another table. Since it is created as a single row, there is no Cartesian result. First the query, then I'll describe the computations on it.
select
sum( if( t.date >= #today AND t.date < #tomorrow, 1, 0 )) as TodayCnt,
sum( if( t.date >= #today AND t.date < #tomorrow, amount, 0 )) as TodayAmt,
sum( if( t.date >= #yesterday AND t.date < #today, 1, 0 )) as YesterdayCnt,
sum( if( t.date >= #yesterday AND t.date < #today, amount, 0 )) as YesterdayAmt,
sum( if( t.date >= #FirstOfWeek AND t.date < #EndOfWeek, 1, 0 )) as WeekCnt,
sum( if( t.date >= #FirstOfWeek AND t.date < #EndOfWeek, amount, 0 )) as WeekAmt
from
transations t,
( select #today := curdate(),
#yesterday := date_add( #today, interval -1 day ),
#tomorrow := date_add( #today, interval 1 day ),
#FirstOfWeek := date_add( #today, interval +1 - dayofweek( #today) day ),
#EndOfWeek := date_add( #FirstOfWeek, interval 7 day ),
#minDate := least( #yesterday, #FirstOfWeek ) ) sqlvars
where
t.date >= #minDate
AND t.date < #EndOfWeek
Now, the dates. Since the #variables are prepared in sequence, you can think of it as an inline program to set the variables. Since they are a pre-query, they are done first and available for the duration of the rest of the query as previously stated. So to start, I am working with whatever "curdate()" is which gets the date portion only without respect to time. From that, subtract 1 day (add -1) to get the beginning of yesterday. Add 1 day to get Tomorrow. Then, the first of the week is whatever the current date is +1 - the actual day of week (you will see shortly). Add 7 days from the first of the week to get the end of the week. Finally, get whichever date is the LEAST between a yesterday (which COULD exist at the end of the prior week), OR the beginning of the week.
Now look at today for example... Feb 23rd.
Sun Mon Tue Wed Thu Fri Sat Sun
21 22 23 24 25 26 27 28
Today = 23
Yesterday = 22
Tomorrow = 24
First of week = 23 + 1 = 24 - 3rd day of week = 21st
End of Week = 21st + 7 days = 28th.
Why am I doing a cutoff of the dates stripping times? To simplify the SUM() condition for >= AND <. If I stated some date = today, what if your transactions were time-stamped. Then you would have to extract the date portion only to qualify. By this approach, I can say that "Today" count and amount is any date >= Feb 23 at 12am midnight AND < Feb 24th 12 am midnight. This is all time inclusive Feb 23rd up to 11:59:59pm hence LESS than Feb 24th (tomorrow).
Similar consideration for yesterday is all inclusive UP TO but not including whatever "today" is. Similarly for the week range.
Finally the WHERE clause is looking for the earliest date as the range so it does not have to run through the entire database of transactions to the end.
Lastly, if you ever wanted the counts and totals for a prior week / period, whatever, you could just extrapolate and change
#today := '2015-01-24'
and the computations will be AS IF the query was run ON THAT DATE.
Similar if you cared to alter such as for a month, you could compute the first of the month to the first of a following month for MONTHLY totals.
Hope you enjoy this flexible solution to you.
Yes, you can use aggregate functions on conditional expressions, like so:
SELECT SUM(IF(date between ({{today}})), 1, 0) AS count_today
, SUM(IF(date between ({{today}})), amount, 0) AS amount_today
, ...

Is there a MySQL Statement for this or are multiple statements needed?

I have a table with MLSNumber, ListingContractDate, CloseDate.
I want to summarize the activity grouped my month starting with the current month and going back to January 2000.
I have this statement which summarizes the ListingContractDate by month.
SELECT COUNT(MLSNumber) AS NewListings, DATE_FORMAT(ListingContractDate,'%M %Y')
FROM Listings
WHERE Neighbourhood = 'Beachside'
AND ListingContractDate >= '2000-01-01'
GROUP BY YEAR(ListingContractDate), MONTH(ListingContractDate)
ORDER BY ListingContractDate DESC
The two problems with this statement are if there is nothing found in a specific month it skips that month, and I would need to return a 0 so no months are missing, and I am not sure how to get the same count on the CloseDate field or if I just have to run a 2nd query and match the two results up by month and year using PHP.
An exceptionally useful item to have is a "tally table" which simply consists on a set of integers. I used a script found HERE to generate such a table.
With that table I can now LEFT JOIN the time related data to it as shown below:
set #startdt := '2000-01-01';
SELECT COUNT(MLSNumber) AS NewListings, DATE_FORMAT(T.Mnth,'%M %Y')
FROM (
select
tally.id
, date_add( #startdt, INTERVAL (tally.id - 1) MONTH ) as Mnth
, date_add( #startdt, INTERVAL tally.id MONTH ) as NextMnth
from tally
where tally.id <= (
select period_diff(date_format(now(), '%Y%m'), date_format(#startdt, '%Y%m')) + 1
)
) t
LEFT JOIN Temp On Temp.ListingContractDate >= T.Mnth and Temp.ListingContractDate < T.NextMnth
GROUP BY YEAR(T.Mnth), MONTH(T.Mnth)
ORDER BY T.Mnth DESC
Logc,
define a stating date
calculate the number of months from that date until now (using
PERIOD_DIFF + 1)
choose that number of records from the tally table
create period start and end dates (tally.Mnth & tally.NextMnth)
LEFT JOIN the actual data to the tally table using
Temp.ListingContractDate >= T.Mnth and Temp.ListingContractDate < T.NextMnth
group and count the data
see this sqlfiddle`