Count number of occurrences of a date range - mysql

I am working on mysql select query with the ultimate goal of getting an integer based on the count of a subset from a pre-defined date range.
Scenario has a few constraints I can't seem to navigate in a clean way. For the sake of being concise, lets make an assumption that I am working with data for a membership organization. The basics of my situation include:
A user joins on a date (later referenced as 'join_date' and stored in my database in a datefield in format 'yyyy-mm-dd')
My calendar is based on two periods: a Fall Semester and a Spring Semester (names may be a little misleading). Within the calendar year, the Spring semester is defined as February 16 through September 15, while the Fall semester is defined as September 16 though February 15 (into the next year).
When a user joins, for the purposes of counting for this query, the next semester is considered the first semester. (i.e. joining on January 5 (Fall) means that users first semester will be Spring of that year, while joining on September 20 (Fall) means that users first semester will be Spring of the next year.
Back to my query needs. Relative to the next semester from the current date, I need to calculate the number of Fall + Spring semesters that remain before the eighth semester (including that eighth semester). i.e., If I join on April 1, 2016, my first semester would be Fall 2016, and my eighth would be Spring 2020. As of the current date (today = 2017-07-17), I need to get a count of 6 returned, which would represent Fall 2017, Spring 2018, Fall 2018, Spring 2019, Fall 2019, and Spring 2020.
A few other examples:
Join 2016-01-01, 1st semester is Spring 2016, 8th semester is Fall 2019, remaining count needed to return is 5.
Join 2016-10-01, 1st semester is Spring 2017, 8th semester is Fall 2020, remaining count needed to return is 7.
I have been using CASE statements to help break this down, but I am a novice and do not know if this is the best approach. It is strongly preferred to do this dynamically and not have to to use a staging table or similar, in which case it would be easy. My current CASE statements, which seem to work include:
CASE
WHEN DATE_FORMAT( join_date, '%m/%d' ) BETWEEN '01/01' AND '02/14' THEN CONCAT( 'Spring ', YEAR ( join_date) )
WHEN DATE_FORMAT( join_date, '%m/%d' ) BETWEEN '02/15' AND '09/14' THEN CONCAT( 'Fall ', YEAR ( join_date) )
WHEN DATE_FORMAT( join_date, '%m/%d' ) BETWEEN '09/15' AND '12/31' THEN CONCAT( 'Spring ', YEAR ( join_date) + 1 )
END AS '1st semester'
CASE
WHEN DATE_FORMAT( join_date, '%m/%d' ) BETWEEN '01/01' AND '02/14' THEN CONCAT( 'Fall ', YEAR ( join_date) + 3 )
WHEN DATE_FORMAT( join_date, '%m/%d' ) BETWEEN '02/15' AND '09/14' THEN CONCAT( 'Spring ', YEAR ( join_date) + 4 )
WHEN DATE_FORMAT( join_date, '%m/%d' ) BETWEEN '09/15' AND '12/31' THEN CONCAT( 'Fall ', YEAR ( join_date ) + 4 )
END AS '8th semester'
CASE
WHEN DATE_FORMAT( CURDATE( ), '%m/%d' ) BETWEEN '01/01' AND '02/14' THEN CONCAT( 'Fall ', YEAR ( CURDATE( ) ) - 1 )
WHEN DATE_FORMAT( CURDATE( ), '%m/%d' ) BETWEEN '02/15' AND '09/14' THEN CONCAT( 'Spring ', YEAR ( CURDATE( ) ) )
WHEN DATE_FORMAT( CURDATE( ), '%m/%d' ) BETWEEN '09/15' AND '12/31' THEN CONCAT( 'Fall ', YEAR ( CURDATE( ) ) )
END AS 'current semester'
CASE
WHEN DATE_FORMAT( CURDATE( ), '%m/%d' ) BETWEEN '01/01' AND '02/14' THEN CONCAT( 'Spring ', YEAR ( CURDATE( ) ) )
WHEN DATE_FORMAT( CURDATE( ), '%m/%d' ) BETWEEN '02/15' AND '09/14' THEN CONCAT( 'Fall ', YEAR ( CURDATE( ) ) )
WHEN DATE_FORMAT( CURDATE( ), '%m/%d' ) BETWEEN '09/15' AND '12/31' THEN CONCAT( 'Spring ', YEAR ( CURDATE( ) ) + 1 )
END AS 'next semester'
Ultimately, I need find the count of the 'remaining semesters'
Any help or guidance would be greatly appreciated. You can find a sample of a few user examples at http://sqlfiddle.com/#!9/01319.
The expected result for these examples would be to get 2 for user 3060457, 5 for user 3060458, 6 for user 3060459, and 7 for user 3060460.

Related

How to handle Leap Years in anniversary for a current month in MySQL

I'm attempting to write a query that finds the user's work anniversary for the current month and considers a leap year as well (don't get an idea how to manage within the query)
Table "emp_detail":
emp_no
join_date
1
2002-06-10
2
2022-06-25
3
2020-02-29
4
2002-02-15
5
2011-02-01
So far I have tried the below query:
SELECT no,
join_date
CASE WHEN DATEADD(YY,DATEDIFF(yy,join_date,GETDATE()),join_date) < GETDATE()
THEN DATEDIFF(yy,join_date,GETDATE())
ELSE DATEDIFF(yy,join_date,GETDATE()) - 1
END AS 'anniversary'
FROM emp_detail
WHERE 'status' = 'active'
HAVING MONTH(join_date) = 06/07/08 -- ...so on
EDIT:
Expected output:
For FEBRUARY month current year 2022
emp_no
join_date
anniversary_date
3
2020-02-29
2022-02-28 (Here, want get 29 Feb 2020 leap year record with non leap year 2022)
4
2002-02-15
2022-02-15
5
2011-02-01
2022-02-01
Looking for a way to display employees with anniversary dates coming up at the start of the current month considering the leap year.
Am I going in the right direction? Any help would be great.
Most (all?) SQL engines already handle year arithmetic involving leap days the way you want: folding the leap day to the final day of February.
So, computing the employee's join_date + INTERVAL x YEAR will handle '2020-02-29' correctly. To compute that interval in MySQL/MariaDB for the current year, you may use TIMESTAMPDIFF compute the difference between EXTRACTed years yourself:
SELECT emp_no,
join_date,
join_date +
INTERVAL (EXTRACT(YEAR FROM CURDATE()) -
EXTRACT(YEAR FROM join_date)) YEAR
AS "anniversary_date_this_year",
....
You can split your problem into two steps:
filtering your "join_date" values using the current month
changing the year to your "join_date"
getting the minimum value between your updated "join_date" and the last day for that date (>> this will handle leap years kinda efficiently wrt other solutions that attempt to check for specific years every time)
WITH cte AS (
SELECT emp_no,
join_date,
STR_TO_DATE(CONCAT_WS('-',
YEAR (CURRENT_DATE()),
MONTH(join_date ),
DAY (join_date )),
'%Y-%m-%d') AS join_date_now
FROM tab
WHERE MONTH(join_date) = MONTH(CURRENT_DATE())
AND YEAR(join_date) < YEAR(CURRENT_DATE())
)
SELECT emp_no,
join_date,
LEAST(join_date_now, LAST_DAY(join_date_now)) AS anniversary_date
FROM cte
Check the demo here
Note: in the demo, since you want to look at February months and we are in July, the WHERE clause will contain an additional -5 while checking the month.
You can make use of extract function in MySQL
select * from emp_detail where extract( month from (select now())) = extract( month from join_date) and extract( year from (select now())) != extract( year from join_date);
The above query will display all employees whose work anniversary is in the current month.
For the below table:
The above query will display the following rows.
The following query also considers leap year.
If the employee has joined on Feb-29 in a leap year and the current year is a non-leap year, then the query displays Anniversary Date as 'currentYear-Feb-28'
If the employee has joined on Feb-29 in a leap year and the current year is also a leap year, then the query displays Anniversay Date as 'currentYear-Feb-29'
select empId ,
case
when ( ( extract(year from (select now()))%4 = 0 and extract(year from (select now()))%100 != 0 ) or extract(year from (select now())) % 400 = 0 ) then
cast( concat( extract(year from (select now())), '-', extract( month from join_date),'-', extract( day from join_date) ) as date)
when ( ( (extract(year from join_date) % 4 = 0 and extract(year from join_date)%100 != 0) or extract( year from join_date)%400 = 0) and extract(month from join_date) =2
and extract(day from join_date) = 29 ) then
cast( concat( cast( extract(year from (select now())) as nchar), '-02-28') as date)
else cast( concat( extract(year from (select now())), '-', extract( month from join_date),'-', extract( day from join_date) ) as date)
end as AnniversaryDate
from emp_detail
where extract(year from join_date) != extract(year from (select now()));
Emp_detail data
For this data the query will show the following rows
Further if you want to filter the date to current month only, you can make use of extract function.

MYSQL Query Age Calculation

This is my MySQL Query to return the age from the date of birth
SELECT
PensionerDOB,
YEAR( CURDATE() ) AS Year,
DATE_FORMAT( STR_TO_DATE( PensionerDOB, '%d-%M-%Y' ), '%Y') AS age,
YEAR( CURDATE() ) - DATE_FORMAT( STR_TO_DATE(`PensionerDOB`, '%d-%M-%Y' ), '%Y' ) AS differenage
FROM
`pensionerbasicdata`
The query is executed. But it returns the age difference is in a negative value.
SELECT *,
TIMESTAMPDIFF(year, STR_TO_DATE(CONCAT(SUBSTRING_INDEX(PensionerDOB, '-', 2), '-19', SUBSTRING_INDEX(PensionerDOB, '-', -1)), '%d-%M-%Y'), CURRENT_DATE) AS age
FROM pensionerbasicdata
The problem with 2-digit year fixed - all years are treated as 19xx.
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=f356258c99b20d13b0c4e2349b801f18
Try this one it will work,
Query,
SELECT DATE_FORMAT(FROM_DAYS(DATEDIFF(now(),YourDateofBirth)), '%Y')+0 AS Age from AgeCalculationFromDatetime
here Mysql is not parsing two-digit years as expected.. Instead of 1945- it's returning 2065,1953- it's returning 2053.
Please follow this link to parse the date with two digits years.
how-to-use-str-to-date-mysql-to-parse-two-digit-year-correctly

Should I Write Lengthy SQL Query or Breakdown into Few Iterations

I have a very lengthy SQL query to fetch the expected output but on the other side, I also can generate the expected output by using multiple iterations.
Which one should I use?
I care about performance and writing better code.
By using length SQL query it takes around 3000ms to generate the output
Need about 4 ~ 5 iterations to generate the output
What is the query/code is doing
This code is generating the total number of the forecast record based on the financial year regardless the total number is 0 or not.
Using Length SQL Query
SELECT
CONCAT('FY\'', SUBSTR(`quarters`.fy, 3), ' Q', `quarters`.fy_quarter) AS name,
(
SELECT
COUNT(*)
FROM
member_project_stages
WHERE
YEAR ( member_project_stages.start_at ) = `quarters`.fy
AND QUARTER ( member_project_stages.start_at ) = `quarters`.fy_quarter
AND member_project_stages.stage_id = 9
) AS actual,
(
SELECT
COUNT(*)
FROM
projects AS a
WHERE
( a.forecast IS NOT NULL AND a.forecast > '' )
AND a.forecast LIKE CONCAT( '%FY\'', SUBSTR( `quarters`.fy, 3 ), '%' )
AND a.forecast LIKE CONCAT( '% Q', `quarters`.fy_quarter, '%' )
AND a.deleted_at IS NULL
GROUP BY
a.forecast
) AS forecast
FROM
`member_project_stages`,
(
SELECT YEAR
(
DATE_ADD( CURDATE(), INTERVAL - 9 MONTH )) AS fy,
QUARTER (
DATE_ADD( CURDATE(), INTERVAL - 9 MONTH )) AS fy_quarter UNION
SELECT YEAR
(
DATE_ADD( CURDATE(), INTERVAL - 6 MONTH )) AS fy,
QUARTER (
DATE_ADD( CURDATE(), INTERVAL - 6 MONTH )) AS fy_quarter UNION
SELECT YEAR
(
DATE_ADD( CURDATE(), INTERVAL - 3 MONTH )) AS fy,
QUARTER (
DATE_ADD( CURDATE(), INTERVAL - 3 MONTH )) AS fy_quarter UNION
SELECT YEAR
(
CURDATE()) AS fy,
QUARTER (
CURDATE()) AS fy_quarter UNION
SELECT YEAR
(
DATE_ADD( CURDATE(), INTERVAL 3 MONTH )) AS fy,
QUARTER (
DATE_ADD( CURDATE(), INTERVAL 3 MONTH )) AS fy_quarter UNION
SELECT YEAR
(
DATE_ADD( CURDATE(), INTERVAL 6 MONTH )) AS fy,
QUARTER (
DATE_ADD( CURDATE(), INTERVAL 6 MONTH )) AS fy_quarter UNION
SELECT YEAR
(
DATE_ADD( CURDATE(), INTERVAL 9 MONTH )) AS fy,
QUARTER (
DATE_ADD( CURDATE(), INTERVAL 9 MONTH )) AS fy_quarter
) AS `quarters`
GROUP BY
`quarters`.fy,
`quarters`.fy_quarter"
Using Iteration
for(...) {
run SQL query
}
for(...) {
using the previous output and run SQL query again
}
for(...) {
using the previous output and run SQL query again
}
for(...) {
using the previous output and run SQL query again
}
Finally I have my output
Its true that interacting with DB is costlier. Hence people prefer combining several queries into 1 query to optimize the performance.
And for better code, why not explain the logic in comments.

How can I efficiently calculate the sales since the nth day of the month?

I have an up to date mysql database installation and I need a function to calculate the sales between the nth of the last month and today. The function will be called several times per day because I'm looking for the point at which the cumulative sales since the previous nth of the month cross a threshold.
The background is that I am developing a subscription site. Customers will sign up on ad-hoc days of the year and make purchases throughout the year. I want to be able to calculate the sales on the month-to-date basis.
If a customer signs up on the nth of the month then I need to calculate the sales between the previous nth of the month and today.
If the customer signs up on the 28th and today is the 30th then you'd think that there had only been 2 days of sales but if today is 30th March then it's been 30 days. In another example: if the customer signs up on Oct 31st how would this handle Feb 28th or April 30th?
I've looked at functions such as timestampdiff() but can not figure out a practical solution. By that I mean one that will do the job and can do the task efficiently without costing toooo many cpu cycles.
Thanks in anticipation
You might be looking for something like :
DATE_ADD(
LAST_DAY(DATE_SUB(CURDATE(), interval 2 MONTH)),
INTERVAL 15 DAY
)
Given a number of days N, it returns the Nth day of last month.
As of today, when given 15 days, this yields '2018-12-15' (the 15th day of last month, eg December 15th, 2018).
If you want to ensure that the returned date will never exceed the last day of last month (like : current month is March and N = 31), you can use :
LEAST(
DATE_ADD(
LAST_DAY(DATE_SUB(CURDATE(), interval 2 MONTH)),
INTERVAL 15 DAY),
LAST_DAY(DATE_SUB('CURDATE(), interval 1 MONTH))
)
Typically, this :
SELECT LEAST(
DATE_ADD(
LAST_DAY(DATE_SUB('2018-03-28', interval 2 MONTH)),
INTERVAL 31 DAY),
LAST_DAY(DATE_SUB('2018-03-28', interval 1 MONTH))
)
Yieds : '2018-02-28' -(the last day of February).
Thanks to both of you for your contributions. I'd already spent several hours looking into the issue and some of the most promising answers came from stackoverflow which is why I thought to post my question here.
I agree that the question is ill formed having spent quite a bit more time trying to define the problem. As a consequence I'm going to change the problem into something which an be easily resolved.
Rather than rely on the subscription date I'm going to run on a calendar month basis. This forces changes elsewhere but I think I can live with them.
Thanks again
I could recommend TIMESTAMP and INTERVAL. It is simple and efficient without the need to try to do something trivial enough
So consider the following schema and query
DROP TABLE IF EXISTS `example_sales`;
CREATE TABLE `example_sales`(
`id` INT(11) UNSIGNED AUTO_INCREMENT,
`id_customer` MEDIUMINT(8) UNSIGNED NOT NULL,
`profits` DECIMAL(16,2) NOT NULL DEFAULT 0,
`ts` TIMESTAMP NOT NULL,
PRIMARY KEY(`id`)
) ENGINE=InnoDB DEFAULT CHARACTER SET = utf8 COLLATE = utf8_unicode_ci;
-- add some values, it doesn't matter the order in the example
INSERT INTO `example_sales`( `id_customer`, `profits`, `ts` ) VALUES
( 1, 10.00, NOW( ) - INTERVAL 12 DAY ),
( 1, 14.00, NOW( ) - INTERVAL 1 WEEK ),
( 1, 110.00, NOW( ) - INTERVAL 30 DAY ),
( 1, 153.00, NOW( ) - INTERVAL 8 DAY ),
( 1, 5.00, NOW( ) - INTERVAL 2 DAY ),
( 1, 97.00, NOW( ) - INTERVAL 13 DAY ),
( 1, 1.00, '2018-02-28 13:00:00' ),
( 1, 2.00, '2018-03-28 13:00:00' ),
( 1, 3.00, '2018-01-30 13:00:00' ),
( 1, 4.00, '2018-03-30 13:00:00' ),
( 1, 42.00, NOW( ) - INTERVAL 42 DAY );
Updated, since I've originally didn't fully understand the question. I've kept the timestamp though and made some corrections I didn't noticed earlier.
-- '2018-03-28' or '2018-03-29' instead of NOW( ) or anything you like
SET #last := DATE( NOW( ) );
SET #first := LEAST( DATE_SUB( #last, INTERVAL 1 MONTH ), LAST_DAY( DATE_SUB( #last, INTERVAL 1 MONTH ) ) );
-- change last and first test different sets
SELECT `profits`, DATE( `ts` ) AS `date`
FROM `example_sales`
WHERE `id_customer` = 1
HAVING `date` BETWEEN #first AND #last
ORDER BY `date`;
And when you are confident enough that this will do the job
SELECT SUM( `profits` ), DATE( `ts` ) AS `date`
FROM `example_sales`
WHERE `id_customer` = 1
HAVING `date` BETWEEN #first AND #last;
Hope that this will do the trick this time.

GROUP BY each year with missing years

I am trying to summarize records using the following query in MySQL. It works great as long as there is at least one record in each year. If records are missing in years, then the year doesn't show up. How can I modify this to show each year within my filter?
SELECT SUM( SICK_SIZE + DEAD_SIZE ) AS Cases, DATE_FORMAT( EVENT_DATE, '%Y' ) AS DateYear
FROM report_case_ext
WHERE DATE_FORMAT( EVENT_DATE, '%Y' ) >= DATE_FORMAT( DATE_ADD( CURDATE( ) , INTERVAL -4YEAR ) , '%Y' )
AND DATE_FORMAT( EVENT_DATE, '%Y' ) <= DATE_FORMAT( CURDATE( ) , '%Y' )
GROUP BY DATE_FORMAT( EVENT_DATE, '%Y' )
In MySQL, you can use sqlvariables, join to any other table to simulate row creation -- which returns a valid result set of the years you are looking for, then LEFT-JOIN to your other table so you know you'll always get the years you want...
select
YearsYouWant.RequireYear as DateYear,
SUM( RCE.SICK_SIZE + RCE.DEAD_SIZE ) AS Cases
from
( select #nYear := #nYear +1 as RequireYear
from report_case_ext,
( select #nYear := year( curdate()) -5 ) sqlvars
limit 5 ) as YearsYouWant
LEFT JOIN
report_case_ext RCE
on YearsYouWant.RequireYear = year( RCE.Event_Date )
GROUP BY
YearsYouWant.RequireYear
The inner prequery that uses "report_case_ext" is only used to have a table of at least 5 records to keep the years you want... In this case,
#nYear is initialized to 1 year less than the 4 you were looking for -- hence -5
curdate() = 2013 - 5 = 2008.
Then, in the select #nYear := #nYear +1 first time will have the first year become 2009 and complete for 5 years, thus generating a record for 2009, 2010, 2011, 2012 and 2013 (via LIMIT 5)
Now that result (of all years) is LEFT-joined to the report_case_ext table on common years. So, even those that have no dates
Create a table contains all possibile year/month/date (depends on your needs).
Then left join the table.
What is the most straightforward way to pad empty dates in sql results (on either mysql or perl end)?