How can I handle NULL values while doing datetime comparisons? - mysql

My goal is to work out how many accounts there have been on my website at specific times. I allow accounts to be cancelled at anytime, but if they were cancelled after the month I'm looking at then I would still like them to appear as they were active at that snapshot in time.
My accounts table which looks like:
--------------------------------------------------
id | int
signUpDate | varchar
cancellationTriggeredDate | datetime (NULLABLE)
--------------------------------------------------
I wrote a select statement to accomplish this goal which looks like:
SELECT
COUNT(*) AS January_2020
FROM
Accounts
WHERE
STR_TO_DATE(signUpDate, '%d/%m/%Y') <= STR_TO_DATE('31/01/2020', '%d/%m/%Y')
AND cancellationTriggeredDate <= '2020-01-31 00:00:00'
The expected results would be 3, this is how many accounts I had in January and have not been cancelled after January. The actual results is 0. I believe this is because not all of my accounts have a cancellation date set, but I'm not sure how to handle this.
To make it easier to get help, I have created a SQL Fiddle including sample data and schema.
http://sqlfiddle.com/#!9/64f3e3

Your date comparison needs to be correct for the cancellation date.
Then you can use OR to handle NULL:
SELECT COUNT(*) AS January_2020
FROM Accounts
WHERE STR_TO_DATE(signUpDate, '%d/%m/%Y') <= '2020-01-31' and
(cancellationTriggeredDate > '2020-01-31' OR
cancellationTriggeredDate IS NULL
)
Here is the db<>fiddle. Note that the above gives the users who are active at eactly 2020-01-31 00:00:00 -- that is, at the beginning of the day. There are different things that you might mean:
Customers active at exactly 2020-01-31 00:00:00
Customers active at exactly 2020-02-01 00:00:00
Customers active for the entire day of 2020-01-31
Customers active for the entire month of 2020-01
Customers active at any time during the month of 2020-01
All of these use basically the same logic, just by tweaking the specific comparisons.

Alternate solution - you can use CASE WHEN and initialize cancellationTriggeredDate with a hypothetically higher date (i.e. end of this century) before taking this field for comparison in WHERE predicate.
SELECT
COUNT(*) AS January_2020
FROM
Accounts
WHERE
STR_TO_DATE(signUpDate, '%d/%m/%Y') <= STR_TO_DATE('31/01/2020', '%d/%m/%Y')
AND CASE WHEN cancellationTriggeredDate IS NULL THEN '2099-12-31' ELSE cancellationTriggeredDate END > '2020-01-31 00:00:00'

Related

How to get the number of active events happening in a week given start and end date (MYSQL)?

This is the current table I have.
ID Start_Date End_Date
6446 2018-01-01 00:00:00 2018-04-01 00:00:00
6848 2018-05-01 00:00:00 2018-05-31 00:00:00
3269 2016-11-09 00:00:00 2016-11-21 00:00:00
7900 2018-11-07 00:00:00 2018-11-30 00:00:00
4006 2017-04-06 00:00:00 2017-04-30 00:00:00
Is there a way to get the number of active events per week? Some events might run past a few weeks. Event ID is distinct and can be used to count.
Please help and happy to furnish more info if required.
EDIT 1: The dataset I want is
2019 week 1 - 60 active events
2019 week 2 - 109 active events
I know about WEEK(datetime), however that does not capture the event being active for subsequent weeks.
The issue is that I don't capture the number of active events after the week they are started.
EDIT 2: Week would be defined as the integer returned using the week() function in mysql on a date object. My data is only for 2019.
Try to use count() function in MySQL.
SELECT COUNT(*) FROM your_table WHERE Start > 'start-date' AND End < 'end-date'
Give a try to below query
SELECT
IFNULL(DATE_FORMAT(Start_Date, '%Y WEEK %U'), 0) AS STARTDate,
IFNULL(DATE_FORMAT(End_Date, '%Y WEEK %U'), 0) AS ENDDate,
IFNULL(COUNT(ID),0) AS total,
group_concat(ID)
FROM `event`
where Start_Date < End_Date
Group by STARTDate;
I found an answer, but encountered a new problem.
Building upon Kranti's code, the answer is as follows.
SELECT
EXTRACT(WEEK FROM starting_date) AS STARTDate,
EXTRACT(WEEK FROM ending_date) AS ENDDate,
discount_type,
COUNT(ID) AS total
FROM `event`
where starting_date < ending_date
Group by 1,2
What this gives me is the number of events that have happened from Week 1 - 3, Week 1-4 etc, so on and so forth.
Afterwards, we do a left join with the weeks of interest, on the condition where the week numbers are in between STARTDate and ENDDate. Due to how a left join works, it will duplicate rows for all the rows that fulfill the specific condition.
We follow up with a groupby and sum, which will give us the number of events that were active, for each week.

MySQL searching timestamp columns by date only

I am building out a query to search a table by a timestamp column value. An example of the format I am passing to the api is 2018-10-10. The user has the ability to select a date range. Often times the date range start date is 2018-10-10 and end date is the same day, 2018-10-10. The below doesn't seem to do the trick. What is the simplest way to accomplish this without having to specify the time? Obviously, I'd like to query for the entire day of 2018-10-10 from start to end of day.
SELECT
count(*)
FROM
contact
WHERE
created_at >= '2018-10-10'
AND created_at <= '2018-10-10';
The problem here is that Timestamp datatype will have HH:MM:SS (time) values also. While comparing a datetime with date, MySQL would automatically assume 00:00:00 as HH:MM:SS for the date value.
So, 2018-10-10 12:23:22 will not match the following condition: created_at <= '2018-10-10'; since it would be treated as: 2018-10-10 12:23:22 <= '2018-10-10 00:00:00, which is false
To handle this, you can add one day to the date (date_to in the filter), and use < operator for range checking.
SELECT
count(*)
FROM
contact
WHERE
created_at >= '2018-10-10'
AND created_at < ('2018-10-10' + INTERVAL 1 DAY);

Dynamic due dates checking with the given period of dates

id start_date interval period
1 1/22/2018 2 month
2 2/25/2018 3 week
3 11/24/2017 3 day
4 7/22/2017 1 year
5 2/25/2018 2 week
the above is my table data sample. start_dates will be expired based on interval and period(i.e id-1 will have due date after 2 months from the start_date, id-2 will have due after 3 weeks vice versa). period is enum of (day,week,month,year). Client can give any period of dates. let's say 25-06-2026 to 13-07-2026 like that.. I have to return the ids whose due dates falls under that period.I hope i made my question clear.
Here what i have done to resolve this. I am using mysql 5.7. I found ways to achieve this with recursive CTE's.(not available in mysql 5.7). and there is a way to achieve this by populating virtual records by using inline sub queries along with unions and its a performance killer and there is restriction of population of records.(like given in the link Generating a series of dates) I have reached a point to get results for a single date which is very easy. Below is my query(in oracle)
select id
from (select a.*,
case
when period='week'
then mod((to_date('22-07-2018','dd-mm-yyyy')-start_date),7*interval)
when period='month' and to_char(to_date('22-07-2018','dd-mm-yyyy'),'dd')=to_char(start_date,'dd')
and mod(months_between(to_date('22-07-2018','dd-mm-yyyy'),start_date),interval)=0
then 0
when period='year' and to_char(to_date('22-07-2018','dd-mm-yyyy'),'dd-mm')=to_char(start_date,'dd-mm')
and mod(months_between(to_date('22-07-2018','dd-mm-yyyy'),start_date)/12,interval)=0
then 0
when period='day'
and mod((to_date('22-07-2018','dd-mm-yyyy')-start_date),interval)=0
then 0 else 1 end filter from kml_subs a)
where filter=0;
But I need to do this for a period of dates not a single date. Any suggestions or solutions will be much appreciated.
Thanks,
Kannan
Assuming this is an Oracle question and not MySQL:
I think the first thing that you need to do is calculate when the due date is. I think a simple case statement can handle that for you:
case when period = 'day' then start_date + numtodsinterval(interval,period)
when period = 'week' then start_date + numtodsinterval(interval*7,'day')
when period = 'month' then add_months(start_date,interval)
when period = 'year' then add_months(start_date,interval*12)
end due_date
Then, using that new due_date field, you can check if the due date falls between the desired date range.
select *
from(
select id,
start_date,
interval,
period,
case when period = 'day' then start_date + numtodsinterval(interval,period)
when period = 'week' then start_date + numtodsinterval(interval*7,'day')
when period = 'month' then add_months(start_date,interval)
when period = 'year' then add_months(start_date,interval*12)
else null end due_date
from data)
where due_date between date '2018-02-25' and date '2018-03-12'
The above query checking between 2/25/18 and 3/12/18 produces the following output using your data:
+----+-------------+----------+--------+-------------+
| id | start_date | interval | period | due_date |
+----+-------------+----------+--------+-------------+
| 2 | 05-FEB-2018 | 3 | week | 26-FEB-2018 |
| 5 | 25-FEB-2018 | 2 | week | 11-MAR-2018 |
+----+-------------+----------+--------+-------------+

Return rows between two dates

I have ItemsRent table,
ID | ParentID | SubID | StartDate | EndDate |
--------------------------------------------------------------------
1 | 100 | 102 | 2014-09-09 17:40:00 | 2014-11-09 17:40:00 |
2 | 70 | 73 | 2014-08-09 14:20:00 | 2014-12-09 13:40:00 |
The dates are in sql format.
My input dates are:
InputStartDate: 2014-09-09 18:00:00
InputEndDate: 2014-10-09 13:47:00
And i want to return the best row only of the dates are between two dates. So for example:
Lets call StartDate as S, and EndDate as E.
And input dates will be InputStartDate as IS, and InputEndDate as IE.
S E
|----------------|
IS IE
|XXXXXXX--------|
Any suggestions ?
This query will produce a result matching your illustration. It will find all rows where any time at all was spent between InputStartDate and InputEndDate, and output a modified date range that is clamped by InputStartDate and InputEndDate.
SELECT ID, ParentId, SubId,
MAX( InputStartDate, StartDate ) AS Date_Start,
MIN( InputEndDate, EndDate ) AS Date_End
FROM `itemsRent`
WHERE InputStartDate <= EndDate AND InputEndDate >= StartDate
Normally, I would say to use the BETWEEN operator, but since you are storing both the start and end dates in the table, this would get more complicated than it needs to be. If you assume that the start date being stored is before the end date, you only need to perform two checks.
SELECT * FROM `itemsRent` WHERE `StartDate` > 1410285600 AND `EndDate` < 1410356820
This verifies that the start date of the item takes place after the specified start date. The issue with this is that it does not check if it takes place before the end date. Instead of explicitly writing this check, you can make sure of this by checking that the item's end date takes place before the specified end date.
NOTE: Might cause issues if the start date does not occur before the end date. If this is a possibility, then you will need to explicitly write these checks. This would be a good case in which to use the BETWEEN operation.
Why not just pull out the max end date in the table as the high date?
SELECT *
FROM itemsRent
WHERE (InputStartDate BETWEEN startdate AND end date)
AND (InputEndDate BETWEEN startdate AND enddate);
Your query is currently looking for rows whose EndDate is earlier than your provided StartDate or after your provided EndDate. I don't think that is what you want.
If you want rows that both StartDate and EndDate are between your provided dates (lets call them your-start-date and your-end-date), your query should be something like:
SELECT * FROM `itemsRent` WHERE `StartDate` > your-start-date AND `StartDate` < your-end-date AND `EndDate` > your-start-date AND `EndDate` < your-send-date

MySQL - Using a date range vs functions

I needed to know how many users registered during June and July, this is the first query I wrote:
select count(*) from users where created_at>="2013-06-01" and created_at<="2013-07-31"
Result: 15,982
select count(*) from users where year(created_at)=2013 and month(created_at) in (6,7)
Result: 16,278
Why do they return a different result? Could someone explain? Or am I missing something?
Thanks.
Both query should be equivalent, except that the first one is able to make use of an index and it should be faster, and except the case in which created_at is not a DATE but is a TIMESTAMP.
If created_at is a timestamp, you should write your first query this way:
select count(*) from users
where created_at>='2013-06-01' and created_at<'2013-08-01'
otherwise your first query will exclude all records created on 31th of July, after midnight, eg. 2013-07-31 00:00:00 will be included while 2013-07-31 09:15:43 will be not.
The reason is that your date values do not include the last day: The date constants are converted to a timestamp at midnight. You are querying between these values:
2013-06-01 00:00:00
2013-07-31 00:00:00
So only the first second of the last day is included.
Try this:
select count(*)
from users
where created_at>="2013-06-01"
and created_at<="2013-07-31 23:59:59"
Or more simply make less than the next day:
select count(*)
from users
where created_at>="2013-06-01"
and created_at<"2013-08-01" -- < 1st day of next month