Getting records by date is multiples of 30 days - mysql

I have the following query to get appointments that need remind once a month if they are not done yet. I want to get records with 30, 60, 90, 120,etc... in the past from the current date.
SELECT
a.*
FROM
appointments a
WHERE
DATEDIFF(CURDATE(), a.appointment_date) % 30 = 0
is there another way not to use DATEDIFF to achieve this? I want to increase the performance of this query.

Ok, lets all put the dates and date-diff aside for a moment. Looking at the question, the person is trying to look for all appointments in the past that dont necessarily have another in the future. Such as doing a FOLLOW-UP appointment with a Dr. "Come back in a month to see where things change". This points me to thinking there is probably some patient ID in the table of appointments. So this probably turns the question to looking at the past 30, 60 or 90 days ago to see if there was a corresponding appointment scheduled in the future. If already scheduled, the patient does not need a call reminder to get into the office.
That said, I would start a bit differently, get all patients that DID have an appointment within the last 90 days, and see if they already have (or not) a follow-up appointment already on the schedule for the follow-up. This way, the office person can make contacts with said patients to get on the calendar.
start by getting all maximum appointments for any given patient within the last 90 days. If someone had an appointment 90 days ago, and had a follow-up at 59 days, then they probably only care about the most recent appointment to make sure THAT has the follow-up.
select
a1.patient_id,
max( a1.appointment_date ) MostRecentApnt
from
appointments a1
WHERE
a1.appointment_date > date_sub( a1.appointment_date, interval 90 day )
group by
a1.patient_id
Now, from this fixed list and beginning date, all we care is, how many days to current is there last appointment. IS it X number of days? Just use datediff and sort. You can visually see the how many days. By trying to break them into buckets of 30, 60 or 90 days, just knowing how many days since the last appointment is probably just as easy as sorting in DESCENDING order with the oldest appointments getting called on first, vs those that just happened. Maybe even cutting off the calling list at say 20 days and still has not made an appointment and getting CLOSE to the expected 30 days in question.
SELECT
p.LastName,
p.FirstName,
p.Phone,
Last90.Patient_ID,
Last90.MostRecentApnt,
DATEDIFF(CURDATE(), Last90.appointment_date) LastAppointmentDays
FROM
( select
a1.patient_id,
max( a1.appointment_date ) MostRecentApnt
from
appointments a1
WHERE
a1.appointment_date > date_sub( a1.appointment_date, interval 90 day )
group by
a1.patient_id ) Last90
-- Guessing you might want patient data to do phone calling
JOIN Patients p
on Last90.Patient_id = p.patient_id
order by
Last90.MostRecentApnt DESC,
p.LastName,
p.FirstName
Sometimes, having an answer just for the direct question doesnt get the correct need. Hopefully I am more on-target with the desired ultimate outcome needs. Again, the above implies joining to the patient table for follow-up call purposes to schedule an appointment.

You could use the following query which compares the day of the month of the appointement to the day of the month of today.
We also test whether we are the last day of the month so as to get appointements due at the end of the month. For example if we are the 28th February (not a leap year) we will accept days of the month >= 28, ie 29, 30 & 31, which would otherwise be missed.
This method has the same problem as your current system, that appointements falling during the weekend will be missed.
select a.*
from appointements a,
(select
day(now()) today,
case when day(now())= last_day(now()) then day(now()) else 99 end lastDay
) days
where d = today or d >= lastDay;

You just want the appointments for 30 days in the future? Are they stored as DATE? Or DATETIME? Well, this works in either case:
SELECT ...
WHERE appt_date >= CURDATE() + INTERVAL 30 DAY
AND appt_date < CURDATE() + INTERVAL 31 DAY
If you have INDEX(appt_date) (or any index starting with appt_date), the query will be efficient.
Things like DATE() are not "sargable", and prevent the use of an index.
If your goal is to nag customers, I see nothing in your query to prevent nagging everyone over and over. This might need a separate "nag" table, where customers who have satisfied the nag can be removed. Then performance won't be a problem, since the table will be small.

If your primary concern is to speed up this query we can add a column int for comparing the number of days and index it. We then add triggers to calculate the modulus of the datediff between the start of the Unix period: 01/01/1970 (or any other date if you prefer) and store the result in this column.
This will take a small amount of storage space, and slow down insert and update operations. This will not be noticable when we add or modify one appointment at the time, which I suspect to be the general case.
When we query our table we calculate the day value of today, which will take very little time as it will only be done once, and compare it with the days column which will be very quick because it is indexed and there are no calculations involved.
Finally we run your current query and look at it using explain to see that, even though we have indexed the column date_ , the index cannot be used for this query.
CREATE TABLE appointments (
id INT PRIMARY KEY NOT NULL AUTO_INCREMENT,
date_ date,
days int
);
CREATE INDEX ix_apps_days ON appointments (days);
✓
✓
CREATE PROCEDURE apps_day()
BEGIN
UPDATE appointments SET days = day(date_);
END
✓
CREATE TRIGGER t_apps_insert BEFORE INSERT ON appointments
FOR EACH ROW
BEGIN
SET NEW.days = DATEDIFF(NEW.date_, '1970-01-01') % 30 ;
END;
✓
CREATE TRIGGER t_apps_update BEFORE UPDATE ON appointments
FOR EACH ROW
BEGIN
SET NEW.days = DATEDIFF(NEW.date_, '1970-01-01') % 30 ;
END;
✓
insert into appointments (date_) values ('2022-01-01'),('2022-01-01'),('2022-04-15'),(now());
✓
update appointments set date_ = '2022-01-12' where id = 1;
✓
select * from appointments
id | date_ | days
-: | :--------- | ---:
1 | 2022-01-12 | 14
2 | 2022-01-01 | 3
3 | 2022-04-15 | 17
4 | 2022-04-22 | 24
select
*
from appointments
where DATEDIFF(CURDATE() , '1970-01-01') % 30 = days;
id | date_ | days
-: | :--------- | ---:
4 | 2022-04-22 | 24
explain
select DATEDIFF(CURDATE() , '1970-01-01')
from appointments
where DATEDIFF(CURDATE() , '1970-01-01') = days;
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra
-: | :---------- | :----------- | :--------- | :--- | :------------ | :----------- | :------ | :---- | ---: | -------: | :----------
1 | SIMPLE | appointments | null | ref | ix_apps_days | ix_apps_days | 5 | const | 1 | 100.00 | Using index
CREATE INDEX ix_apps_date_ ON appointments (date_);
✓
SELECT
a.*
FROM
appointments a
WHERE
DATEDIFF(CURDATE(), a.date_) % 30 = 0
id | date_ | days
-: | :--------- | ---:
4 | 2022-04-22 | 24
explain
SELECT
a.*
FROM
appointments a
WHERE
DATEDIFF(CURDATE(), a.date_) % 30 = 0
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra
-: | :---------- | :---- | :--------- | :--- | :------------ | :--- | :------ | :--- | ---: | -------: | :----------
1 | SIMPLE | a | null | ALL | null | null | null | null | 4 | 100.00 | Using where
db<>fiddle here

Related

MySQL get NOT overlapping date ranges

I have seen so many questions similar to this, but they all seem to be tailored to highlighting when date ranges are overlapping, I need the opposite.
Lets say I have a table like so
id| start_date | end_date | room_id
1 | 15/05/2018 | 30/06/2020 | 1
2 | 01/11/2018 | 31/10/2019 | 2
3 | 01/08/2020 | 31/07/2022 | 1
4 | 01/12/2019 | 30/11/2021 | 2
5 | 01/08/2020 | 31/07/2022 | 3
As you can see there are multiple bookings for each room. I need to be able to specify either a single start/end date or both, and get back what DOESN'T overlap (i.e, the available rooms)
For example, if i specified just a start date of 01/05/2018 then every room will return, or if i specify just an end date of 30/07/2020 then every room will return because neither of those dates are between the start and end date of each booking. Even though id 1 has a booking that ends on 30/06/2020 and a new one that starts on 01/08/2020, it would still be available because someone could book between those 2 dates.
If I specified both start and end dates, it searches through and returns only the rooms that have no bookings between the 2 dates at all.
I have read plenty of questions online and the logic seems to be
SELECT *
FROM bookings
WHERE $start_date < expiry_date AND $end_date > start_date
which i understand, but if I ran this query above with the following dates
SELECT *
FROM bookings
WHERE '2018-10-01' < expiry_date AND '2019-10-01' > start_date
it returns
id| start_date | end_date | room_id
1 | 15/05/2018 | 30/06/2020 | 1
2 | 01/11/2018 | 31/10/2019 | 2
How do I get it so that when I pass either a start date, end date or BOTH it returns the rooms that are available?
By De Morgan's Laws, we can negate the overlapping range query you gave as follows:
SELECT *
FROM bookings
WHERE $start_date >= expiry_date OR $end_date <= start_date;
The expression ~(P ^ Q) is equivalent to ~P V ~Q.

MySQL select avg reading every hour even if there is no reading

I'm having a hard time making a MySQL statement from a Postgres one for a project we are migrating. I won't give the exact use case since it's pretty involved, but I can create a simple comparable situation.
We have a graphing tool that needs somewhat raw output for our data in hourly intervals. In Postgres, the SQL would generate a series for the date and hour over a time span, then it would join a query against that for the average where that date an hour existed. We were able to get for example the average sales by hour, even if that number is 0.
Here's a table example:
Sales
datetime | sale
2017-12-05 08:34:00 | 10
2017-12-05 08:52:00 | 20
2017-12-05 09:15:00 | 5
2017-12-05 10:22:00 | 10
2017-12-05 10:49:00 | 10
Where something like
SELECT DATE_FORMAT(s.datetime,'%Y%m%d%H') as "byhour", AVG(s.sale) as "avg sales" FROM sales s GROUP BY byhour
would produce
byhour | avg sales
2017120508 | 10
2017120509 | 5
2017120510 | 10
I'd like something that gives me the last 24 hours, even the 0/NULL values like
byhour | avg sales
2017120501 | null
2017120502 | null
2017120503 | null
2017120504 | null
2017120505 | null
2017120506 | null
2017120507 | null
2017120508 | 10
2017120509 | 5
2017120510 | 10
...
2017120600 | null
Does anyone have any ideas how I could do this in MySQL?
Join the result on a table that you know contains all the desired hours
someting like this:
SELECT
* FROM (
SELECT
DATE_FORMAT(s.datetime, '%Y%m%d%H') AS 'byhour'
FROM
table_that_has_hours
GROUP BY byhour) hours LEFT OUTER JOIN (
SELECT
DATE_FORMAT(s.datetime, '%Y%m%d%H') AS 'byhour',
AVG(s.sale) AS 'avg sales'
FROM
sales s
GROUP BY byhour) your_stuff ON your_stuff.byhour = hours.by_hours
if you don't have a table like that you can create one.
like this:
CREATE TABLE ref (h INT);
INSERT INTO ref (h)
-> VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),
-> (12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23)
and then you can just DATE_FORMAT(date(now()),'%Y%m%d%H') to the values

Use Max date to create a date range

I need to create a date range in a table that houses transaction information. The table updates sporadically throughout the week from a manual process. Each time the table is updated transactions are added up to the previous Sunday. For instance, the upload took place yesterday and so transactions were loaded through last Sunday (Feb 26th). If it had been loaded on Wednesday it would still be dated for Sunday. The point is that I have a moving target with my transactions and also when the data is loaded to the table. I am trying to fix my look back period to the date of the latest transaction then go three weeks back. Here is the query that I came up with:
SELECT distinct TransactionDate
FROM TransactionTABLE TB
inner join (
SELECT distinct top 21 TransactionDate FROM TrasactionTABLE ORDER BY TransactionDate desc
) A on TB.TransactionDate = A.TransactionDate
ORDER BY TB.TransactionDate desc
Technically this code works. The problem that I am running into now is when there were no transactions on a given date, such as bank holidays (in this case Martin Luther King Day), then the query looks back one day too far.
I have tried a few different options including MAX(TransactionDate) but if I use that in a sub-query or CTE then use the new value in a WHERE statement as a reference I only get the max value or the value I subtract that statement by. For instance if I say WHERE TransactionDate >= MAX(TransactionDate)-21 and the max date is Feb 26th then the result is Feb 2nd instead of the range of dates from Feb 2nd through Feb 26th.
IN SUMMARY, what I need is a date range looking three weeks back from the date of the latest transaction date. This is for a daily report so I cannot hardcode the date in. Since I am also using Excel Connections the use of Declare statements is prohibited.
Thank you StackOverflow gurus in advance!
You could use something like this:
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
select top (21)
[Date]=convert(date,dateadd(day, row_number() over (order by (select 1))-1
, dateadd(day,-20,(select max(TransactionDate) from t) ) ) )
from n as deka
cross join n as hecto
order by [Date]
)
select Date=convert(varchar(10),dates.date,120) from dates
rextester demo: http://rextester.com/ZFYV25543
returns:
+------------+
| Date |
+------------+
| 2017-02-06 |
| 2017-02-07 |
| 2017-02-08 |
| 2017-02-09 |
| 2017-02-10 |
| 2017-02-11 |
| 2017-02-12 |
| 2017-02-13 |
| 2017-02-14 |
| 2017-02-15 |
| 2017-02-16 |
| 2017-02-17 |
| 2017-02-18 |
| 2017-02-19 |
| 2017-02-20 |
| 2017-02-21 |
| 2017-02-22 |
| 2017-02-23 |
| 2017-02-24 |
| 2017-02-25 |
| 2017-02-26 |
+------------+
I just found this for looking up dates that fall within a given week. The code can be manipulated to change the week start date.
select convert(datetime,dateadd(dd,-datepart(dw,convert(datetime,convert(varchar(10),DateAdd(dd,-1/*this # changes the week start day*/,getdate()),101)))+1/*this # is used to change the week start date*/,
convert(datetime,convert(varchar(10),getdate(),21))))/*also can enter # here to change the week start date*/
I've included a screenshot of the results if you were to include this with a full query. This way you can see how it looks with a range of dates. I did a little manipulation so that the week starts on Monday and references Monday's date.
Since I am only looking back three weeks a simple GETDATE()-21 is sufficient because as the query moves forward through the week it will look back 21 days and pick the Monday at the beginning of the week as my start date.

MySQL doesn't use indexes in a SELECT clause subquery

I have an "events" table
table events
id (pk, auto inc, unsigned int)
field1,
field2,
...
date DATETIME (indexed)
I am trying to analyse holes in the trafic (the moments where there is 0 event in a day)
I try this kind of request
SELECT
e1.date AS date1,
(
SELECT date
FROM events AS e2
WHERE e2.date > e1.date
LIMIT 1
) AS date2
FROM events AS e1
WHERE e1.date > NOW() -INTERVAL 10 DAY
It takes a very huge amount of time
Here is the explain
+----+--------------------+-------+-------+---------------------+---------------------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------------+---------------------+---------+------+----------+-------------+
| 1 | PRIMARY | t1 | range | DATE | DATE | 6 | NULL | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | t2 | ALL | DATE | NULL | NULL | NULL | 58678524 | Using where |
+----+--------------------+-------+-------+---------------------+---------------------+---------+------+----------+-------------+
2 rows in set (0.00 sec)
Tested on MySQL 5.5
Why can't mysql use the DATE indexe? is it because of a subquery?
Your query suffers from the problem shown here which also presents a quick solution with temp tables. That is a mysql forum page, all of which I unearthed thru finding this Stackoverflow question.
You may find that the creation and populating such a new table on the fly yields bearable performance and is easy to implement with the range of datetimes now() less 10 days.
If you need assistance in crafting anything, let me know. I will see if I can help.
You are looking for dates with no events?
First build a table Days with all possible dates (dy). This will give you the uneventful days:
SELECT dy
FROM Days
WHERE NOT EXISTS ( SELECT * FROM events
WHERE date >= days.day
AND date < days.day + INTERVAL 1 DAY )
AND dy > NOW() -INTERVAL 10 DAY
Please note that 5.6 has some optimizations in this general area.

getting greatest value in specific column and row

I have a table that is going to have several time stamp entries added throughout the day with a specific employee ID tied to each entry. I am curious how I would get the first timestamp of the day and the last time stamp of the day to calculate amount of time worked for that specific employee on the specific date. My table is below:
+----+------------+----------+---------+---------------------+-----------+------------+-----------+---------+
| id | employeeID | date | timeIn | jobDescription | equipType | unitNumber | unitHours | timeOut |
+----+------------+----------+---------+---------------------+-----------+------------+-----------+---------+
| 1 | 1 | 01/13/13 | 8:17 pm | Worked in Hubbard | Dozer | 2D | 11931 | 8:17 pm |
| 2 | 1 | 01/13/13 | 8:17 pm | Worked in Jefferson | Excavator | 01E | 8341 | 8:18 pm |
+----+------------+----------+---------+---------------------+-----------+------------+-----------+---------+
so far I have a query like this to retrieve the time values:
$stmt = $conn->prepare('SELECT * FROM `timeRecords` WHERE `date`= :dateToday AND `employeeID` = :employeeID ORDER BY employeeID ASC');
$stmt->execute(array(':employeeID' => $_SESSION['employeeID'], ':dateToday' => $dateToday));
But I am unsure of how to obtain the greatest value in the timeOut column
Really, you just need the aggregate MAX() and MIN() grouped by employeeID. Use the TIMEDIFF() function to calculate the difference in time between the two.
SELECT
`employeeID`,
MIN(`timeIn`) AS `timeIn`,
MAX(`timeOut`) AS `timeOut`,
TIMEDIFF(MAX(`timeOut`), MIN(`timeIn`)) AS `totalTime`
FROM `timeRecords`
WHERE
`date` = :dateToday
AND `employeeID` = :employeeID
/* Selecting only one employeeID you don't actually need the GROUP BY */
GROUP BY `employeeID`
However, this won't report the total time worked if an employee clocks in and out several times during one day. In that case, you would need to SUM() the result of the TIMEDIFF() for each of the in/out pairs.
Something like:
SELECT
`employeeID`,
/* Assumes no times overlap across rows */
SEC_TO_TIME(SUM(TIME_TO_SEC(TIMEDIFF(`timeOut`, `timeIn`)))) AS `totalTime`
FROM `timeRecords`
WHERE
`date` = :dateToday
AND `employeeID` = :employeeID
GROUP BY `employeeID`