Date table lookup efficiency - MySQL - mysql

I need to calculate the number of "working minutes" between two datetime values, lets call them 'Created' and 'Finished'.
'Finished' is always subsequent to 'Created'. The two values can differ by anything from 1 second to several years. The median difference is 50,000 seconds or roughly 14 hours.
Working minutes are defined as those occurring between 0900 to 1700 hours, Monday to Friday; excluding weekends and official holidays in our country.
I decided a lookup table was the way to go, so I generated a table of all work minutes, explicitly excluding weekends, nights and holidays...
CREATE TABLE `work_minutes` (
`min` datetime NOT NULL,
PRIMARY KEY (`min`),
UNIQUE KEY `min_UNIQUE` (`min`)
)
I populated this programatically with all the "working minutes" between years 2017 to 2024, and at this point I started to get the feeling I was being very inefficient as the table began to balloon to several hundred thousand rows.
I can do a lookup easily enough, for instance:
SELECT COUNT(min) FROM `work_minutes` AS wm
WHERE wm.min > '2022-01-04 00:04:03'
AND wm.min <= '2022-02-03 14:13:09';
#Returns 10394 'working minutes' in 0.078 sec
This is good enough for a one-off lookup but to query a table of 70,000 value pairs takes over 90 minutes.
So, I am uncomfortable with the slowness of the query and the sense that the lookup table is unnecessarily bloated.
I am thinking I need to set up two tables, one just for dates and another just for minutes, but not sure how to implement. Date logic has never been my forte. The most important thing to me is that the lookup can query over 70,000 values reasonably quickly and efficiently.
Working in MySQL 5.7.30. Thanks in advance for your expertise.

Divide the timerange to 3 parts - starting and finishing incomplete day parts, and middle part which consists from a lot of complete days. Of course if both starting and finishing time stamps have the same date part then it will be one part only, if their dates are consecutive then you\ll have 2 parts to process.
There is no problem to calculate the number of working minutes in incomplete day part. Common overlapping formula with weekday checking will help.
Create static calendar/service table which starts from the date which is earlier than any possible date in your beginning timestamp with guarantee and includes all dates after any possible date in your finishing timestamp. Calculate cumulative working minutes for each date in the table. This table allows to calculate the amount of working time in any range of complete days with single substraction.

Plan A: Convert the DATETIME values to seconds (from some arbitrary time) via TO_SECONDS(), then manipulate them with simple arithmetic.
Plan B: Use the DATEDIFF() function.
Your COUNT(min) counts the number of rows where min IS NOT NULL. You may as well say COUNT(*). But did you really want to count the number of rows?

Related

Calculate the difference in hours between two String dates in MySQL?

I have two String columns in MySQL database. Those two columns were populated from a Java program in following way:
System.currentTimeMillis(); //first column
System.currentTimeMillis(); + someStringHours //second column; the String, someStringDays reprensents some number of days, let's say 5 hours in millis...
Which function in MySQL can be used to calculated the difference to get number of hours between these two columns?
You call them string dates but they are actually UNIX timestamps in milliseconds (also called Javascript timestamps). That's what System.currentTimeMillis() generates. It's a Java long data item, and a MySQL BIGINT data item. You can store it in a string. (You can store it that way if you must, but searching and sorting numbers stored as strings is an unreliable mess; beware!)
A typical Javascript timestamp (or UNIX timestamp in milliseconds) is a big integer like 1600858176374456. 1 600 858 176 374 456.
You can convert such a timestamp to a MySQL TIMESTAMP value with FROM_UNIXTIME() like this
FROM_UNIXTIME(column * 0.001)
Why multiply the column value by 0.001 (that is, divide it by 1000)? Because FROM_UNIXTIME() takes the timestamp in seconds, whereas System.currentTmeMillis() generates it in milliseconds.
Then you can use DATEDIFF() like this
DATEDIFF(FROM_UNIXTIME(laterTs*0.001),FROM_UNIXTIME(earlierTs*0.001))
This gives an integer number of days.
If you need the time difference in some other unit, such as hours, minutes, or calendar quarters, you can use TIMESTAMPDIFF(). This gives you your difference in hours.
TIMESTAMPDIFF(HOUR,
FROM_UNIXTIME(laterTs*0.001),
FROM_UNIXTIME(earlierTs*0.001));
You can use SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, or YEAR as the time unit in this function.
Pro tip: Use your DBMS's date arithmetic functions if you possibly can. They've worked out all sorts of edge cases for you already.
And, by the way, if you declare your columns like this (Timestamp with a millisecond precision: How to save them in MySQL):
laterTs TIMESTAMP(3),
earlierTs TIMESTAMP(3),
You'll have an easier time indexing on and searching by these times.
SELECT (1600858176374-1600944576374)/(24*60*60*1000) as Days
Where (1600858176374-1600944576374) are timestamps and (246060*1000) is a mills in day

UNIX Timestamp Time Difference Average MYSQL

I am currently working on a ticket system in which I would like to work out the average amount of time it is taking staff to respond to tickets.
I have 2 columns that hold the UNIX timestamps: timestamp (when ticket was submitted) and endstamp (when ticket was closed)
SELECT AVG(TIMEDIFF(endstamp,timestamp)) AS timetaken FROM `tickets`
I'm not really sure what I am doing wrong.
Any help would be much appreciated!
A UNIX timestamp is just a representation of a point in time as a number of seconds, so basically an integer value. On the other hand, date function timestampdiff() operates on 3 parameters: a unit, and two values (or expressions) of datetime datatype (or the-like). Your query should actually raise a syntax error, since what you are giving as first argument is not a legal unit.
If you want the difference in seconds between two UNIX timestamps, just substract them, so:
SELECT AVG(endstamp - timestamp) AS timetaken FROM `tickets`

get average interarrival time for service requests by timestamp

I have partly the following MySQL schema
ServiceRequests
----------
id int
RequestDateTime datetime
This is what a typical collection of records might look like.
1 | 2009-10-11 14:34:22
2 | 2009-10-11 14:34:56
3 | 2009-10-11 14:35:01
In this case the average request time is (34+5)/2 = 19.5 seconds, being
14:34:22 ---> (34 seconds) ----> 14:34:56 ------> (5 seconds) -----> 14:35:01
Basically I need to work out the difference in time between consecutive records, sum that up and divide by the number of records.
The closest thing I can think of is to convert the timestamp to epoch time and start there. I can add a field to the table to precalculate the epoch time if necessary.
How do I determine 19.5 using a sql statement(s)?
You don't really need to know the time difference of each record to get the average. You have x data points ranging from some point t0 to t1. Notice that the the last time - first time is also 39 sec. (max-min)/(count-1) should work for you
select max(RequestDateTime)-min(RequestDateTime) / (count(id)-1) from ServiceRequests;
Note: This will not work if the table is empty, due to a divide by zero.
Note2: Different databases handle subtraction of dates differently so you may need to turn that difference into seconds.
Hint: maybe using TIMEDIFF(expr1,expr2) and/or TIME_TO_SEC(expr3)

Time Over 23:59:59 in PostgreSQL?

In MySQL I can create a table with a time field, and the value can be as high as 838:59:59 (839 hours - 1 second). I just read that in PostgreSQL, the hour field cannot exceed 23:00:00 (24 hours). Is there a way around this? I'm trying to make a simple DB that keeps track of how many hours & minutes were spent doing something, so it'll need to go higher than 23 hours & some minutes. I can do this in MySQL, but I need to use PostgreSQL for this. I Googled, but didn't find what I'm looking for, so I'm hoping I just didn't use the right keywords.
Postgres has no "hour field" - it has a few date/time types which serve different needs. The type I believe best fits your needs is INTERVAL.
Although they use the same notation, there's a difference between time of day and elapsed time. Some of their values overlap, but they're different domains. 838 isn't a valid value for an hour if you're talking about a time of day. 838 is a valid value for an hour if you're talking about elapsed time.
This distinction leads to two different data types: timestamp and interval.
create table intervals (
ts timestamp primary key,
ti interval not null
);
insert into intervals values (current_timestamp, '145:23:12');
select *
from intervals;
2011-08-03 21:51:16.837 145:23:12
select extract(hour from ti)
from intervals
145
I believe you are right, but It should not be an issue to work around. Would suggest storing the UNIX time integers for when you "punch in" and out again, and then adding the delta to an int field.
This will yield the number of seconds spent, which can be translated trivially into an hours:minutes:seconds format.
The delta (difference) can be calculated by subtracting the start timestamp from the end timestamp.
you could use a datetime field... 839 hours being something on the order 34.9 days...

Select day of week from date

I have the following table in MySQL that records event counts of stuff happening each day
event_date event_count
2011-05-03 21
2011-05-04 12
2011-05-05 12
I want to be able to query this efficiently by date range AND by day of week. For example - "What is the event_count on Tuesdays in May?"
Currently the event_date field is a date type. Are there any functions in MySQL that let me query this column by day of week, or should I add another column to the table to store the day of week?
The table will hold hundreds of thousands of rows, so given a choice I'll choose the most efficient solution (as opposed to most simple).
Use DAYOFWEEK in your query, something like:
SELECT * FROM mytable WHERE MONTH(event_date) = 5 AND DAYOFWEEK(event_date) = 7;
This will find all info for Saturdays in May.
To get the fastest reads store a denormalized field that is the day of the week (and whatever else you need). That way you can index columns and avoid full table scans.
Just try the above first to see if it suits your needs and if it doesn't, add some extra columns and store the data on write. Just watch out for update anomalies (make sure you update the day_of_week column if you change event_date).
Note that the denormalized fields will increase the time taken to do writes, increase calculations on write, and take up more space. Make sure you really need the benefit and can measure that it helps you.
Check DAYOFWEEK() function
If you want textual representation of day of week - use DAYNAME() function.