Why is MySQL's maximum time limit 838:59:59? - mysql

I've run into the limit myself, but despite lots of chatter online, I've never seen an explanation for why the upper and lower limit for the TIME data type is what it is. The official reference at http://dev.mysql.com/doc/refman/5.7/en/time.html says
TIME values may range from '-838:59:59' to '838:59:59'. The hours part may be so large because the TIME type can be used not only to represent a time of day (which must be less than 24 hours), but also elapsed time or a time interval between two events (which may be much greater than 24 hours, or even negative).
But I'm wondering not why the hours part is allowed to be "so large", but why it's cut off where it is. There doesn't seem to be any significance to that many hours in regards to days, or if I try to imagine possible cutoffs for how many seconds could be stored as an integer. So why the range?

The TIME values were always stored on 3 bytes in MySQL. But the format changed on version 5.6.4. I suspect this was not the first time when it changed. But the other change, if there was one, happened long time ago and there is no public evidence of it. The MySQL source code history on GitHub starts with version 5.5 (the oldest commit is from May 2008) but the change I am looking for happened somewhere around 2001-2002 (MySQL 4 was launched in 2003)
The current format, as described in the documentation, uses 6 bits for seconds (possible values: 0 to 63), 6 bits for minutes, 10 bits for hours (possible values: 0 to 1023), 1 bit for sign (add the negative values of the already mentioned intervals) and 1 bit is unused and labelled "reserved for future extensions".
It is optimized for working with time components (hours, minutes, seconds) and doesn't waste much space. Using this format it's possible to store values between -1023:59:59 and +1023:59:59. However MySQL limits the number of hours to 838, probably for backward compatibility with applications that were written a while ago, when I think this was the limit.
Until version 5.6.4, the TIME values were also stored on 3 bytes and the components were packed as days * 24 * 3600 + hours * 3600 + minutes * 60 + seconds. This format was optimized for working with timestamps (because it was, in fact, a timestamp). Using this format it would be possible to store values in the range of about -2330 to +2330 hours. While having this big range of values available, MySQL was still limiting the values to -838 to +838 hours.
There was bug #11655 on MySQL 4. It was possible to return TIME values outside the -838..+838 range using nested SELECT statements. It was not a feature but a bug and it was fixed.
The only reason to limit the values to this range and to actively change any piece of code that produces TIME values outside it was backward compatibility.
I suspect MySQL 3 used a different format that, due to the way the data was packed, limited the valid values to the range -838..+838 hours.
By looking into the current MySQL's source code I found this interesting formula:
#define TIME_MAX_VALUE (TIME_MAX_HOUR*10000 + TIME_MAX_MINUTE*100 + TIME_MAX_SECOND)
Let's ignore for the moment the MAX part of the names used above and let's remember only that TIME_MAX_MINUTE and TIME_MAX_SECOND are numbers between 00 and 59. The formula just concatenates the hours, minutes and seconds in a single integer number. For example, the value 170:29:45 becomes 1702945.
This formula raises the following question: given that the TIME values are stored on 3 bytes with sign, what is the maximum positive value that can be represented this way?
The value we are looking for is 0x7FFFFF that in decimal notation is 8388607. Since the last four digits (8607) should be read as minutes (86) and seconds (07) and their maximum valid values is 59, the greatest value that can be stored on 3 bytes with sign using the formula above is 8385959. Which, as TIME is +838:59:59. Ta-da!
Guess what? The fragment of C code listed above was extracted from this:
/* Limits for the TIME data type */
#define TIME_MAX_HOUR 838
#define TIME_MAX_MINUTE 59
#define TIME_MAX_SECOND 59
#define TIME_MAX_VALUE (TIME_MAX_HOUR*10000 + TIME_MAX_MINUTE*100 + TIME_MAX_SECOND)
I am sure this is how MySQL 3 used to keep the TIME values internally. This format imposed the limitation of the range, and the backward compatibility requirement on the subsequent versions propagated the limitation to our days.

DATETIME is stored based on a base of 10, see Date and Time Data Type Representation:
DATETIME: Eight bytes: A four-byte integer for date packed as YYYY×10000 + MM×100 + DD and a four-byte integer for time packed as HH×10000 + MM×100 + SS
For convinience and some other reasons, the (old) time format was encoded in the same way, using 3 bytes:
Hours * 10000 + Minutes * 100 + Seconds
This means:
3 bytes = 2^24 = 16.777.216
with sign: 2^23 = 8.388.608
Using the encoding, this represents the magical 838 hours. And max. 8608 seconds for the minutes and seconds (without overflow), which results in the largest valid time 838:59:59. One nice thing about this is that the integer representation of that time, 8385959, is easily readable to a human. But this encoding of course leaves gaps, invalid (unused) integer values (like 8309999).
As of MySQL 5.6.4, time format changed its encoding to
1 bit sign (1= non-negative, 0= negative)
1 bit unused (reserved for future extensions)
10 bits hour (0-838)
6 bits minute (0-59)
6 bits second (0-59)
---------------------
24 bits = 3 bytes
Even though it could now store more hours, for compatibility it still just allows 838 hours.

Obviously, it's hard to answer these types of questions without getting direct feedback from the designers of the database.
But there is some documentation regarding how the different data types are stored internally, and, to an extent, it can help us understand this a little bit.
For, instance, regarding the TIME data type, notice how it's stored internally according to the documentation:
TIME encoding for nonfractional part:
1 bit sign (1= non-negative, 0= negative)
1 bit unused (reserved for future extensions)
10 bits hour (0-838)
6 bits minute (0-59)
6 bits second (0-59)
---------------------
24 bits = 3 bytes
So, as you can see, the goal is to fit the information within 3 bytes. And, of those 3 bytes, 10 bits are reserved for the hours, which pretty much determines the overall range.
That said, 10 bits does allow values up to 1023, so I guess, technically, without any changes to the storage size, the range could have been -1023:59:59 to 1023:59:59. Why they didn't do that and they chose 838 as the cutoff, I have no idea.

Related

Ideal data type for currency rate in MySQL

What should I ideally use to store currency rate information in my MySQL database? Should I use:
double
decimal like (10,4)
float or
something else?
Currency rates are usually decimal numbers like 1.2362.
How precise do you need to be? I ask because the problem may go beyond the storage format of the conversion factor. For example, perhaps the rules might be to keep the factor to N decimal places, perform the multiplication (or division) to M places, then round. And the "round" might be "financial" -- 0.5 always rounds up, instead of IEEE-754, "round to nearest even".
Assuming you can afford to be off by one unit in the last place (cents or centimes or whatever), I would simply use DOUBLE.
I assume you won't be storing more than, say, 100 conversion factors, so the fact that DOUBLE occupies 8 bytes will not be an issue. If you are recording the conversion factors for every minute for the past 10 years, I would rethink this decision.
FLOAT gives you 6-7 significant digits (and takes 4 bytes); DOUBLE gives about 16. 12345 and 123.45 and 0.00012345 each have 5 "significant" digits.
For currency, itself, I think the maximum number of decimal places for any currency today is 4. That is, DECIMAL(nn, 4) should suffice for storing money.
However, that does not provide the necessary rounding.
ROUND(currency1 * factor, decimals)
where decimals is 2 for a target currency of USD and Euros (and many other currencies). As I pointed out above, this may differ from what a bank would compute by as much as one cent.
(Would you call this "opinion based"? Or something better?)

Best data type to store money values in MySQL

I want to store many records in a MySQL database. All of them contains money values. But I don't know how many digits will be inserted for each one.
Which data type do I have to use for this purpose?
VARCHAR or INT (or other numeric data types)?
Since money needs an exact representation don't use data types that are only approximate like float. You can use a fixed-point numeric data type for that like
decimal(15,2)
15 is the precision (total length of value including decimal places)
2 is the number of digits after decimal point
See MySQL Numeric Types:
These types are used when it is important to preserve exact precision, for example with monetary data.
You can use DECIMAL or NUMERIC both are same
The DECIMAL and NUMERIC types store exact numeric data values. These types are used when it is important to preserve exact precision, for example with monetary data. In MySQL, NUMERIC is implemented as DECIMAL, so the following remarks about DECIMAL apply equally to NUMERIC. : MySQL
i.e. DECIMAL(10,2)
Good read
I prefer to use BIGINT, and store the values in by multiply with 100, so that it will become integer.
For e.g., to represent a currency value of 93.49, the value shall be stored as 9349, while displaying the value we can divide by 100 and display. This will occupy less storage space.
Caution:
Mostly we don't perform currency * currency multiplication, in case if we are doing it then divide the result with 100 and store, so that it returns to proper precision.
It depends on your need.
Using DECIMAL(10,2) usually is enough but if you need a little bit more precise values you can set DECIMAL(10,4).
If you work with big values replace 10 with 19.
If your application needs to handle money values up to a trillion then this should work: 13,2
If you need to comply with GAAP (Generally Accepted Accounting Principles) then use: 13,4
Usually you should sum your money values at 13,4 before rounding of the output to 13,2.
At the time this question was asked nobody thought about Bitcoin price. In the case of BTC, it is probably insufficient to use DECIMAL(15,2). If the Bitcoin will rise to $100,000 or more, we will need at least DECIMAL(18,9) to support cryptocurrencies in our apps.
DECIMAL(18,9) takes 12 bytes of space in MySQL (4 bytes per 9 digits).
We use double.
*gasp*
Why?
Because it can represent any 15 digit number with no constraints on where the decimal point is. All for a measly 8 bytes!
So it can represent:
0.123456789012345
123456789012345.0
...and anything in between.
This is useful because we're dealing with global currencies, and double can store the various numbers of decimal places we'll likely encounter.
A single double field can represent 999,999,999,999,999s in Japanese yens, 9,999,999,999,999.99s in US dollars and even 9,999,999.99999999s in bitcoins
If you try doing the same with decimal, you need decimal(30, 15) which costs 14 bytes.
Caveats
Of course, using double isn't without caveats.
However, it's not loss of accuracy as some tend to point out. Even though double itself may not be internally exact to the base 10 system, we can make it exact by rounding the value we pull from the database to its significant decimal places. If needed that is. (e.g. If it's going to be outputted, and base 10 representation is required.)
The caveats are, any time we perform arithmetic with it, we need to normalize the result (by rounding it to its significant decimal places) before:
Performing comparisons on it.
Writing it back to the database.
Another kind of caveat is, unlike decimal(m, d) where the database will prevent programs from inserting a number with more than m digits, no such validations exists with double. A program could insert a user inputted value of 20 digits and it'll end up being silently recorded as an inaccurate amount.
If GAAP Compliance is required or you need 4 decimal places:
DECIMAL(13, 4)
Which supports a max value of:
$999,999,999.9999
Otherwise, if 2 decimal places is enough:
DECIMAL(13,2)
src: https://rietta.com/blog/best-data-types-for-currencymoney-in/
Indeed this relies on the programmer's preferences. I personally use: numeric(15,4) to conform to the Generally Accepted Accounting Principles (GAAP).
Try using
Decimal(19,4)
this usually works with every other DB as well
Storing money as BIGINT multiplied by 100 or more with the reason to use less storage space makes no sense in all "normal" situations.
To stay aligned with GAAP it is sufficient to store currencies in DECIMAL(13,4)
MySQL manual reads that it needs 4 bytes per 9 digits to store DECIMAL.
https://dev.mysql.com/doc/refman/8.0/en/precision-math-decimal-characteristics.html
DECIMAL(13,4) represents 9 digits + 4 fraction digits (decimal places) => 4 + 2 bytes = 6 bytes
compare to 8 bytes required to store BIGINT.
There are 2 valid options:
use integer amount of currency minor units (e.g. cents)
represent amount as decimal value of the currency
In both cases you should use decimal data type to have enough significant digits. The difference can be in precision:
even for integer amount of minor units it's better to have extra precisions for accumulators (account for accumulating 10% fees from 1-cent operations)
different currencies have different number of decimals, cryptocurrencies have up to 18 decimals
The number of decimals can change over time due to inflation
Source and more caveats and facts.
Multiplies 10000 and stores as BIGINT, like "Currency" in Visual Basic and Office. See https://msdn.microsoft.com/en-us/library/office/gg264338.aspx

MySQL storing duration time - datatype?

I need to calculate the time a user spends on site. It is difference between logout time and login time to give me something like "Mr X spent 4 hours and 43 minutes online". So to store the4 hours and 43 minutes i declared it like this:
duration time NOT NULL
Is this valid or a better way to store this? I need to store in the DB because I have other calculations I need to use this for + other use cases.
Storing it as an integer number of seconds will be the best way to go.
The UPDATE will be clean and simple - i.e. duration = duration + $increment
As Tristram noted, there are limitations to using the TIME field - e.g. "TIME values may range from '-838:59:59' to '838:59:59'"
The days/hours/minutes/seconds display formatting won't be hardcoded.
The execution of your other calculations will almost surely be clearer when working with an integer "number of seconds" field.
I wouldn't use time as you would be limited to 24 hours. The easiest would just to store an integer in minutes (or seconds depending on the resolution you need).
Consider storing both values as a UNIX-epoch-delta.
I generally prefer to use a signed (64b) bigint (for secondly resolution), or a (signed) (64b) double (if fractional seconds are needed), or a signed (32b) int (if scaled down to minutely or hourly).
Make the unit explicit in the name of the column, for example with a suffix like "_epoch_minutely", for example: "started_epoch_minutely", "finished_epoch_minutely".

Calculate date from numeric value

The number 71867806 represents the present day, with the smallest unit of days.
Sorry guy's, caching owned me, it's actually milliseconds!
How can I
calculate the currente date from it?
(or) convert it into an Unix timestamp?
Solution shouldn't use language depending features.
Thanks!
This depends on:
What unit this number represents (days, seconds, milliseconds, ticks?)
When the starting date was
In general I would discourage you from trying to reinvent the wheel here, since you will have to handle every single exception in regards to dates yourself.
If it's truly an integer number of days, and the number you've given is for today (April 21, 2010, for me as I'm reading this), then the "zero day" (the epoch) was obviously enough 71867806 days ago. I can't quite imagine why somebody would pick that though -- it works out to roughly 196,763 years ago (~194,753 BC, if you prefer). That seems like a strange enough time to pick that I'm going to guess that there's more to this than what you've told us (perhaps more than you know about).
It seems to me the first thing to do is verify that the number does increase by one every 24 hours. If at all possible keep track of the exact time when it does increment.
First, you have only one point, and that's not quite enough. Get the number for "tomorrow" and see if that's 71867806+1. If it is, then you can safely bet that +1 means +1 day. If it's something like tomorrow-today = 24, then odds are +1 means +1 hour, and the logic to display days only shows you the "day" part. If it's something else check to see if it's near (24*60, which would be minutes), (24*60*60, which would be seconds), or (24*60*60*1000, which would be milliseconds).
Once you have an idea of what kind of units you are using, you can estimate how many years ago the "start" date of 0 was. See if that aligns with any of the common calendar systems located at http://en.wikipedia.org/wiki/List_of_calendars. Odds are that the calendar you are using isn't a truly new creation, but a reimplementation of an existing calendar. If it seems very far back, it might be an Julian Date, which has day 0 equivalent to BCE 4713 January 01 12:00:00.0 UT Monday. Julian Dates and Modified Julian dates are often used in astronomy calculations.
The next major goal is to find Jan 1, 1970 00:00:00. If you can find the number that represents that date, then you simply subtract it from this foreign calendar system and convert the remainder from the discovered units to milliseconds. That will give you UNIX time which you can then use with the standard UNIX utilities to convert to a time in any time zone you like.
In the end, you might not be able to be 100% certain that your conversion is exactly the same as the hand implemented system, but if you can test your assumptions about the calendar by plugging in numbers and seeing if they display as you predicted. Use this technique to create a battery of tests which will help you determine how this system handles leap years, etc. Remember, it might not handle them at all!
What time is: 71,867,806 miliseconds from midnight?
There are:
- 86,400,000 ms/day
- 3,600,000 ms/hour
- 60,000 ms/minute
- 1,000 ms/second
Remove and tally these units until you have the time, as follows:
How many days? None because 71,867,806 is less than 86,400,000
How many hours? Maximum times 3,600,000 can be removed is 19 times
71,867,806 - (3,600,000 * 19) = 3,467,806 ms left.
How many minutes? Maximum times 60,000 can be removed is 57 times.
3,467,806 - (60,000 * 57) = 47,806 ms left
How many seconds? Maximum times 1,000 can be removed is 47 times.
47,806 - (1,000 * 47) = 806
So the time is: 19:57:47.806
It is indeed a fairly long time ago if the smallest number is in days. However, assuming you're sure about it I could suggest the following shell command which would be obviously not valid for dates before 1st Jan. 1970:
date -d "#$(echo '(71867806-71853086)*3600*24'|bc)" +%D
or without bc:
date -d "#$(((71867806 - 71853086) * 3600 * 24))" +%D
Sorry again for the messy question, i got the solution now. In js it looks like that:
var dayZero = new Date(new Date().getTime() - 71867806 * 1000);

MySQL: Order by time (MM:SS)?

I'm currently storing various metadata about videos and one of those bits of data is the length of a video.
So if a video is 10 minutes 35 seconds long, it's saved as "10:35" in the database.
But what I'd like to do is retrieve a listing of videos by length (longest first, shortest last).
The problem I'm having is that if a video is "2:56", it's coming up as longest because the number 2 is more than the number 1 in.
So, how can I order data based on that length field so that "10:35" is recognized as being longer than "2:56" (as per my example)?
SELECT * FROM table ORDER BY str_to_date(meta_time,'%l:%i')
You can find the specific formatters on the MySQL Website.
For example:
%k -> Hour (0..23)
%l -> Hour (1..12)
The easiest choice is to store a integer (seconds) or a float (minutes) instead of a string. So 10:35 would be 635 in seconds or 10.583 in minutes. You can sort by these numerically very easily. And you can output them in the format you'd like with some simple math and string functions.
Some options:
Save it as an integer representing the total number of seconds. "10:35" => 635
Save it as a timestamp object with no date component. "10:35" => MAKETIME(0, 10, 34)
Save it with leading decimals or spaces. "2:25" => " 2:25"
My preference would be for the first option.
You could try to see if
ORDER BY TIME_TO_SEC(timefield)
would parse it correctly, however it is not an optimal approach to store time as strings in the database, and I suggest that you store them as TIME if you are able to. Then you can use standard formatting functions to present them as you like.
I had the same problem - storing videos length in database.
I solved it by using TIME mysql type - it solves all ordering and converting issues.