Store date of birth in database as UNIX time? - mysql

I want to store the date of birth as a UNIX timestamp in my database, because this keeps the database small and it speed up the queries.
However, when converting the date of birth to a UNIX time using strtotime, it will output the wrong value, namely the inputted value with one hour difference. I know setting the date_default_timezone_set('UTC'); will output the correct date of birth in UNIX time, but the date of birth has nothing to do with where someone lives, right? Date of birth stays the date of birth, no matter where someone lives.
So in example
$bday = 20;
$bmonth = 6;
$bYear = 1993;
strtotime($cBday.'-'.$cBmonth.'-'.$cByear) // output: 740527200 == Sat, 19 Jun 1993 22:00:00
PS: Database field is defined as: bDate int(4) UNSIGNED

UTC is not a great choice for whole calendar dates such as a date of birth.
My date of birth is 1976-08-27. Not 1976-08-27T00:00:00Z.
I currently live in the US Pacific time zone.
My next birthday is from 2016-08-27T00:00:00-07:00 until 2016-08-28T00:00:00-07:00
In UTC, that's equivalent to 2016-08-27T07:00:00Z until 2016-08-28T07:00:00Z
Of course, if I move to a different time zone before then, I'll celebrate my birthday over a completely different set of ranges.
If I move to Japan, then my birthday will come 16 hours sooner.
My next birthday would be from 2016-08-27T00:00:00+09:00 until 2016-08-28T00:00:00+09:00
In UTC, that's equivalent to 2016-08-26T15:00:00Z until 2016-08-27T15:00:00Z
Therefore, a date of birth (or anniversary date, hire date, etc.) should be stored as a simple year, month and day. No time, and no time zone.
In MySQL, use the DATE type. Do not use DATETIME, TIMESTAMP or an integer containing Unix time.
Also consider that evaluation of age depends on the time zone where the person is currently located, not the time zone where they were born. If the person's location is unknown to the asker - then it's the asker's time zone that is relevant. "How old are you according to you?" is not necessarily the same as "How old are you according to me?".
Of course, where you live doesn't actually make you older or younger - but it comes down to how we as humans evaluate age in years based on our local calendars. If you were instead to ask "How many minutes old am I?" then answer depends on the instantaneous point in time where you were born - which could be measured in UTC, but will usually be given as a local time and time zone. However, in the common case, one does not usually collect that level of detail.

Unix does not know that you are storing a birth date. It just knows that you are storing a timestamp in Unix format. The timestamp includes a time component.
When you convert from the birth date to the timestamp, and back from the timestamp to the birth date, you need to use consistent timezones in order to avoid a time difference in either direction.
Using UTC is a fine choice. The key though is consistency.

Related

How to store time UTC in MySQL?

Which type I should use to store current date + time in UTC?
Then to be able to convert UTC date to specific timezone?
Now I use TIMESTAMP type and CURRENT_TIMESTAMP.
It stores data like: 2019-08-19 20:44:11
But minutes are different that real UTC time, I dont know why.
My server time is local. It is correct under Windows Server
It is up to you to decide the best way to solve timezone problem when users and server has different locale.
No matter the case and the app (mobile, web, etc.) the problem is the same. You should find the best and easiest in your case way to handle time zones.
Here are few options that you can use:
MySQL
From MySQL Date and Time Types - you can create table fields that will hold your date and time values.
"The date and time types for representing temporal values are DATE, TIME, DATETIME, TIMESTAMP, and YEAR. Each temporal type has a range of valid values, as well as a “zero” value that may be used when you specify an invalid value that MySQL cannot represent. The TIMESTAMP type has special automatic updating behavior, described later."
In respect to MySQL Data Type Storage Requirements read the link and make sure you satisfy the table storage engine and type requirements in your project.
Setting the timezone in MySQL by:
SET time_zone = '+8:00'
To me this is a bit more work to handle, but the data is fully loaded, managed and updated by MySQL. No PHP here!
Using MySQL might seem like a better idea (that's what I'd like to think), but there's a lot more to it.
To be able to choose, you will have to make an educated decision. There's a lot to cover in regards to using MySQL. Here's a practical article that goes into the rabbit hole of using MySQL to manage date, time and timezone.
Since you didn't specify how you interface the database, here's a PHP example and functions to handle the date, time and time zones.
PHP
1. Save date, time and time zone
E.g. Chicago (USA - Illinois) - UTC Offset UTC -5 hours
You can save the date time
2015-11-01 00:00:00
and the time zone
America/Chicago
You will have to work out DST transitions and months having different numbers of days.
Here's a reference to the DateTime to work out any timezone and DST differences:
DateTime Aritmetic
2. Unix Timestamp and Time Zone
Before we go into the details of this option we should be aware of the following:
The unix time stamp is a way to track time as a running total of seconds. This count starts at the Unix Epoch on January 1st, 1970 at UTC. Therefore, the unix time stamp is merely the number of seconds between a particular date and the Unix Epoch. It should also be pointed out (thanks to the comments from visitors to this site) that this point in time technically does not change no matter where you are located on the globe. This is very useful to computer systems for tracking and sorting dated information in dynamic and distributed applications both online and client side.
What happens on January 19, 2038?
On this date the Unix Time Stamp will cease to work due to a 32-bit overflow. Before this moment millions of applications will need to either adopt a new convention for time stamps or be migrated to 64-bit systems which will buy the time stamp a "bit" more time.
Here's how the timestamp works:
08/19/2019 # 8:59pm (UTC) translates to 1566248380 seconds since Jan 01 1970. (UTC)
Using the PHP date() function you can format to anything you want like:
echo date('l jS \of F Y h:i:s A', 1566248380);
Monday 19th of August 2019 08:59:40 PM
or MySQL:
SELECT from_unixtime(2147483647);
+--------------------------------------+
| from_unixtime(2147483647) |
+--------------------------------------+
| 2038-01-19 03:14:07 |
+--------------------------------------+
More example formats that you can convert to:
08/19/2019 # 8:59pm (UTC)
2019-08-19T20:59:40+00:00 in ISO 8601
Mon, 19 Aug 2019 20:59:40 +0000 in RFC 822, 1036, 1123, 2822
Monday, 19-Aug-19 20:59:40 UTC in RFC 2822
2019-08-19T20:59:40+00:00 in RFC 3339
The PHP Date() function can be used as a reference.
Again you will have to save the time zone:
America/Chicago
Set the PHP script time zone for your users by using date_default_timezone_set() function:
// set the default timezone to use. Available since PHP 5.1
date_default_timezone_set('UTC');
date_default_timezone_set('America/Chicago');
You can't store a date/time with time zone information.
MySQL does not store the time zone information on either DATETIME or TIMESTAMP. They are assumed to be on the server time zone.
The only ugly work around is to set the whole MySQL server/vm/docker container to UTC.

Compare creation dates of things that may have been made before christ

I'm thinking about making a project in a database with a large amount of objects / people / animals / buildings, etc.
The application would let the user select two candidates and see which came first. The comparison would be made by date, or course.
MySQL only allow dates after 01/01/1000.
If one user were to compare which came first: Michael Jackson or Fred Mercury, the answer would be easy since they came after this year.
But if they were to compare which came first: Tyranosaurus Rex or Dog, they both came before the accepted date.
How could I make those comparisons considering the SQL limit?
I didn't do anything yet, but this is something I'd like to know before I start doing something that will never work.
THIS IS NOT A DUPLICATE OF OTHER QUESTIONS ABOUT OLD DATES.
In other questions, people are asking about how to store. It would be extremely easy, just make a string out of it. But in my case, I'd need to compare such dates, which they didn't ask for, yet.
I could store the dates as a string, using A for after and B for before, as people answered in other questions. There would be no problem. But how could I compare those dates? What part of the string I'd need to break?
You could take a signed BIGINT field and use it as a UNIX timestamp.
A UNIX timestamp is the number of seconds that passed since January 1, 1970, at 0:00 UTC.
Any point in time would simply be a negative timestamp.
If my amateurish calculation is correct, a BIGINT would be enough to take you 292471208678 years into the past (from 1970) and the same number of years into the future. That ought to be enough for pretty much anything.
That would make dates very easy to compare - you'd simply have to see whether one date is bigger than the other.
The conversion from calendar date to timestamp you'd have to do outside mySQL, though.
Depending on what platform you are using there may be a date library to help you with the task.
Why deal with static age at time of entry and offset?
User is going to want to see a date as a date anyway
Complex data entry
Three fields
year smallint (good for up to -32,768 BC)
month tinyint
day tinyint
if ( (y1*10000 + m1*100 + d1) > (y2*10000 + m2*100 + d2) )
OK I had an idea.
Store the age in days, since the hours/seconds are irrelevant for this case.
Christ's age in days: -2015 * 365.
Dog's age in days: -40000 * 365.
In order to make precise calculations, I'd only need an extra field with the date I have added the values. Then add to the "age in days" the difference in days from the day I have added the register, from the day the user is making the comparison.
For example:
Dog's age has been added in 29/12/2015 and the age in days is -40000 * 365.
User is making a comparison on day 29/01/2016.
The difference in days between the two dates is 31 days.
So dog's age in days should be -40000 * 365 - 31.
Using an unsigned big int can do the trick.
Thanks to Pekka for suggesting using negative numbers for any date before the current date.

What is enough to store dates/times in the DB from multiple time zones for accurate calculations?

This is a HARD question. In fact it is so hard it seems the SQL standard and most of the major databases out there don't have a clue in their implementation.
Converting all datetimes to UTC allows for easy comparison between records but throws away the timezone information, which means you can't do calculations with them (e.g. add 8 months to a stored datetime) nor retrieve them in the time zone they were stored in. So the naive approach is out.
Storing the timezone offset from UTC in addition to the timestamp (e.g. timestamp with time zone in postgres) would seem to be enough, but different timezones can have the same offset at one point in the year and a different one 6 months later due to DST. For example you could have New York and Chile both at UTC-4 now (August) but after the 4th of November New York will be UTC-5 and Chile (after the 2nd of September) will be UTC-3. So storing just the offset will not allow you to do accurate calculations either. Like the above naive approach it also discards information.
What if you store the timezone identifier (e.g. America/Santiago) with the timestamp instead? This would allow you to distinguish between a Chilean datetime and a New York datetime. But this still isn't enough. If you are storing an expiration date, say midnight 6 months into the future, and the DST rules change (as unfortunately politicians like to do) then your timestamp will be wrong and expiration could happen at 11 pm or 1 am instead. Which might or might not be a big deal to your application. So using a timestamp also discards information.
It seems that to truly be accurate you need to store the local datetime (e.g. using a non timezone aware timestamp type) with the timezone identifier. To support faster comparisons you could cache the utc version of it until the timezone db you use is updated, and then update the cached value if it has changed. So that would be 2 naive timestamp types plus a timezone identifier and some kind of external cron job that checks if the timezone db has changed and runs the appropriate update queries for the cached timestamp.
Is that an accurate solution? Or am I still missing something? Could it be done better?
I'm interested in solutions for MySQL, SQL Server, Oracle, PostgreSQL and other DBMS that handle TIMESTAMP WITH TIME ZONE.
You've summarized the problem well. Sadly the answer is to do what you've described.
The correct format to use does depend the pragmatics of what the timestamp is supposed to represent. It can in general be divided between past and future events (though there are exceptions):
Past events can and usually should be stored as something which can never be reinterpreted differently. (eg: a UTC time stamp with a numeric time zone). If the named time zone should be kept (to be informative to the user) then this should be separate.
Future events need the solution you've described. Local timestamp and named time zone. This is because you want to change the "actual" (UTC) time of that event when the time zone rules change.
I would question if time zone conversion is such an overhead? It's usually pretty quick. I'd only go through the pain of caching if you are seeing a really significant performance hit. There are (as you pointed out) some big operations which will require caching (such as sorting billions of rows based on the actual (UTC) time.
If you require future events to be cached in UTC for performance reasons then yes, you need to put a process in place to update the cached values. Depending of the type of DB it is possible that this could be done by the sysadmins as TZ rules change rarely.
If you care about the offset, you should store the actual offset. Storing the timezone identifier is not that same thing as timezones can, and do, change over time. By storing the timezone offset, you can calculate the correct local time at the time of the event, rather than the local time based on the current offset. You may still want to store the timezone identifier, if it's important to know what actual timezone event was considered to have happened in.
Remember, time is a physical attribute, but a timezone is a political one.
If you convert to UTC you can order and compare the records
If you add the name of the timezone it originated from you can represent it in it's original tz and be able to add/substract timeperiods like weeks, months etc (instead of elapsed time).
In your question you state that this is not enough because DST might be changed. DST makes calculating with dates (other than elapsed time) complicated and quite code intensive. Just like you need code to deal with leap years you need to take into account if for a given data / period you need to apply a DST correction or not. For some years the answer will be yes for others no.
See this wiki page for how complex those rules have become.
Storing the offset is basically storing the result of those calculations. That calculated offset is only valid for that given point in time and can't be applied as is to later or earlier points like you suggest in your question. You do the calculation on the UTC time and then convert the resulting time to the required timezone based on the rules that are active at that time in that timezone.
Note that there wasn't any DST before the first world war anywhere and date/time systems in databases handle those cases perfectly.
I'm interested in solutions for MySQL, SQL Server, Oracle, PostgreSQL and other DBMS that handle TIMESTAMP WITH TIME ZONE.
Oracle converts with instant in time to UTC but keeps the time zone or UTC offset depending on what you pass. Oracle (correctly) makes a difference between the time zone and UTC offset and returns what you passed to you. This only costs two additional bytes.
Oracle does all calculations on TIMESTAMP WITH TIME ZONE in UTC. This is does not make a difference for adding months, but makes a difference for adding days as there is no daylight savings time. Note that the result of a calculation must always be a valid timestamp, e.g. adding one month to January 31st will throw an exception in Oracle as February 31st does not exist.

What is special about dates before the year 1970?

I see a lot of discussion about getting dates that are pre-1970. For example, I see people ask a question like, "how do I get a date before 1970?"
What I'd like to know is what is so special about 1970? Why do people have trouble getting dates before that particular year? Was it the beginning of the universe or something?
It is the beginning of the UNIX epoch, timestamp 0. All UNIX timestamps are the number of seconds since January 1st 1970 UTC. The moment of this writing is timestamp 1298440626.
UNIX timestamps pop up in the datetime libraries of a lot of languages and software, as storing times as a number of seconds is convenient for various reasons.
Since 1970 is time 0, dates before then can't typically be stored as timestamps.
It has to do with UNIX times. They're strored as number of seconds since the epoch, and the epoch is defined as the start of the day January 1, 1970 (UTC).
That's also the cause for the upcoming Y2K38 bug where the value will roll over to negative sometime early Feb (from memory) in 2038. Unless they up it to beyond a signed 32-bit value, of course.
It was the beginning of the UNIX era.

Timzone conversions on date only and time only - is it necessary?

We've been working on implementing timezone support for our Web app.
This great SO post has helped us a bunch: Daylight saving time and time zone best practices
We've implelmented the OLSON TZ database in MYSQL and are using that for TZ conversions.
We're building a scheduling app so:
We are storing all our bookings which occur on a specific date at a specific time in UTC time in DateTime fields and converting them using CONVERT_TZ(). This is working great.
What we aren't so sure about is stuff like vacations and breaks:
Vacations are just Date references and don't include a time portion. Because CONVERT_TZ() doesn't work on date objects we are guessing that we are best to just store the date value as per the user's timezone?
id1 id3 startDate endDate
-----------------------------
3 6 2010-12-25 2011-01-03
4 3 2010-09-22 2010-09-26
Same thing with recurring breaks during stored for each day of the week. We currently store their breaks indexed 0-6 for each day of the week. Because these are just time objects we can't use CONVERT_TZ() and assume we should just store them as time values in the user's time zone?
bID sID dayID startTime endTime
--------------------------------
1 4 1 12:00:00 14:00:00
2 4 4 13:30:00 13:30:00
In this case with vacations and breaks we would only compare them to booking times AFTER the booking times have been converted to the user's local time.
Is this the correct way to handle things, or should we be storing both vacations and breaks in some other way so that we can convert them to UTC (not sure how this would work for breaks).
Thanks for your assistance!
The two storage formats look fine. You just need to convert them to the user's local time when you pull them out of the table.
Actually, for the breaks table I presume they're already nominally in local time, so you just compare directly against the local time of the appointment.
I don't understand your question well enough to say my answer is 100% correct for you. But I think what you need to do is store the DateTime in "local" time and also store the timezone. This way you have it correct even if daylight savings time shifts (which happens).
Good article at http://blogs.windwardreports.com/davidt/2009/11/what-every-developer-should-know-about-time.html (yes by me).