MongoDB: should I use string instead of date?

MongoDB: should I use string instead of date? - json

In my web project using angular, node and mongodb with JSON, date is not natively supported by JSON serializer. There is a workaround for this problem, as shown here. However, I wonder what's the benefit saving the date as a date object instead of a string in MongoDB? I'm not that far with the project so I don't see the difference.

By saving your dates not as dates but as strings, you are missing out on some very useful features:
MongoDB can query date-ranges with $gt and $lt.
In version 3.0, the aggregation framework got many useful aggregation operators for date handling. None of those work on strings and few of them can be adequately substituted by string operators.
MongoDB dates are internally handled in UNIX epoch, so nasty details like saving timestamps from different timezones or daylight saving times are a non-issue.
A BSON Date is just 8 byte. A date in the minimal form of YYYYMMDD is 12 byte (strings in BSON are prefixed with a 4 byte integer for the length). When you store it as an ISODate string which uses all the ISO 8601 standard has to offer (date, time accurate to millisecond and timezone), you have 32 byte - four times the storage space.
You need to know if any of this matters for your project.
When you really want to avoid using the BSON Date type, you should consider to store your dates as a number representing the elapsed milliseconds/seconds/hours/days (whatever appropriate for your use-case) since a fixed point in time instead of a string. That way you retain the advantages of everything but point 2.

You should at least use ISO dates if you go for this approach. I would however argue that there are benefits in storing date values as date objects. Storing dates as date objects will allow you to add indices and should also help with date range queries. Saying this many developers seem to be happy to store dates as strings, see What is the best way to store dates in MongoDB?

Related

what is the best to format numbers as money in sql server 2008

i have column which is in int and float format.
ex:10000
but i need to display this value in UI as money format with comma separation. i have written below code to convert it.
PARSENAME(CONVERT(VARCHAR,CAST(amount) AS MONEY),1),2)
EX: amount=10000
Result of the above query is 10,000
since the above conversion uses PARSENAME, CONVERT, CAST etc., i feel that some performance over head will be there, but i am not sure.
Is there any best way to do above operation? or this is good.

rmysql return data type

I am extracting some data from a db using R and RMySQL and the dates come back as factors. I then have to call as.Date() on this column. Because there are a lot of records, this takes a long time. Is there a way to strong type the return values from fetch? That is, just like read.csv, for example, allows you to specify column types to prevent R from automatically trying to recognize them, is there something like this available? The dates in my db are typed as Date.

Bad luck. The RMySQL documentation has this to say: "Time variables are imported/exported as character data, so you need to convert these to your favorite
date/time representation." So you'll always have to convert.
RODBC seems to properly support dates tough.

What is the best way to store Time Series in MySQL?

I want to store a large number of Time Series (time vs value) data points. I would prefer using MySQL for the same. Right now I am planning to store the time series as a Binary Blob in MySQL. Is this the best way, what would be the best approach.

You should store your values as whatever type they are (int, boolean, char) and your times as either date, or int containing the UNIX timestamp, whatever fits your application better.

If you want to process the information in any way using mysql you should store it as a date type, numeric type .
The only scaling issue i see (if you only intend to store the information) is extra disk size.

As both Tom and aeon said, you should store data in its native format in MySQL if you want to do anything with that data (-> process the data with SQL).
If, on the other end, you don't want to work on that data, but just store/retrieve it, then you are using MySQL just as a blob container where every blob is a mess/group of multiple time/data points, and it could not be the best tool for the job: that's what files are designed for.
You could investigate an hybrid approach where you possibly store the data non-structured, but you store timestamps as discrete values. In other words, a key/value store with your timestamps as keys. That should open up the possibility to work with NoSQL solutions and maybe you could find out that they fit better (eg. think about running map/reduce jobs directly on the db in a Riak cluster)

Depending on your application, using the spatial data extensions of MySQL could be an option. You could then use spatial indexing to query the data fast.
To represent a time series, the LineString class might be the most appropriate choice. It represents a sequence of tuples you could use to store time and value.
Here it says that "On a world map, LineString objects could represent rivers."
Creating line strings is easy:
-- if you see a blob, use Convert( AsText(...) using 'utf8')
SELECT AsText(GeomFromText('LineString(1 1, 2 2, 3.5 3.9)'));
Some links to get started:
https://dev.mysql.com/doc/refman/5.1/en/spatial-extensions.html
https://dev.mysql.com/doc/refman/5.1/en/populating-spatial-columns.html

Does MySQL support historical date (like 1200)?

I can't see any info about that. Where can I find the oldest date Mysql can support ?

For the specific example you used on your question (year 1200), technically things will work.
In general, however, timestamps are unadvisable for this uses.
First, the range limitation is arbitrary: in MySQL it's Jan 1st, 1000. If you are working with 12-13th century stuff, things go fine... but if at some moment you need to add something older (10th century or earlier), the date will miserably break, and fixing the issue will require re-formatting all your historic dates into something more adequate.
Timestamps are normally represented as raw integers, with a given "tick interval" and "epoch point", so the number is indeed the number of ticks elapsed since the epoch to the represented date (or viceversa for negative dates). This means that, as with any fixed-with integer data-type, the set of representable values is finite. Most timestamp formats I know about sacrifice range in favor of precision, mostly because applications that need to perform time arithmetic often need to do so with a decent precision; while applications that need to work with historical dates very rarely need to perform serious arithmetic.
In other words, timestamps are meant for precise representation of dates. Second (or even fraction of second) precission makes no sense for historical dates: could you tell me, down to the milliseconds, when was Henry the 8th crowned as King of England?
In the case of MySQL, the format is inherently defined as "4-digit years", so any related optimization can rely on the assumption that the year will have 4 digits, or that the entire string will have exactly 10 chars ("yyyy-mm-dd"), etc. It's just a matter of luck that the date you mentioned on your title still fits, but even relying on that is still dangerous: besides what the DB itself can store, you need to be aware of what the rest of your server stack can manipulate. For example, if you are using PHP to interact with your database, trying to handle historical dates is very likely to crash at some point or another (on a 32-bit environment, the range for UNIX-style timestamps is December 13, 1901 through January 19, 2038).
In summary: MySQL will store properly any date with a 4-digit year; but in general using timestamps for historical dates is almost guaranteed to trigger issues and headaches more often than not. I strongly advise against such usage.
Hope this helps.
Edit/addition:
Thank you for this very insteresting
answer. Should I create my own algo
for historical date or choose another
db but which one ? – user284523
I don't think any DB has too much support for this kind of dates: applications using it most often have enough with string-/text- representation. Actually, for dates on year 1 and later, a textual representation will even yield correct sorting / comparisons (as long as the date is represented by order of magnitude: y,m,d order). Comparisons will break, however, if "negative" dates are also involved (they would still compare as earlier than any positive one, but comparing two negative dates would yield a reversed result).
If you only need Year 1 and later dates, or if you don't need sorting, then you can make your life a lot easier by using strings.
Otherwise, the best approach is to use some kind of number, and define your own "tick interval" and "epoch point". A good interval could be days (unless you really need further precission, but even then you can rely on "real" (floating-point) numbers instead of integers); and a reasonable epoch could be Jan 1, 1. The main problem will be turning these values to their text representation, and viceversa. You need to keep in mind the following details:
Leap years have one extra day.
The rule for leap years was "any multiple of 4" until 1582, when it changed from the Julian to the Gregorian calendar and became "multiple of 4 except those that are multiples of 100 unless they are also multiples of 400".
The last day of the Julian calendar was Oct 4th, 1582. The next day, first of the Gregorian calendar, was Oct 15th, 1582. 10 days were skipped to make the new calendar match again with the seasons.
As stated in the comments, the two rules above vary by country: Papal states and some catholic countries did adopt the new calendar on the stated dates, but many other countries took longer to do so (the last being Turkey in 1926). This means that any date between the papal bull in 1582 and the last adoption in 1926 will be ambiguous without geographical context, and even more complex to process.
There is no "year 0": the year before year 1 was year -1, or year 1 BCE.
All of this requires quite elaborate parser and formater functions, but beyond the many case-by-case breakings there isn't really too much complexity (it'd be tedious to code, but quite straight-forward). The use of numbers as the underlying representation ensures correct sorting/comparing for any pair of values.
Knowing this, now it's your choice to take the approach that better fits your needs.

From the documentation:
DATE
A date. The supported range is '1000-01-01' to
'9999-12-31'.

Yes. MySQL dates start in year 1000.

For whatever it's worth, I found that the MySQL DATE field does support dates < 1000 in practice, though the documentation says otherwise. E.g., I was able to enter 325 and it stores as 0325-00-00. A search WHERE table.date < 1000 also gave correct results.
But I am hesitant to rely on the < 1000 dates when they are not officially supported, plus I sometimes need BCE years with more than 4 digits anyway (e.g. 10000 BCE). So separate INT fields for year, month and day (as suggested above) do seem the only choice.
I do wish the DATE type (or perhaps a new HISTDATE type) supported a full range of historical dates - it would be nice to combine three fields into one and simply sort by date instead of having to sort by year, month, day.

Use SMALLINT for year, so the year will accept from -32768 (BC) to 32768 (AD)
As for months and days, use TINYINT UNSIGNED
Most historical events dont have months and days, so you could query like this :
SELECT events FROM history WHERE year='-4990'
Result : 'Noah Ark'
Or : SELECT events FROM history WHERE year='570' AND month='4' AND day='20'
return : "Muhammad pbuh was born"
Depending on requirements, you could also add DATETIME column and make it NULL for date before 1000 and vice versa (thus saving some bytes)

This is an important and interesting problem which has another solution.
Instead of relying on the database platform to support a potentially infinite number of dates with millisecond precision, rely on an object-oriented programming language compiler and runtime to correctly handle date and time arithmetic.
It is possible to do this using the Java Virtual Machine (JVM), where time is measured in milliseconds relative to midnight, January 1, 1970 UTC (Epoch), by persisting the required value as a long in the database (including negative values), and performing the required conversion/calculation in the component layer after retrieval.
For example:
Date d = new Date(Long.MIN_VALUE);
DateFormat df = new SimpleDateFormat("EEE, d MMM yyyy G HH:mm:ss Z");
System.out.println(df.format(d));
Should show:
Sun, 2 Dec 292269055 BC 16:47:04 +0000
This also enables independence of database versions and platforms as it abstracts all date and time arithmetic to the JVM runtime, i.e. changes in database versions and platforms will be much less likely to require re-implementation, if at all.

I had the similar problem and I wanted to continue relay on date fields in the DB to allow me use date range search with accuracy of up-to a day for historic values.
(My DB includes date of birth and dates of roman emperors...)
The solution was to add a constant year (example: 3000) to all the dates before adding them to the DB and subtracting the same number before displaying the query results to the users.
If you DB has already some dates value in it, remember to update the exiting value with the new const number.

How to represent date and/or time information in JSON?

JSON text (RFC 4627) has unambigious representation of objects, arrays, strings, numbers, Boolean values (literally true or false) and null. However, it has nothing defined for representing time information like date and time of day, which is very common in applications. What are the current methods in use to represent time in JSON given the constraints and grammar laid out in RFC 4627?
Note to respondents: The purpose of this question is to document the various methods known to be in circulation along with examples and relative pros and cons (ideally from field experience).

The only representation that I have seen in use (though, admittedly, my experience is limited to DOJO) is ISO 8601, which works nicely, and represents just about anything you could possibly think of.
For examples, you can visit the link above.
Pros:
Represents pretty much anything you could possibly throw at it, including timespans. (ie. 3 days, 2 hour)
Cons:
Umm... I don't know actually. Other than perhaps it might take a bit of getting used to? It's certainly easy enough to parse, if there aren't built in functions to parse it already.

ISO 8601 seems like a natural choice, but if you'd like to parse it with JavaScript running in a browser, you will need to use a library, for browser supports for the parts of the JavaScript Date object that can parse ISO 8601 dates is inconsistent, even in relatively new browsers. Another problem with ISO 8601 is that it is a large, rich standard, and the date/time libraries support only part of it, so you will have to pick a subset of ISO 8601 to use that is supported by the libraries you use.
Instead, I represent times as the number of milliseconds since 1970-01-01T00:00Z. This is understood by the constructor for the Date object in much older browsers, at least going back to IE7 (which is the oldest I have tested).

There is no set literal so use what's easiest for you. For most people, that's either a string of the UTC output or an long-integer of the UTC-centered timecode.
Read this for a bit more background: http://msdn.microsoft.com/en-us/library/bb299886.aspx

I recommend using RFC 3339 format, which is nice and simple, and understood by an increasing number of languages, libraries, and tools.
Unfortunately, RFC 3339, Unix epoch time, and JavaScript millisecond time, are all still not quite accurate, since none of them account for leap seconds! At some point we're all going to have to revisit time representations yet again. Maybe the next time we can be done with it.

Sorry to comment on such an old question, but in the intervening years more solutions have turned up.
Representing date and/or time information in JSON is a special case of the more general problem of representing complex types and complex data structures in JSON. Part of what make the problem tricky is that if you represent complex types like timestamps as JSON objects, then you need to have a way of expression associative arrays and objects, which happen to look like your JSON object representation of a timestamp, as some other marked-up object.
Google's protocol buffers have a JSON mapping which has the notion of a timestamp type, with defined semantics.
MongoDB's BSON has an Extended JSON which says { "$date": "2017-05-17T23:09:14.000000Z" }.
Both can also express way more complex structures in addition to datetime.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008