How to store approximate dates in MySQL? - mysql

I need to store dates such as 'Summer 1878' or 'Early June 1923', or even 'Mid-afternoon on a Tuesday in August'. How would you suggest I do this?
I have considered breaking the date and time up into separate (integer) columns, and giving each column an ancillary (integer) column containing a range (0 if exact; NULL if unknown). But I'm sure there's other ways...
Thanks!

Since 'Mid-afternoon on a Tuesday in August' ("A Sunday Afternoon on the Island of La Grande Jatte"?) doesn't specify a year, the only real solution is your table of all date and time components, all nullable.
Other wise, you're conflating your data.
You have two (admittedly related) things here: a human readable string, the date_description, and a range of possible dates.
If you can specify at least a range, you can do this:
create table artwork {
artwork_id int not null primary key,
name varchar(80),
... other columns
date_description varchar(80),
earliest_possible_creation_date datetime
latest_possible_creation_date datetime
}
insert into artwork(
name,
date_description,
earliest_possible_creation_date,
latest_possible_creation_date
) values (
'A Sunday Afternoon on the Island of La Grande Jatte',
'Mid-afternoon on a Tuesday in August'
'1884-01-01',
'1886-12-31'
), (
'Blonde Woman with Bare Breasts',
'Summer 1878'
'1878-05-01',
'1878-08-31'
), (
'Paulo on a Donkey',
'Early June 1923',
'1923-06-01'
'1923-06-15'
);
This allows you to display whatever you want, and search for:
select * from artwork
where #some_date between
earliest_possible_creation_date and latest_possible_creation_date;
And obviously, "creation date" (the date the artist created the work) is entirely differnet from "date depicted in work", if the latter can be determined at all.

I'm using Postgres, and I wanted to do the same thing. Perhaps you can do it the same way as I did it, if MySQL has some similar geometric types: http://www.electricwords.org/2008/11/fuzzy-date-matching-in-postgresql/

Almost no matter what you do, you almost certainly won't be able to get the database to do the heavy lifting for you. So you are left with two options:
1 - Use natural strings as you have described
2 - Store a precise data as well as the precision of that date
For example, you could store "5:10:23pm on Sep 23,1975", "plus or minus 6 months", and when someone wants to search for records that occured in that timeframe this could pop up.
This doesn't help with queries, because to the best of my knowledge MySQL doesn't provide any support for tolerances ( nor do any others I know of ). You have to basically query it all and then filter out yourself.

I don't think any native MySQL date representation is going to work for you. Your two-column solution would work well if paired with a Unix time stamp (generated with the UNIX_TIMESTAMP() function with a MySQL date as the argument). Use the second column (the range width) for an upper and lower bound in your selects, and make sure the date column is indexed.

In the end I decided upon: a column for each of the date components (year, month, day, hour, minute, second), and accompanying columns for the range of each of these (year_range, month_range, day_range, hour_range, minute_range, second_range), mainly because this method allows me to specify that I know for sure that a particular photo was taken in August (for instance) in the late '60s (year=1868, year_range=2, month=8, month_range=0).
Thank you all for your help!

create a table with a list of values that you could want, like "Early" or "Summer". then whatever you have setting up the data could have an algorithm that sets a foreign key depending on the date.

Going with Chris Arguin's answer, in the second column just have another datetime column that you can use to store the +/-, then you should be able to write a query that uses both columns to get an approximate datetime.

Use two dates and determine the start and end date of the fuzzy region. For stuff like Summer 1878, enter 18780621 to 18780920. For Early June 1923 you have to decide when early ends, maybe 19230601 to 19230610. This makes it possible to select against the values. You might still want to filter afterward but this will get you close.
For the ones without years, you'll have to find a different system.

Related

Relative dates in Semantic Mediawiki?

Is there a way in Semantic Mediawiki to store and use relative dates?
I would like to store genealogical data in Semantic Mediawiki and there is sometimes information like: »On January 10th 2021 John, son of the deceased Jack, married Mary.« Now I know that Jack died BEFORE 2021-01-10. Is there any way to store (and query) such information -- BEFORE 2021-01-10 -- in a date property, just like in GEDCOM format?
To store such data, you can define Record datatype:
Property:Relative date of birth:
[[Has type::Record]]
[[Has fields::Sign;Date value]]
Property:Date value:
[[Has type::Date]]
Property:Sign:
[[Has type::Text]]
[[Allows value::Before]]
[[Allows value::Exactly]]
[[Allows value::After]]
To store data, use [[Relative date of birth::Before;January 9th, 1976]].
Querying such data is not an easy task. For an exact day, use {{#ask:[[Relative sate of birth::Exactly;January 9th,1976]]}}. To query for people born before the 9th of January 1976, you need a more complicated query, or a union of queries: {{#ask:[[Relative sate of birth::Exactly||Before;<January 9th,1976]]|?Relative date of birth.Date value=date}}.
I have a set of functions for "GEDdates" I store dates with two fields, one for the date in ccyymmdd format and another for a modifier. The date can be truncated if you don't have specifics: ccyy or ccyymm. The modifiers are <, >, c, - for BEF, AFT, ABT and BTW in GEDCOM. The - is followed in the modifier field by the later date such as -ccyymm. I've recently also used the Unicode character for between ≬ (≬) which is more aligned with the data type.
This data structure gives all the flexibility needed. There are code examples at GitHub

How to calculate the difference between days in a table field

Ok this should be a relatively easy thing to do, yet I'm at the head desk stage trying to figure out the insanity here.
I have a table called tblPersonnel. I'm tracking two document expiration dates in date/time fields called CED and PPED. When I run a query against tblPersonnel I need it to look at PPED, determine if that document is expired and if so use CED instead. I have a few fields in the query that need to use this concept to determine what the output value is, but I am hitting a wall here trying to get the query to spit out the correct value. Here's what I'm using for one of the fields - Document Expiration Date: IIf([PPED]-Now()<0,[CED],[PPED]). What's happening is that the expression is constantly popping as false, so PPED is getting used regardless if it's an expired date or not. Does anyone have any ideas as to what I'm doing wrong here?
I've also tried to set this up as its own field in tblPersonnel, but that's even more aggravating. If I try to set the field to just a text field - IIf([PPED]-Now()<0,"Yes","No"), the formula will accept the use of Now(), but it doesn't like the reference to the other fields in the table. If I set it as a calcuated column, I can reference the other fields but it doesn't like Now(). I'm at a loss here.
If PPED is less than Date(), it is expired. Don't need to subtract. Assuming CED and PPED are just date parts, no time, consider:
IIf([PPED] < Date(), [CED], [PPED])
If PPED could be null:
IIf(Nz([PPED],0) < Date(), [CED], [PPED])
Ok finally fixed it here. I had another issue in that I wasn't accounting for how Access would handle a Null or blank value in PPED. The functioning formula is Document Expiration Date: IIf(Len([PPED])>0,IIf([PPED]<Date(),[CED],[PPED]),[CED]) Thanks to June7 for helping me simplify the expression, as I was using DateDiff('d',[PPED],Date())<0 but their answer is just so much cleaner and quicker to type.

Querying https://musicbrainz.org for all artists

How can I query for all artists who were born after 1720 and died before 1900 on https://musicbrainz.org?
I need to retrieve their IDs and some information about them.
Is it possible to get data in JSON format?
for those who dont want to read a long post, here is everything the OP asked for in only one query:
http://musicbrainz.org/ws/2/artist/?query=begin:[1720 TO 1900] AND end:[1720 TO 1900] AND type:"person"&fmt=json
This should return perfect results, and has got to be the best answer possible.
- all artists, born after 1720 and dead before 1900, in json format, which retrieves their IDs, and lots of information about them...
The explanation and thought process:
Since Brian's currently accepted answer includes a link to the API document, i can say it is technically complete but I don't consider pointing to the spec a the best possible answer, and can be greatly improved.
Firstly it is easy to return json by adding the json format parameter.
&fmt=json
Secondly while i don't reckon there where many boy bands back in the day, given that OP is asking about births and deaths we may conclude they are interested in only people rather than groups other types of artists.
AND type:"person"
At which point as Brian suggests another call for each end date and then filter the results taking only those who died by 1900.
If you did this you would need to do way more than 180 searches the best answer suggests, but rather one for each birth and each death year combination, so technically 1720 to 1720, all the way through 1900-1900, my math stinks but that is thousands of searches.
But what makes this still such a horrible search is because sometimes dates are either written with only the year, and then sometimes written with month date and year, so for example if you search for begin 1929 and end 1900
So if a date is written to include not only year but month/date you would not get any results for this artist because of the full birthday:
ex:
id "2b8a16a9-468f-49b0-93ea-5e6726f41643" type "Person" life-span
begin "1929-11-10"
end "1990"
ended true
Therefore in order to get any good results using only the year you would need to add the fuzzy search syntax
musicbrainz.org/ws/2/artist/?query=begin:1960~ AND end:1990~ AND
type:"person"&fmt=json
But this does nothing to solve big problem of the magnitude of searches suggested, so knowing its LUCENS based I decided to learn some LUCENS, and realize there is range syntax:
Therefor you can do all of the above with one query:
http://musicbrainz.org/ws/2/artist/?query=begin:[1720 TO 1900] AND
end:[1720 TO 1900] AND type:"person"&fmt=json
PS I recommend to start adding quotes or even url encoding your parameter values to prevent breakage.
For example leaving quotes off begin and end numerals in the example above has no problem but off the type value will fail.
First, Musicbrainz only returns XML, as far as I know, so you'll have to convert the results to JSON.
To answer your question, it doesn't look like you'll be able to get the data you want in a single call. (The following is based off the XML Web Service Search documentation.)
This call will retrieve all artists who were born in a given year:
http://musicbrainz.org/ws/2/artist/?query=begin:1720
I believe you'd need to write 180 calls (one for each year between 1720 and 1900) to get the data you need. You'd also need to manually filter out artists who died after 1900, by looking at the <end> node within <life-span>. This is because the end field will only get you artists who died in a specific year.

calcutate membership duration, until now or until end date

I have a MySQL table with (among others) the following columns:
[name] [member_since_date] [member_until_date]
When somebody's membership ends, the [member_until_date] field is populated, otherwise it contains NULL.
I need a purely SQL based solution for the following:
I want to calculate how long someone is a member: when [member_until_date] is filled, I need it to calculate [member_until_date] - [member_since_date].
When somebody is still a member, the field [member_until_date] is NULL, so then I need it to calculate [NOW] - [member_since_date].
I hope I'm clear enough on this, and I hope somebody has an answer for me.
To get the difference between dates, in days, use DATEDIFF(). To take one value based on a condition, or another, use IF, though in this case I am using the similar IFNULL().
SELECT DATEDIFF(IFNULL(member_until_date, NOW()), member_since_date) AS days_member
FROM ...
IFNULL() says use the first argument, unless it's null, then use the second argument.
DATEDIFF() expects the larger date first in order to get a positive result.
COALESCE() provides similar functionality to IFNULL() and would be the ANSI SQL way of doing this.
Here's a solution using DATEDIFF().
SELECT name, DATEDIFF(IF(member_until_date,member_until_date,NOW()),
member_since_date) AS membership_duration
FROM members;
First, we check if member_until_date is null. If it is, we use NOW(), otherwise we use member_until_date.
IF(member_until_date,member_until_date,NOW())
Now we calculate the date difference between the above and the beginning of the membership, member_since_date, and return is as membership_duration.
DATEDIFF(IF(member_until_date,member_until_date,NOW()),
member_since_date) AS membership_duration

Access data conversion issue

I'm using Access 2003. Have a table with some date values in a text data column like this;
May-97
Jun-99
Jun-00
Sep-02
Jan-04
I need to convert them to proper date format and into another Date/time column, So create a new Date/Time columns and just updated the values from the Text column into this new column. At first it looked fine, except for years after the year 2000. The new columns converted the dates as follows;
May-97 > 01/05/1997
Jun-99 > 01/06/1999
Jun-00 > 01/06/2000
Sep-02 > 01/09/2010
Jan-04 > 01/01/2010
As you can see any data with year after 2000 get converted to 2010. The same thing happens if I query the data using FORMAT(dateString, "dd/mm/yyyy").
Any ideas why this is so? Do I have to split the month and year and combine them again?
Thanks
Access/Jet/ACE (and many other Windows components) use a window for interpreting 2-digit years. For 00 to 29, it's assumed to be 2000-2029, and for 30-99, 1930-1999. This was put in place to address Y2K compatibility issues sometime in the 1997-98 time frame.
I do not allow 2-digit year input anywhere in any of my apps. Because of that, I don't have to have any code to interpret what is intended by the user (which could conceivably make mistakes).
This also points up the issue of the independence of display format and data storage with Jet/ACE date values. The storage is as a double, with the integer part indicating the day since 12/30/1899 and the decimal part the time portion within the day. Any date you enter is going to be stored as only one number.
If you input an incomplete date (i.e., with no century explicitly indicated for the year), your application has to make an assumption as to what the user intends. The 2029 window is one solution to the 2-digit year problem, but in my opinion, it's entirely inappropriate to depend on it because the user can change it in their Control Panel Regional Settings. I don't write any complicated code to verify dates, I just require 4-digit year entry and avoid the problem entirely. I have been doing this since c. 1998 as a matter of course, and everybody is completely accustomed to it. A few users squawked back then, and I had the "it's because of Y2K" as the excuse that shut them down. Once they got used it, it became a non-issue.
The date is ambiguous, so it is seeing 02 as the day number. Depending on your locale, something like this may suit:
cdate("01-" & Field)
However, it may be best to convert to four digit year, month, day format, which is always unambiguous.
Access seems to be get conduced between MM-YYYY format and MM-DD format. Don't know why it is doing it for dates after the year 2000, but solved it by converting the original string date to full date (01-May-01). Now Access converts the year into 2001 instead of 2010.
If you don't supply a year and the two sets of digits entered into a date field could be a day and month then Access assumes the current year. So your first three dates definitely have a year in them. But the last two don't.
Note that this isn't Access but actually the operating system doing the work. You get the same results in Excel. I had an interesting conversattion with some Microsoft employees on this issue and it's actually OLEAUT32.DLL.