How can I query for all artists who were born after 1720 and died before 1900 on https://musicbrainz.org?
I need to retrieve their IDs and some information about them.
Is it possible to get data in JSON format?
for those who dont want to read a long post, here is everything the OP asked for in only one query:
http://musicbrainz.org/ws/2/artist/?query=begin:[1720 TO 1900] AND end:[1720 TO 1900] AND type:"person"&fmt=json
This should return perfect results, and has got to be the best answer possible.
- all artists, born after 1720 and dead before 1900, in json format, which retrieves their IDs, and lots of information about them...
The explanation and thought process:
Since Brian's currently accepted answer includes a link to the API document, i can say it is technically complete but I don't consider pointing to the spec a the best possible answer, and can be greatly improved.
Firstly it is easy to return json by adding the json format parameter.
&fmt=json
Secondly while i don't reckon there where many boy bands back in the day, given that OP is asking about births and deaths we may conclude they are interested in only people rather than groups other types of artists.
AND type:"person"
At which point as Brian suggests another call for each end date and then filter the results taking only those who died by 1900.
If you did this you would need to do way more than 180 searches the best answer suggests, but rather one for each birth and each death year combination, so technically 1720 to 1720, all the way through 1900-1900, my math stinks but that is thousands of searches.
But what makes this still such a horrible search is because sometimes dates are either written with only the year, and then sometimes written with month date and year, so for example if you search for begin 1929 and end 1900
So if a date is written to include not only year but month/date you would not get any results for this artist because of the full birthday:
ex:
id "2b8a16a9-468f-49b0-93ea-5e6726f41643" type "Person" life-span
begin "1929-11-10"
end "1990"
ended true
Therefore in order to get any good results using only the year you would need to add the fuzzy search syntax
musicbrainz.org/ws/2/artist/?query=begin:1960~ AND end:1990~ AND
type:"person"&fmt=json
But this does nothing to solve big problem of the magnitude of searches suggested, so knowing its LUCENS based I decided to learn some LUCENS, and realize there is range syntax:
Therefor you can do all of the above with one query:
http://musicbrainz.org/ws/2/artist/?query=begin:[1720 TO 1900] AND
end:[1720 TO 1900] AND type:"person"&fmt=json
PS I recommend to start adding quotes or even url encoding your parameter values to prevent breakage.
For example leaving quotes off begin and end numerals in the example above has no problem but off the type value will fail.
First, Musicbrainz only returns XML, as far as I know, so you'll have to convert the results to JSON.
To answer your question, it doesn't look like you'll be able to get the data you want in a single call. (The following is based off the XML Web Service Search documentation.)
This call will retrieve all artists who were born in a given year:
http://musicbrainz.org/ws/2/artist/?query=begin:1720
I believe you'd need to write 180 calls (one for each year between 1720 and 1900) to get the data you need. You'd also need to manually filter out artists who died after 1900, by looking at the <end> node within <life-span>. This is because the end field will only get you artists who died in a specific year.
Related
I'm pretty much trying to get all of the rows that contain any of the relevant tags in any of the relevant columns.
Take a look at an example row:
[LeadID Leadname Ratings AvgRating Address Website Phone TimesOpen Category LeadDescription CurrentStatus]
1 Siena Tuscan Steakhouse 396 4.300 104 S Broadway, Wichita, KS 67202, United States http://www.sienawichita.com/ +1 316-440-5300 LGBTQ+ friendly2022-05-19 Thursday 5PM–12AM
2022-05-20 Friday 6:30–10AM
2022-05-21 Saturday 7–11AM
2022-05-22 Sunday 7–11AM
2022-05-23 Monday 6:30–10AM
2022-05-24 Tuesday 6:30–10AM
2022-05-25 Wednesday 5PM–10AM
restaurants Hotel restaurant-bar offering refined Italian plates & many wines in a warm & elegant atmosphere.
I don't think you'll need to see it in structured form so I apologize for it being messy.
Everything in [ ] are the column names, and the following are its respective fields.
Here is my query
SELECT LeadID
FROM cleancopy
WHERE
Website OR LeadName OR LeadDescription OR Category
IN ('%Event%' OR '%Live%' OR '%Music%' OR '%Venue%');
This query is returning all rows unfiltered.
I want the query to select all rows that contain any number of the relevant tags "Event", "Live", "Music", "Venue", in any of the column names Website, LeadName, LeadDescription, Category.
So one or all of the tags could be in one or all of the attribute types.
More simply put, I'm trying to filter out any row that doesn't contain any of the keywords I want.
First thing: "I don't think you'll need to see it in structured form" is a very bad assumption. We DO need this because it makes it much easier to provide a good answer.
Second thing: This is not the kind of data checking SQL is done for. So there is no simple way especially when you really need a list of strings and a like condition. Such complex data handling should better be avoided or done outside SQL and within the application.
Anyway, the shortest way to do what you describe will be to CONCAT all columns and then search for your strings using OR.
SELECT leadid FROM cleancopy WHERE
CONCAT(website,leadname,leaddescription,category) LIKE '%Event%'
OR CONCAT(website,leadname,leaddescription,category) LIKE '%Live%'
OR CONCAT(website,leadname,leaddescription,category) LIKE '%Music%'
OR CONCAT(website,leadname,leaddescription,category) LIKE '%Venue%'
Is there a way in Semantic Mediawiki to store and use relative dates?
I would like to store genealogical data in Semantic Mediawiki and there is sometimes information like: »On January 10th 2021 John, son of the deceased Jack, married Mary.« Now I know that Jack died BEFORE 2021-01-10. Is there any way to store (and query) such information -- BEFORE 2021-01-10 -- in a date property, just like in GEDCOM format?
To store such data, you can define Record datatype:
Property:Relative date of birth:
[[Has type::Record]]
[[Has fields::Sign;Date value]]
Property:Date value:
[[Has type::Date]]
Property:Sign:
[[Has type::Text]]
[[Allows value::Before]]
[[Allows value::Exactly]]
[[Allows value::After]]
To store data, use [[Relative date of birth::Before;January 9th, 1976]].
Querying such data is not an easy task. For an exact day, use {{#ask:[[Relative sate of birth::Exactly;January 9th,1976]]}}. To query for people born before the 9th of January 1976, you need a more complicated query, or a union of queries: {{#ask:[[Relative sate of birth::Exactly||Before;<January 9th,1976]]|?Relative date of birth.Date value=date}}.
I have a set of functions for "GEDdates" I store dates with two fields, one for the date in ccyymmdd format and another for a modifier. The date can be truncated if you don't have specifics: ccyy or ccyymm. The modifiers are <, >, c, - for BEF, AFT, ABT and BTW in GEDCOM. The - is followed in the modifier field by the later date such as -ccyymm. I've recently also used the Unicode character for between ≬ (≬) which is more aligned with the data type.
This data structure gives all the flexibility needed. There are code examples at GitHub
Let's consider a multiple selection parameter on a report: Employee
This parameter has a lot of possible values. Initially nothing is shown on the list and there is a textfield search parameter associated, that updates the Employee selection list with top n matches for the searched string.
If the entered search query is John Doe we can imagine that now the selection list shows:
John Doe
...
Xavier John Doesson
Now I can select as many items as I want from this filtered list, but if I want to select both John Doe and Alicia Keys happens the following:
First when I enter the search string "John Doe" the selection list gets populated accordingly
I select John Doe - OK
I enter search string "Alicia Keys", the selection list gets populated also
Selection of John Doe is gone - I want to be able to select both Alicia and John at the same time, but I don't want to go through a thousands of names long selection list
Update:
Forgot to mention that we have an OLAP cube in the background with dimension 'Employee'. This dimension is used as the source of the parameter and the param dataset uses MDX to fetch the values, therefore the SQL solution cannot be applied here.
The current solution creates an custom set with MDX Filter and Head functions and then this set is used in the ROWS-part of the MDX query.
Here is how the set created:
SET setEmployees AS {
HEAD(
FILTER( [Employees].[Employees].ALLMEMBERS,
INSTR([Employees].[Employees].CURRENTMEMBER.Name,#EmployeeSearch,1 >= 1 )
)
,100)
}
Basically the problem with this solution is that how do you add multiple search strings to the instr function
Is there a common solution to this kind of situation? Am I approaching the problem from wrong direction?
What you could do is make the search parameter more flexible, so you can handle input such as:
John OR Jane
If "OR" queries are more common than "AND" queries you could support it with queries such as:
John Jane
Note that this may throw people off, because the search features they're used to (such as Google search) typically tend interpret multiple words in the "AND" sense.
Anyhow, the tricky bit of course is the SQL behind the Employee data set. This should use the search parameter in a more flexible way. You haven't specified how that's currently working, but I imagine you may be using something like:
WHERE Employee.FullName LIKE '%' + #SearchParameter + '%'
You would need to extend that to support "OR" queries. There's a whole range of solutions for that, from quick 'n dirty handmade SQL (e.g. string split combined with WHERE...IN) to full-text querying. Choose a solution that's best for your situation.
If you have a fixed number of search terms than you can do something like the following.
FILTER( [Employees].[Employees].ALLMEMBERS,
INSTR([Employees].[Employees].CURRENTMEMBER.Name,#EmployeeSearch1,1 >= 1) OR
INSTR([Employees].[Employees].CURRENTMEMBER.Name,#EmployeeSearch2,1 >= 1)
)
Even if you can do that, I do not recommend it. You don't have the luxury to index Analysis Services like you do SQL. A better possible approach would be to query your data warehouse for the employees and return the appropriate keys, and then filter by those keys in your MDX statement.
I'm using Access 2003. Have a table with some date values in a text data column like this;
May-97
Jun-99
Jun-00
Sep-02
Jan-04
I need to convert them to proper date format and into another Date/time column, So create a new Date/Time columns and just updated the values from the Text column into this new column. At first it looked fine, except for years after the year 2000. The new columns converted the dates as follows;
May-97 > 01/05/1997
Jun-99 > 01/06/1999
Jun-00 > 01/06/2000
Sep-02 > 01/09/2010
Jan-04 > 01/01/2010
As you can see any data with year after 2000 get converted to 2010. The same thing happens if I query the data using FORMAT(dateString, "dd/mm/yyyy").
Any ideas why this is so? Do I have to split the month and year and combine them again?
Thanks
Access/Jet/ACE (and many other Windows components) use a window for interpreting 2-digit years. For 00 to 29, it's assumed to be 2000-2029, and for 30-99, 1930-1999. This was put in place to address Y2K compatibility issues sometime in the 1997-98 time frame.
I do not allow 2-digit year input anywhere in any of my apps. Because of that, I don't have to have any code to interpret what is intended by the user (which could conceivably make mistakes).
This also points up the issue of the independence of display format and data storage with Jet/ACE date values. The storage is as a double, with the integer part indicating the day since 12/30/1899 and the decimal part the time portion within the day. Any date you enter is going to be stored as only one number.
If you input an incomplete date (i.e., with no century explicitly indicated for the year), your application has to make an assumption as to what the user intends. The 2029 window is one solution to the 2-digit year problem, but in my opinion, it's entirely inappropriate to depend on it because the user can change it in their Control Panel Regional Settings. I don't write any complicated code to verify dates, I just require 4-digit year entry and avoid the problem entirely. I have been doing this since c. 1998 as a matter of course, and everybody is completely accustomed to it. A few users squawked back then, and I had the "it's because of Y2K" as the excuse that shut them down. Once they got used it, it became a non-issue.
The date is ambiguous, so it is seeing 02 as the day number. Depending on your locale, something like this may suit:
cdate("01-" & Field)
However, it may be best to convert to four digit year, month, day format, which is always unambiguous.
Access seems to be get conduced between MM-YYYY format and MM-DD format. Don't know why it is doing it for dates after the year 2000, but solved it by converting the original string date to full date (01-May-01). Now Access converts the year into 2001 instead of 2010.
If you don't supply a year and the two sets of digits entered into a date field could be a day and month then Access assumes the current year. So your first three dates definitely have a year in them. But the last two don't.
Note that this isn't Access but actually the operating system doing the work. You get the same results in Excel. I had an interesting conversattion with some Microsoft employees on this issue and it's actually OLEAUT32.DLL.
I need to store dates such as 'Summer 1878' or 'Early June 1923', or even 'Mid-afternoon on a Tuesday in August'. How would you suggest I do this?
I have considered breaking the date and time up into separate (integer) columns, and giving each column an ancillary (integer) column containing a range (0 if exact; NULL if unknown). But I'm sure there's other ways...
Thanks!
Since 'Mid-afternoon on a Tuesday in August' ("A Sunday Afternoon on the Island of La Grande Jatte"?) doesn't specify a year, the only real solution is your table of all date and time components, all nullable.
Other wise, you're conflating your data.
You have two (admittedly related) things here: a human readable string, the date_description, and a range of possible dates.
If you can specify at least a range, you can do this:
create table artwork {
artwork_id int not null primary key,
name varchar(80),
... other columns
date_description varchar(80),
earliest_possible_creation_date datetime
latest_possible_creation_date datetime
}
insert into artwork(
name,
date_description,
earliest_possible_creation_date,
latest_possible_creation_date
) values (
'A Sunday Afternoon on the Island of La Grande Jatte',
'Mid-afternoon on a Tuesday in August'
'1884-01-01',
'1886-12-31'
), (
'Blonde Woman with Bare Breasts',
'Summer 1878'
'1878-05-01',
'1878-08-31'
), (
'Paulo on a Donkey',
'Early June 1923',
'1923-06-01'
'1923-06-15'
);
This allows you to display whatever you want, and search for:
select * from artwork
where #some_date between
earliest_possible_creation_date and latest_possible_creation_date;
And obviously, "creation date" (the date the artist created the work) is entirely differnet from "date depicted in work", if the latter can be determined at all.
I'm using Postgres, and I wanted to do the same thing. Perhaps you can do it the same way as I did it, if MySQL has some similar geometric types: http://www.electricwords.org/2008/11/fuzzy-date-matching-in-postgresql/
Almost no matter what you do, you almost certainly won't be able to get the database to do the heavy lifting for you. So you are left with two options:
1 - Use natural strings as you have described
2 - Store a precise data as well as the precision of that date
For example, you could store "5:10:23pm on Sep 23,1975", "plus or minus 6 months", and when someone wants to search for records that occured in that timeframe this could pop up.
This doesn't help with queries, because to the best of my knowledge MySQL doesn't provide any support for tolerances ( nor do any others I know of ). You have to basically query it all and then filter out yourself.
I don't think any native MySQL date representation is going to work for you. Your two-column solution would work well if paired with a Unix time stamp (generated with the UNIX_TIMESTAMP() function with a MySQL date as the argument). Use the second column (the range width) for an upper and lower bound in your selects, and make sure the date column is indexed.
In the end I decided upon: a column for each of the date components (year, month, day, hour, minute, second), and accompanying columns for the range of each of these (year_range, month_range, day_range, hour_range, minute_range, second_range), mainly because this method allows me to specify that I know for sure that a particular photo was taken in August (for instance) in the late '60s (year=1868, year_range=2, month=8, month_range=0).
Thank you all for your help!
create a table with a list of values that you could want, like "Early" or "Summer". then whatever you have setting up the data could have an algorithm that sets a foreign key depending on the date.
Going with Chris Arguin's answer, in the second column just have another datetime column that you can use to store the +/-, then you should be able to write a query that uses both columns to get an approximate datetime.
Use two dates and determine the start and end date of the fuzzy region. For stuff like Summer 1878, enter 18780621 to 18780920. For Early June 1923 you have to decide when early ends, maybe 19230601 to 19230610. This makes it possible to select against the values. You might still want to filter afterward but this will get you close.
For the ones without years, you'll have to find a different system.