Wikipedia-API fallback if extract is empty - json

I am requesting data from Wikipedia API.
My request-URL looks like this:
https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts%7Cpageimages%7Cinfo&inprop=url&piprop=thumbnail&pithumbsize=144&pilimit=50&exintro&explaintext&redirects=1n&generator=geosearch&ggscoord=47.073733%7C15.3918631&ggsradius=10000&ggslimit=50&origin=*
It returns pages located close to the given coordinates with their extract. But for some articles this extract is empty. Can you help me to extend my request to have either a fallback for an empty extract or to get the extract AND the first 10 sentences?
I don't want to request always the first 10 sentences, as the extract (if available) makes more sense.
Thank you

Related

receive Excel data and turn into objects to format a JSON

I have this solution that helps me creating a Wizard to fill some data and turn into JSON, the problem now is that I have to receive a xlsx and turn specific data from it into JSON, not all the data but only the ones I want which are documented in the last link.
In this link: https://stackblitz.com/edit/xlsx-to-json I can access the excel data and turn into object (when I print document.getElementById('output').innerHTML = JSON.parse(dataString); it shows [object Object])
I want to implement this solution and automatically get the specified fields in the config.ts but can't get to work. For now, I have these in my HTML and app-component.ts
https://stackblitz.com/edit/angular-xbsxd9 (It's probably not compiling but it's to show the code only)
It wasn't quite clear what you were asking, but based on the assumption that what you are trying to do is:
Given the data in the spreadsheet that is uploaded
Use a config that holds the list of column names you want returned in the JSON when the user clicks to download
based on this, I've created a fork of your sample here -> Forked Stackbliz
what I've done is:
use the map operator on the array returned from the sheet_to_json method
Within the map, the process is looping through each key of the record (each key being a column in this case).
If a column in the row is defined in the propertymap file (config), then return it.
This approach strips out all columns you don't care about up front. so that by the time the user clicks to download the file, only the columns you want are returned. If you need to maintain the original columns, then you can move this logic somewhere more convenient for you.
I also augmented the property map a little to give you more granular control over how to format the data in the returned JSON. i.e. don't treat numbers as strings in the final output. you can use this as a template if it suites your needs for any additional formatting.
hope it helps.

rest api response format

Should I treat all api response as "resource" and return a JSON object or simple array would be appropriate as well ?
for instance are all of the below responses valid?
GET /rest/someresource should return collection of ids
[{id:1},{id:2}]
{{id:1},{id:2}}
[1,2]
GET /rest/someresource?id>0 search for ids bigger than zero and return collection of ids
[{id:1},{id:2}]
{{id:1},{id:2}}
[1,2]
Collection Resources
It is acceptable to return an array of resources - either a list of ids, or object structures - such a thing is commonly known as a 'collection' resource.
See http://51elliot.blogspot.com.au/2014/06/rest-api-best-practices-4-collections.html for an examination of resources and collections.
While not required by REST, it's common to use a plural noun to refer to a collection resource - e.g.
/rest/someresources
REST also requires the use of defined media types, and there are a couple available to assist with collections, e.g.:
Collection+json
Provides a structure with meta data around a list of items wherein you define the structure of each item as your resource
HAL
provides a structure with embedded collections and embedded resources
And many more
All provide a defined structure for including hypermedia links for your resource, or each resource in your collection - and if you are doing REST this is one of the things that the spec says you MUST do (even though many people don't).
Your Proposed Json Structures
Some more specific comments on your proposed json structures:
Option 2 is not valid json. Consider:
{{id:1},{id:2}}
A json object must have a name:value pair, e.g.
{somename:{id:1},someothername:{id:2}}
would be valid - but not very useful!
Also - strictly for json, the name should be enclosed in quotes. the value may be enclosed in quotes if it is a string.
So if you don't want to use a commonly used media type as referenced above, your options are 1 or 3. which should be:
[{"id":1},{"id":2}]
[1, 2]
Both are valid, however option 1 will give you more flexibility to add more properties to each element of the array if you decide in the future you would like to return more than an id. e.g. at some point in the future you might decide to return:
[{"id":1,"name":"fred"},{"id":2,"name":"wilma"}]
Option 3 will only ever be able to return a list of ids.
So personally I would go with option 1.
Depends on how RESTful you're aiming to be.
In addition to what #Chris Simon said, I'll add that if the server would only return IDs at GET /rest/someresource, the client would have to repeatedly call something like GET /rest/someresource/{id} in order to obtain data (it can display on the UI), right? This in turn would just increase the load on the server. If the id would be enough, you can probably get away with the proposed solution.
Also, once you decide you'd better be consistent.
Given that the 2nd option is not even valid, and the last is pretty limiting, I'd also go for the first option, JSON.
Just to make it clear we are talking about different representations of the same resource here:
By GET /rest/someresource both [{id:1},{id:2}] and [1,2] are valid responses, but you should make clear which one you want to see, e.g. with the prefer header. So by Prefer: return=minimal you would return [1,2] and if the header is not present, then [{id:1},{id:2}]. Just make sure that the prefer header is registered by the vary header, or you will have caching troubles.
By GET /rest/someresource?id>0 you filter your collection. So either the /rest/someresource?id>0 URI identifies a different filtered collection resource or it identifies the same collection resource, but with the filter query string your client indicates that it is waiting for a filtered representation of the resource and not the full representation. You can use the same by the minimal representation if you don't want to use the prefer header: GET /rest/someresource?return=minimal.
Note that if you want your client to query again, then you should send them hyperlinks in your response. The REST client must get the URIs (or URI templates) from these hyperlinks and it should not start to build URIs on its own.

Finding a particular string in HTML

I need to extract data from a website and display it to the user. I'm recieving HTML, and I need to find a particular number inside it.
For example the string would be : "Canada = 50, USA = 60, France = 70". I need to search for "Canada" and find only the number 50.
I've been searching online for how to actually search the returned string of HTML and can't seem to get anything to work.
I dont know how this could be done in App since you want the App to look for specific words in a text file.
However I know this can be done using data analysis tools like R which can filter large amount of texts to create word clouds.
http://georeferenced.wordpress.com/2013/01/15/rwordcloud/

Advice required on correct RESTful response format for the "deepest" resource item

I have a "conceptual" question about designing a RESTful API which returns and accepts data in JSON format.
Consider following requests and responses:
GET http://host/records/12345
{ "id":"12345", "address":{"street":"main street","number":5,"city":"springfield"}}
GET http://host/records/12345/address
{"street":"main street","number":5,"city":"springfield"}
GET http://host/records/12345/address/city
{"city":"springfield"}
OR
springfield (=not valid json)
I realize that the second answer isn't a valid JSON response so I presume the latter is the correct answer to my question. However it looks redundant to me to respond in the form of a key/value since the requester already knew the "key" during the request.
Same counts for updates:
When I want to update the city of my 12345 record with another value what would be more correct to submit:
PUT http://host/records/12345/address/city
{"city":"paris"} <- content of body when submitting
OR
paris <- content of body when submitting (=not valid json)
The reason I'm asking is because one would already have enough by doing
PUT http://host/records/12345/address
{"city":"paris"} <- content of body when submitting
What would be considered to be most appropriate approach to this?
Thanks,
Jay
REST API's generally work on resources, which loosely translate to objects or tables in a database. Your first example of a GET does not indicate that you are trying to get a resource of type "address". What if you want to add additional resources to your API, for example "companies", then this would not be clear. And there should be a way to get a list of all of the addresses. So to get all of the addresses the API call would look like
GET http://host/records/address
[{"id":"12345", "street":"main street","number":5,"city":"springfield"},
{"id":"12346", "street":"foo street","number":1,"city":"alexandria"}]
To get a specific address it would look like
GET http://host/records/address/12345
{"id":"12345", "street":"main street","number":5,"city":"springfield"}
That id is part of the address object and I do not see any need to break it out into a parent object as in your example. You then use that id to let your web service know what needs to be updated. So your update would look like this.
PUT http://host/records/address
{"id":"12345", "street":"main street","number":5,"city":"paris"}
Usually the client would send the whole object over and not just the fields to update.
If you really want to do this "micro-PUT" style of updating then consider just sending the body using the text/plain media type. One of the beauties of using HTTP is that you can freely mix and match media types to use what is the most appropriate.
PUT http://host/records/12345/address/city
Content-Type: text/plain
Content-Length: 5
paris
=>
200 OK
Be warned though, HTTP is optimized for working with large grain resources. If you see your users wanting to do these kind of small updates frequently then maybe you need to reconsider the approach.

facebook JSOn API question

I am using JSON GRAPH API for facebook to retrieve all the post from a particular group. I can access all the post without authentication. However the problem I have is, if a particular post has more comments (like 10) then JSON outputs shows only 3 comments(not all 10). However the "Key" "Count" in JSON output gives the value as 10(meaning 10 comments for the post) but displays only 3. How to resolve this problem.
ANy help is greatly appreciated...!!!
In that json data set there is a a key named 'paging' that gives you the next and previous sets ('next' and 'previous' respectively). You can use those urls to traverse all comments.
See the section named 'Paging' in the docs here: http://developers.facebook.com/docs/api?ref=mf for more stuff (like limits and time queries)