HATEOS with HAL and links to embedded ressources - json

I think the answer to this question is great because it explains a lot about HAL: How to handle nested resources with JSON HAL?
However it does not fully answer the question (at least for me). Assuming we have a /employees resource that returns a list of all employees. I want the employees embedded but just with some basic information (not the full employee). This is OK according to the above answer and the spec. But how would my link look like?
So what would _links look like? Lets simplify the example. Assume there is no paging:
GET /employees
{
"_links": {
"self": { "href": "/employees" },
"employees" { "href": "/employees/{id}", "templated": "true" }
},
"_embedded": {
"employees": [{
"id": "1",
"fullname": "bla bli",
"_links": { ... }
},
{
"id": "2",
"fullname": "djsjsdj",
"_links": { ... }
}]
}
}
Does the templated "emloyees" URL make sense or would this be a case where you would not use any entry in _links? And if the URL is OK: is it necessary that the template parameter (here "id" does match the attribute in the embedded employee objects?

My heuristic is to consider the analogs in HTML - if it's OK for a web page, then it will also be OK for HAL.
"employees" { "href": "/employees/{id}", "templated": "true" }
What's the HTML analog? It's a form with a GET action. Can we have a form with a get action on a web page that also has digests of the information that will be reached via the form? Of course. So it must be fine here.
is it necessary that the template parameter (here "id") does match the attribute in the embedded employee objects?
I don't think it's necessary (the machines don't really care), but it's going to make life easier for the humans, and that alone has value.
Imagine, if you will, reading the documentation of a schema, and discovering that the same semantic concept (an identifier for an employee) has two different names with unrelated spellings. I would guess that would (a) introduce avoidable errors in the documentation when authors get confused about which spelling context they are in and (b) that's the sort of inconsistency that would make me suspicious of the quality of the specification as a whole.
But it's not impossible to have tradeoffs, and other benefits that outweigh these liabilities.

Related

Wikipedia API: how to parse content text into JSON?

EDIT
Not sure what to do because I realized the question I originally asked was irrelevant to what I really wanted, because I thought the descriptionurl and shortdescriptionurl from a Wikipedia API query of an image file would return text that described the image, but really they're just descriptions of the URL, so I feel dumb about that.
I tried to delete the question but it wouldn't let me, because there's already an answer.
So I'm going to change the question to what I really want to know, but now the answer that already exists will not make any sense, so this is kind of a mess but I don't know what to do about it.
What I actually wanted to know
When I do this:
https://en.wikipedia.org/w/api.php?action=query&pageids=18306940&prop=revisions&formatversion=2&rvprop=content
I get this:
{
"batchcomplete": true,
"query": {
"pages": [
{
"pageid": 18306940,
"ns": 6,
"title": "File:Rot-Weiss Essen Fans, May 2008.jpg",
"revisions": [
{
"contentformat": "text/x-wiki",
"contentmodel": "wikitext",
"content": "== Summary ==\n{{Information\n|Description=Fans of Rot-Weiss Essen are celebrating a 1-0 away victory against 1. FC Magdeburg in the 2007/08 Regionalliga Nord.\n|Source=I created this work entirely by myself.\n|Date=May 24, 2008\n|Author=[[User:Povldr|Povldr]] ([[User talk:Povldr|talk]])\n|other_versions=\n}}\n== Licensing: ==\n{{self|cc-by-sa-3.0|GFDL}}\n\n{{Copy to Wikimedia Commons|bot=Fbot|priority=true}}"
}
]
}
]
}
}
What I'd like to do is have the query return only these parts of the content:
Fans of Rot-Weiss Essen are celebrating a 1-0 away victory against 1. FC Magdeburg in the 2007/08 Regionalliga Nord. (the Description)
May 24, 2008 (the Date)
Poldvr (the Author)
I could just get all that out of the content string by chopping up the string in C#, but is there any way to get it spit back to me formatted as nice little JSON in the first place?
I haven't been able to figure this out from The Wikipedia API page on the parse action, nor from the Wikipedia API Sandbox.
Can it be done?
Here is the old question, which was asking the wrong thing
title was: Wikipedia API: how do I use descriptionurl and shortdescriptionurl?
When I do this, for example:
https://en.wikipedia.org/w/api.php?action=query&list=allimages&aiprop=url&date&format=json&ailimit=1&aifrom=rot
...one of the pieces of JSON info is called "descriptionurl," and another is "shortdescriptionurl."
When I type those urls into a browser, it just takes me to the image's entire page.
How do I use those urls to get just the text of the actual description and short description?
Oh, and before you just type the link to the Wikipedia API, I have been trying to find out this information on there and failing. It's full of general information but I can't find this specific thing.
When I put your URL in a browser, I get some nice JSON as expected:
{
"warnings": {
"main": {
"*": "Unrecognized parameter: date."
}
},
"batchcomplete": "",
"continue": {
"aicontinue": "Rot-Weiss_Essen_logo.svg",
"continue": "-||"
},
"query": {
"allimages": [{
"name": "Rot-Weiss_Essen_Fans,_May_2008.jpg",
"url": "https://upload.wikimedia.org/wikipedia/en/5/5c/Rot-Weiss_Essen_Fans%2C_May_2008.jpg",
"descriptionurl": "https://en.wikipedia.org/wiki/File:Rot-Weiss_Essen_Fans,_May_2008.jpg",
"descriptionshorturl": "https://en.wikipedia.org/w/index.php?curid=18306940",
"ns": 6,
"title": "File:Rot-Weiss Essen Fans, May 2008.jpg"
}]
}
}
To extract an individual entry, you'll need to parse the JSON with your programming language of choice.

Microsoft Academic API, Knowledge graph search -- ReferenceIDs always empty

I'm using the graph search method of the Microsoft Academic API to retrieve citation IDs and reference IDs for a paper. However, while retrieving citation IDs works, the reference IDs field is always empty, even for papers which should have linked references. For example, retrieving this publication through the API:
POST https://westus.api.cognitive.microsoft.com/academic/v1.0/graph/search?mode=json
Content-Type: application/json
Host: westus.api.cognitive.microsoft.com
Ocp-Apim-Subscription-Key: my-api-key
{
"path": "/paper",
"paper": {
"select": [
"OriginalTitle",
"CitationIDs",
"ReferenceIDs"
],
"type": "Paper",
"id": [2059999322]
}
}
yields this response (I shortened the CitationIDs list for the sake of legibility):
{
"Results": [
[
{
"CellID": 2059999322,
"CitationIDs": "[630584464,2053566310,2239657960,...]",
"OriginalTitle": "Biodistribution of colloidal gold nanoparticles after intravenous administration: Effect of particle size",
"ReferenceIDs": ""
}
]
]
}
One thing I've noticed is that the graph schema provided here (at the bottom of the page) doesn't match the schema shown here (some of the attributes were renamed, e.g. NormalizedPaperTitle -> NormalizedTitle), so I thought the field was perhaps renamed to something else.
What is the correct query to get reference IDs through the API?
It should be ReferencesIDs, not ReferenceIDs

Mongo DB query of complex json structure

Say I have a json structure like so:
{
"A":{
"name":"dog",
"foo":"bar",
"array":[
{"name":"one"},
{"name":"two"}
]
},
"B":{
"name":"cat",
"foo":"bar",
"array":[
{"name":"one"},
{"name":"three"}
]
}
}
I want to be able to do two things.
1: Query for any "name":* within "A.array".
2: Query for any "name":"one" within "*.array".
That is, any object within a specific document's array, and any specific object within any document's array.
I hope I have used proper terminology here, I am just starting to familiarize myself with a lot of these concepts. I have tried searching for an answer but am having trouble finding something like my case.
Thanks.
EDIT:
Since I still haven't really made progress towards this, I'll just explain what I'm trying to do: I want to use the "AllSets" dataset (after I trim it down below 16mb) available on mtgjson.com. I am having problems getting mongo to play nicely though.
In an effort to try and learn what's going on, I have downloaded one set: http://mtgjson.com/json/OGW.json.
Here is a photo of its structure laid out:
I am unable to even get mongo to return an object from within the cards array using:
"find({cards: {$elemMatch: {name:"Deceiver of Form"}}})"
"find({"cards.name":"Deceiver of Form"})"
When I run either of the commands above it just returns the entire document to me.
You could use the positional projection $ operator to limit the contents of an array. For example, if you have a single document like below:
{
"block": "Battle for Zendikar",
"booster": "...",
"translations": "...",
"cards": [
{
"name": "Deceiver of Form",
"power": "8"
},
{
"name": "Eldrazi Mimic",
"power": "2"
},
{
"name": "Kozilek, the Great Distortion",
"power": "12"
}
]
}
You can query for a card name matching "Deceiver of Form", and limit fields to return only the matching array card element(s) using:
> db.collection.find({"cards.name":"Deceiver of Form"}, {"cards.$":1})
{
"_id": ObjectId("..."),
"cards": [
{
"name": "Deceiver of Form",
"power": "8"
}
]
}
Having said the above, I think you should re-consider your data model. MongoDB is a document-oriented database. A record in MongoDB is a document, so having a single record in a database does not bring out the potential of the database i.e. similar to storing all data in a single row in a table.
You should try storing the 'cards' into a collection instead. Where each document is a single card, (depending on your use case) you could add a reference to another collection containing the deck information. i.e: block, type, releaseDate, etc. For example:
// a document in cards collection:
{
"name": "Deceiver of Form",
"power": "8",
"deck_id": 1
}
// a document in decks collection:
{
"deck_id": 1,
"releaseDate": "2016-01-22",
"type": "expansion"
}
For different types of data model designs and examples, please see Data Model Design.

REST JSON API optional parameters design

Our goal is to develop API where you can POST /data/save/ that will accept some JSON data like below. The main requirement that JSON should contain one of the following attributes:
"attribute1", "attribute2", "attribute3". Namely when one attribute is exist another one should not exist.
{
"name": "test name",
"attribute1": [
"test1", "test2"
]
or
"attribute2": [
"test3", "test4"
]
or
"attribute3": true
}
The question is how to correctly design such API that it will be easy to use and not confused from the client side.
It would be good to know some best practices in such direction.
I would return a
400 Bad Request
The request could not be understood by the server due to malformed
syntax. The client SHOULD NOT repeat the request without
modifications.
and a phrase explaining that multiple attributes are not supported.
I agree such API is confusing for client side.
What's about creating different endpoints:
POST /data/save/attribute1 json_1
POST /data/save/attribute2 json_2
A custom media type should clarify how to use your API. It should specify what to include in your request.
Another solution might be, building the request like this:
{
"name": "test name",
"attr-key": "my-attribute1",
"values": ["test1", "test2"]
}

Design RESTful API using HAL - serialize model relationships

I'm relatively new to REST but I've been doing my homework on how RESTful should be. Now I'm trying to create a RESTful api implementing a JSON+HAL serializer for my models which have relationships with other models.
Example models in python:
class Category(Model):
name = CharField()
parent = ManyToOneField(Category)
categories = OneToManyField(Category)
products = ManyToManyField(Product)
class Product(Model):
name = CharField()
price = DecimalField()
related = ManyToManyField(Product)
categories = ManyToManyField(Category)
lets suppose we have a category "catalog" with a sub-category "food" with products "burger" and "hot-dog" which are both related.
First question. Categories and products should be resources so they need an URI, should I implement an uri field in my model and store it in the DB or somehow calculate it at runtime, what about multiple identifiers(URIs)?
Second question. Discoverability, In Hal format what should "GET /" and diferent nodes return to make the api easily self discoverable.
{
"_links":{
"self":{
"href":"/"
},
"categories":[
{
"href":"/catalog"
}
]
}
}
Third question. Add as properties, embed or link. Example "GET /catalog/food":
{
"_links":{
"self":{
"href":"/catalog/food"
}
},
"name":"food",
"parent":"/catalog",
"categories":[],
"products":[
"/products/burger",
"/products/hot-dog"
]
}
{
"_links":{
"self":{
"href":"/catalog/food"
},
"parent":{
"href":"/catalog"
},
"categories":[
],
"products":[
{
"href":"/products/burger"
},
{
"href":"/products/hot-dog"
}
]
},
"name":"food"
}
{
"_links":{
"self":{
"href":"/catalog/food"
}
},
"name":"food",
"_embedded":{
"parent":{
"_links":{
"self":{
"href":"/catalog"
}
},
"name":"catalog",
...
},
"categories":[
],
"products":[
{
"_links":{
"self":{
"href":"/products/burger"
}
},
"name":"burger",
...
},
{
"_links":{
"self":{
"href":"/products/hot-dog"
}
},
"name":"hot-dog",
...
}
]
}
}
Fourth question. How deep should I go when returning structures. Example "GET /catalog
{
"_links":{
"self":{
"href":"/catalog"
}
},
"name":"catalog",
"parent":null,
"categories":[
{
"name":"food",
"parent":{...},
"categories":[],
"products":[
{
"name":"burger",
"price":"",
"categories":[...],
"related":[...]
},
{
"name":"hot-dog",
"price":"",
"categories":[...],
"related":[...]
}
]
}
],
"products": []
}
About 1st question: I wouldn't store the URIs in the DB. You can easily calculate them inside your controller in runtime, and it's of controller's responsibility to care about URIs. This way you keep your model and your API decoupled, and should you decide to change the API structure in the future, you won't need to update your whole database with the new URIs.
About multiple identifiers, I'm not sure what the question is, but again, in my opinion, it has nothing to do with the DB, it's the routing and the controllers who should care about how to deal with any URIs.
About 2nd question: First of all, as a side note: I would keep the word categories as part of the URI. For example, I 'd have http://domain.com/api/categories/catalog/food. This way, you make your API more descriptive and more "hackable", meaning that user should be able to remove the /catalog/food part and expect to receive a collection with all the available categories.
Now, about what GET should return to allow discoverability: I think it's already being made clear from your URI structure. When user hits GET /categories he expects to get a list with the categories (the name and URI for each, to keep it lightweight), and when he follows one of the URIs like GET /categories/catalog he should receive the resource catalog which is a category. Likewise, when he wants to GET /products/burger, he should receive a product resource with all the attributes you have in your model. You may want to check this example about the structure of your responses.
About 3rd question: Again, the same example can help you form the structure. I think your 2nd way of response is closer to that, but I would also add a name field, not only the href.
About 4th question: When the GET request expects a collection of resources (like GET /categories) I would suggest providing only the necessary for each resource, that is, the name and the URI for each and only when user follows the desired URI, he can receive the rest information.
In your example, catalog is a resource, so on GET /categories/catalog I would include of course the name of the resource (catalog) and its self link, and for parent, sub-categories and products that are related to it, I would just provide the name and the URI for each, to keep it light. But: This was a general thought about designing APIs. In your actual problem, you should decide depending on your specific business problem. I mean, if your API is about a restaurant menu with categories and dishes, you may want to include the price, or a small description even when responding not for the actual product but for a collection of products, because probably for your users, that's an important information. So generally, provide all the necessary info (you only know what are these for your problem) when responding about a list of resources, and provide all the details of the resource when responding about a specific resource.
I would store something in the DB and calculate the URI at Runtime. That way if you move boxes it's not static.
Create a 'bookmark' page. The page we created was just a list of links with their rels. I believe HAL defines that specifically. The bookmark page was the only page other pages needed to know about
Not sure about this
How deep you go is up to you. There is a big debate now at my place of work for fine grain vs course grain. I'm going to do fine grain with small resource to keep the api simple, but then use the Expand-ability concept. It's a combination ofthe idea of composite resources defined on pg 35 of Subbu’s REST book and the expand concept used by Netflix. http://developer.netflix.com/docs/REST_API_Conventions