REST URLs: integer vs string and the behavior of PUT - json

This is a multi-part question:
Given a REST API with URLs containing natural numbers as path segments, is the generally expected behavior that the number be interpreted as an index or a key?
When performing a PUT against a deep resource path, is the generally expected behavior that the path be interpreted as a declaration of state? Meaning that all non-existent resources along the path be created. Or should an error be returned if any resource along the path does not exist?
Expanding on question 2, if the path does exist, and the path defines a resource structure differing from that which is present, should the preexisting resources be overwritten, again as a declaration of state, or should an error be returned indicating a type mismatch?
For example, consider the endpoint:
domain.tld/datasource/foo/2/bar/1/baz
foo is a string, and identifies a top level resource.
2 could be interpreted as either an index or a key.
bar is a string, interpreted as a key.
1 could be interpreted as either an index or a key.
baz is a string, interpreted as a key, pointing to a leaf node.
In other words, the data residing at domain.tld/datasource under the identifier foo could be any of the following:
index based:
[
null,
null,
{
'bar': [
null,
{'baz': null}
]
}
]
key based:
{
'2': {
'bar': {
'1': {
{'baz': null}
}
}
}
}
both index and key based:
{
'2': {
'bar': [
null,
{'baz': null}
]
}
}
Question 1
Should 2 and 1 be considered an integer or a string? As this is potentially impossible to know, is there a standard for type annotation in REST URLs for addressing this case? Some solutions on the whiteboard so far are as follows with the assertion that 2 is a key and 1 is an index:
domain.tld/datasource/foo/2:str/bar/1:int/baz
where :str indicates that the preceding value is a key
and :int indicates that the preceding value is an index
domain.tld/datasource/foo/2/bar/1/baz?types=ki
where k, being member 0 of types, maps to the first int-like segment, and indicates that the value is a key
and i, being member 1 of types, maps to the second int-like segment, and indicates that the value is an index
Question 2
If none of the above data was present, should a PUT against this path create those resources or return an error? If an error is returned, should each resource at each level be created individually, requiring multiple PUTs against the path?
Question 3
If the data from the first illustration (index based) is already present should the data from the second illustration (key based) forcibly overwrite all data at all levels in the path or return an error indicating a type mismatch? The inference here being that again, multiple PUTs are required for any assignment that changes the type.
I'm probably over-complicating the issue or missing something basic but I haven't found much in the way of definitive guidance. I have complete control over the system and can enforce any rules I see fit. However, I'm interested in the experience, meaning interactions should be easy to reason about, logical, expected, deterministic, etc.

From my point of view, you should never ever make something like 'deep resources' when trying to be 'restful' or 'resty' - i really don't see the benefit. It just makes the system way harder to understand, to use and to develop (eg.: see your questions :) ).
Why not keep it simple and having 'single' URLs for single resources? That way it is clear to the client what a PUT will do, what a DELETE will do.
So just as an example, you could have the list resource endpoint domain.com/datasource which will return a list of all foos registered. It will return a list of HREFs...like domain.com/foo/1 ... beneath some metadata, foo/1 could also include a list of bars....but again, they are not nested in the 'foo URI', they are simple top level resources eg 'domain.com/bar/1'.
This way a client can easily delete, update, create items. You can link them, bu setting the correct links in the entities.
Regarding your question 2 and 3: I think that totally depends on your system. If you see the link domain.com/datasource/foo/1/bar/2/baz as ONE big resource, meaning the response will not only include information about baz, but also of bar, foo and datasource, yes a put would 'recreate' (full update) the resource. If that link "only" returns information about baz, a put would only full update this resource.

Related

Returning different JSON results for the same request - is this a violation of REST?

Note the following from Roy Fielding concerning REST design, guidelines & principals.
5.2.1.1 Resources and Resource Identifiers
The key abstraction of information in REST is a resource. Any
information that can be named can be a resource: a document or image,
a temporal service (e.g. "today's weather in Los Angeles"), a
collection of other resources, a non-virtual object (e.g. a person),
and so on. In other words, any concept that might be the target of an
author's hypertext reference must fit within the definition of a
resource.
A resource is a conceptual mapping to a set of entities, not the
entity that corresponds to the mapping at any particular point in
time.
More precisely, a resource R is a temporally varying membership
function MR(t), which for time t maps to a set of entities, or values,
which are equivalent. The values in the set may be resource
representations and/or resource identifiers. A resource can map to the
empty set, which allows references to be made to a concept before any
realization of that concept exists -- a notion that was foreign to
most hypertext systems prior to the Web [61]. Some resources are
static in the sense that, when examined at any time after their
creation, they always correspond to the same value set. Others have a
high degree of variance in their value over time.
The only thing that is required to be static for a resource is the
semantics of the mapping, since the semantics is what distinguishes
one resource from another.
The key points have been bolded, the rest of the paragraph I have included is for context.
Here is the scenario.
I have a web api that has a endpoint: http://www.myfakeapi.com/people
When a client does a GET request to this endpoint, they receive back a list of people.
Person
{
"Name": "John Doe",
"Age": "23",
"Favorite Color": "Green"
}
Ok, well that's cool.
But is it against REST design practices and principles if I have a 'Person' who does not have a Favorite Color and I want to return them like this:
Person
{
"Name": "Bob Doe",
"Age": "23",
}
Or should I return them like this:
Person
{
"Name": "Bob Doe",
"Age": "23",
"Favorite Color": null
}
The issue is that the client requesting the resource has to do extra work to see if the property even exist in the first place. Some 'Person's' have favorite colors and some don't. Is it against REST principals to just omit the json property of 'Favorite Color' if they don't exist - or should that property be given a 'null' or blank value?
What does REST say about this? I am thinking that I should give back a null and not change the representation of the resource the client is requesting by omitting properties.
Off the top of my head I can't think of any REST constraints that this violates (here's a link to a brief overview if you're interested). It also doesn't violate idempotency for a GET request. However, it is still bad practice.
The consumer of your API should know what to expect and ideally this should be well documented (I like using Swagger a lot for this). Any changes in what to expect should be communicated to consumers, possibly in the form of release notes. Changes that could potentially be breaking for your consumer should be delivered in a new version of your API.
Since your Person1 and Person2 are technically different object structures, that could be breaking in itself (let's face it, we don't always find the edge cases as devs). You don't just want your API to work on a basic level and to hell with the end users - you want to design it with the end-consumer in mind so that their lives are made easier.
There are various ways we can deal with this, depends upon the use case, I'll list them only by one
1) Prefer enums (only if it makes sense to your use case)
{
"Name": "Bob Doe",
"Age": "23",
"Favorite Color": NO_COLOR
}
When you know the values for your property at the beginning, define a set of enum constants, and assign a default value if the property does not apply to the user. This helps in a few ways:
Your client knows what are the possible values so they can prepare their client system accordingly.
By giving default enum constant, we convey that value of the particular field is successfully retrieved from either persistent storage or maybe from another remote service, but it has default value because the property may not apply to the user OR user doesn't have any value for this property.
By avoiding NULL pattern, your client code will be resilient and the client can prepare their code for default enum constant.
When you start to serve more users, you may need to add a few more enum constants which may not apply to every client of yours. When you add new enums which they don't know, they can easily handle this in their parsing libraries and convert into something as per client application design. In Jackson, we can use DeserializationFeature.READ_UNKNOWN_ENUM_VALUES_AS_NULL for this.
2) Use Null - Do not create enum constants for everything
There are cases perfectly valid to have a NULL object. For instance, in the below example, it makes sense to use null if there is no favourite quote.
{
"Name": "Bob Doe",
"Age": "23",
"Favorite Quote": null
}
3) Document your required properties clearly
If you use swagger for your rest API documentation, you can mark mandatory properties as required. The ones not marked are optional. In that way, the client will be prepared to handle if they are NULL or empty string. (It should apply to other API documentation tools as well)
Bad practice:
I notice a few users code in such a way, they send errors in the same response model they send their success response 200. Refer this question & answer. This is definitely a bad practice. Don't mix two different responses and mark one property as optional - use status codes to convey any problems. I'm not talking about partial response here.
4) Add/Modify properties (as long as you're not breaking a contract with the client)
Say the Favorite Color property is added later and currently you're sending the following response to your client. You will publish your new contract to your clients when you add Favorite Color, but your clients should have fail-safe code and they should handle the unknown properties. In Jackson, we will use DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES for this. Non-breaking changes do not necessarily require v2.
Person
{
"Name": "Bob Doe",
"Age": "23",
}
So, answer to your question is, you should start looking at the first three options while you design your rest API, you don't require to omit any properties. But, you may be required to add a few properties later(covered at #4), which is perfectly fine.

Perl6: Convert Match object to JSON-serializable Hash

I am currently gettin' my hands dirty on some Perl6. Specifically I am trying to write a Fortran parser based on grammars (the Fortran::Grammar module)
For testing purposes, I would like to have the possiblity to convert a Match object into a JSON-serializable Hash.
Googling / official Perl6 documentation didn't help. My apologies if I overlooked something.
My attempts so far:
I know that one can convert a Match $m to a Hash via $m.hash. But this keeps nested Match objects.
Since this just has to be solvable via recursion, I tried but gave up in favor of asking first for the existance of a simpler/existing solution here
Dealing with Match objects' contents is obviously best accomplished via make/made. I would love to have a super simple Actions object to hand to .parse with a default method for all matches that basically just does a make $/.hash or something the like. I just have no idea on how to specify a default method.
Here's an action class method from one of my Perl 6 projects, which does what you describe.
It does almost the same as what Christoph posted, but is written more verbosely (and I've added copious amounts of comments to make it easier to understand):
#| Fallback action method that produces a Hash tree from named captures.
method FALLBACK ($name, $/) {
# Unless an embedded { } block in the grammar already called make()...
unless $/.made.defined {
# If the Match has named captures, produce a hash with one entry
# per capture:
if $/.hash -> %captures {
make hash do for %captures.kv -> $k, $v {
# The key of the hash entry is the capture's name.
$k => $v ~~ Array
# If the capture was repeated by a quantifier, the
# value becomes a list of what each repetition of the
# sub-rule produced:
?? $v.map(*.made).cache
# If the capture wasn't quantified, the value becomes
# what the sub-rule produced:
!! $v.made
}
}
# If the Match has no named captures, produce the string it matched:
else { make ~$/ }
}
}
Notes:
This totally ignores positional captures (i.e. those made with ( ) inside the grammar) - only named captures (e.g. <foo> or <foo=bar>) are used to build the Hash tree. It could be amended to handle them too, depending on what you want to do with them. Keep in mind that:
$/.hash gives the named captures, as a Map.
$/.list gives the positional captures, as a List.
$/.caps (or $/.pairs) gives both the named and positional captures, as a sequence of name=>submatch and/or index=>submatch pairs.
It allows you to override the AST generation for specific rules, either by adding a { make ... } block inside the rule in the grammar (assuming that you never intentionally want to make an undefined value), or by adding a method with the rule's name to the action class.
I just have no idea on how to specify a default method.
The method name FALLBACK is reserved for this purpose.
Adding something like this
method FALLBACK($name, $/) {
make $/.pairs.map(-> (:key($k), :value($v)) {
$k => $v ~~ Match ?? $v.made !! $v>>.made
}).hash || ~$/;
}
to your actions class should work.
For each named rule without an explicit action method, it will make either a hash containing its subrules (either named ones or positional captures), or if the rule is 'atomic' and has no such subrules the matching string.

REST service semantics; include properties not being updated?

Suppose I have a resource called Person. I can update Person entities by doing a POST to /data/Person/{ID}. Suppose for simplicity that a person has three properties, first name, last name, and age.
GET /data/Person/1 yields something like:
{ id: 1, firstName: "John", lastName: "Smith", age: 30 }.
My question is about updates to this person and the semantics of the services that do this. Suppose I wanted to update John, he's now 31. In terms of design approach, I've seen APIs work two ways:
Option 1:
POST /data/Person/1 with { id: 1, age: 31 } does the right thing. Implicitly, any property that isn't mentioned isn't updated.
Option 2:
POST /data/Person/1 with the full object that would have been received by GET -- all properties must be specified, even if many don't change, because the API (in the presence of a missing property) would assume that its proper value is null.
Which option is correct from a recommended design perspective? Option 1 is attractive because it's short and simple, but has the downside of being ambiguous in some cases. Option 2 has you sending a lot of data back and forth even if it's not changing, and doesn't tell the server what's really important about this payload (only the age changed).
Option 1 - updating a subset of the resource - is now formalised in HTTP as the PATCH method. Option 2 - updating the whole resource - is the PUT method.
In real-world scenarios, it's common to want to upload only a subset of the resource. This is better for performance of the request and modularity/diversity of clients.
For that reason, PATCH is now more useful than PUT in a typical API (imo), though you can support both if you want to. There are a few corner cases where a platform may not support PATCH, but I believe they are rare now.
If you do support both, don't just make them interchangeable. The difference with PUT is, if it receives a subset, it should assume the whole thing was uploaded, so should then apply default properties to those that were omitted, or return an error if they are required. Whereas PATCH would just ignore those omitted properties.

play scalaJson how to retrieve record when querying for value

So I am having a couple problems with Play's scalaJSON. First being that I am somehow not able to make my keys anything other than strings. I have defined a a JSONWrites converter as follows:
implicit val musics = new Writes[Question] {
def writes(question:Question) = Json.obj(
"questionID" -> question.questionID,
"questionText" -> question.questionText,
"responseURI" -> question.responseURI,
"constraints: min,max,Optional" -> Json.arr(question.minResponse, question.maxResponse, question.optionalQ),
"responseDataType" -> question.responseDataType
)
}
my model case class for question:
case class Question (questionID:Int,
questionText:String,
responseURI:String,
minResponse:Option[Int],
maxResponse:Option[Int],
optionalQ:Boolean,
responseDataType:String)
When desigining my REST API in play, I wanted to access the specific question of this survey app with a url such as /questionBlock/questionID
I tried to simply make the question.questionID the parent key and nest the rest of the JSON as the value to this key, but it would not allow me to do this, saying expected String actual Int
the actual JSON rendered out looks like this:
[{"questionID":0,"questionText":"What is your favorite musical artist?",
"responseURI":"/notDoneYet","constraints: min,max,Optional":[1,1,false],
"responseDataType":"String"},{"questionID":1,"questionText":"What is your favorite music genre?",
"responseURI":"/notDoneYet","constraints: min,max,Optional":[1,1,false],"responseDataType":"String"}]
But using this I cannot seem to figure out how to return the entire field where questionID equals 1 or 2 etc. I have used the 0th, 1st, etc element of the array but that is not the ideal approach for me, since question Ids may not always start at 0 for a particular sequence of questions.
Basically, I want to be able to show an entire record for one question when I provide the value of questionID. In Javascript I would have made the outermost key this questionID value, but I am unable to figure out how to do this using scalaJson. If there is an alternative way to accomplish this, I am open to suggestions.

Avoid even Option fields. Always empty string for String and 0 for Int optional fields

I have scala REST service based on JSON and Play Framework. Some of the fields of the JSON are optional (e.g. middleName). I can mark it Option e.g.
middleName: Option[String]
and even don't expect it in JSON. But I would like to avoid possible app errors in the future and simplify life. I would like to mark it as expectable but empty if user don't want to provide this info and have no Option fields throughout entire application (JSON/DB overhead is minor).
Is it good idea to avoid Option fields throughout the application? If the String field is empty, it contains an empty string but manadatory present in JSON/DB. If the Int field is empty it contains 0 etc
Thanks in advance
I think you would regret avoiding Option because of the loss of type safety. If you go passing around potentially null object references, everyone who touches them has to remember to check for null because there is nothing that forces them to do so. Failure to remember is a NullPointerException waiting to happen. The use of Option forces code to deal with the possibility that there is no value to work with; forgetting to do so will cause a compilation error:
case class Foo(name: Option[String])
...
if (foo1.name startsWith "/") // ERROR: no startsWith on Option
I very occasionally do use nulls in a very localized bit of code where I think either performance is critical or I have many, many objects and don't want to have all of those Some and None objects taking up memory, but I would never leak the null out across a public API. Using nulls is a complicating optimization that should only be used where the extra vigilance required to avoid catastrophe is justified by the benefit. Such cases are rare.
I am not entirely sure I understand what your needs are with regard to JSON, but it sounds like you might like to have Option fields not disappear from JSON documents. In Spray-json there is a NullOptions trait specifically for this. You simply mix it into your protocol type and it affects all of the JsonFormats defined within (you can have other protocol types that do "not" mix it in if you like), e.g.
trait FooJsonProtocol extends DefaultJsonProtocol with NullOptions {
// your jsonFormats
}
Without NullOptions, Option members with value None are omitted altogether; with it, they appear with null values. I think that it is clearer for users if you show the optional fields with null values rather than having them disappear, but for transmission efficiency you might want them omitted. With Spray-json, at least, you can pick.
I don't know whether other JSON packages have a similar option, but perhaps that will help you look for it if for some reason you don't want to use Spray-json (which, by the way, is very fast now).
I think that would depend on your business logic and how you want to use these values.
In the case of the middleName I am assuming you are using it primarily to address the user in a personal manner and you just concatenate title, firstName, middleName and lastName. So you treat the value exactly the same whether the user has specified it or not. So I think using an empty String instead of None might be preferable.
In the case of values where 0 or the "" is a valid value in terms of your business logic I would go with the Option[String], also in cases where you have different behaviours depending on whether the value is specified or not.
x match {
case 0 => foo
case _ => bar(_)
}
is less descriptive than
x match {
case Some(i) => bar(i)
case None => foo
}
It's a bad idea, because normally you want to handle the absence of something differently. If you pass a value of "" or 0 around, this can very easily be confused with a real value; you might end up sending an email that starts "Dear Mr ," or wishing them Happy 35th Birthday because the timestamp 0 comes out as 1st January 1970. If you keep a distinction between a value and None in code and in the type system, this forces you to think about whether a value is actually set and what you want to do if it isn't.
Don't blindly just push Options everywhere though, either. If it's an error for a value to not be supplied, you should check that immediately and throw an error as soon as possible, not wait until much later in your application when it will be harder to debug where that None came from.
It won't make your "life easier". If anything, it will make it harder, and instead of avoiding app errors will make them more likely. Your app code will have to be infested with checks like if(middleName != "") { doSomething(middleName); } or if(age == 0) "Unknown age" else age.toString, and you will have to rely on the programmer remembering to handle those "kinda-optional" fields in a special way.
All of this you could get "for free" using the monadic properties of Option with middleName.foreach(doSomething) or age.map(_.toString).getOrElse("")