Is it possible to use invalid(non existing) Uri for JSON schema definition? - json

Is it possible to use invalid(non existing) Uri for JSON schema definition?
So that I can specify it and use for versioning, without need to deploy it anywhere?

A URL is expected to resolve to the resource, so if you say "this is the URL for the schema" then that URL should resolve to the schema.
However, URLs are not the only sort of URI - it sounds like a URN might be what you want. In contrast to a URL (uniform resource location), a URN (uniform resource name) is an identifier for a resource, but it doesn't carry a generic method to resolve it.
For example, the URN urn:ietf:rfc:2648 is an identifier for RFC 2648, but there isn't a standard way to get the RFC text from just that URN (you'd need some kind of special service that knew about urn:ietf:rfc:... URNs). If you used something like this, then it should (in theory) do what you want.
(You might run into trouble referencing one schema from another if your library is mistakenly assuming all URIs are URLs, but that would be a bug in your library.)

Related

What is json schema equivalent of targetNamespace?

In any xml file i can say what namespace I refer to using xmlns-attributes to describe the namespace. It is well descibed here: What does "xmlns" in XML mean?
Then I can use a xml schema having a target namespace so that everyone knows that the schema describes that namespace. One question about that is found here: Why do we need targetNamespace?
Using json-schema we can define schemas for json documents. My mental model is that this is roughly equivalent to having a xsd file.
Now, how do I reference the schema in a json object? I can reference a schema using $schema attribute, but how do I declare the name of the schema i develop myself? I dont understand the equivalent of targetNamespace
Researching for writing the question I found the answer. The closest thing of a targetNamespace is the $id attribute. The standard states...
The "$id" keyword defines a URI for the schema, and the base URI that
other URI references within the schema are resolved against. A
subschema's "$id" is resolved against the base URI of its parent
schema. If no parent sets an explicit base with "$id", the base URI is
that of the entire document, as determined per RFC 3986 section 5
[RFC3986].
... which is kind of the mirror image of the leading text for $schema...
The "$schema" keyword is both used as a JSON Schema version identifier
and the location of a resource which is itself a JSON Schema, which
describes any schema written for this particular version. The value
of this keyword MUST be a URI [RFC3986] (containing a scheme) and this
URI MUST be normalized. The current schema MUST be valid against the
meta-schema identified by this URI.
so it is essentially the same thing.
Some things to note, however:
a) you use $schema in a schema to define what schema should be used for defining your own custom schema. It is not stated in the spec that $schema in any kind of object should indicate validation for a schema.
b) You may define in your schema that $schema should be an indication about what schema to use for validation.
c) there are other ways to indicate the schema for data. One such example is using content-type in http headers. Another is to use link http headers.
d) vscode and visual studio both interpret $schema as a reference to a schema for use in validation
The issue has been discussed at the github repo for the spec.
https://github.com/json-schema/json-schema/issues/235
https://github.com/json-schema/json-schema/issues/220#issuecomment-209452992

REST API with method PATCH/UPDATE and resource ID

I have to implement REST endpoints to update a resource.
I will use the methods PUT and PATCH (the latter used to send a json that has only the attributes to modify).
The payload of the calls will be a json that is parsed by Jackson.
Using custom deserializers and converter,s the parser will create the correct instances of java beans.
I know that usually these endpoints will have in their URL the ID of the resource to update.
For the PATCH endpoint, I would prefer to send the ID of the resource in the json, together with all the other resources.
I can see PROS and CONS.
CONS: the URL looks like something that is updating the collection of resources and not a single resource,
PROS: the json contains all the necessary information and the parser can find the resource in the database and add the information that wasn't sent with the request.
This would simplify the code because the parser will return an object ready to use.
As you pointed out, it would be better to pass the ID in the URL.
One of the main constraint of a REST API is the "Resource identification in requests": it should be possible to easily identify the resource modified looking the call.
It is not enough that the representation passed has the information necessary to identify the resource: it is also important that the identification is easy and does not require the knowlodege of internal details.

REST API Best practices: args in query string vs in request body

A REST API can have arguments in several places:
In the request body - As part of a json body, or other MIME type
In the query string - e.g. /api/resource?p1=v1&p2=v2
As part of the URL-path - e.g. /api/resource/v1/v2
What are the best practices and considerations of choosing between 1 and 2 above?
2 vs 3 is covered here.
What are the best practices and considerations of choosing between 1
and 2 above?
Usually the content body is used for the data that is to be uploaded/downloaded to/from the server and the query parameters are used to specify the exact data requested. For example when you upload a file you specify the name, mime type, etc. in the body but when you fetch list of files you can use the query parameters to filter the list by some property of the files. In general, the query parameters are property of the query not the data.
Of course this is not a strict rule - you can implement it in whatever way you find more appropriate/working for you.
You might also want to check the wikipedia article about query string, especially the first two paragraphs.
I'll assume you are talking about POST/PUT requests. Semantically the request body should contain the data you are posting or patching.
The query string, as part of the URL (a URI), it's there to identify which resource you are posting or patching.
You asked for a best practices, following semantics are mine. Of course using your rules of thumb should work, specially if the web framework you use abstract this into parameters.
You most know:
Some web servers have limits on the length of the URI.
You can send parameters inside the request body with CURL.
Where you send the data shouldn't have effect on debugging.
The following are my rules of thumb...
When to use the body:
When the arguments don't have a flat key:value structure
If the values are not human readable, such as serialized binary data
When you have a very large number of arguments
When to use the query string:
When the arguments are such that you want to see them while debugging
When you want to be able to call them manually while developing the code e.g. with curl
When arguments are common across many web services
When you're already sending a different content-type such as application/octet-stream
Notice you can mix and match - put the the common ones, the ones that should be debugable in the query string, and throw all the rest in the json.
The reasoning I've always used is that because POST, PUT, and PATCH presumably have payloads containing information that customers might consider proprietary, the best practice is to put all payloads for those methods in the request body, and not in the URL parms, because it's very likely that somewhere, somehow, URL text is being logged by your web server and you don't want customer data getting splattered as plain text into your log filesystem.
That potential exposure via the URL isn't an issue for GET or DELETE or any of the other REST operations.

Why is the HTTP location header only set for POST requests/201 (Created) responses?

Ignoring 3xx responses for a moment, I wonder why the HTTP location header is only used in conjunction with POST requests/201 (Created) responses.
From the RFC 2616 spec:
For 201 (Created) responses, the Location is that of the new resource which was created by the request.
This is a widely supported behavior, but why shouldn't it be used with other HTTP methods? Take the JSON API spec as an example:
It defines a self referencing link for the current resource inside the JSON payload (not uncommon for RESTful APIs). This link is included in every payload. The spec says that you MUST include an HTTP location header, if you create a new document via POST and that the value is the same as the self referencing link in the payload, but this is ONLY needed for POST. Why bother with a custom format for a self referencing link, if you could just use the HTTP location header?
Note: This isn't specific to JSON API. It's the same for HAL, JSON Hyper-Schema or other standards.
Note 2: It isn't even specific to the HTTP location header as it is the same with the HTTP link header. As you can see the JSON API, HAL and JSON Hyper-Schema not only define conventions for self referencing links, but also to express information about related resources or possible actions for a resource. But it seems that they all could just use the HTTP link header. (They could even put the self referencing link into the HTTP link header, if they don't want to use the HTTP location header.)
I don't want to rant, it just seems to be some sort of "reinventing the wheel". It also seems to be very limiting: if you would just use HTTP location/link header, it doesn't matter if you ask for JSON, XML or whatever in your HTTP accept header and you would get useful meta-information about your resource on a HEAD request, which wouldn't contain the links if you would use JSON API, HAL or JSON Hyper-Schema.
The semantics of the Location header isn't that of a self-referencing link, but of a link the user-agent should follow in order to complete the request. That makes sense in redirects, and when you create a new resource that will be in a new location you should go to. If your request is already completed, meaning you already have a full representation of the resource you wanted, it doesn't make sense to return a Location.
The Link header may be considered semantically equivalent to an hypertext Link, but it should be used to reference metadata related to the given resource when the media-type is not hypermedia-aware, so it doesn't replace the functionality of a link to related resources in a RESTful API.
The need for a custom link format in the resource representation is inherent to the need to decouple the resource from the underlying implementation and protocol. REST is not coupled to HTTP, and any protocol for which there's a valid URI scheme can be used. If you decided to use the Link header for all links, you're coupling to HTTP.
Let's say you present an FTP link for clients to follow. Where would be the Link in that case?
The semantic of the Location header depends on the status code. For 201, it links to the newly created resource, but in 3xx requests it can have multiple (although similiar) meanings. I think that is why it is generally avoided for other usages.
The alternative is the Content-Location header, which always has a consistent meaning. It tells the client the canonical URL the resource it requested. It is purely informative (in contrast to the Location, which is expected to be processed by the client).
So, the Content-Location header seems to closer resemble a self-referencing link. However, the Content-Location also has no defined behavior for PUT and POST. It also seems to be quite rarely used.
This blogs post Location vs Content-Location is a nice comparison. Here is a quote:
Finally, neither header is meant for general-purpose linking.
In sum, requiring a standardized, self link in the body seems to be good idea. It avoids a lot of confusion on the client side.

What are REST resources?

What are REST resources and how do they relate to resource names and resource representations?
I read a few articles on the subject, but they were too abstract and they left me more confused than I was before.
Is the following URL a resource? If it is, what is the name of that resource and what is its representation?
http://api.example.com/users.json?length=2&offset=5
The GET response of the URL should look something like:
[
{
id: 6,
name: "John"
},
{
id: 7,
name: "Jane"
}
]
The reason why articles on REST resources are abstract is because the concept of a REST resource is abstract. It's basically "whatever thing is accessed by the URL you supply". So, in your example, the resource would be the list of two users starting at offset 5 in some bigger list. Note that, how the resource is implemented is a detail you don't care about unless you are the one writing the implementation.
Is the following URL a resource?
The URL is not a resource, it is a label that identifies the resource, it is, if you like, the name of the resource.
The JSON is a representation of the resource.
What’s a Resource?
A resource is anything that’s important enough to be referenced as a
thing in itself. If your users might “want to create a hypertext link
to it, make or refute assertions about it, retrieve or cache a
representation of it, include all or part of it by reference into
another representation, annotate it, or perform other operations on
it”, then you should make it a resource.
Usually, a resource is something that can be stored on a computer and
represented as a stream of bits: a document, a row in a database, or
the result of running an algorithm. A resource may be a physical
object like an apple, or an abstract concept like courage, but (as
we’ll see later) the representations of such resources are bound to be
disappointing. Here are some possible resources:
Version 1.0.3 of the software release
The latest version of the software release
The first weblog entry for October 24, 2006
A road map of Little Rock, Arkansas
Some information about jellyfish
A directory of resources pertaining to jellyfish
The next prime number after 1024
The next five prime numbers after 1024
The sales numbers for Q42004
The relationship between two acquaintances, Alice and Bob
A list of the open bugs in the bug database
The text is from the O'Reilly book "RESTful Web Services".
The URL is never a resource or its name or its representation.
URL just tells where the resource is located and You can invoke GET,POST,PUT,DELETE etc on this URL to invoke the resource.
Data responded back are the resources while the form of the data is its representation.
Let's say Your URL with given GET parameters can output a JSON resource - this is the JSON representation of this resource. While with other flag in the GET it could respond with the same data in XML - that will be another representation of the very same resource.
EDIT: Due to the comments to the OP and to my answer I'm adding another explanations.
Also the resource name is considered to be the 'script name', e.g. in this case it is users.json while this resource name is self describing the resource representation itself - when calling this resource we expect the resource is in JSON, while when calling e.g. users.xml we would expect the data in XML.
When I change the offset parameter in GET the response contains
different data set - is it a new resource or its representation?
When I define which columns are returned in response in GET, is it a different resource or different representation, or?
Well, here the problem and answer are clear - we still call the same URL, the server responses with the data in the same form (still it is JSON), data still contains information about users - just the information itself has changed due to the new offset parameter. So it is obvious that it is still the same resource with the same representation and the same resource name as before.
Second problem could be a little confusing. Though we are calling the same resource, though the resource contains the same data (just with only predefined column set) and though the data is in the same representation it could seem to us as a different resource. But due to the points in the paragraph above it is nor the different resource or different representation. Though the data set contains less information the requesting side (filtering this data set) should be considering this and behave accordingly. So again: it is the same resource with the same resource name and the same resource representation.
REST
This architectural style was defined in the chapter 5 of Roy T. Fielding's dissertation.
REST is about resource state manipulation through their representations on the top of stateless communication between client and server. It's a protocol independent architectural style but, in practice, it's commonly implemented on the top of the HTTP protocol.
Resources
The resource itself is an abstraction and, in the words of the author, a resource can be any information that can be named. The domain entities of an application (e.g. a person, a user, a invoice, a collection of invoices, etc) can be resources. See the following quote from Fielding's dissertation:
5.2.1.1 Resources and Resource Identifiers
The key abstraction of information in REST is a resource. Any information that can be named can be a resource: a document or image, a temporal service (e.g. "today's weather in Los Angeles"), a collection of other resources, a non-virtual object (e.g. a person), and so on. In other words, any concept that might be the target of an author's hypertext reference must fit within the definition of a resource. A resource is a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time.
More precisely, a resource R is a temporally varying membership function MR(t), which for time t maps to a set of entities, or values, which are equivalent. The values in the set may be resource representations and/or resource identifiers. [...]
Resource representations
A JSON document is resource representation that allows you to represent the state of a resource. A server can provide different representations for the same resource. For example, using XML and JSON documents. A client can use content negotiation to request different representations of the same resource.
Quoting Fielding's dissertation:
5.2.1.2 Representations
REST components perform actions on a resource by using a representation to capture the current or intended state of that resource and transferring that representation between components. A representation is a sequence of bytes, plus representation metadata to describe those bytes. Other commonly used but less precise names for a representation include: document, file, and HTTP message entity, instance, or variant.
A representation consists of data, metadata describing the data, and, on occasion, metadata to describe the metadata (usually for the purpose of verifying message integrity). Metadata is in the form of name-value pairs, where the name corresponds to a standard that defines the value's structure and semantics. Response messages may include both representation metadata and resource metadata: information about the resource that is not specific to the supplied representation. [...]
Over HTTP, request and response headers can be used to exchange metadata about the representation.
Resource identifiers
A URL a resource identifier that identifies/locates a resource in the server.
This answer may also be insightful.
What are REST resources and how do they relate to resource names and resource representations?
REST doesn't mean a great deal more then you use HTTP verbs (GET, POST, PUT, DELETE, etc) properly.
Is the following URL a resource?
All URLs are strings that tell computers where a resource can be located. (Hence the name: Uniform Resource Locator).
A resource is:
a noun
that is unique
and can be represented as data
and has at least one URI
I go into more detail on my blog post, What, Exactly, Is a RESTful Resource?
Representational State Transfer (REST) is a style of software architecture for distributed systems such as the World Wide Web. REST-style architectures consist of clients and servers. Clients initiate requests to servers; servers process requests and return appropriate responses. Requests and responses are built around the transfer of representations of resources. Resources are a set of addressable objects, basically files and documents, linked using URLs. As correctly pointed out above by Quentin, REST archiecture simply implies that you'd use the HTTP verbs GET/POST/PUT/DELETE...
Conceptually you can think about a resource as everything which is accessible on the web using an URL.
If you stick to this rule http://api.example.com/users.json?length=2&offset=5 can be considered a resource
You've only provided what appear to be relative parameters rather than "ID" which is (or should be) concrete. Remember, get operations should be idempotent (i.e. repeatable with the same outcome).
What is REST?
REST is an architecture Style which stands for Representational(RE) State(S) transfer(T).
What is REST Resource ?
Rest Resource is data on which we want to perform operation(s).So this data can be present in database as record(s) of table(s) or in any other form.This record has unique identifier with which it can be identified like id for Employee.
Now when this data is requested by unique url like http://www.example.com/employees/123,so ultimately data or record which is present in database will be converted to JSON/XML/Plain text format by Rest Service and will be sent to Consumer.
So basically,what is happening here is REPRESENTATIONAL STATE TRANSFER, in a way that state of the data present in database is transferred to another format which can be JSON/XML or plain text.
So in this case 1 employee represents 1 resource which can be accessed by unique url like http://www.example.com/employees/123
In case we want to get list of all resources(employees),we will do:
http://www.example.com/employees
Hope this will help.
REST stands for REpresentational State Transfer. It's a method of transferring variable information from one location to another. A common method of doing this is using JSON - a way of formatting your variables so that they can be transferred without loss of information.
PHP, for example, has built in JSON support. Passing a PHP array to json_encode($array) will output a string in the format you posted (which by the way, is indeed a REST resource, because it gives you variables and information).
In PHP, the array you posted would turn out as:
Array (
[0]=>Array (
['id']=>6;
['name']=>'John';
)
[1]=>Array (
['id']=>7;
['name']=>'Jane';
)
)