Orion Context Provider query multiple entities - fiware

Tax information system contains all Tax information regarding every citizen in the city.
Following FIWARE principles, seems it might make sense Consumers query Orion about entity(citizen) tax information, and the request being forwarded to Context Provider (ie:TaxInformationSystem).
Query citizen X tax information -> Orion -> TaxInformationSystem_CP
According to documentation, Context Providers can register themselves as source for specific attributes. This, for example, could make this work:
http://{{orion}}/v2/entities/urn:citizenID/attrs/name/tax
However, this seems to require every citizen to be registered as an entity, so tax information system should register multiple times (one per citizen). (And residenceInformationSystem, and healthInformationSystem, and...)
"entities": [
{
"id" : "citizenID", //one per citizen ???
"type": "taxInformation"
}
],
and that seems, at least, a lot of unnecessary/superfluous work.
After reading a bit more, seems any workaround is not yet implemented/supported
Seems I can't use query parameters http://{{orion}}/v2/entities/tax?citizen=X, as they aren't forwarded to CP
Seems I can't query any citizen tax http://{{orion}}/v2/entities/X/tax if the entity hasn't be explicitly created first
Seems I can't set idPattern (currently only .* supported), as it would return all citizens tax, as Broker is not forwarding requests filters neither entity to CP
Neither typePattern
(IIUC, isPattern seems now deprecated in favour of idPattern/typePattern)
Am I doing something wrong? Is registering once per citizen the only way to go?

In FIWARE, and in every system or platform, there are features more used and mature and others more experimental and dizzy. The more the real use cases and the real customers ask for some features (and moreover uses at real life and real deployments) they will get more consolidated, proved and extended. That's not the case for registrations, complex federation scenarios are not in current state of the art. I agree that they enable some really interesting experimental use cases, but in real deployments federation scenarios add an extra level of complexity making them undesirable at this stage.

Not sure if I'm fully understanding you case...
You could do a registration for all citizen like this one:
{
"dataProvided": {
"entities": [
{
"idPattern": ".*",
"type": "taxInformation"
}
],
"attrs": [
...
},
"provider": {
"http": {
"url": "http://thetaxsystem.com"
}
}
}
So if you want to get tax information of an specific citizen you could do something like this at the CB:
GET /v2/entities/1234567H?type=taxInformation
and that registration would cause the request sent to the tax system as Context Provider.
EDIT: there is an issue at Context Broker (this one) which is precluding this case to work. In particular, second case:
regR = .*, query = 'E', attrs = {null}
EDIT2: the above case has been solved in Orion Context Broker. It is now available in master branch (:latest tag in dockerhub) and will be included in the next Orion Context Broker release (3.1.0).

Related

Returning different JSON results for the same request - is this a violation of REST?

Note the following from Roy Fielding concerning REST design, guidelines & principals.
5.2.1.1 Resources and Resource Identifiers
The key abstraction of information in REST is a resource. Any
information that can be named can be a resource: a document or image,
a temporal service (e.g. "today's weather in Los Angeles"), a
collection of other resources, a non-virtual object (e.g. a person),
and so on. In other words, any concept that might be the target of an
author's hypertext reference must fit within the definition of a
resource.
A resource is a conceptual mapping to a set of entities, not the
entity that corresponds to the mapping at any particular point in
time.
More precisely, a resource R is a temporally varying membership
function MR(t), which for time t maps to a set of entities, or values,
which are equivalent. The values in the set may be resource
representations and/or resource identifiers. A resource can map to the
empty set, which allows references to be made to a concept before any
realization of that concept exists -- a notion that was foreign to
most hypertext systems prior to the Web [61]. Some resources are
static in the sense that, when examined at any time after their
creation, they always correspond to the same value set. Others have a
high degree of variance in their value over time.
The only thing that is required to be static for a resource is the
semantics of the mapping, since the semantics is what distinguishes
one resource from another.
The key points have been bolded, the rest of the paragraph I have included is for context.
Here is the scenario.
I have a web api that has a endpoint: http://www.myfakeapi.com/people
When a client does a GET request to this endpoint, they receive back a list of people.
Person
{
"Name": "John Doe",
"Age": "23",
"Favorite Color": "Green"
}
Ok, well that's cool.
But is it against REST design practices and principles if I have a 'Person' who does not have a Favorite Color and I want to return them like this:
Person
{
"Name": "Bob Doe",
"Age": "23",
}
Or should I return them like this:
Person
{
"Name": "Bob Doe",
"Age": "23",
"Favorite Color": null
}
The issue is that the client requesting the resource has to do extra work to see if the property even exist in the first place. Some 'Person's' have favorite colors and some don't. Is it against REST principals to just omit the json property of 'Favorite Color' if they don't exist - or should that property be given a 'null' or blank value?
What does REST say about this? I am thinking that I should give back a null and not change the representation of the resource the client is requesting by omitting properties.
Off the top of my head I can't think of any REST constraints that this violates (here's a link to a brief overview if you're interested). It also doesn't violate idempotency for a GET request. However, it is still bad practice.
The consumer of your API should know what to expect and ideally this should be well documented (I like using Swagger a lot for this). Any changes in what to expect should be communicated to consumers, possibly in the form of release notes. Changes that could potentially be breaking for your consumer should be delivered in a new version of your API.
Since your Person1 and Person2 are technically different object structures, that could be breaking in itself (let's face it, we don't always find the edge cases as devs). You don't just want your API to work on a basic level and to hell with the end users - you want to design it with the end-consumer in mind so that their lives are made easier.
There are various ways we can deal with this, depends upon the use case, I'll list them only by one
1) Prefer enums (only if it makes sense to your use case)
{
"Name": "Bob Doe",
"Age": "23",
"Favorite Color": NO_COLOR
}
When you know the values for your property at the beginning, define a set of enum constants, and assign a default value if the property does not apply to the user. This helps in a few ways:
Your client knows what are the possible values so they can prepare their client system accordingly.
By giving default enum constant, we convey that value of the particular field is successfully retrieved from either persistent storage or maybe from another remote service, but it has default value because the property may not apply to the user OR user doesn't have any value for this property.
By avoiding NULL pattern, your client code will be resilient and the client can prepare their code for default enum constant.
When you start to serve more users, you may need to add a few more enum constants which may not apply to every client of yours. When you add new enums which they don't know, they can easily handle this in their parsing libraries and convert into something as per client application design. In Jackson, we can use DeserializationFeature.READ_UNKNOWN_ENUM_VALUES_AS_NULL for this.
2) Use Null - Do not create enum constants for everything
There are cases perfectly valid to have a NULL object. For instance, in the below example, it makes sense to use null if there is no favourite quote.
{
"Name": "Bob Doe",
"Age": "23",
"Favorite Quote": null
}
3) Document your required properties clearly
If you use swagger for your rest API documentation, you can mark mandatory properties as required. The ones not marked are optional. In that way, the client will be prepared to handle if they are NULL or empty string. (It should apply to other API documentation tools as well)
Bad practice:
I notice a few users code in such a way, they send errors in the same response model they send their success response 200. Refer this question & answer. This is definitely a bad practice. Don't mix two different responses and mark one property as optional - use status codes to convey any problems. I'm not talking about partial response here.
4) Add/Modify properties (as long as you're not breaking a contract with the client)
Say the Favorite Color property is added later and currently you're sending the following response to your client. You will publish your new contract to your clients when you add Favorite Color, but your clients should have fail-safe code and they should handle the unknown properties. In Jackson, we will use DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES for this. Non-breaking changes do not necessarily require v2.
Person
{
"Name": "Bob Doe",
"Age": "23",
}
So, answer to your question is, you should start looking at the first three options while you design your rest API, you don't require to omit any properties. But, you may be required to add a few properties later(covered at #4), which is perfectly fine.

Best way to represent nested resources that has two different type (entities) owners in REST APIs? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Suppose I have a entity called Appointment. This entity represents an medical appointment between one Doctor and one Patient.
To create a new Appointment, I send somethind like:
POST /appointments
{
"doctorId": 98173821,
"patientId": 2138212,
... omitted for brevity
}
Works well.
As you can see, this object is nested with two other resources (Patient and Doctor)
Imagine that logged Patient (using his JWT Token) want to see his history of Appointments.
Today, I do that sending a request to:
GET /patients/2138212/appointments?page=1&size=2&startDate=2019-01-01&endDate=2019-08-01
(please, feel free to criticize)
as this is a history, there's no need to retrieve the whole Appointment objects. The response retrieves only basic informations about the Appointments. The response looks like this:
{
"timestamp": 1566216359,
"transactionId": "6eed92831cad128",
"data": {
"appointments": [
{
"id": 6372,
"doctorId": 98173821,
"date": "2019-01-01"
},
{
"id": 6985,
"doctorId": 98173821,
"date": "2019-02-01"
}
]
}
}
Now, what if the Patient select an specific Appointment in the front-end and the detailed information about this specific Appointment needs to be shown? What is the best way to model a REST endpoint like this?
My options:
GET /patients/2138212/appointments/6372
GET /appointments/6372
The first one looks nice and follow up REST patterns on nested resources, since can represent that the Appointment 6372 belongs to the Patient 2138212.
The second one doesn't provide readability. I mean, looking at the resource, I don't know who owns this Appointment. That's why I prefer the first approach.
Now, going down the road, also, the Doctor has to see the history of his Appointments.
Today, I do that sending a request to:
GET /doctors/98173821/appointments?page=1&size=2&startDate=2019-01-01&endDate=2019-08-01
(please, feel free to criticize)
The response looks like this:
{
"timestamp": 1566216359,
"transactionId": "6eed92831cad321",
"data": {
"appointments": [
{
"id": 6372,
"patientId": 2138212,
"date": "2019-01-01"
},
{
"id": 6985,
"patientId": 2138212,
"date": "2019-02-01"
}
]
}
}
Now, what if the Doctor select an specific Appointment in the front-end and the detailed information about this specific Appointment needs to be shown? What is the best way to model a REST endpoint like this?
My options:
GET /doctors/98173821/appointments/6372
GET /appointments/6372
The questions I have:
How to represent a nested resource that has "two owners"? There is a better approach than this I point out on that question?
To the history of Appointments, I've been suggested to do /appointments/doctors/{doctorId} and /appointments/patients/{patientId}, which I heavily disagree, since the Doctor and the Patient owns the Appointment, not the opposite. What you suggest?
REST doesn't care what spelling you use for your resource identifiers.
GET /patients/2138212/appointments/6372
GET /appointments/6372
GET /5877971d-4f91-4297-9995-94f560190463
All three of those spellings are fine. From the point of view of a REST client, the URI is just an opaque sequence of bytes that can be used as a cache key.
Choosing a spelling for a URI is a lot like choosing a spelling for a variable name -- the machines don't care; you can choose any spelling you like that is convenient.
Furthermore, REST doesn't particularly care how you organize your domain model into resources, aka "documents". The "doctor's view of appointment 6372" is not necessarily the same resource as "the patient's view of appointment 6372"; there are trade offs to consider there.
Example:
https://stackoverflow.com/questions/57557129/best-way-to-represent-nested-resources-that-has-two-different-type-entities-ow
https://api.stackexchange.com/2.2/questions/57557129?order=desc&sort=activity&site=stackoverflow
To the history of Appointments, I've been suggested to do /appointments/doctors/{doctorId} and /appointments/patients/{patientId}, which I heavily disagree, since the Doctor and the Patient owns the Appointment, not the opposite. What you suggest?
The hierarchy of path segments in a URI need not imply "ownership". Again, from the perspective of a REST client, there is no relation between /X/Y and /X/Y/Z -- they are different identifiers, and therefore different resources.
There's no fundamental difference between
/appointments/patients/{patientId}
/patients/{patientId}/appointments
We normally prefer the latter, not because it is "better" in an of itself, but because it can be more convenient when using relative references to identify another resource identifier hierarchy relative to the position of this one.
/appointments/patients/{patientId} + .. -> /appointments/patients
/patients/{patientId}/appointments + .. -> /patients/{patientId}

How to create advanced subscriptions expression at Orion Context Broker NGSIv2?

According to official documentation of Orion Context Broker NGSIv2 :
You can include filtering expressions in conditions. For example, to
get notified not only if pressure changes, but if it changes within
the range 700-800. This is an advanced topic, see the "Subscriptions"
section in the NGSIv2 specification.
At NGSIv2 subscriptions there is no notifyConditions such as NGSIv1 , it was replaced by subject.condition object:
condition: Condition to trigger notifications. This field is optional
and it may contain two properties, both optional:
attrs: array of attribute names
expression: an expression composed of q, mq, georel,
geometry and coords (see "List entities" operation above about this
field)
When we use subject.condition.attrs, it contains an array of attributes names, these names define the "triggering attributes", i.e. attributes that upon creation/change due to entity creation or update trigger the notification.
But, for subject.condition.expression there is not example at official documentations.
Getting pieces of puzzle is possible to deduce :
Is possible do combine subject.condition.expression and subject.condition.attrs. If I set and attribute different of expression,eg. attr foo with expression 'boo>10' what it will do ? Will this behave like an OR or AND ?
Is possible to set multiple expressions. Will this behave like an OR or AND ?
It would be nice to have some examples of these more complex subscriptions combining the different ways of delimiting the entities in the subscription.
NOTE: This question is related to Orion Version 1.7.0+
I think the following example, from the NGSIv2 Overview for Developers That Already Know NGSIv1 presentation (slide 34 in the current version), could help to clarify.
Example: subscribe to speed changes in any entities of any type ending with Vehicle (such as RoadVehicle, AirVehicle, etc.) whenever speed is greater than 90 its average metadata is between 80 and 90 and the vehicle distance to Madrid city center is less than 100 km
Request:
POST /v2/subscriptions
...
{
"subject": {
"entities": [
{
"idPattern": ".*",
"typePattern": ".*Vehicle"
},
],
"condition": {
"attrs": [ "speed" ],
"expression": {
"q": "speed>90",
"mq": "speed.average==80..100",
"georel": "near;maxDistance:100000",
"geometry": "point",
"coords": "40.418889,-3.691944"
}
}
},
...
}
As this example illustrates, you can use different conditions (q, mq, geoquery, etc.) and they are interpreted in the AND sense. Morevoer, q and mq allow complex expressions interpreted also in the AND sense, such as:
"q": "speed>90;engine!=fail",
Note that q and mq when they appear in subscriptions expression follow the same rules than the ones when they appear in synchronous queries (i.e. GET /v2/entities?q=...). These rules are described in "Simple Query Language" section in the NGSIv2 specification.

REST service semantics; include properties not being updated?

Suppose I have a resource called Person. I can update Person entities by doing a POST to /data/Person/{ID}. Suppose for simplicity that a person has three properties, first name, last name, and age.
GET /data/Person/1 yields something like:
{ id: 1, firstName: "John", lastName: "Smith", age: 30 }.
My question is about updates to this person and the semantics of the services that do this. Suppose I wanted to update John, he's now 31. In terms of design approach, I've seen APIs work two ways:
Option 1:
POST /data/Person/1 with { id: 1, age: 31 } does the right thing. Implicitly, any property that isn't mentioned isn't updated.
Option 2:
POST /data/Person/1 with the full object that would have been received by GET -- all properties must be specified, even if many don't change, because the API (in the presence of a missing property) would assume that its proper value is null.
Which option is correct from a recommended design perspective? Option 1 is attractive because it's short and simple, but has the downside of being ambiguous in some cases. Option 2 has you sending a lot of data back and forth even if it's not changing, and doesn't tell the server what's really important about this payload (only the age changed).
Option 1 - updating a subset of the resource - is now formalised in HTTP as the PATCH method. Option 2 - updating the whole resource - is the PUT method.
In real-world scenarios, it's common to want to upload only a subset of the resource. This is better for performance of the request and modularity/diversity of clients.
For that reason, PATCH is now more useful than PUT in a typical API (imo), though you can support both if you want to. There are a few corner cases where a platform may not support PATCH, but I believe they are rare now.
If you do support both, don't just make them interchangeable. The difference with PUT is, if it receives a subset, it should assume the whole thing was uploaded, so should then apply default properties to those that were omitted, or return an error if they are required. Whereas PATCH would just ignore those omitted properties.

Use of PUT vs PATCH methods in REST API real life scenarios

First of all, some definitions:
PUT is defined in Section 9.6 RFC 2616:
The PUT method requests that the enclosed entity be stored under the supplied Request-URI. If the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server. If the Request-URI does not point to an existing resource, and that URI is capable of being defined as a new resource by the requesting user agent, the origin server can create the resource with that URI.
PATCH is defined in RFC 5789:
The PATCH method requests that a set of changes described in the
request entity be applied to the resource identified by the Request-
URI.
Also according to RFC 2616 Section 9.1.2 PUT is Idempotent while PATCH is not.
Now let us take a look at a real example. When I do POST to /users with the data {username: 'skwee357', email: 'skwee357#domain.example'} and the server is capable of creating a resource, it will respond with 201 and resource location (lets assume /users/1) and any next call to GET /users/1 will return {id: 1, username: 'skwee357', email: 'skwee357#domain.example'}.
Now let us say I want to modify my email. Email modification is considered "a set of changes" and therefore I should PATCH /users/1 with "patch document". In my case it would be the JSON document: {email: 'skwee357#newdomain.example'}. The server then returns 200 (assuming permission are ok). This brings me to first question:
PATCH is NOT idempotent. It said so in RFC 2616 and RFC 5789. However if I issue the same PATCH request (with my new email), I will get the same resource state (with my email being modified to the requested value). Why is PATCH not then idempotent?
PATCH is a relatively new verb (RFC introduced in March 2010), and it comes to solve the problem of "patching" or modifying a set of fields. Before PATCH was introduced, everybody used PUT to update resources. But after PATCH was introduced, it leaves me confused about what PUT is used for. And this brings me to my second (and the main) question:
What is the real difference between PUT and PATCH? I have read somewhere that PUT might be used to replace entire entity under specific resource, so one should send the full entity (instead of set of attributes as with PATCH). What is the real practical usage for such case? When would you like to replace / overwrite an entity at a specific resource URI and why is such an operation not considered updating / patching the entity? The only practical use case I see for PUT is issuing a PUT on a collection, i.e. /users to replace the entire collection. Issuing PUT on a specific entity makes no sense after PATCH was introduced. Am I wrong?
NOTE: When I first spent time reading about REST, idempotence was a confusing concept to try to get right. I still didn't get it quite right in my original answer, as further comments (and Jason Hoetger's answer) have shown. For a while, I have resisted updating this answer extensively, to avoid effectively plagiarizing Jason, but I'm editing it now because, well, I was asked to (in the comments).
After reading my answer, I suggest you also read Jason Hoetger's excellent answer to this question, and I will try to make my answer better without simply stealing from Jason.
Why is PUT idempotent?
As you noted in your RFC 2616 citation, PUT is considered idempotent. When you PUT a resource, these two assumptions are in play:
You are referring to an entity, not to a collection.
The entity you are supplying is complete (the entire entity).
Let's look at one of your examples.
{ "username": "skwee357", "email": "skwee357#domain.example" }
If you POST this document to /users, as you suggest, then you might get back an entity such as
## /users/1
{
"username": "skwee357",
"email": "skwee357#domain.example"
}
If you want to modify this entity later, you choose between PUT and PATCH. A PUT might look like this:
PUT /users/1
{
"username": "skwee357",
"email": "skwee357#gmail.com" // new email address
}
You can accomplish the same using PATCH. That might look like this:
PATCH /users/1
{
"email": "skwee357#gmail.com" // new email address
}
You'll notice a difference right away between these two. The PUT included all of the parameters on this user, but PATCH only included the one that was being modified (email).
When using PUT, it is assumed that you are sending the complete entity, and that complete entity replaces any existing entity at that URI. In the above example, the PUT and PATCH accomplish the same goal: they both change this user's email address. But PUT handles it by replacing the entire entity, while PATCH only updates the fields that were supplied, leaving the others alone.
Since PUT requests include the entire entity, if you issue the same request repeatedly, it should always have the same outcome (the data you sent is now the entire data of the entity). Therefore PUT is idempotent.
Using PUT wrong
What happens if you use the above PATCH data in a PUT request?
GET /users/1
{
"username": "skwee357",
"email": "skwee357#domain.example"
}
PUT /users/1
{
"email": "skwee357#gmail.com" // new email address
}
GET /users/1
{
"email": "skwee357#gmail.com" // new email address... and nothing else!
}
(I'm assuming for the purposes of this question that the server doesn't have any specific required fields, and would allow this to happen... that may not be the case in reality.)
Since we used PUT, but only supplied email, now that's the only thing in this entity. This has resulted in data loss.
This example is here for illustrative purposes -- don't ever actually do this (unless your intent is to drop the omitted fields, of course... then you are using PUT as it should be used). This PUT request is technically idempotent, but that doesn't mean it isn't a terrible, broken idea.
How can PATCH be idempotent?
In the above example, PATCH was idempotent. You made a change, but if you made the same change again and again, it would always give back the same result: you changed the email address to the new value.
GET /users/1
{
"username": "skwee357",
"email": "skwee357#domain.example"
}
PATCH /users/1
{
"email": "skwee357#gmail.com" // new email address
}
GET /users/1
{
"username": "skwee357",
"email": "skwee357#gmail.com" // email address was changed
}
PATCH /users/1
{
"email": "skwee357#gmail.com" // new email address... again
}
GET /users/1
{
"username": "skwee357",
"email": "skwee357#gmail.com" // nothing changed since last GET
}
My original example, fixed for accuracy
I originally had examples that I thought were showing non-idempotency, but they were misleading / incorrect. I am going to keep the examples, but use them to illustrate a different thing: that multiple PATCH documents against the same entity, modifying different attributes, do not make the PATCHes non-idempotent.
Let's say that at some past time, a user was added. This is the state that you are starting from.
{
"id": 1,
"name": "Sam Kwee",
"email": "skwee357#olddomain.example",
"address": "123 Mockingbird Lane",
"city": "New York",
"state": "NY",
"zip": "10001"
}
After a PATCH, you have a modified entity:
PATCH /users/1
{"email": "skwee357#newdomain.example"}
{
"id": 1,
"name": "Sam Kwee",
"email": "skwee357#newdomain.example", // the email changed, yay!
"address": "123 Mockingbird Lane",
"city": "New York",
"state": "NY",
"zip": "10001"
}
If you then repeatedly apply your PATCH, you will continue to get the same result: the email was changed to the new value. A goes in, A comes out, therefore this is idempotent.
An hour later, after you have gone to make some coffee and take a break, someone else comes along with their own PATCH. It seems the Post Office has been making some changes.
PATCH /users/1
{"zip": "12345"}
{
"id": 1,
"name": "Sam Kwee",
"email": "skwee357#newdomain.example", // still the new email you set
"address": "123 Mockingbird Lane",
"city": "New York",
"state": "NY",
"zip": "12345" // and this change as well
}
Since this PATCH from the post office doesn't concern itself with email, only zip code, if it is repeatedly applied, it will also get the same result: the zip code is set to the new value. A goes in, A comes out, therefore this is also idempotent.
The next day, you decide to send your PATCH again.
PATCH /users/1
{"email": "skwee357#newdomain.example"}
{
"id": 1,
"name": "Sam Kwee",
"email": "skwee357#newdomain.example",
"address": "123 Mockingbird Lane",
"city": "New York",
"state": "NY",
"zip": "12345"
}
Your patch has the same effect it had yesterday: it set the email address. A went in, A came out, therefore this is idempotent as well.
What I got wrong in my original answer
I want to draw an important distinction (something I got wrong in my original answer). Many servers will respond to your REST requests by sending back the new entity state, with your modifications (if any). So, when you get this response back, it is different from the one you got back yesterday, because the zip code is not the one you received last time. However, your request was not concerned with the zip code, only with the email. So your PATCH document is still idempotent - the email you sent in PATCH is now the email address on the entity.
So when is PATCH not idempotent, then?
For a full treatment of this question, I again refer you to Jason Hoetger's answer which already fully answers that.
Though Dan Lowe's excellent answer very thoroughly answered the OP's question about the difference between PUT and PATCH, its answer to the question of why PATCH is not idempotent is not quite correct.
To show why PATCH isn't idempotent, it helps to start with the definition of idempotence (from Wikipedia):
The term idempotent is used more comprehensively to describe an operation that will produce the same results if executed once or multiple times [...] An idempotent function is one that has the property f(f(x)) = f(x) for any value x.
In more accessible language, an idempotent PATCH could be defined as: After PATCHing a resource with a patch document, all subsequent PATCH calls to the same resource with the same patch document will not change the resource.
Conversely, a non-idempotent operation is one where f(f(x)) != f(x), which for PATCH could be stated as: After PATCHing a resource with a patch document, subsequent PATCH calls to the same resource with the same patch document do change the resource.
To illustrate a non-idempotent PATCH, suppose there is a /users resource, and suppose that calling GET /users returns a list of users, currently:
[{ "id": 1, "username": "firstuser", "email": "firstuser#example.org" }]
Rather than PATCHing /users/{id}, as in the OP's example, suppose the server allows PATCHing /users. Let's issue this PATCH request:
PATCH /users
[{ "op": "add", "username": "newuser", "email": "newuser#example.org" }]
Our patch document instructs the server to add a new user called newuser to the list of users. After calling this the first time, GET /users would return:
[{ "id": 1, "username": "firstuser", "email": "firstuser#example.org" },
{ "id": 2, "username": "newuser", "email": "newuser#example.org" }]
Now, if we issue the exact same PATCH request as above, what happens? (For the sake of this example, let's assume that the /users resource allows duplicate usernames.) The "op" is "add", so a new user is added to the list, and a subsequent GET /users returns:
[{ "id": 1, "username": "firstuser", "email": "firstuser#example.org" },
{ "id": 2, "username": "newuser", "email": "newuser#example.org" },
{ "id": 3, "username": "newuser", "email": "newuser#example.org" }]
The /users resource has changed again, even though we issued the exact same PATCH against the exact same endpoint. If our PATCH is f(x), f(f(x)) is not the same as f(x), and therefore, this particular PATCH is not idempotent.
Although PATCH isn't guaranteed to be idempotent, there's nothing in the PATCH specification to prevent you from making all PATCH operations on your particular server idempotent. RFC 5789 even anticipates advantages from idempotent PATCH requests:
A PATCH request can be issued in such a way as to be idempotent,
which also helps prevent bad outcomes from collisions between two
PATCH requests on the same resource in a similar time frame.
In Dan's example, his PATCH operation is, in fact, idempotent. In that example, the /users/1 entity changed between our PATCH requests, but not because of our PATCH requests; it was actually the Post Office's different patch document that caused the zip code to change. The Post Office's different PATCH is a different operation; if our PATCH is f(x), the Post Office's PATCH is g(x). Idempotence states that f(f(f(x))) = f(x), but makes no guarantes about f(g(f(x))).
TLDR - Dumbed Down Version
PUT => Set all new attributes for an existing resource.
PATCH => Partially update an existing resource (not all attributes required).
I was curious about this as well and found a few interesting articles. I may not answer your question to its full extent, but this at least provides some more information.
http://restful-api-design.readthedocs.org/en/latest/methods.html
The HTTP RFC specifies that PUT must take a full new resource
representation as the request entity. This means that if for example
only certain attributes are provided, those should be remove (i.e. set
to null).
Given that, then a PUT should send the entire object. For instance,
/users/1
PUT {id: 1, username: 'skwee357', email: 'newemail#domain.example'}
This would effectively update the email. The reason PUT may not be too effective is that your only really modifying one field and including the username is kind of useless. The next example shows the difference.
/users/1
PUT {id: 1, email: 'newemail#domain.example'}
Now, if the PUT was designed according the spec, then the PUT would set the username to null and you would get the following back.
{id: 1, username: null, email: 'newemail#domain.example'}
When you use a PATCH, you only update the field you specify and leave the rest alone as in your example.
The following take on the PATCH is a little different than I have never seen before.
http://williamdurand.fr/2014/02/14/please-do-not-patch-like-an-idiot/
The difference between the PUT and PATCH requests is reflected in the
way the server processes the enclosed entity to modify the resource
identified by the Request-URI. In a PUT request, the enclosed entity
is considered to be a modified version of the resource stored on the
origin server, and the client is requesting that the stored version be
replaced. With PATCH, however, the enclosed entity contains a set of
instructions describing how a resource currently residing on the
origin server should be modified to produce a new version. The PATCH
method affects the resource identified by the Request-URI, and it also
MAY have side effects on other resources; i.e., new resources may be
created, or existing ones modified, by the application of a PATCH.
PATCH /users/123
[
{ "op": "replace", "path": "/email", "value": "new.email#example.org" }
]
You are more or less treating the PATCH as a way to update a field. So instead of sending over the partial object, you're sending over the operation. i.e. Replace email with value.
The article ends with this.
It is worth mentioning that PATCH is not really designed for truly REST
APIs, as Fielding's dissertation does not define any way to partially
modify resources. But, Roy Fielding himself said that PATCH was
something [he] created for the initial HTTP/1.1 proposal because
partial PUT is never RESTful. Sure you are not transferring a complete
representation, but REST does not require representations to be
complete anyway.
Now, I don't know if I particularly agree with the article as many commentators point out. Sending over a partial representation can easily be a description of the changes.
For me, I am mixed on using PATCH. For the most part, I will treat PUT as a PATCH since the only real difference I have noticed so far is that PUT "should" set missing values to null. It may not be the 'most correct' way to do it, but good luck coding perfect.
tl;dr version
POST: is used to create an entity
PUT: is used to update/replace an existing entity where you must send the entire representation of the entity as you wish for it to be stored
PATCH: is used to update an entity where you send only the fields that need to be updated
The difference between PUT and PATCH is that:
PUT is required to be idempotent. In order to achieve that, you have to put the entire complete resource in the request body.
PATCH can be non-idempotent. Which implies it can also be idempotent in some cases, such as the cases you described.
PATCH requires some "patch language" to tell the server how to modify the resource. The caller and the server need to define some "operations" such as "add", "replace", "delete". For example:
GET /contacts/1
{
"id": 1,
"name": "Sam Kwee",
"email": "skwee357#olddomain.example",
"state": "NY",
"zip": "10001"
}
PATCH /contacts/1
{
[{"operation": "add", "field": "address", "value": "123 main street"},
{"operation": "replace", "field": "email", "value": "abc#myemail.example"},
{"operation": "delete", "field": "zip"}]
}
GET /contacts/1
{
"id": 1,
"name": "Sam Kwee",
"email": "abc#myemail.example",
"state": "NY",
"address": "123 main street",
}
Instead of using explicit "operation" fields, the patch language can make it implicit by defining conventions like:
in the PATCH request body:
The existence of a field means "replace" or "add" that field.
If the value of a field is null, it means delete that field.
With the above convention, the PATCH in the example can take the following form:
PATCH /contacts/1
{
"address": "123 main street",
"email": "abc#myemail.example",
"zip":
}
Which looks more concise and user-friendly. But the users need to be aware of the underlying convention.
With the operations I mentioned above, the PATCH is still idempotent. But if you define operations like: "increment" or "append", you can easily see it won't be idempotent anymore.
In my humble opinion, idempotence means:
PUT:
I send a compete resource definition, so - the resulting resource state is exactly as defined by PUT params. Each and every time I update the resource with the same PUT params - the resulting state is exactly the same.
PATCH:
I sent only part of the resource definition, so it might happen other users are updating this resource's OTHER parameters in a meantime. Consequently - consecutive patches with the same parameters and their values might result with different resource state. For instance:
Presume an object defined as follows:
CAR:
- color: black,
- type: sedan,
- seats: 5
I patch it with:
{color: 'red'}
The resulting object is:
CAR:
- color: red,
- type: sedan,
- seats: 5
Then, some other users patches this car with:
{type: 'hatchback'}
so, the resulting object is:
CAR:
- color: red,
- type: hatchback,
- seats: 5
Now, if I patch this object again with:
{color: 'red'}
the resulting object is:
CAR:
- color: red,
- type: hatchback,
- seats: 5
What is DIFFERENT to what I've got previously!
This is why PATCH is not idempotent while PUT is idempotent.
I might be a bit off topic considering your questions about idempotency, but I'd like you to consider evolutivity.
Consider you have the following element :
{
"username": "skwee357",
"email": "skwee357#domain.example"
}
If you modify with PUT, you have to give the whole representation of the object :
PUT /users/1
{
"username": "skwee357",
"email": "skwee357#newdomain.example"
}
Now you update the schema, and add a field phone :
PUT /users/1
{
"username": "skwee357",
"email": "skwee357#newdomain.example",
"phone": "123-456-7890"
}
Now update it again with PUT the same way, it will set phone to null. To avoid that bad side-effect, you have to update all the components that modify elements everytime you update your schema. Lame.
By using PATCH, you do not have this problem, because PATCH only updates the given fields. So, in my opinion, you should use PATCH to modify an element (whether it is really idempotent or not). That's a real-life return of experience.
Everyone else has answered the PUT vs PATCH. I was just going to answer what part of the title of the original question asks: "... in REST API real life scenarios". In the real world, this happened to me with internet application that had a RESTful server and a relational database with a Customer table that was "wide" (about 40 columns). I mistakenly used PUT but had assumed it was like a SQL Update command and had not filled out all the columns. Problems: 1) Some columns were optional (so blank was valid answer), 2) many columns rarely changed, 3) some columns the user was not allowed to change such as time stamp of Last Purchase Date, 4) one column was a free-form text "Comments" column that users diligently filled with half-page customer services comments like spouses name to ask about OR usual order, 5) I was working on an internet app at time and there was worry about packet size.
The disadvantage of PUT is that it forces you to send a large packet of info (all columns including the entire Comments column, even though only a few things changed) AND multi-user issue of 2+ users editing the same customer simultaneously (so last one to press Update wins). The disadvantage of PATCH is that you have to keep track on the view/screen side of what changed and have some intelligence to send only the parts that changed. Patch's multi-user issue is limited to editing the same column(s) of same customer.
Let me quote and comment more closely the RFC 7231 section 4.2.2, already cited in earlier comments:
A request method is considered "idempotent" if the intended effect on
the server of multiple identical requests with that method is the same
as the effect for a single such request. Of the request methods
defined by this specification, PUT, DELETE, and safe request methods
are idempotent.
(...)
Idempotent methods are distinguished because the request can be
repeated automatically if a communication failure occurs before the
client is able to read the server's response. For example, if a
client sends a PUT request and the underlying connection is closed
before any response is received, then the client can establish a new
connection and retry the idempotent request. It knows that repeating
the request will have the same intended effect, even if the original
request succeeded, though the response might differ.
So, what should be "the same" after a repeated request of an idempotent method? Not the server state, nor the server response, but the intended effect. In particular, the method should be idempotent "from the point of view of the client". Now, I think that this point of view shows that the last example in Dan Lowe's answer, which I don't want to plagiarize here, indeed shows that a PATCH request can be non-idempotent (in a more natural way than the example in Jason Hoetger's answer).
Indeed, let's make the example slightly more precise by making explicit one possible intend for the first client. Let's say that this client goes through the list of users with the project to check their emails and zip codes. He starts with user 1, notices that the zip is right but the email is wrong. He decides to correct this with a PATCH request, which is fully legitimate, and sends only
PATCH /users/1
{"email": "skwee357#newdomain.example"}
since this is the only correction. Now, the request fails because of some network issue and is re-submitted automatically a couple of hours later. In the meanwhile, another client has (erroneously) modified the zip of user 1. Then, sending the same PATCH request a second time does not achieve the intended effect of the client, since we end up with an incorrect zip. Hence the method is not idempotent in the sense of the RFC.
If instead the client uses a PUT request to correct the email, sending to the server all properties of user 1 along with the email, his intended effect will be achieved even if the request has to be re-sent later and user 1 has been modified in the meanwhile --- since the second PUT request will overwrite all changes since the first request.
To conclude the discussion on the idempotency, I should note that one can define idempotency in the REST context in two ways. Let's first formalize a few things:
A resource is a function with its co-domain being the class of strings. In other words, a resource is a subset of String × Any, where all the keys are unique. Let's call the class of the resources Res.
A REST operation on resources, is a function f(x: Res, y: Res): Res. Two examples of REST operations are:
PUT(x: Res, y: Res): Res = x, and
PATCH(x: Res, y: Res): Res, which works like PATCH({a: 2}, {a: 1, b: 3}) == {a: 2, b: 3}.
(This definition is specifically designed to argue about PUT and POST, and e.g. doesn't make much sense on GET and POST, as it doesn't care about persistence).
Now, by fixing x: Res (informatically speaking, using currying), PUT(x: Res) and PATCH(x: Res) are univariate functions of type Res → Res.
A function g: Res → Res is called globally idempotent, when g ○ g == g, i.e. for any y: Res, g(g(y)) = g(y).
Let x: Res a resource, and k = x.keys. A function g = f(x) is called left idempotent, when for each y: Res, we have g(g(y))|ₖ == g(y)|ₖ. It basically means that the result should be same, if we look at the applied keys.
So, PATCH(x) is not globally idempotent, but is left idempotent. And left idempotency is the thing that matters here: if we patch a few keys of the resource, we want those keys to be same if we patch it again, and we don't care about the rest of the resource.
And when RFC is talking about PATCH not being idempotent, it is talking about global idempotency. Well, it's good that it's not globally idempotent, otherwise it would have been a broken operation.
Now, Jason Hoetger's answer is trying to demonstrate that PATCH is not even left idempotent, but it's breaking too many things to do so:
First of all, PATCH is used on a set, although PATCH is defined to work on maps / dictionaries / key-value objects.
If someone really wants to apply PATCH to sets, then there is a natural translation that should be used: t: Set<T> → Map<T, Boolean>, defined with x in A iff t(A)(x) == True. Using this definition, patching is left idempotent.
In the example, this translation was not used, instead, the PATCH works like a POST. First of all, why is an ID generated for the object? And when is it generated? If the object is first compared to the elements of the set, and if no matching object is found, then the ID is generated, then again the program should work differently ({id: 1, email: "me#site.example"} must match with {email: "me#site.example"}, otherwise the program is always broken and the PATCH cannot possibly patch). If the ID is generated before checking against the set, again the program is broken.
One can make examples of PUT being non-idempotent with breaking half of the things that are broken in this example:
An example with generated additional features would be versioning. One may keep record of the number of changes on a single object. In this case, PUT is not idempotent: PUT /user/12 {email: "me#site.example"} results in {email: "...", version: 1} the first time, and {email: "...", version: 2} the second time.
Messing with the IDs, one may generate a new ID every time the object is updated, resulting in a non-idempotent PUT.
All the above examples are natural examples that one may encounter.
My final point is, that PATCH should not be globally idempotent, otherwise won't give you the desired effect. You want to change the email address of your user, without touching the rest of the information, and you don't want to overwrite the changes of another party accessing the same resource.
A very nice explanation is here-
https://blog.segunolalive.com/posts/restful-api-design-%E2%80%94-put-vs-patch/#:~:text=RFC%205789,not%20required%20to%20be%20idempotent.
A Normal Payload-
// House on plot 1
{
address: 'plot 1',
owner: 'segun',
type: 'duplex',
color: 'green',
rooms: '5',
kitchens: '1',
windows: 20
}
PUT For Updated-
// PUT request payload to update windows of House on plot 1
{
address: 'plot 1',
owner: 'segun',
type: 'duplex',
color: 'green',
rooms: '5',
kitchens: '1',
windows: 21
}
Note: In above payload we are trying to update windows from 20 to 21.
Now see the PATH payload-
// Patch request payload to update windows on the House
{
windows: 21
}
Since PATCH is not idempotent, failed requests are not automatically re-attempted on the network. Also, if a PATCH request is made to a non-existent url e.g attempting to replace the front door of a non-existent building, it should simply fail without creating a new resource unlike PUT, which would create a new one using the payload. Come to think of it, it’ll be odd having a lone door at a house address.
PUT method is ideal to update data in tabular format like in a relational db or entity like storage. Based on use case it can be used to update data partially or replace the entity as a whole. This will always be idempotent.
PATCH method can be used to update(or restructure) data in json or xml format which is stored in local file system or no sql database. This can be performed by mentioning the action/operation to be performed in the request like adding/removing/moving a key-value pair to json object. The remove operation can be used to delete a key-value pair and duplicate request will result in error as the key was deleted earlier making it a non-idempotent method. refer RFC 6902 for json data patching request.
This artical has detailed information related to PATCH method.
I will try to summarize in layman terms what I understood (maybe it helps)
Patch is not fully idempotent (it can be in an ideal situation where nobody changes another field of your entity).
In an not ideal (real life) situation somebody modifies another field of your object by another Patch operation and then both operations are not Idempotent (meaning that the resource you are both modifying comes back "wrong" from either one point of view)
So you cannot call it Idempotent if it does not cover 100% of the situations.
Maybe this is not that important to some, but to others is
One additional information I just one to add is that a PATCH request use less bandwidth compared to a PUT request since just a part of the data is sent not the whole entity. So just use a PATCH request for updates of specific records like (1-3 records) while PUT request for updating a larger amount of data. That is it, don't think too much or worry about it too much.