I am trying to figure out the best way of adding multiple users to a group via a REST API.
Right now, I am thinking this is the best way of adding a single user at a time:
PUT /groups/123/{userID}
Then, to remove the user from the group:
DELETE /groups/123/{userID}
But how would I add multiple users to the group at the same time? Would this be the best way?
PUT /groups/123
Content body as an array:
[
"user1",
"user2",
"user3"
]
...and to remove the users from the group, I would do the same thing via a DELETE request.
Is there anything "wrong" with this setup, or would there be a better, more "industry standard" way of doing this?
How to PUT multiple resources via a REST API
You don't do that - each request in HTTP has one-and-exactly-one target resource; the semantics of the PUT request are that the message body of the request is a replacement representation for the resource.
Resources are generalizations of documents.
It's entirely reasonable to have a document that is a representation of the members of some group.
PUT /groups/123
[
"user1",
"user2",
"user3"
]
What this request means is "replace your copy of /groups/123 with this copy". It's purely a document editing command.
The useful work - like actually making changes to the master group membership list, is a side effect of making your copy of the document look like the provided copy. See Webber 2011.
In this style, "removing" a user from the group would look like another edit to the /groups/123 document
PUT /groups/123
[
"user1",
"user3"
]
Of course, remote authoring semantics aren't required; you could instead do:
POST /groups/123
Please remove user1
or
POST /groups/123
action=addUser&id=user4
to remove the users from the group, I would do the same thing via a DELETE request
Careful - DELETE (like PUT) means something specific in HTTP, and that may not be what you mean.
Relatively few resources allow the DELETE method
The semantics of DELETE belong firmly in the transfer documents over a network domain, not in the "manage users and groups" domain.
Related
I am looking at defining a REST API using HATEOAS. In particular, I find very interesting the concept of indicating for a given resource the actions that are available right now.
Some of the HATEOAS specifications include too much overhead for my needs, so I was looking at the HAL specification, as I find it very concise and practical:
{
_links: {
self: { href: "/orders/523" },
warehouse: { href: "/warehouse/56" },
invoice: { href: "/invoices/873" }
},
currency: "USD",
status: "shipped",
total: 10.20
}
However, links in HAL only contain a list of related resources, but not the available actions on them. As per the example above, am I allowed to cancel the order now, or not anymore? Some HAL examples solve this by using a specific URL for cancellations, and add a corresponding link in the response only if cancellation is possible:
"cancel": { "href": "/orders/523/cancel" }
But that is not very RESTful. Cancellations are not a resource. Cancellations are a DELETE of a resource, i.e.:
DELETE /orders/523
Is there a nice way to represent this with HAL, or should I use a different HATEOAS specification?
I am considering returning a "cancel" link with the same URL as self, but in this case the client would have to know that to cancel they have to use the DELETE verb, which is not really being described in the HATEOAS response.
self: { "href": "/orders/523" },
cancel: { "href": "/orders/523" }
Would this be the recommended approach as per HATEOAS / HAL? I understand HAL does not have any "method" parameter, and adding it myself would be against the HAL specification.
Some HAL examples solve this by using a specific URL for cancellations, and add a corresponding link in the response only if cancellation is possible
Yes. Just like web sites: if you want to alert the client to the possibility of reaching some other application state, you provide the client with a link, including the identifier for the resource involved.
But that is not very RESTful.
It may not be "RESTful", but it is certainly conforms to the REST architectural style.
Cancellations are a DELETE of a resource, i.e.: DELETE /orders/523
You are confusing the actions on the domain model with the actions on the integration model. What a REST API does is guide the client through a protocol to achieve some end; it is not a mapping of domain semantics onto HTTP.
Jim Webber phrased it this way:
The web is not your domain; it's a document management system. All of the HTTP verbs apply to the document management domain. URIs do NOT map onto domain objects -- that violates encapsulation. Work (ex: issuing commands to the domain model) is a side effect of managing resources.
One of the REST constraints is the uniform interface; in the case of HTTP, it means that all resources understand methods in a uniform way; DELETE means the semantics described in RFC 7231, section 4.3.5.
In other words, if I send the request
OPTIONS /x/y/z/foobar ...
and the response includes DELETE in the Allow header, then I know what it means. The side effects in your domain? I don't know anything about the side effects.
In the definition of DELETE, note the following
Relatively few resources allow the DELETE method -- its primary use is for remote authoring environments, where the user has some direction regarding its effect.
Anyway, you aren't really asking about DELETE, but about HAL
Is there a nice way to represent this with HAL, or should I use a different HATEOAS specification?
From what I can tell, the official way to do it is to document it with the link relation. In other words, instead of using "cancel" as the link relation, you use something like
https://www.rfc-editor.org/rfc/rfc5023#section-5.4.2
And then your consumers, if they want to discover what a link is for, can follow the relation to learn what is going on.
HAL Discuss: Why No Methods? has a lot of good information.
I like Mike Kelly's summary:
The idea is the available methods can be conveyed via the link
relation documentation and don't need to be in the json messages.
According to this article from LosTechies, in a CQRS perspective, Its accepted to use URLs such as: /orders/<id>/<command> and call these with PUT requests. So its OK to use a "cancel": { "href": "/orders/523/cancel" }.
However, if you absolutely want to use DELETE and you only use command links to modify your ressources (i.e. /orders/<id>/<command>) like proposed the article, why can't you just add a link such as "cancel": { "href": "/orders/523" } and deduct the HTTP verb?
I mean according to REST there is only 5 main verbs (GET, POST, PUT, PATCH and DELETE). We can't use POST on a URL such as /<ressource>/<id>, GET is already define as the "self" relation, we mentioned above that modifications (PUT) will be handled by command links (i.e. /<ressource>/<id>/<command>) and because we use command links there is no need to use PATCH. After that, the only option left is: DELETE.
It's not perfect, but it works and it doesn't break any convention.
I am using Symfony2 and its ACL security component in my project. I want to use the ACL information in the frontend framework for show/hide elements.
Would it be a terrible idea security wise to attach formated ACL information for current user on current object?
Lets say the user has permission to VIEW and EDIT object, so the JSON data would look like this:
{
"id": 1,
"name": "Product",
"_permissions": ["VIEW", "EDIT"]
}
What security holes this solution can potentially cause?
I don't think there is a security problem. You will agree that it can not be a problem to have the id and type of the object in your data :-). So the only thing we should take a look at are the VIEW and EDIT attributes. These values are not a secret. They are part of the Symfony documentation. So it's only about the information if you have these permissions for that object.
If you return that JSON together with your data I think, the VIEW attribute is actually no additional information as it would not be returned if you would not have the VIEW permission. So the only information you provide here is if the user can edit that object.
If you think about it you will agree that you would provide the same information if you would on server side decide based on permissions if you add an edit link for that object to a html page or not.
So if you do an isGranted("EDIT", $product) to decide if you return that EDIT as part of your JSON I can not see any security hole there.
I have found very little detail about best practices when responding to PUT or POST commands with a REST API.
Assume the example is that the API is for a list of movies in a movie store and has the following:
GET api/Movies
GET api/Movies/{id}
PUT api/Movies/
PUT api/Movies/{id}
POST api/Movies/
POST api/Movies/{id}
Where you can PUT or POST either single or collections. I included both because I do not want to get into a discussion about PUT vs. POST, and would like an answer on best practices, particularly in response to errors.
If working on a single item I can return HTTP status codes and a response easily, but what should be done when handling POST and PUT of collections, especially in a non-idempotent method?
My thought for returning a package would be as follows:
{
"version": "1.0"
"status": 200,
"errors": [
// List of object id's, and errors
]
"data": [
// List of movies POSTed or PUT
]
}
With the errors being generated for each specific ID that failed, but I'm not sure it passes the smell test in regards to overall status and HttpStatus. Should I return another status if a portion of the collection fails or a single entity fails?
Generally in REST a operation needs to completely succeed or completely fail. Operations like this should be atomic and idempotent.
So what you're asking is simply outside of what REST can do for you. From the horse's mouth:
"If you find yourself in need of a batch operation, then most likely you just haven’t defined enough resources."
http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven#comment-743
So what does that quote mean? It doesn't mean that you can't have a resource representing the same data as several other resources (e.g.: your collection), but if you are using PUT to update it, you are still 100% replacing its contents. Not partially.
I'm adding a new REST service to our API, and wanted to canvas a few opinions on the best REST API. The service is used to retrieve the user's email address in case they have forgotten what their username is. The service requires three parameters:
Account number (this is a number that is on their printed statement)
Surname
Date of birth
If we find a match for these three pieces of info, the service returns JSON contained a masked version of the user's registered email address (eg. jo******#gmail.com) so that the UI can present a message something like "We are going to send your username to j******#g******.com. Is that OK?"
Note that the service doesn't actually change anything within their account or send an email (it is purely fetching info so that the user can confirm the next step), so it seems to me that a GET request is the way to go. The question is how to represent it? It strikes me that /users is a reasonable place to start(?), but then what? Using the URL path, I might end up with something like:
/users/accountEmail/accountNumber/123456/surname/Smith/dateOfBirth/25-12-1970
This seems icky as, ordinarily, our /users URLs contain the username (eg. /users/john/transactions), but clearly for this API call we don't actually know who the user is yet. I'm also not sure it really indicates what the service actually does. Alternatively, I could use URL query params:
/users/accountEmail?accountNumber=123456&surname=Smith&dateOfBirth=25-12-1970
This feels a bit more natural, but I'm unsure that stringing all those input parameters into the URL is a good idea. Then again, maybe /users is the wrong noun. Maybe it should be like:
/accountEmail/...
Having said all that, maybe given the service's idempotence, I could actually use a PUT request and encode the parameters into the HTTP body. Not sure about using PUT for read-only requests though... it seems a bit like heading down the RPC path. The one nice thing about the PUT approach though is that it doesn't log this relatively sensitive user input into any web server logs.
I'd be interested in opinions or hearing what other API developers did in a similar situation. Thanks.
First of all, don't use method GET with sensitive information in URL parameters or in a URL path, because that information can be stored in web server access log files, browser's history, HTTP proxy log files, etc.
Security wise you need to use method POST in this case. In regard of URL to use, I'm not that sure, probably something like /accounts and put all parameters to the request body.
Your second approach is what I would use. Logically, the URLS are build following these steps.
Collection Resource of Users
The URL
GET /users
returns a list of all users including all user properties.
[
12345: {
"surname": "Smith",
"firstname": "John",
"dateOfBirth": "1970-12-25",
"accountEmail": "john.smith#example.com"
},
6789 : {
"surname": "Hallow",
"firstname": "Jane",
"dateOfBirth": "1981-02-15",
"accountEmail": "jane.hallowh#example.com"
}
]
Sub-Collection Resource of User Emails
The URL
GET /users/accountEmail
returns al list alf all emails for all users.
[
12345: {
"accountEmail": "john.smith#example.com"
},
"accountEmail": "jane.hallowh#example.com"
}
]
Filter this Resource
The URL
GET /users/accountEmail?accountNumber=123456&surname=Smith&dateOfBirth=25-12-1970
returns a filtered list of emails for the users that match the query parameters.
[
12345: {
"accountEmail": "john.smith#example.com"
}
]
I have a collection resource called Columns. A GET with Accept: application/json can't directly return a collection, so my representation needs to nest it in a property:-
{ "propertyName": [
{ "Id": "Column1", "Description": "Description 1" },
{ "Id": "Column2", "Description": "Description 2" }
]
}
Questions:
what is the best name to use for the identifier propertyName above? should it be:
d (i.e. is d an established convention or is it specific to some particular frameworks (MS WCF and MS ASP.NET AJAX ?)
results (i.e. is results an established convention or is it specific to some particular specifications (MS OData)?)
Columns (i.e. the top level property should have a clear name and it helps to disambiguate my usage of generic application/json as the Media Type)
NB I feel pretty comfortable that there should be something wrapping it, and as pointed out by #tuespetre, XML or any other representation would force you to wrap it to some degree anyway
when PUTting the content back, should the same wrapping in said property be retained [given that it's not actually necessary for security reasons and perhaps conventional JSON usage idioms might be to drop such nesting for PUT and POST given that they're not necessary to guard against scripting attacks] ?
my gut tells me it should be symmetric as for every other representation but there may be prior art for dropping the d/*results** [assuming that's the answer to part 1]*
... Or should a PUT-back (or POST) drop the need for a wrapping property and just go with:-
[
{ "Id": "Column1", "Description": "Description 1" },
{ "Id": "Column2", "Description": "Description 2" }
]
Where would any root-level metadata go if one wished to add that?
How/would a person crafting a POST Just Know that it needs to be symmetric?
EDIT: I'm specifically interested in an answer that with a reasoned rationale that specifically takes into account the impacts on client usage with JSON. For example, HAL takes care to define a binding that makes sense for both target representations.
EDIT 2: Not accepted yet, why? The answers so far don't have citations or anything that makes them stand out over me doing a search and picking something out of the top 20 hits that seem reasonable. Am I just too picky? I guess I am (or more likely I just can't ask questions properly :D). Its a bit mad that a week and 3 days even with an )admittedly measly) bonus on still only gets 123 views (from which 3 answers ain't bad)
Updated Answer
Addressing your questions (as opposed than going off on a bit of a tangent in my original answer :D), here's my opinions:
1) My main opinion on this is that I dislike d. As a client consuming the API I would find it confusing. What does it even stand for anyway? data?
The other options look good. Columns is nice because it mirrors back to the user what they requested.
If you are doing pagination, then another option might be something like page or slice as it makes it clear to the client, that they are not receiving the entire contents of the collection.
{
"offset": 0,
"limit": 100,
"page" : [
...
]
}
2) TBH, I don't think it makes that much difference which way you go for this, however if it was me, I probably wouldn't bother sending back the envelope, as I don't think there is any need (see below) and why make the request structure any more complicated than it needs to be?
I think POSTing back the envelope would be odd. POST should let you add items into the collection, so why would the client need to post the envelope to do this?
PUTing the envelope back could make sense from a RESTful standpoint as it could be seen as updating metadata associated with the collection as a whole. I think it is worth thinking about the sort of meta data you will be exposing in the envelope. All the stuff I think would fit well in this envelope (like pagination, aggregations, search facets and similar meta data) is all read only, so it doesn't make sense for the client to send this back to the server. If you find yourself with a lot of data in the envelope that the client is able to mutate - then it could be a sign to break that data out into a separate resource with the list as a sub collection. Rubbish example:
/animals
{
"farmName": "farm",
"paging": {},
"animals": [
...
]
}
Could be broken up into:
/farm/1
{
"id": 1,
"farmName": "farm"
}
and
/farm/1/animals
{
"paging": {},
"animals": [
...
]
}
Note: Even with this split, you could still return both combined as a single response using something like Facebook's or LinkedIn's field expansion syntax. E.g. http://example.com/api/farm/1?field=animals.offset(0).limit(10)
In response, to your question about how the client should know what the JSON payload they are POSTing and PUTing should look like - this should be reflected in your API documentation. I'm not sure if there is a better tool for this, but Swagger provides a spec that allows you to document what your request bodies should look like using JSON Schema - check out this page for how to define your schemas and this page for how to reference them as a parameter of type body. Unfortunately, Swagger doesn't visualise the request bodies in it's fancy web UI yet, but it's is open source, so you could always add something to do this.
Original Answer
Check out William's comment in the discussion thread on that page - he suggests a way to avoid the exploit altogether which means you can safely use a JSON array at the root of your response and then you need not worry about either of you questions.
The exploit you link to relies on your API using a Cookie to authenticate a user's session - just use a query string parameter instead and you remove the exploit. It's probably worth doing this anyway since using Cookies for authentication on an API isn't very RESTful - some of your clients may not be web browsers and may not want to deal with cookies.
Why Does this fix work?
The exploit is a form of CSRF attack which relies on the attacker being able to add a script tag on his/her own page to a sensitive resource on your API.
<script src="http://mysite.com/api/columns"></script>
The victims web browser will send all Cookies stored under mysite.com to your server and to your servers this will look like a legitimate request - you will check the session_id cookie (or whatever your server-side framework calls the cookie) and see the user is authenticated. The request will look like this:
GET http://mysite.com/api/columns
Cookie: session_id=123456789;
If you change your API you ignore Cookies and use a session_id query string parameter instead, the attacker will have no way of tricking the victims web browser into sending the session_id to your API.
A valid request will now look like this:
GET http://mysite.com/api/columns?session_id=123456789
If using a JavaScript client to make the above request, you could get the session_id from a cookie. An attacker using JavaScript from another domain will not be able to do this, as you cannot get cookies for other domains (see here).
Now we have fixed the issue and are ignoring session_id cookies, the script tag on the attackers website will still send a similar request with a GET line like this:
GET http://mysite.com/api/columns
But your server will respond with a 403 Forbidden since the GET is missing the required session_id query string parameter.
What if I'm not authenticating users for this API?
If you are not authenticating users, then your data cannot be sensitive and anyone can call the URI. CSRF should be a non-issue since with no authentication, even if you prevent CSRF attacks, an attacker could just call your API server side to get your data and use it in anyway he/she wants.
I would go for 'd' because it clearly separates the 'envelope' of your resource from its content. This would also make it easier for consumers to parse your responses, as opposed to 'guessing' the name of the wrapping property of a given resource before being able to access what it holds.
I think you're talking about two different things:
POST request should be sent in application/x-www-form-urlencoded. Your response should basically mirror a GET if you choose to include a representation of the newly created resource in your reply. (not mandatory in HTTP).
PUTs should definitely be symmetric to GETs. The purpose of a PUT request is to replace an existing resource representation with another. It just makes sense to have both requests share the same conventions, doesn't it?
Go with 'Columns' because it is semantically meaningful. It helps to think of how JSON and XML could mirror each other.
If you would PUT the collection back, you might as well use the same media type (syntax, format, what you will call it.)