REST API which needs multiple different resources? - json

I'm designing a REST api for running jobs on virtual machines in different domains (Active Directory domains, the virtual machines with the same name can exist in different domains).
/domains
/domains/{dname}
/domains/{dname}/vms
/domains/{dname}/vms/{cname}
And for jobs, which will be stored in a database
/jobs
/jobs/{id}
Now I need to add a new API for the following user stories.
As a user, I want to run a job (just job definition, not the stored job) on an existing VM.
As a user, I want to run a job (just job definition, not the stored job) on VM named x, which may or may not exist. The system should create the VM if x doesn't exist.
How should the api be designed?
Approach 1:
PUT /domains/{dname}
{ "state": "running_job", "vm": "vm_name", "job_definition": { .... } }
Approach 2:
PUT /domains/{dname}/vms/{vm_name}
{ "state": "running_job", "job_definition": { .... } }
Approach 3:
PUT /jobs
{ "state": "running", "domain": "name", "vm": "vm_name", "job_definition": { .... } }
Approach 4: create a new resource, saying scheduler,
PUT /scheduler
{ "domain": "name", "vm": "vm_name", "job_definition": { .... } }
(what if I need to update some attributes of scheduler in the future?)
In general, hwo to design the REST API url which needs multiple resources?

How should the api be designed?
How would you design this on the web?
There would be an HTML form, right? With a bunch of input controls to collect information from the operator about what job to use, and which VM to target, and so on. Operator would fill in the details, submit the form. The browser would then use the form to create the appropriate HTTP request to send to the server (the request-target being computed from the form meta data).
Since the server gets to decide what the request-target should be (benefits of using hypertext), it can choose any resource identifier it wants. In HTTP, a successful unsafe request happens to invalidate previously cached responses with the same request target, so one possible strategy is to consider which is the most important resource changed by successfully handling the request, and use that resource as the target.
In this specific case, we might have a resource that represents the job queue (ex /jobs), and what we are doing here is submitting a new entry in the queue, so we might expect
POST /jobs HTTP/1.1
....
If the server, in its handling of the request, also creating new resources for the specific job, then those would be indicated in the response
HTTP/1.1 201 Created
Location: /jobs/931a8a02-1a87-485a-ba5b-dd6ee716c0ef
....
Could you instead just use PUT?
PUT /jobs/931a8a02-1a87-485a-ba5b-dd6ee716c0ef HTTP/1.1
???
Yes, if (a) the client knows what spelling to use for the request-target and (b) is the client knows what the representation of the resource should look like.
Which unsafe HTTP method you use in the messages that trigger you business activities doesn't actually matter very much. You need to use the methods correctly (so that general purpose HTTP connectors don't get misled).
In particular, the important thing to remember about PUT is that the request body should be a complete representation of the resource - in other words, the request body for a PUT should match the response body of a GET. Think "save file"; we've made local edits to our copy of a resource, and we send back a copy of the entire document.

Related

How do applications know how return the data an API requested?

Let's say I created Twilio. Below is their alerts API
https://www.twilio.com/docs/usage/monitor-alert:
Download the helper library from https://www.twilio.com/docs/python/install
import os
from twilio.rest import Client
Your Account Sid and Auth Token from twilio.com/console
and set the environment variables. See http://twil.io/secure
account_sid = os.environ['TWILIO_ACCOUNT_SID']
auth_token = os.environ['TWILIO_AUTH_TOKEN']
client = Client(account_sid, auth_token)
alert = client.monitor.alerts('NOXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX').fetch()
print(alert.alert_text)
Example JSON API response:
{
"account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
"alert_text": "alert_text",
"api_version": "2010-04-01",
"date_created": "2015-07-30T20:00:00Z",
"date_generated": "2015-07-30T20:00:00Z",
"date_updated": "2015-07-30T20:00:00Z",
"error_code": "error_code",
"log_level": "log_level",
"more_info": "more_info",
"request_method": "GET",
"request_url": "http://www.example.com",
"request_variables": "request_variables",
"resource_sid": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"response_body": "response_body",
"response_headers": "response_headers",
"request_headers": "request_headers",
"sid": "NOXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
"url": "https://monitor.twilio.com/v1/Alerts/NOXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
"service_sid": "PNe2cd757cd5257b0217a447933a0290d2"
}
How does the application return the data above? How does it know to return the above data?
DO you have to create an object with the data above to respond to said API call> Does the programmer need to write a special function, they would not normally create if they didn't want to provide API access to the application, to respond to the api call that will return the data?
So for example if I had a website that enabled people to enter in their first and last names into a database. I would then need to write a program that utilized some function that inserted the names. If I then wanted to create an API to give others to retrieve the names from the database I would need to create another function, that I wouldn't create if I didn't have the API, to retrieve the names for the API call. Or I would create an API call that would make a request to some function that would return an object with all the data I said my API call would return; or the function would go searching around for the data create an object and then return the information to the person who made the call.
Someone wrote it. Or maybe they wrote code that wrote it.
But probably someone just wrote it in at least one iteration; but as everything is translated, compiled and interpreted nowadays, what is the difference.
Some of the best external API-services evolved out of internal APIs; like AWS.
The hard part is not throwing together an API, it is testing it, ensuring it is robust and secure and privacy capable sufficiently powerful.

Restrict feathers service method to user for external but allow any queries for internal calls

I want to restrict calls to a Feathers service method for externals calls with associateCurrentUser.
I also want to allow the server to call this service method without restricting it.
The use case is that through this service then clients use a lock table, all clients can see all locks, and occasionally the server should clear out abandoned rows in this table. Row abandonment can happen on network failures etc. When the server removes data then the normal Feathers remove events should be emitted to the clients.
I would imagine that this should be a mix of associateCurrentUser and disallow hooks but I can't even begin to experiment with this as I don't see how it would be put together.
How would one implement this, please?
Update:
I found this answer User's permissions in feathers.js API from Daff which implies that if the hook's context.params.provider is null then the call is internal, otherwise external. Can anyone confirm if this is really so in all cases, please?
It seems to be so from my own tests but I don't know if there are any special cases out there that might come and bite me down the line.
If the call is external params.provider will be set to the transport that has been used (currently either rest, socketio or primus, documented here, here and here).
If called internally on the server there is not really any magic. It will be whatever you pass as params. If you pass nothing it will be undefined if you pass (or merge with) hook.params in a hook it will be the same as what the original method was called with.
// `params` is an empty object so `params.provider` will be `undefined`
app.service('messages').find({})
// `params.provider` will be `server`
app.service('messages').find({ provider: 'server' })
// `params.provider` will be whatever the original hook was called with
function(hook) {
hook.app.service('otherservice').find(hook.params);
}

HTTP ReST: update large collections: better approach than JSON PATCH?

I am designing a web service to regularly receive updates to lists. At this point, a list can still be modeled as a single entity (/lists/myList) or an actual collection with many resources (/lists/myList/entries/<ID>). The lists are large (millions of entries) and the updates are small (often less than 10 changes).
The client will get web service URLs and lists to distribute, e.g.:
http://hostA/service/lists: list1, list2
http://hostB/service/lists: list2, list3
http://hostC/service/lists: list1, list3
It will then push lists and updates as configured. It is likely but undetermined if there is some database behind the web service URLs.
I have been researching and it seems a HTTP PATCH using the JSON patch format is the best approach.
Context and examples:
Each list has an identifying name, a priority and millions of entries. Each entry has an ID (determined by the client) and several optional attributes. Example to create a list "requiredItems" with priority 1 and two list entries:
PUT /lists/requiredItems
Content-Type: application/json
{
"priority": 1,
"entries": {
"1": {
"color": "red",
"validUntil": "2016-06-29T08:45:00Z"
},
"2": {
"country": "US"
}
}
}
For updates, the client would first need to know what the list looks like now on the server. For this I would add a property "revision" to the list entity.
Then, I would query this attribute:
GET /lists/requiredItems?property=revision
Then the client would see what needs to change between the revision on the server and the latest revision known by the client and compose a JSON patch. Example:
PATCH /list/requiredItems
Content-Type: application/json-patch+json
[
{ "op": "test", "path": "revision", "value": 3 },
{ "op": "add", "path": "entries/3", "value": { "color": "blue" } },
{ "op": "remove", "path": "entries/1" },
{ "op": "remove", "path": "entries/2/country" },
{ "op": "add", "path": "entries/2/color", "value": "green" },
{ "op": "replace", "path": "revision", "value": 10 }
]
Questions:
This approach has the drawback of slightly less client support due to the not-often-used HTTP verb PATCH. Is there a more compatible approach without sacrificing HTTP compatibility (idempotency et cetera)?
Modelling the individual list entries as separate resources and using PUT and DELETE (perhaps with ETag and/or If-Match) seems an option (PUT /lists/requiredItems/entries/3, DELETE /lists/requiredItems/entries/1 PUT /lists/requiredItems/revision), but how would I make sure all those operations are applied when the network drops in the middle of an update chain? Is a HTTP PATCH allowed to work on multiple resources?
Is there a better way to 'version' the lists, perhaps implicitly also improving how they are updated? Note that the client determines the revision number.
Is it correct to query the revision number with GET /lists/requiredItems?property=revision? Should it be a separate resource like /lists/requiredItems/revision? If it should be a separate resource, how would I update it atomically (i.e. the list and revision are both updated or both not updated)?
Would it work in JSON patch to first test the revision value to be 3 and then update it to 10 in the same patch?
This approach has the drawback of slightly less client support due to the not-often-used HTTP verb PATCH.
As far as I can tell, PATCH is really only appropriate if your server is acting like a dumb document store, where the action is literally "please update your copy of the document according to the following description".
So if your resource really just is a JSON document that describes a list with millions of entries, then JSON-Patch is a great answer.
But if you are expecting that the patch will, as a side effect, update an entity in your domain, then I'm suspicious.
Is a HTTP PATCH allowed to work on multiple resources?
RFC 5789
The PATCH method affects the resource identified by the Request-URI, and it also MAY have side effects on other resources
I'm not keen on querying the revision number; it doesn't seem to have any clear advantage over using an ETag/If-Match approach. Some obvious disadvantages - the caches between you and the client don't know that the list and the version number are related; a cache will happily tell a client that version 12 of the list is version 7, or vice versa.
Answering my own question. My first bullet point may be opinion-based and, as has been pointed out, I've asked many questions in one post. Nevertheless, here's a summary of what was answered by others (VoiceOfUnreason) and my own additional research:
ETags are HTTP's resource 'hashes'. They can be combined with If-Match headers to have a versioning system. However, ETag-headers are normally not used to declare the ETag of a resource that is being created (PUT) or updated (POST/PATCH). The server storing the resource usually determines the ETag. I've not found anything explicitly forbidding this, but many implementations may assume that the server determines the ETag and get confused when it is provided with PUT or PATCH.
A separate revision resource is a valid alternative to ETags for versioning. This resource must be updated at the same time as the resource it is the revision of.
It is not semantically enforceable on a HTTP level to have commit/rollback transactions, unless by modelling the transaction itself as a ReST resource, which would make things much more complicated.
However, some properties of PATCH allow it to be used for this:
A HTTP PATCH must be atomic and can operate on multiple resources. RFC 5789:
The server MUST apply the entire set of changes atomically and never provide (e.g., in response to a GET during this operation) a partially modified representation. If the entire patch document cannot be successfully applied, then the server MUST NOT apply any of the changes.
The PATCH method affects the resource identified by the Request-URI, and it also MAY have side effects on other resources; i.e., new resources may be created, or existing ones modified, by the application of a PATCH. PATCH is neither safe nor idempotent
JSON PATCH can consist of multiple operations on multiple resources and all must be applied or none must be applied, making it an implicit transaction. RFC 6902: Operations are applied sequentially in the order they appear in the array.
Thus, the revision can be modeled as a separate resource and still be updated at the same time. Querying the current revision is a simple GET. Committing a transaction is a single PATCH request containing first a test of the revision, then the operations on the resource(s) and finally the operation to update the revision resource.
The server can still choose to publish the revision as ETag of the main resource.

How can I configure Polymer's platinum-sw-* to NOT cache one URL path?

How can I configure Polymer's platinum-sw-cache or platinum-sw-fetch to cache all URL paths except for /_api, which is the URL for Hoodie's API? I've configured a platinum-sw-fetch element to handle the /_api path, then platinum-sw-cache to handle the rest of the paths, as follows:
<platinum-sw-register auto-register
clients-claim
skip-waiting
on-service-worker-installed="displayInstalledToast">
<platinum-sw-import-script href="custom-fetch-handler.js"></platinum-sw-import-script>
<platinum-sw-fetch handler="HoodieAPIFetchHandler"
path="/_api(.*)"></platinum-sw-fetch>
<platinum-sw-cache default-cache-strategy="networkFirst"
precache-file="precache.json"/>
</platinum-sw-cache>
</platinum-sw-register>
custom-fetch-handler.js contains the following. Its intent is simply to return the results of the request the way the browser would if the service worker was not handling the request.
var HoodieAPIFetchHandler = function(request, values, options){
return fetch(request);
}
What doesn't seem to be working correctly is that after user 1 has signed in, then signed out, then user 2 signs in, then in Chrome Dev Tools' Network tab I can see that Hoodie regularly continues to make requests to BOTH users' API endpoints like the following:
http://localhost:3000/_api/?hoodieId=uw9rl3p
http://localhost:3000/_api/?hoodieId=noaothq
Instead, it should be making requests to only ONE of these API endpoints. In the Network tab, each of these URLs appears twice in a row, and in the "Size" column the first request says "(from ServiceWorker)," and the second request states the response size in bytes, in case that's relevant.
The other problem which seems related is that when I sign in as user 2 and submit a form, the app writes to user 1's database on the server side. This makes me think the problem is due to the app not being able to bypass the cache for the /_api route.
Should I not have used both platinum-sw-cache and platinum-sw-fetch within one platinum-sw-register element, since the docs state they are alternatives to each other?
In general, what you're doing should work, and it's a legitimate approach to take.
If there's an HTTP request made that matches a path defined in <platinum-sw-fetch>, then that custom handler will be used, and the default handler (in this case, the networkFirst implementation) won't run. The HTTP request can only be responded to once, so there's no chance of multiple handlers taking effect.
I ran some local samples and confirmed that my <platinum-sw-fetch> handler was properly intercepting requests. When debugging this locally, it's useful to either add in a console.log() within your custom handler and check for those logs via the chrome://serviceworker-internals Inspect interface, or to use the same interface to set some breakpoints within your handler.
What you're seeing in the Network tab of the controlled page is expected—the service worker's network interactions are logged there, whether they come from your custom HoodieAPIFetchHandler or the default networkFirst handler. The network interactions from the perspective of the controlled page are also logged—they don't always correspond one-to-one with the service worker's activity, so logging both does come in handy at times.
So I would recommend looking deeper into the reason why your application is making multiple requests. It's always tricky thinking about caching personalized resources, and there are several ways that you can get into trouble if you end up caching resources that are personalized for a different user. Take a look at the line of code that's firing off the second /_api/ request and see if it's coming from an cached resource that needs to be cleared when your users log out. <platinum-sw> uses the sw-toolbox library under the hood, and you can make use of its uncache() method directly within your custom handler scripts to perform cache maintenance.

How do I use FIDO U2F to allow users to authenticate with my website?

With all the recent buzz around the FIDO U2F specification, I would like to implement FIDO U2F test-wise on a testbed to be ready for the forthcoming roll out of the final specification.
So far, I have a FIDO U2F security key produced by Yubico and the FIDO U2F (Universal 2nd Factor) extension installed in Chrome. I have also managed to set up the security key to work with my Google log-in.
Now, I'm not sure how to make use of this stuff for my own site. I have looked through Google's Github page for the U2F project and I have checked their web app front-end. It looks really simple (JavaScript only).
So is implementing second factor auth with FIDO as simple as implementing a few JavaScript calls? All that seems to be happening for the registration in the example is this:
var registerRequest = {
appId: enrollData.appId,
challenge: enrollData.challenge,
version: enrollData.version
};
u2f.register([registerRequest], [], function (result) {
if (result.errorCode) {
document.getElementById('status')
.innerHTML = "Failed. Error code: " + result.errorCode;
return;
}
document.location = "/enrollFinish"
+ "?browserData=" + result.clientData
+ "&enrollData=" + result.registrationData
+ "&challenge=" + enrollData.challenge
+ "&sessionId=" + enrollData.sessionId;
});
But how can I use that for an implementation myself? Will I be able to use the callback from this method call for the user registration?
What you are trying to do is implement a so called "relying party", meaning that your web service will rely on the identity assertion provided by the FIDO U2F token.
You will need to understand the U2F specifications to do that. Especially how the challenge-response paradigm is to be implemented and how app ids and facets work. This is described in the spec in detail.
You are right: The actual code necessary to work with FIDO U2F from the front end of you application is almost trivial (that is, if you use the "high-level" JavaScript API as opposed to the "low-level" MessagePort API). Your application will however need to work with the messages generated by the token and validate them. This is not trivial.
To illustrate how you could pursue implementing a relying party site, I will give a few code examples, taken from a Virtual FIDO U2F Token Extension that I have programmed lately for academic reasons. You can see the page for the full example code.
Before your users can use their FIDO U2F tokens to authenticate, they need to register it with you.
In order to allow them to do so, you need to call window.u2f.register in their browser. To do that, you need to provide a few parameters (again; read the spec for details).
Among them a challenge and the id of your app. For a web app, this id must be the web origin of the web page triggering the FIDO operation. Let's assume it is example.org:
window.u2f.register([
{
version : "U2F_V2",
challenge : "YXJlIHlvdSBib3JlZD8gOy0p",
appId : "http://example.org",
sessionId : "26"
}
], [], function (data) {
});
Once the user performs a "user presence test" (e.g. by touching the token), you will receive a response, which is a JSON object (see spec for more details)
dictionary RegisterResponse {
DOMString registrationData;
DOMString clientData;
};
This data contains several elements that your application needs to work with.
The public key of the generated key pair -- You need to store this for future authentication use.
The key handle of the generated key pair -- You also need to store this for future use.
The certificate -- You need to check whether you trust this certificate and the CA.
The signature -- You need to check whether the signature is valid (i.e. confirms to the key stored with the certificate) and whether the data signed is the data expected.
I have prepared a rough implementation draft for the relying party server in Java that shows how to extract and validate this information lately.
Once the registration is complete and you have somehow stored the details of the generated key, you can sign requests.
As you said, this can be initiated short and sweet through the high-level JavaScript API:
window.u2f.sign([{
version : "U2F_V2",
challenge : "c3RpbGwgYm9yZWQ/IQ",
app_id : "http://example.org",
sessionId : "42",
keyHandle: "ZHVtbXlfa2V5X2hhbmRsZQ"
}], function (data) {
});
Here, you need to provide the key handle, you have obtained during registration.
Once again, after the user performs a "user presence test" (e.g. by touching the token), you will receive a response, which is a JSON object (again, see spec for more details)
dictionary SignResponse {
DOMString keyHandle;
DOMString signatureData;
DOMString clientData;
};
You the need to validate the signature data contained herein.
You need to make sure that the signature matches the public key you have obtained before.
You also need to validate that the string signed is appropriate.
Once you have performed these validations, you can consider the user authenticated. A brief example implementation of the server side code for that is also contained in my server example.
I have recently written instructions for this, as well as listing all U2F server libraries (most of them bundles a fully working demo server), at developers.yubico.com/U2F. The goal is to enable developers to implement/integrate U2F without having to read the specifications.
Disclaimer: I work as a developer at Yubico.