RESTful HTTP API Standards? [closed] - json

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Tasked to detail a simple API, I did a little research and suspect everything I know about the Internet is wrong.
I've Googled for far longer than I care to admit, reading a number of articles, StackOverflow questions and websites that all seem to vehemently disagree. I recognize every developer does things differently, but still suspect there is an official standard somewhere, or at least a general best practice (although, admittedly, not everyone follows whatever the best practice).
This API would use JSON. That I cannot change.
What my local peers do/have told me (very likely incorrect):
HTTP is a complex, antiquated beast that we should deal with as little as possible. It's simply a vehicle for moving chunks of JSON back and forth, where the magic happens. All data and metadata should be in the JSON, and you can set it up exactly as you like.
Use a 200 status code for everything, even if there is an application-level error or problem with the user's input. The other error codes mean something went wrong with the HTTP operation--a catastrophic unexpected server error, using the wrong URL, that kind of thing.
"Envelope" the JSON data for messages from the server; have JSON properties for metadata and include the actual JSON object/array inside a consistent property, like "data"
HTTPS is "nice to have" but not important for minor projects
Use PUT requests for everything
Log in to get a randomized strong of characters as an access token from the server. The server stores information on when the token expires, what account it is for and what IP address used it. Pass that access token to the server for every other call; the client does not store the password.
URLs tend to be verbs, like /register or /checkout or /changepassword. All other needed data is in the JSON. Each operation has its own URL
What I THINK might be right based on my reading, but not sure
HTTP is the divine data structure. Header information and server return codes can encompass any possible metadata and is, indeed, designed for that purpose. The contents need only contain the actual JSON object(s) the applications are acting upon. Put nothing in the JSON body that could possibly be part of the HTTP metadata.
ALWAYS use HTTPS, for everything
For any possible error (a form field didn't validate, the user's session expired, their game character is dead), send an HTTP status code. Try to pick what seems closest based on the W3C descriptions, but all that really matters is that you use it consistently in your system. The code should be enough to tell the client app what to do (show user validation errors and make them fix form input, make user log in again, take user back to main screen). The body, in case of errors, contains extra details about the error, if necessary.
The client app should pass login info with every request, in the HTTP header. This means it needs to use basic auth, which means it needs to remember the user's password.
The JSON data should never be in an "envelope". There is no standard format, because the contents directly represent the object(s) needed for the given operation as indicated by the combination of URL, GET/POST/PUT/DELETE
URLs tend to be nouns, like /user or /shoppingcart. Subdirectories of the URL refer to the object ID being acted on: /user/johndoe or /shoppingcart/12359. A URL could be used for different operations for GET (retrieve data) POST (update data) PUT (create new data) DELETE (remove data).
I'm not even sure that either of these is fully right--can you tell me the rules for what is the official, or most recommended way to structure such an API?

You should read the relevant part of the Fielding dissertation, that defines what REST is: http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
Some additional dissertation from Markus: http://www.markus-lanthaler.com/research/third-generation-web-apis-bridging-the-gap-between-rest-and-linked-data.pdf He started the work about creating a standard REST implementation: the http://www.hydra-cg.com/ RDF REST vocab and the http://json-ld.org/ RDF JSON format. Currently we don't have a standard solution to describe the uniform interface of any REST service. This is like if we would not have a HTML standard. That's why we are not able to write REST browsers just application specific clients.
(Hydra is not production ready, I guess they'll need another 2-3 years to standardize it and start to build Hydra specific tools. Until then we cannot really talk about real REST, because most of the APIs define an implementation specific format or use a non-standard more or less common format, like HAL.)

Related

REST: Creating a Json based Query: Which http method to use?

I have a question to RESTful services. In REST the POST method is used to create an entity.
And GET is used to query entities. Right?
As I read in another posts it is not allowed in HTTP to send a GET request with a body.
But when I want to send Json to make a query, what is the best way? Are there any best practices or how do you solve such json queries?
Thanks for your answers
In REST the POST method is used to create an entity. And GET is used to query entities. Right?
Not really. GET is used to fetch representations of resources. POST is deliberately vague -- anything not worth standardizing can use POST.
when I want to send Json to make a query, what is the best way?
There is no best way to do it, just trade offs.
The basic plot of HTTP is that you GET representations of resources. If the resource you want doesn't exist, you create a new one. So the "REST" flow would look something like sending a request to the server to create a "the answer to my query" resource, and then using GET to obtain the current representation of that resource. Which is great, because we can fetch the latest representation of that resource any time we're worried that our copy is out of date. Other people with the same query can use the same resource, so we can use a general-purpose cache to take a lot of the work. The end result is "web scale".
OK, not that great, because we learned that sending information over insecure channels is a bad idea; but we can put a general-purpose caching proxy in front of our server, and get some scale that way.
But "create a new resource" is a lot of ceremony when you only expect to need the query once.
Creating a new resource was using POST in this situation anyway, so why not return a representation of the solution right away? And the answer is, go right ahead! that works great... but doesn't give you any cache support at all. You are effectively performing a remote call under the guise of modifying a resource.
Also, POST doesn't promise idempotent semantics -- on an unreliable network, requests can get lost, and general purpose components won't know that in this particular case it is harmless to just repeat the same request.
PUT has idempotent semantics... but it also has very specific opinions about the contents of the payload that don't match "query" at all.
You can dig through other standardized methods, but there aren't really any good fits. The only methods that are close are SEARCH and REPORT, which are coupled to WebDAV semantics.
You can invent your own non standard method; but general purpose components won't understand it.
You can standardize a new method with the semantics you need, but that's a lot of work.
Or you can just use POST.
Remember, the web took over the world using nothing more than GET and POST. So it's probably fine.

Why is there a difference between get and put requests? [duplicate]

Back when I first started developing client/server apps which needed to make use of HTTP to send data to the server, I was pretty nieve when it came to HTTP methods. I literally used GET requests for EVERYTHING.
I later learned that I should use POST for sending data and GET for requesting data however, I was slightly confused as to why this is best practice. From a functionality perspective, I was able to use either GET or POST to achieve the exact same thing.
Why is it important to use specific HTTP methods rather than using the same method for everything?
I understand that POST is more secure than GET (GET makes the data visible in the HTTP URL) however, couldn't we just use POST for everything then?
I'm going to take a stab at giving a short answer to this.
GET is used for reading information. It's the 'default' method, and everything uses this to jump from one link to the next. This includes browsers, but also crawlers.
GET is 'safe'. This means that if you do a GET request, you are guaranteed that you will never change something on the server. If a GET request could cause something to delete on the server, this can be very problematic because a spider/crawler/search engine might assume that following links is safe and automatically delete things.
This is why we have a couple of different methods. GET is meant to allow you to 'get' things from the server. Likewise, PUT allows you to set something new on a server and DELETE allows you remove something.
POST's biggest original purpose is submitting forms. You're posting a form to the server and ask the server to do something with that form.
Any client (a human/browser or machine/crawler) knows that POST is 'unsafe'. It won't do POST requests automatically on your behalf unless it really knows it's what you (the user) wants. It's also used for things like are kinda similar to submitting forms.
So when you design your website, make sure you use GET only for getting things from the server, and use POST if your ajax request will cause 'something' to change on the server.
Fun fact: there are a lot of official HTTP methods. At least 30. You'll probably only use a very few of them though.
So to answer the question in the title more precisely:
Why are there multiple HTTP Methods available?
Different HTTP methods have different rules and restrictions. If everyone agrees on those rules, we can start making assumptions about what the intent is. Because these guarantees exists, HTTP servers, clients and proxies can make smart decisions without understanding your specific application.
Suppose, You have one task app in which you can store data, delete data. Now suppose the route of your web page is /xx so to get the webpage, to store the data using add button , to delete the data using delete button you will send requests to /xx but how web server will know whether you are asking for web page or you want to add data or you want to delete because /xx is the same for all requests that's why we have different web requests browser always sends request name(GET,POST,PUT,DELETE) in header to server so server can understand what you need.

Best practice for email links that will set a DB flag?

Our business wants to email our customers a survey after they work with support. For internal reasons, we want to ask them the first question in the body of the email. We'd like to have a link for each answer. The link will go to a web service, which will store the answer, then present the rest of the survey.
So far so good.
The challenge I'm running into: making a server-side changed based on an HTTP GET is bad practice, but you can't do a POST from a link. Options seem to be:
Use an HTTP GET instead, even though that's not correct and could cause problems (https://twitter.com/rombulow/status/990684453734203392)
Embed an HTML form in the email and style some buttons to look like links (likely not compatible with a number of email platforms)
Don't include the first question in the email (not possible for business reasons)
Use HTTP GET, but have some sort of mechanism which prevents a link from altering the server state more than once
Does anybody have any better recommendations? Googling hasn't turned up much about this specific situation.
One thing to keep in mind is that HTTP is specifying semantics, not implementation. If you want to change the state of your server on receipt of a GET request, you can. See RFC 7231
This definition of safe methods does not prevent an implementation from including behavior that is potentially harmful, that is not entirely read-only, or that causes side effects while invoking a safe method. What is important, however, is that the client did not request that additional behavior and cannot be held accountable for it. For example, most servers append request information to access log files at the completion of every response, regardless of the method, and that is considered safe even though the log storage might become full and crash the server. Likewise, a safe request initiated by selecting an advertisement on the Web will often have the side effect of charging an advertising account.
Domain agnostic clients are going to assume that GET is safe, which means your survey results could get distorted by web spiders crawling the links, browsers pre-loading resource to reduce the perceived latency, and so on.
Another possibility that works in some cases is to treat the path through the graph as the resource. Each answer link acts like a breadcrumb trail, encoding into itself the history of the clients answers. So a client that answered A and B to the first two questions is looking at /survey/questions/questionThree?AB where the user that answered C to both is looking at /survey/questions/questionThree?CC. In other words, you aren't changing the state of the server, you are just guiding the client through a pre-generated survey graph.

REST API - file (ie images) processing - best practices

We are developing server with REST API, which accepts and responses with JSON. The problem is, if you need to upload images from client to server.
Note: and also I am talking about a use-case where the entity (user) can have multiple files (carPhoto, licensePhoto) and also have other properties (name, email...), but when you create new user, you don't send these images, they are added after the registration process.
The solutions I am aware of, but each of them have some flaws
1. Use multipart/form-data instead of JSON
good : POST and PUT requests are as RESTful as possible, they can contain text inputs together with file.
cons : It is not JSON anymore, which is much easier to test, debug etc. compare to multipart/form-data
2. Allow to update separate files
POST request for creating new user does not allow to add images (which is ok in our use-case how I said at beginning), uploading pictures is done by PUT request as multipart/form-data to for example /users/4/carPhoto
good : Everything (except the file uploading itself) remains in JSON, it is easy to test and debug (you can log complete JSON requests without being afraid of their length)
cons : It is not intuitive, you cant POST or PUT all variables of entity at once and also this address /users/4/carPhoto can be considered more as a collection (standard use-case for REST API looks like this /users/4/shipments). Usually you cant (and dont want to) GET/PUT each variable of entity, for example users/4/name . You can get name with GET and change it with PUT at users/4. If there is something after the id, it is usually another collection, like users/4/reviews
3. Use Base64
Send it as JSON but encode files with Base64.
good : Same as first solution, it is as RESTful service as possible.
cons : Once again, testing and debugging is a lot worse (the body can have megabytes of data), there is increase in size and also in processing time in both - client and server
I would really like to use solution no. 2, but it has its cons... Anyone can give me a better insight of "what is best" solution?
My goal is to have RESTful services with as much standards included as possible, while I want to keep it as simple as possible.
OP here (I am answering this question after two years, the post made by Daniel Cerecedo was not bad at a time, but the web services are developing very fast)
After three years of full-time software development (with focus also on software architecture, project management and microservice architecture) I definitely choose the second way (but with one general endpoint) as the best one.
If you have a special endpoint for images, it gives you much more power over handling those images.
We have the same REST API (Node.js) for both - mobile apps (iOS/android) and frontend (using React). This is 2017, therefore you don't want to store images locally, you want to upload them to some cloud storage (Google cloud, s3, cloudinary, ...), therefore you want some general handling over them.
Our typical flow is, that as soon as you select an image, it starts uploading on background (usually POST on /images endpoint), returning you the ID after uploading. This is really user-friendly, because user choose an image and then typically proceed with some other fields (i.e. address, name, ...), therefore when he hits "send" button, the image is usually already uploaded. He does not wait and watching the screen saying "uploading...".
The same goes for getting images. Especially thanks to mobile phones and limited mobile data, you don't want to send original images, you want to send resized images, so they do not take that much bandwidth (and to make your mobile apps faster, you often don't want to resize it at all, you want the image that fits perfectly into your view). For this reason, good apps are using something like cloudinary (or we do have our own image server for resizing).
Also, if the data are not private, then you send back to app/frontend just URL and it downloads it from cloud storage directly, which is huge saving of bandwidth and processing time for your server. In our bigger apps there are a lot of terabytes downloaded every month, you don't want to handle that directly on each of your REST API server, which is focused on CRUD operation. You want to handle that at one place (our Imageserver, which have caching etc.) or let cloud services handle all of it.
small 2023 update: If possible, but CDN in front of the pictures, it usually will save you a lot of money and make the pictures even more available (i.e. no issues when peaks happen).
Cons : The only "cons" which you should think of is "not assigned images". User select images and continue with filling other fields, but then he says "nah" and turn off the app or tab, but meanwhile you successfully uploaded the image. This means you have uploaded an image which is not assigned anywhere.
There are several ways of handling this. The most easiest one is "I don't care", which is a relevant one, if this is not happening very often or you even have desire to store every image user send you (for any reason) and you don't want any deletion.
Another one is easy too - you have CRON and i.e. every week and you delete all unassigned images older than one week.
There are several decisions to make:
The first about resource path:
Model the image as a resource on its own:
Nested in user (/user/:id/image): the relationship between the user and the image is made implicitly
In the root path (/image):
The client is held responsible for establishing the relationship between the image and the user, or;
If a security context is being provided with the POST request used to create an image, the server can implicitly establish a relationship between the authenticated user and the image.
Embed the image as part of the user
The second decision is about how to represent the image resource:
As Base 64 encoded JSON payload
As a multipart payload
This would be my decision track:
I usually favor design over performance unless there is a strong case for it. It makes the system more maintainable and can be more easily understood by integrators.
So my first thought is to go for a Base64 representation of the image resource because it lets you keep everything JSON. If you chose this option you can model the resource path as you like.
If the relationship between user and image is 1 to 1 I'd favor to model the image as an attribute specially if both data sets are updated at the same time. In any other case you can freely choose to model the image either as an attribute, updating the it via PUT or PATCH, or as a separate resource.
If you choose multipart payload I'd feel compelled to model the image as a resource on is own, so that other resources, in our case, the user resource, is not impacted by the decision of using a binary representation for the image.
Then comes the question: Is there any performance impact about choosing base64 vs multipart?. We could think that exchanging data in multipart format should be more efficient. But this article shows how little do both representations differ in terms of size.
My choice Base64:
Consistent design decision
Negligible performance impact
As browsers understand data URIs (base64 encoded images), there is no need to transform these if the client is a browser
I won't cast a vote on whether to have it as an attribute or standalone resource, it depends on your problem domain (which I don't know) and your personal preference.
Your second solution is probably the most correct. You should use the HTTP spec and mimetypes the way they were intended and upload the file via multipart/form-data. As far as handling the relationships, I'd use this process (keeping in mind I know zero about your assumptions or system design):
POST to /users to create the user entity.
POST the image to /images, making sure to return a Location header to where the image can be retrieved per the HTTP spec.
PATCH to /users/carPhoto and assign it the ID of the photo given in the Location header of step 2.
There's no easy solution. Each way has their pros and cons . But the canonical way is using the first option: multipart/form-data. As W3 recommendation guide says
The content type "multipart/form-data" should be used for submitting forms that contain files, non-ASCII data, and binary data.
We aren't sending forms,really, but the implicit principle still applies. Using base64 as a binary representation, is incorrect because you're using the incorrect tool for accomplish your goal, in other hand, the second option forces your API clients to do more job in order to consume your API service. You should do the hard work in the server side in order to supply an easy-to-consume API. The first option is not easy to debug, but when you do it, it probably never changes.
Using multipart/form-data you're sticked with the REST/http philosophy. You can view an answer to similar question here.
Another option if mixing the alternatives, you can use multipart/form-data but instead of send every value separate, you can send a value named payload with the json payload inside it. (I tried this approach using ASP.NET WebAPI 2 and works fine).

How to create a website based on a Perl script? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I was trying to learn Perl then I ended up writing a script that tries to find all possible schedules given course names, where a possible schedule means that there are no clashes between the course times by iterating through all sections.
I crawled my university schedule of classes and placed them in a messy data structure hash to a hash to a 2D array where first hash indicated the Subject and second hash indicated the Course number then an array of sections where each section is an array of all the data. (not the most appealing data structure)
I then, processed all schedules combinations by iterating through all possible schedule combinations and return all schedules that didnt have a clash as a 3D array (where each entry was a schedule and each schedule had courses and each course had its specific data)
Now, I can hard-code the input in the script as a 2D array where each element consisted of Subject name and course number.
What I want to do now is to transform this into a website.
I took an online course on database but I don't have a clue on how to handle databases from Perl or whether this is a good approach.
I don't know how to store the data crawled permanently so it could be used for further computations.
I know basic HTML and CSS and Javascript but I have no idea on how to integrate the script with them and take the input from the user (I only know how to do that in Javascript). Google lead me towards "cgi-scripts" but I don't anything about servers except that they are responsible for computation done by website and one of them is called Apache or AJAX. I am not sure whether this is true or not but I want to give you an idea of my level of expertise.
Could you please point me in the right direction by telling me what do I need to learn in order to be able to make this website.
I took an online course on database but I don't have a clue on how to handle databases from Perl or whether this is a good approach.
Database access in Perl is done via DBI. You can use DBIx::Class to get a nice OO abstraction for it.
I don't know how to store the data crawled permanently so it could be used for further computations.
Databases are a good choice.
I know basic HTML and CSS and Javascript but I have no idea on how to integrate the script with them and take the input from the user (I only know how to do that in Javascript).
Use a <form>. Set the action to the URL of a server side program. Submit the form.
Google lead me towards "cgi-scripts" but I don't anything about servers except that they are responsible for computation done by website and one of them is called Apache or AJAX. I am not sure whether this is true or not but I want to give you an idea of my level of expertise.
An HTTP server listens for HTTP requests and provides HTTP responses. Browsers (and search engines, and other clients) make HTTP requests to servers that host websites. The servers respond with the data (HTML, CSS, JavaScript, Images, etc) needed to render the site and the client renders it (or indexes it, or whatever).
Apache HTTPD is one of the most commonly used HTTP servers.
CGI is means by which an HTTP server can determine what to respond with by running a program instead of just handing over a static file. It is very simple but not very efficient. Some alternatives are described in this answer.
Ajax has nothing to do with this. It means "Using JavaScript, in a web page, to tell the browser to make a new HTTP request (without leaving the page) and make the response available to the JavaScript".
For a pure perl setup, the HTTP::Daemon and HTTP::Response modules are your best friends. I tried to write a web server using nothing but IO::Socket and nearly drove myself crazy.
Getting started is pretty easy.
use strict;
use warnings;
use HTTP::Daemon;
my %opt = (
'listen-host' => 'localhost',
'listen-port' => 8808,
);
my $d = HTTP::Daemon->new(
LocalPort => $opt{'listen-port'},
LocalAddr => $opt{'listen-host'},
Reuse => 1,
) or die "HTTP listener failed at $opt{'listen-host'}:$opt{'listen-port'} - $!";
print "Started HTTP listener!\n";
my $c = $d->accept;
Now your script will sit there until you get a connection from a browser. Of course you still need to send a response, so see HTTP::Response on how to send data back.
This is going to be a partial/vague answer..
For database, what you want to do is to learn to use DBI this is a database implementation independant api to talk to data bases (it can even write to csv files!). You would also need a driver for your database of choice.
As for website it is beyond my skills, there are many ways to do it. Perl would be used server side via something called CGI. Javascript on the other hand is typically processed on the client side, and is used to add dynamic elements to your site. Apache is a web server software, it takes care of talking with your browser and passing it relevant html pages, you might need to use it, but you would not need to code anything for it for basic use-cases.
For perl webpages, you can start with this tutorial to understand better, and then look to perl monks for a better(and more up-to-date) answer. This post will also give you more practical advice like to use Dancer