We are designing a fairly complex REST API, in which most of the I/O are JSON encoded objects with a specific structure. One challenge we have found is to document the API in such a way that makes it easier for clients to post correct input and process output. Because the data of both the input and output requires fairly complex JSON objects, client developers often introduce bugs related to the structure of the I/O objects.
With all of the JSON web API's these days, I would have hoped for a general solution, but I am having a hard time finding one. I looked into json-schema which is a json-validation schema but both the IETF draft and implementations seem to be fairly immature (even though they have been around for a while, which is not a good sign).
A slightly different approach is offered by Protocol Buffers and Apache Avro, where the schema is not used for validation, but actually required for the encoding/decoding of the message. Of these 2, Avro seems to have rather limited documentation and implementations. ProtoBuf seems better, but I am not sure if this is really suitable to use in the browser to call a JSON api?
Now I am starting to doubt if I am looking at this from the right angle. Are there other methods available to make my API a bit more strong-typed'ish? Or is a formal description of a JSON REST/RPC API something that defeats the purpose of using JSON?
Edit: 6 months after this topic we found mongoose, which is very close to what we were lookin for.
Below a reply I received by email from Douglas Crockford.
I am not a believer in schemas as an alternative to input validation.
There are properties that cannot be verified from the syntax. I think
that was one of the ways that XML went wrong.
If your formats are too complex, then I would look at simplifying
them.
Such systems exist and I'm the author of one of them. It is called Piqi-RPC and it does IDL-based validation of the input and output parameters for RPC-style APIs over HTTP.
It supports JSON, XML and Google Protocol Buffers as data representation formats for input and output of HTTP POST requests. Clients can choose to use any of the three formats and specify their choice using the standard Accept and Content-Type HTTP headers.
So, yes, in theory, you are looking in the right direction. However, at the moment, Piqi-RPC supports writing servers only in Erlang and it wouldn't be very useful for you if you use a different stack. I heard that Apache Thrift also supports JSON over HTTP transport, but I haven't checked. Another kind of similar system I know of (also for Erlang) is called UBF. I have heard of libraries for Java that can parse and validate JSON based on Protocol Buffers specification (e.g. http://code.google.com/p/protostuff/).
The idea itself is far from being new, but there aren't many systems that approach it in practice. It is a challenging problem.
Historically, IDLs were used for interface definition and binary data serialization and not so much for validating dynamic data interchange formats (e.g. XML and JSON) which emerged later. Sun-RPC IDL and CORBA IDL fall in the first category. WSDL would be one of few examples covering both areas, but it is a terrible piece of technology and it would be a bad choice for most modern systems. In addition, there are many schema languages (also known as DDLs -- data definition languages), most of which are highly specialized and work with only one representation format, e.g. XML or JSON schemas. Few of those have stable implementations.
The Piqi project and Piqi-RPC, which is based on it, are build around several fairly simple realizations:
DLL doesn't have to be explicitly tied to any particular data representation format or built around it. Instead, such language can be fairly universal and cover wide range of practical use-cases (e.g. cross-language data serialization and data validation) and data formats (e.g. JSON, XML, Protocol Buffers).
IDL for RPC-style communication can be implemented as a thin, mostly syntactic layer on top of the universal DDL.
Such IDL and interface specifications can be transport agnostic.
Speaking of REST-style APIs over HTTP compared to RPC-style APIs over HTTP.
With RPC-style APIs, service developer or an automated system have to validate three things: function name (according to some service naming scheme), input and, if you choose so, output.
In case of REST-style APIs, people get themselves in trouble for no good reason. Now, they have a lot more stuff to validate: arbitrarily complex URL syntax, including dynamic parameters encoded in URL segments (for all HTTP methods) and URL query string (only for HTTP GET method), HTTP method correspondence (whether it should be GET, POST, PUT, DELETE, etc.), HTTP body when some parameters go there (sometimes they do it manually twice for parameters represented in JSON and XML), custom HTTP headers, and separately -- service documentation. Imagine an IDL supporting all that!
XML is better for RESTful services in many ways. It has native linking (<link href=, for all those HATEOAS fans), native language support (lang="en") and a great ecosystem.
It is also better for future proofing and future API refactorings. Converting this:
<profile>
<username>alganet</username>
</profile>
To support more usernames:
<profile>
<username>alganet</username>
<username>alexandre</username>
</profile>
Is much more simpler to do without breaking existing clients using XML. JSON is hard on that.
If you really need JSON, JSON-Schema is the way to go. It's immature, but I don't know anything better on that case. Maybe your consumers could choose between XML and JSON, so they can choose between a small payload (JSON) or RESTful candies (XML) using Content Negotiation.
I'd say the answer to your last question is yes. If you need a way to constrain and document the JSON "schema", why didn't you go with XML in the first place? It is not that much harder to parse, and being able to enforce a schema for it is a great advantage.
Related
According to a post on Stackflow.com called “what’s is JSOn and why would I use it? “web services used XML as their primary data format for transmitting back data, but since JSON appeared, it is preferred method.” Why do must web services use JSON over XML, is because it’s a better method for interchanging?
XML was designed primarily for document formats, e.g. papers in scientific journals. It contains many features that aren't needed for simple data interchange, and these features can get in the way when you are processing XML because they can't be easily represented in Javascript. So the code for processing the XML ends up a lot more complicated than it could be. By contrasts, JSON has an exact match to the data structures Javascript can handle natively. Of course, that problem could in principle be solved by using a language with better XML support than JavaScript - XSLT, for example - but unfortunately XSLT in the browser has never had the same level of investment put into it.
Additionally, for reasons I have never understood, the browser security folks decided that reading JSON from alien web sites (i.e. from a different domain from your HTML page) is safe, but reading XML from alien sites isn't. So if you switch from XML to JSON, you get rid of a lot of cross-site-scripting hassle.
JSON is less verbose and it is sufficient for simple data transmission, i.e. if you do not need any transformations (XSLT).
I have chosen jeroMQ for building Asynchronous message channel for publishing content from multiple clients. On the other end server side workers processes request and notify client only if server wanted to notify client based on the message received.
On digging deep, looking for messaging library to marshal/un-marshal message. I found kvpmsg class which does the job for simple key-value.
Don't want to re-invent the wheel if some standard library exists, that can be applied for bigger objects
It seems like you are asking for data serialization libraries. Check Wikipedia for a list and a comparison of data serialization formats.
Also there is a relevant entry in ZeroMQ FAQ explaining why ZeroMQ doesn't include any serialization format:
Does ØMQ include APIs for serializing data to/from the wire representation?
No. This design decision adheres to the UNIX philosophy of "do one thing and do it well". In the case of ØMQ, that one thing is moving messages, not marshaling data to/from binary representations.
Some middleware products do provide their own serialization API. We believe that doing so leads to bloated wire-level specifications like CORBA (1055 pages). Instead, we've opted to use the simplest wire formats possible which ensure easy interoperability, efficiency and reduce the code (and bug) bloat.
If you wish to use a serialization library, there are plenty of them out there. See for example
Google Protocol Buffers
MessagePack
JSON-GLib
C++ BSON Library
Note that serialization implementations might not be as performant as you might expect. You may need to benchmark your workloads with several serialization formats and libraries in order to understand performance and which format/implementation is best for your use case (ease of development must also be considered).
Currently doing some exams and I'm struggling through some concepts. These have all been 'mentioned' in my notes really but I didn't really understand how they all linked together. As far as my understanding is:
SOA - a solution to make service consumers/providers communicate. (as far as I understand this is the umbrella term for everything else)
WSDL - A language that describes the provider service.
SOAP - A XML protocol 'wrapper' used by the services to send messages. Works in conjunction with WSDL as to provide parameters?
REST - A design pattern that is similar to SOAP in function but avoids the XML? (really not sure about this one)
JSON - An alternative to XML that uses javascript? (not sure about this one either)
Looking around in the internet there doesn't seem to be a clear definition of what all of these are and how they interlink.
Imagine you are developing a web-application and you decide to decouple the functionality from the presentation of the application, because it affords greater freedom.
You create an API and let others implement their own front-ends over it as well. What you just did here is implement an SOA methodology, i.e. using web-services.
Web services make functional building-blocks accessible over standard
Internet protocols independent of platforms and programming languages.
So, you design an interchange mechanism between the back-end (web-service) that does the processing and generation of something useful, and the front-end (which consumes the data), which could be anything. (A web, mobile, or desktop application, or another web-service). The only limitation here is that the front-end and back-end must "speak" the same "language".
That's where SOAP and REST come in.
They are standard ways you'd pick communicate with the web-service.
SOAP:
SOAP internally uses XML to send data back and forth. SOAP messages have rigid structure and the response XML then needs to be parsed.
WSDL is a specification of what requests can be made, with which parameters, and what they will return. It is a complete specification of your API.
REST:
REST is a design concept.
The World Wide Web represents the largest implementation of a system
conforming to the REST architectural style.
It isn't as rigid as SOAP. RESTful web-services use standard URIs and methods to make calls to the webservice. When you request a URI, it returns the representation of an object, that you can then perform operations upon (e.g. GET, PUT, POST, DELETE). You are not limited to picking XML to represent data, you could pick anything really (JSON included)
Flickr's REST API goes further and lets you return images as well.
JSON and XML, are functionally equivalent, and common choices. There are also RPC-based frameworks like GRPC based on Protobufs, and Apache Thrift that can be used for communication between the API producers and consumers. The most common format used by web APIs is JSON because of it is easy to use and parse in every language.
WSDL: Stands for Web Service Description Language
In SOAP(simple object access protocol), when you use web service and add a web service to your project, your client application(s) doesn't know about web service Functions. Nowadays it's somehow old-fashion and for each kind of different client you have to implement different WSDL files. For example you cannot use same file for .Net and php client.
The WSDL file has some descriptions about web service functions. The type of this file is XML. SOAP is an alternative for REST.
REST: Stands for Representational State Transfer
It is another kind of API service, it is really easy to use for clients. They do not need to have special file extension like WSDL files. The CRUD operation can be implemented by different HTTP Verbs(GET for Reading, POST for Creation, PUT or PATCH for Updating and DELETE for Deleting the desired document) , They are based on HTTP protocol and most of times the response is in JSON or XML format. On the other hand the client application have to exactly call the related HTTP Verb via exact parameters names and types. Due to not having special file for definition, like WSDL, it is a manually job using the endpoint. But it is not a big deal because now we have a lot of plugins for different IDEs to generating the client-side implementation.
SOA: Stands for Service Oriented Architecture
Includes all of the programming with web services concepts and architecture. Imagine that you want to implement a large-scale application. One practice can be having some different services, called micro-services and the whole application mechanism would be calling needed web service at the right time.
Both REST and SOAP web services are kind of SOA.
JSON: Stands for javascript Object Notation
when you serialize an object for javascript the type of object format is JSON.
imagine that you have the human class :
class Human{
string Name;
string Family;
int Age;
}
and you have some instances from this class :
Human h1 = new Human(){
Name='Saman',
Family='Gholami',
Age=26
}
when you serialize the h1 object to JSON the result is :
[h1:{Name:'saman',Family:'Gholami',Age:'26'}, ...]
javascript can evaluate this format by eval() function and make an associative array from this JSON string. This one is different concept in comparison to other concepts I described formerly.
I'm designing a distributed application that will consist of a variety of REST services. Lately I've been going back and forth about whether to implement my REST services using the ASP.NET MVC 4 Web API or OData. Web API seems like it will some day be what I need but right now it's only half baked. Specifically, it only has a partial implementation of OData-style URI querying and doesn't do hypermedia out-of-the-box.
So this forces me to take another long hard look at OData. I really like the URI querying capability and structural hypermedia for lazy loading; I think I will use these features a lot in my application. However, the Atom Pub specification appears to be grossly inefficient.
I recently read a post about an efficient format for OData which mentions "dense JSON" but such a thing does not appear to actually exist. Is this true? And even if there's no such thing as dense JSON, regular JSON is still much more efficient than Atom Pub, correct?
Is there any situation where I would want to use Atom Pub over JSON?
There should be very little difference between ATOM and JSON on the semantic level with OData. Also most OData servers (WCF Data Services for sure) support both, so it's a choice of the client which one to use. As the blog post from Pablo mentions, to get the best payload size you should enable HTTP compression. It works great on both ATOM and JSON.
Reading JSON tends to be faster (XML parsing is kind of expensive), but that's if you're concerned with CPU consumption on the client. If I remember correctly, last time I saw the numbers, the compressed payload size for ATOM and JSON is not that different.
ATOM PUB is usually easier to consume in client which has available good XML or ATOM libraries and not JSON. And vice versa. But other than that, there should not be much of a difference.
Receiving and sending data with JSON is done with simple HTTP requests. Whereas in SOAP, we need to take care of a lot of things. Parsing XML is also, sometimes, hard. Even Facebook uses JSON in Graph API. I still wonder why one should still use SOAP? Is there any reason or area where SOAP is still a better option? (Despite the data format)
Also, in simple client-server apps (like Mobile apps connected with a server), can SOAP give any advantage over JSON?
I will be very thankful if someone can enlist the major/prominent differences between JSON and SOAP considering the information I have provided(If there are any).
I found the following on advantages of SOAP:
There is one big reason everyone sticks with SOAP instead of using JSON. With every JSON setup, you're always coming up with your own data structure for each project. I don't mean how the data is encoded and passed, but how the data formatted format is defined, the data model.
SOAP has an industry-mature way of specifying that data will be in a certain format: e.g. "Cart is a collection of Products and each Product can have these attributes, etc." A well put together WSDL document really has this nailed. See W3C specification: Web Services Description Language
JSON has similar ways of specifying this data structure — a JavaScript class comes to mind as the most common way of doing this — but a JavaScript class isn't really a data structure used for this purpose in any kind of agnostic, well established, widely used way.
In short, SOAP has a way of specifying the data structure in a maturely formatted document (WSDL). JSON doesn't have a standard way of doing this.
If you are creating a client application and your server implementation is done with SOAP then you have to use SOAP in client side.
Also, see: Why use SOAP over JSON and custom data format in an “ENTERPRISE” application? [closed]
Nowadays SOAP is a complete overkill, IMHO. It was nice to use it, nice to learn it, and it is beautiful we can use JSON now.
The only difference between SOAP and REST services (no matter whether using JSON) is that SOAP WS always has it's own WSDL document that could be easily transformed into a self-descriptive documentation while within REST you have to write the documentation for yourself (at least to document the data structures). Here are my cons'&'pros for both:
REST
Pros
lightweight (in all means: no server- nor client-side extensions needed, no big chunks of XML are needed to be transfered here and there)
free choice of the data format - it's up on you to decide whether you can use plain TXT, JSON, XML, or even create you own format of data
most of the current data formats (and even if used XML) ensures that only the really required amount of data is transfered over HTTP while with SOAP for 5 bytes of data you need 1 kB of XML junk (exaggerated, ofc, but you got the point)
Cons
even there are tools that could generate the documentation from docblock comments there is need to write such comments in very descriptive way if one wants to achieve a good documentation as well
SOAP
Pros
has a WSDL that could be generated from even basic docblock comments (in many languages even without them) that works well as a documentation
even there are tools that could work with WSDL to give an enhanced try this request interface (while I do not know about any such tool for REST)
strict data structure
Cons
strict data structure
uses an XML (only!) for data transfers while each request contains a lot of junk and the response contains five times more junk of information
the need for external libraries (for client and/or server, though nowadays there are such libraries already a native part of many languages yet people always tend to use some third-party ones)
To conclude, I do not see a big reason to prefer SOAP over REST (and JSON). Both can do the same, there is a native support for JSON encoding and decoding in almost every popular web programming language and with JSON you have more freedom and the HTTP transfers are cleansed from lot of useless information junk. If I were to build any API now I would use REST with JSON.
I disagree a bit on the trend of JSON I see here. Although JSON is an order maginitude easier, I'd venture to say it's quite limited. For example, SOAP WS is not the last thing. Indeed, between soap client/server you now have enterprise services bus, authentification scheme based on crypto, user management, timestamping requests/replies, etc. For all of this, there're some huge software platforms that provide services around SOAP (well, "web services") and will inject stuff in your XML. So although JSON is probably enough for small projects and an order of magnitude easier there, I think it becomes quite limited if you have decoupled transmission control and content (ie. you develop the content stuff, the actual server, but all the transmission is managed by another team, the authentification by one more team, deployment by yet another team). I don't know if my experience at a big corp is relevant, but I'd say that JSON won't survive there. There are too many constraints on top of the basic need of data representation. So the problem is not JSON RPC itself, the problem is it misses the additional tools to manage the complexity that arises in complex applications (not to say that what you do is not complex, it's just that the software reflects the complexity of the company that produces it)
I think there is a lot of basic misinformation on this thread. SOAP, REST, XML, and JSON concepts seem to be mixed up in the responses.
Here is some clarification -
XML and JSON (an others) are encodings of information.
SOAP is a communications protocol
REST is an (Architecture) style
each is used for something different although you might use more than one of these things together.
Lets start with encoding data structures as XML vs JSON:
Everything JSON currently supports can be done in XML, but not the other way around. JSON will eventually adopt all the features that XML has, but its proponents haven't encountered all of the problems yet, once they get more experience things will be added on to close the gap. for example JSON didn't start out with Schemas and binary formats.
SOAP is a communication protocol for calling an operation. It runs on top of things like, HTTP, SMTP, etc. Aside from many other features, SOAP messages can span multiple "application" layer protocols. i.e. i can sent a SOAP message by HTTP to a service endpoint which then puts it on a message queue for another system. SOAP solves the problem of maintaining authentication, message authenticity, etc. as the requested moved between different parts of a distributed system.
JSON and other data formats canbe sent via SOAP. I work with some systems that sent binary fixed-width encoded objects via SOAP, its not a problem.
The analogy is that - if only the postman is allowed to send you a letter, then it is just HTTP, but if anyone can send you a letter, then you want SOAP. (i.e. message transport security vs message content security)
the 6 REST constraints are architectural style. Interestingly the first several years of REST the examples were in SOAP. (there is no such thing as REST or SOAP they are not opposites)
A "heavyweight bloated, etc.etc." SOA SOAP system might have monoliths with operations like GET, PUT, POST instances of a single entity. SOAP doesn't have those operations predefined, but that is typically how it is used.
Consider that if you built a "REST" service on HTTP alone with an SSL/TLS terminating proxy, then you may have violated the 4th constraint of REST.
So for your software development today, you wouldn't normally interact with any of these directly. Just as if you were written a graphics program you wouldn't directly work with HDMI vs. DisplayPort typically.
The question is do you understand architecturally what your system needs to do and configure it to use the mechanism that does that job. (for example, all the challenges of applying today's microservices to general systems are old problems previously solved by SOAP, CORBA and the old protocols)
I have spent several years writing SOAP web services (with JAX WS). They are not hard to write. And I love the idea of a single endpoint and single HTTP method (POST). For me, REST is too verbose.
But as a data container, JSON is simpler, smaller, more readable, more flexible, looks closer to programming languages.
So, I reinvented the wheel and created my own approach to writing backends for AJAX requests. In comparison:
REST:
get user: method GET https://example.com/users/{id}
update user: method POST https://example.com/users/ (JSON with User object in request body)
RPC:
get user: method GET https://example.com/getUser?id=1
update user: method POST https://example.com/updateUser (JSON with User object in the request body)
My way (the proposed name is JOH - JSON over HTTP):
get user: method POST https://example.com/ (JSON specifies both user ID and class/method responsible for handling request)
update user: method POST https://example.com/ (JSON specifies both user object and class/method responsible for handling request)