can someone tell me which APIs exactly use XML data format and which use JSON data format to represent data? I came across different answers on the internet and can't make a difference between SOAP, REST, RESTful, or WS and realize which of them is actually an API that support JSON or XML.
Related
I came across this service from stackoverflow
https://api.stackexchange.com/2.3/questions?fromdate=1519862400&todate=1522368000&order=desc&sort=activity&site=stackoverflow&tagged=python
I believe the source is from a database. How do I build an Xml to spit me out data in similar format?
I use the below logical lines
xmldoc.Load(xmlFileName);
Newtonsoft.Json.JsonConver.SerializeXmlNode(xmldoc);
Any recommendation of how to build the Xml which is a reverse process? My solutions are heavily dependant on Xml and flatFiles
According to https://api.stackexchange.com/docs the StackExchange API only supports JSON output, not XML. So you will have to convert the JSON to XML.
My own preference is to do this conversion "by hand" using XSLT 3.0 rather than using a standard library, because standard libraries often give you XML that's rather difficult to work with.
I'm aware that there are python and powershell methods to convert plain text files, csv's etc.... into json format for upload into NoSQL DBs such as CouchDB.
But according to the CouchDB definitive guide, it makes it seems like there is a native built in way of doing this kind of conversion, without the need for a 3rd party tool.
This older thread appears to hint at this:
Filter and update functions in CouchDB?
This part in particular:
There are other design document functions that are being introduced at the >time of this writing, including _update and _filter that we aren’t covering in >depth here. Filter functions are covered in Chapter 20, Change Notifications. >Imagine a web service that POSTs an XML blob at a URL of your choosing when >particular events occur. PayPal’s instant payment notification is one of >these. With an _update handler, you can POST these directly in CouchDB and it >can parse the XML into a JSON document and save it. The same goes for CSV, >multi-part form, or any other format.
But when I dig deeper I don't find anything concrete.
The supporting wiki link is not clear to me (a beginner with json/NoSQL/curl stuff: http://wiki.apache.org/couchdb/Document_Update_Handlers
Hopefully this is a simple yes/no. And any links to help on this that is better than the above link also appreciated, thank you!
CouchDB supports transforming the internal documents/views into many other formats through the use of show and list functions. It's not a "native" transformation, as you define the transformation yourself, it's not magical.
That being said, there is not a similar mechanism for the reverse (ie: converting some arbitrary format into JSON documents) but you're much better off scripting that with a full-featured language/script and using the bulk docs API to do your imports in batch.
I would love to use protocol buffers, but I am not sure if they fit my use case. Here it is:
I have a Quiz app. This requires a bunch of data, like categories, questions, a list of answers (and which ones are correct). I do not want to be responsible for entering this data - I would prefer to pass it off to a non-programmer to serialize all this data for me, in either XML or JSON. Then my app would just read in the data file.
Does Google's Protocol Buffers fit my use case? Or should I stick to a more traditional format like XML or JSON?
I think not: Protobuf is a binary format. So then you would need to support a text format like XML or JSON and Protobuf.
Also it does not seem you would benefit from Protobufs better berformance at all.
If I were to store the same markup in 2 separate documents, one XML, the other JSON, in MarkLogic 6, does MarkLogic automatically convert the JSON equivalent to XML, and index it in that regard, or are both stored in their respective formats?
What I'm getting at is, does MarkLogic store ALL documents as XML, regardless, and simply apply JSON transformations to JSON documents when queried?
If documents are stored in native format, is there any advantage, in terms of performance, to storing documents in JSON over XML?
Below is an example code-snippet:
if($outputFormat="json") then (: result in json format :)
let $custom-config :=
let $config := json:config("custom")
return (map:put($config, "array-element-names",(xs:QName("lp:lesson_plan"),
xs:QName("lp:instructional_segment"),
xs:QName("lp:strand_type"),
xs:QName("lp:resource"),
xs:QName("lp:level"),
xs:QName("lp:discipline"),
xs:QName("lp:language"),
xs:QName("lp:program"),
xs:QName("lp:grade"),
xs:QName("res:strand_type"),
xs:QName("res:resource"),
xs:QName("res:ISBN"),
xs:QName("res:level"),
xs:QName("res:standard"),
xs:QName("res:secondaryURL"),
xs:QName("res:grade"),
xs:QName("res:keyword"))),
map:put($config, "whitespace","ignore"),
map:put($config, "text-value","value"),
$config)
return json:transform-to-json($finalResult, $custom-config)
else (: finalResult in xml format :)
$finalResult
MarkLogic is XML-native and does need to convert JSON to XML to store it in the database. There is a high-level JSON library to perform transformations. The main functions are json:transform-to-json and json:transform-from-json, and when configured correctly should provide lossless conversions.
I think the main difference from your example is whether you want to convert to XML using your own process or use MarkLogic's toolkit.
For more detailed information, see MarkLogic's docs:
http://docs.marklogic.com/guide/app-dev/json
On disk, MarkLogic stores highly compressed C++ data structures that represent hierarchical trees and corresponding indexes. (OK, that’s an over-simplification, but illustrative nonetheless.) There are two places where you as a developer will typically interact with those data structures: 1) building queries and application logic 2) deserializing/serializing data into and out of this internal data model. Today, MarkLogic uses the XML data model (XDM) for the latter and, correspondingly, XQuery, XPath, and XSLT for the former. We chose this stack for several reasons: XML is good at representing both text mark-up as well as data structures and the tooling around XML is mature and widespread.
Having said that, JSON has emerged as a popular serialization of hierarchical data structures—the “X” in AJAX. While we don't have the same watertight abstraction between JSON and MarkLogic’s internal data model today, we do provide a set of tools that allow you to efficiently and losslessly convert between JSON and the XML data model. Additionally, our REST and Java APIs allow you to store, retrieve, and even query tree structures that originated as JSON without having to think about this conversion step; the APIs handle this in the plumbing.
As for performance, there will be a little overhead converting between a JSON and XDM representation. However, I’d expect that to be negligible for most applications. The real benefits of XML will be in the expressiveness of XQuery, XPath, and XSLT in working with the data. There is no widespread equivalent to these in the JSON world today.
One footnote: The REST API (and thus the Java API wrapper around the REST API) provide a facade for the JSON conversion to XML -- that is, the APIs do the conversion to XML for you.
Usually, you don't need to think about the conversion except when you are creating range and geospatial indexes over the converted elements.
If you need to support JSON documents in your client, then the facade is convenient.
On the other hand, expressing the structure as JSON has no advantages for database operations and some limitations. (For instance, XML has the standards-based, baked atomic data types, schema validation, and server processing with XQuery or XSLT.) So, if you have complete control over the data structure, you might want to write it to the server as XML.
As of MarkLogic 8 (February 2015), JSON is now a native data type, just like XML. This eliminates the needs for a translation layer for applications that want to work exclusively in JSON. In addition, we’ve added JavaScript as a first-class language in the database itself (using Google’s V8 engine). This means that you can write stored procedures, triggers, and even full HTTP applications with JavaScript that runs in the database, close to the data.
We are designing a fairly complex REST API, in which most of the I/O are JSON encoded objects with a specific structure. One challenge we have found is to document the API in such a way that makes it easier for clients to post correct input and process output. Because the data of both the input and output requires fairly complex JSON objects, client developers often introduce bugs related to the structure of the I/O objects.
With all of the JSON web API's these days, I would have hoped for a general solution, but I am having a hard time finding one. I looked into json-schema which is a json-validation schema but both the IETF draft and implementations seem to be fairly immature (even though they have been around for a while, which is not a good sign).
A slightly different approach is offered by Protocol Buffers and Apache Avro, where the schema is not used for validation, but actually required for the encoding/decoding of the message. Of these 2, Avro seems to have rather limited documentation and implementations. ProtoBuf seems better, but I am not sure if this is really suitable to use in the browser to call a JSON api?
Now I am starting to doubt if I am looking at this from the right angle. Are there other methods available to make my API a bit more strong-typed'ish? Or is a formal description of a JSON REST/RPC API something that defeats the purpose of using JSON?
Edit: 6 months after this topic we found mongoose, which is very close to what we were lookin for.
Below a reply I received by email from Douglas Crockford.
I am not a believer in schemas as an alternative to input validation.
There are properties that cannot be verified from the syntax. I think
that was one of the ways that XML went wrong.
If your formats are too complex, then I would look at simplifying
them.
Such systems exist and I'm the author of one of them. It is called Piqi-RPC and it does IDL-based validation of the input and output parameters for RPC-style APIs over HTTP.
It supports JSON, XML and Google Protocol Buffers as data representation formats for input and output of HTTP POST requests. Clients can choose to use any of the three formats and specify their choice using the standard Accept and Content-Type HTTP headers.
So, yes, in theory, you are looking in the right direction. However, at the moment, Piqi-RPC supports writing servers only in Erlang and it wouldn't be very useful for you if you use a different stack. I heard that Apache Thrift also supports JSON over HTTP transport, but I haven't checked. Another kind of similar system I know of (also for Erlang) is called UBF. I have heard of libraries for Java that can parse and validate JSON based on Protocol Buffers specification (e.g. http://code.google.com/p/protostuff/).
The idea itself is far from being new, but there aren't many systems that approach it in practice. It is a challenging problem.
Historically, IDLs were used for interface definition and binary data serialization and not so much for validating dynamic data interchange formats (e.g. XML and JSON) which emerged later. Sun-RPC IDL and CORBA IDL fall in the first category. WSDL would be one of few examples covering both areas, but it is a terrible piece of technology and it would be a bad choice for most modern systems. In addition, there are many schema languages (also known as DDLs -- data definition languages), most of which are highly specialized and work with only one representation format, e.g. XML or JSON schemas. Few of those have stable implementations.
The Piqi project and Piqi-RPC, which is based on it, are build around several fairly simple realizations:
DLL doesn't have to be explicitly tied to any particular data representation format or built around it. Instead, such language can be fairly universal and cover wide range of practical use-cases (e.g. cross-language data serialization and data validation) and data formats (e.g. JSON, XML, Protocol Buffers).
IDL for RPC-style communication can be implemented as a thin, mostly syntactic layer on top of the universal DDL.
Such IDL and interface specifications can be transport agnostic.
Speaking of REST-style APIs over HTTP compared to RPC-style APIs over HTTP.
With RPC-style APIs, service developer or an automated system have to validate three things: function name (according to some service naming scheme), input and, if you choose so, output.
In case of REST-style APIs, people get themselves in trouble for no good reason. Now, they have a lot more stuff to validate: arbitrarily complex URL syntax, including dynamic parameters encoded in URL segments (for all HTTP methods) and URL query string (only for HTTP GET method), HTTP method correspondence (whether it should be GET, POST, PUT, DELETE, etc.), HTTP body when some parameters go there (sometimes they do it manually twice for parameters represented in JSON and XML), custom HTTP headers, and separately -- service documentation. Imagine an IDL supporting all that!
XML is better for RESTful services in many ways. It has native linking (<link href=, for all those HATEOAS fans), native language support (lang="en") and a great ecosystem.
It is also better for future proofing and future API refactorings. Converting this:
<profile>
<username>alganet</username>
</profile>
To support more usernames:
<profile>
<username>alganet</username>
<username>alexandre</username>
</profile>
Is much more simpler to do without breaking existing clients using XML. JSON is hard on that.
If you really need JSON, JSON-Schema is the way to go. It's immature, but I don't know anything better on that case. Maybe your consumers could choose between XML and JSON, so they can choose between a small payload (JSON) or RESTful candies (XML) using Content Negotiation.
I'd say the answer to your last question is yes. If you need a way to constrain and document the JSON "schema", why didn't you go with XML in the first place? It is not that much harder to parse, and being able to enforce a schema for it is a great advantage.