I'm as new as can be to JSON. I understand that both JSON-LD and JSON Schema are used to validate JSON data. I, however, cannot find much information comparing and contrasting the two.
Which one is better?
Why use one over the other?
Advantages vs disadvantages?
Can these two even be compared?
Am I misunderstanding what JSON-LD and JSON Schema are?
JSON-LD's goal is to make JSON documents understandable by machines by linking it to well-defined vocabularies. It is not used to validate JSON data. JSON Schema is used for that purpose though. So you can't really compare the two.
Am I misunderstanding what JSON-LD and JSON Schema are?
About JSON-LD, yes; In JSON-LD document, JSON-LD is defined as:
a JSON-based format to serialize Linked Data
Can these two even be compared?
It would be better to compare JSON-LD with JSON (not JSON Schema). Or you could compare JSON Schema with the other encoding syntax schemas e.g. XML Schema.
Note: The following section is about the remaining questions (considering the difference between JSON-LD and JSON).
Choosing between JSON and JSON-LD depends on the situation and context of use. Generally, JSON is a markup language in the syntactic level where the data encoded is just machine-readable. But JSON-LD is being used to semantically markup the data, to make them become not only machine-readable, but also machine-understanable by providing additional syntax to JSON for serialization of Linked Data.
Recommended resource to understand the detailed differences between JSON-LD and JSON is the JSON-LD document published by W3C.
Related
I'm wondering what is the difference between data parsing and transformation.
For example, if I need to convert data from XML format to JSON format will it be a transformation or parsing?
Transformation is a mapping from one form to another.
An XSLT transformation maps from XML to JSON, HTML, (different) XML, etc.
Parsing is an analysis of a sequential form to identify structural parts.
An XML parser reads XML and identifies its elements, attributes, and other parts.
Data conversion is fundamentally a transformation. Note, though, that transformations often leverage structure identified during parsing of the input form to create the output form.
Parsing technically is the process of establishing the logical structure of the textual input: for example establishing that <a b="3"/> represents an element named a containing an attribute named b whose value is 3.
Unfortunately the term seems to be increasingly misunderstood, and programmers without formal computer science training often misuse the term to mean almost any processing of the parsed data: we see questions on SO saying "I am writing a parser", when actually they are writing an application that consumes the output of a parser.
Converting XML to JSON is a three-stage process: parsing the XML, transforming the resulting data structure to a different data structure, and then serializing the transformed data structure into JSON syntax.
Originally, JSON borrowed its syntax from JavaScript (object literals), but then became a programming language agnostic data interchange format. Its structures (string, array, object) can be mapped directly to primitive data types in most dynamic programming languages and vice versa.
Now, since it is no longer tied to JavaScript, what is the abstract data model of JSON today? In other words, if we compare XML with JSON, is there a XML Infoset equivalent for JSON?
Obviously, JSON is not the only format that can be used for serialization of JSON-like documents. Alternatives include YAML, BSON, and even XML. Is there a name for that unified data model and perhaps a formal specification available?
XML is more complicated that JSON format. Some common features that XML has and JSON lacks are: namespaces, attributes, comments. However, both formats can represent any kind of data, but potentially with a different structure logic.
What's the abstract data model of JSON ? The same as it was when it was created, nothing changed. JSON served as a data format for server-client communication. It was never tied to JavaScript, since it is just a formatted data string and not some kind of binary executable. Its format originates from javascript yes, but any language can interpret it with a text parser.
I am not sure what kind of information you are looking for, but the name of the process that converts language-specific structured data into strings and vise versa is called Serialization/Unserialization, but you already know these terms ...
"Unified data model", "formal specification", what are you even looking for ? Are you looking for principles of data formatting ? Data storing ? People need to store/transmit/present their data and they come up with ways to do it, there is nothing more to it.
Json is better than xml for sure, i was wondering if there is any case we should use xml instead of json
If speaking in terms of REST, neither is better. Plain XML or plain JSON does not say anything about data transferred in either format. Though if you use well known formats like:
application/atom+xml
application/vnd.collection+json
comparison will boil down to which format suits your needs better.
If you compare XML to JSON from programming language perspective, yes XML adds extra layer between code and data, though nothing special. Oh and XML is little verbose and larger in terms of bytes.
XML has been around for a long time, and there's a lot of tools in place that JSON does not yet have, are not commonplace or ubiquitous.
XML has XSchema, RelaxNG, DTD. JSON does have an equivalent but it's not as common place.
XML has namespacing, which is great for mixing different document types. JSON does have some ideas on how to do namespacing (such as JSON-LD) but doing this correctly tends to take why people tend to enjoy JSON over XML for.
Namespacing in XML is everywhere, which gives you a very standard framework to re-use existing XML schemas for integration.
So I don't want to say, "you should do XML" or "you should do JSON", but I would rather say that if you need to integrate with existing XML systems, or you needs would strongly benefit from features such as namespacing, schemas, linking, re-use of existing XML documents, XSLT, etc... XML might be a better choice.
Is JSON.stringify( ) equivalent to serialization or effectively serialization or is it just a necessary step towards
serialization?
In other words, is JSON.stringify( ) sufficient but not necessary for serialization? Or is necessary but not sufficient? Or is it neither necessary nor sufficient for serialization of JavaScript objects?
Serialization is the act of converting data into a format that can be written to disk or transmitted over the network (or written on paper if that's what you want). Usually, serialization is transforming objects to text but that's not necessary since there are several serialization formats such as bittorrent's bencoding and the old/ancient standard asn.1 formats which are binary.
JSON is one form of text-based serialization format and is currently very popular due to it's simplicity. It's not the only one though. Other popular formats include XML and CSV.
Due to its popularity and its origin as javascript object literal syntax ES5 introduced JSON.stringify() to generate a JSON string from an object. Previously you had to use libraries or write a recursive descent parser to do the job.
So, is JSON.stringify() enough for serialization? Yes, if the output format you want is JSON. No, if you want other output formats such as XML or CSV or bencode.
There are limitations to the JSON format. One limitation is that JSON cannot encode functions so JSON.stringify() ignores functions/methods when serializing. JSON also can't encode circular references. Most other serialization formats have this limitation as well but since JSON looks like javascript syntax some people assume it can do what javascript object literals can. It can't.
So the relationship between "JSON" and "serialization" is like the relationship between "Toyota Prius" and "car". JSON.stringify() is simply a function that generates JSON strings so I guess that would make it a Toyota factory.
Old question, but the following information may be useful for posterity.
Of course, you can serialise any way you want, including any number of custom methods, but JSON has become an increasingly popular method.
The most obvious benefit of JSON is that it represents objects in the same way that JavaScript object literals do, though it is slightly less flexible. Nevertheless, if you can represent normal data in JavaScript then JSON is a good match.
The most significant feature is that, since it represents objects as well as arrays, it can represent fairly complex & hierarchical data.
For one reason or another, JSON has more-or-less supplanted XML as the preferred serialisation for sending data between the server and browser. It is so useful that many languages include their own JSON functions (PHP, for example, has the better named json_encode & json_decode functions), as do some modern Databases. I myself have found it convenient to use JSON functions to store a more complex data structure in a single field of a database without JavaScript anywhere in sight).
The short answer is yes, for the most part it is a sufficient step to serializing most data (non-binary). It is not, however, necessary as there are alternatives.
Serializing binary data, on the other hand, now that’s another story …
Short answer... Serialize means the same thing as Stringify, IMHO.
This is a near duplicate of How to reliably hash JavaScript objects?, where someone wants to reliably hash javascript objects ;
Now that the json-ld specification has been validated, I saw that there is a normalization procedure that they advertise as a potential way to normalize a json object :
normalize the data using the RDF Dataset normalization algorithm, and then dump the output to normalized NQuads format. The NQuads can then be processed via SHA-256, or similar algorithm, to get a deterministic hash of the contents of the Dataset.
Building a hash of a json object has always been a pain because something like
sha1(JSON.stringify(object))
does not work or is not guaranteed to work the same across implementations (the order of the keys is not defined of example).
Does json-ld work as advertized ? Is it safe to use it as universal json normalization procedure for hashing objects ? Can those objects be standard json objects or do they need some json-ld decorations (#context,..) to be normalized ?
Yes, normalization works with JSON-LD, but the objects do need to be given context (via the #context property) in order for them to produce any RDF. It is the RDF that is deterministically output in NQuads format (and that can then be hashed, for example).
If a property in a JSON-LD document is not defined via #context, then it will be dropped during processing. JSON-LD requires that you provide global meaning (semantics) to the properties in your document by associating them with URLs. These URLs may provide further machine-readable information about the meaning of the properties, their range, domain, etc. In this way data becomes "linked" -- you can both understand the meaning of a JSON document from one API in the context of another and you can traverse documents (via HTTP) to find more information.
So the short answer to the main question is "Yes, you can use JSON-LD normalization to build a unique hash for a JSON object", however, the caveat is that the JSON object must be a JSON-LD object, which really constitutes a subset of JSON. One of the main reasons for the invention of the normalization algorithm was for hashing and digitally-signing graphs (JSON-LD documents) for comparison.