External text and HTML - html

Suppose I have a few web forms to implement. The forms contain standard greetings, validation messages (e.g. "missing name", "email address is invalid"), errors (e.g. "temporary processing error"), etc.
Does it make sense to factor out all these text messages from the HTML and store it in an external text file so that non-technical people might edit the text ?
They say it is easier to edit text files instead of HTML. On the other hand I am afraid it would complicate the solution. What are the best practices in that field?

It depends from case. If texts are often edited by nontechnical people, it may make sense to move text into a separate file with simple structure. Otherwise, it indeed could complicate things.
Typically, server-side template engine is used to build pages from multiple different resources (such resources are, e.g., HTML-template files, database, configuration files, etc.). What type of resource and format of it to use is up to you and depends on situation. For example, you could store your error texts in JSON-format files like this:
{
"name" : {
"minlength" : {
"value": 2,
"error": "Name field must contain at least 2 characters"
},
"maxlength" : {
"value": 255,
"error": "Name field must contain not more than 255"
}
},
"email" : {
"pattern" : {
"value": "some_regexp_for_email_validation",
"error": "Please input a correct e-mail address"
}
}
}
In PHP in particular, JSON format can be read with json_decode() method.
An alternative to JSON is XML (though it's typically harder to use).
By the way, it may make sense to provide a web interface to edit form error rules and texts for nontechnical people. Then implementation details would be hidden from people who shouldn't know about them. So you could use whatever you want as for technical part of this while editors would see just a usable GUI with text fields.
You also may be interested in using ready server-side data-validation solutions like Zend Validator.

I'm using an java webapp which uses keys that mapped to strings in *.properties file.
I noticed that it's more difficult to support such code in cases you're searching "where's that field label "some cool field":
First you have to find key (ok, you get that key for that string is "submit.button.text"), and then you'll have to find where's key is actully used in your code.

Related

Inconsistence between REST POST and GET entity format?

I'm creating an API for submitting incoming email messages for internal processing. The mail server script will submit them in a simple format like, pretty much exactly as received:
POST /api/messages/
{
"sender": "sender#...",
"recipient": "recipient#...",
"email_message": "headers\nbody"
}
However upon submission the email_message is parsed, some fields are extracted, message body is parsed, etc, and what we actually store as the entity is this:
GET /api/messages/1/
{
"sender": "sender#...",
"recipient": "{our internal recipient ID}"
"subject": "Subject from email",
"date": "Date from email",
"parsed_body": "Output of some magic performed on the Email body",
... etc ...
}
As you can see this is quite different from what was submitted via POST in the first place.
Is this transformation permitted under the REST rules or should the entity be stored (and retrieved) as originally POSTed? And if it should be stored as is then what kind of API endpoint should I provide for submitting the unparsed messages?
Is this transformation permitted under the REST rules or should the entity be stored (and retrieved) as originally POSTed?
For POST, it's fine to store whatever makes sense. Think about how it works on the web; we fill out a form, which submits as a bunch of key value pairs, but the server makes no promise that it is going to store key value pairs.
PUT semantics are different; the client expects the new representation to overwrite the existing representation (if any), whatever that means. So you need to be more aware of the implications of those semantics, especially in regards to what assumptions the client is allowed to make as a consequence of various responses.
POST semantics are much more forgiving (and consequently, the client has less information available about what's going on; that's part of the tradeoff).

How to customise error message for invalid input?

How to customise error message for invalid input?
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"username": {
"type": "string",
"pattern": "^[A-Za-z0-9-_.]+$",
"minLength": 3
},
"password": {
"type": "string",
"minLength": 8,
"pattern": "^(?=.*[a-z])(?=.*[A-Z])(?=.*\\d)[a-zA-Z\\d\\W]$"
}
},
"required": [
"username",
"password"
],
"errors": [
{
"property": "username",
"message": "min 3 characters, do not use spaces or special characters"
}
]
}
For example, if username input is not of required min length or doesn't satisfy regex pattern, display one custom message min 3 characters, do not use spaces or special characters
Custom error messages are not supported. However, there is some discussion going on to add a feature like this in the next version of JSON Schema.
Update 2021-01-26
JSON Schema never ended up supporting the customization of error messages in this way. The main problem with this is that appropriate error messaging is dependent on the audience and the context, so defining one message in this way is limiting. For example, a developer needs different feedback than an end user.
Instead, JSON Schema standardized the results that come back from a validation. This allows you to process the results to produce output that is appropriate for your audience. In theory libraries can be developed to produce error messaging for certain audiences. These would be decoupled from your validator library allowing you to more easily switch to another implementation in the future.
However, even the best error message producing libraries wouldn't solve the specific case presented in the original question. A library can't take a regular expression and produce a meaningful message. The good news is, JSON Schema provides an extension mechanism called vocabularies that you can use to create a custom keyword to annotate your schemas with the information that an output processor needs to produce a better error message. For example, the errors keyword in the original question would appear in the standard output results and can be used by an output processor as one of the ways it produces nice user facing error messages.
Unfortunately, no one has built one of these standard output processors yet, so you won't be able to pick one up off the shelf. It shouldn't be too hard to do, but you'd have to write it yourself. https://github.com/atlassian/better-ajv-errors is one of these output processor tools, but it uses ajv's proprietary output format rather than the standard format.
Both the standardized output format and JSON Schema Vocabularies are new in draft 2019-09 which has not seen big adoption so far. As time goes on, we expect to see more tools that make these kinds of customizations easy.

Using Hypermedia Constraint API to drive UI

I want to use a REST API with hypermedia constraint to drive my UI.
That is, depending on "possible next states" for the resources I fetch, I want to adapt my UI for this.
I'm quite new to UI dev on the web so I wonder if there is any special considerations I need to care about here?
Lets say I have a resource that looks like this:
{
href: "..",
orderDate: date..,
details: {
href : "..",
items: [..],
}
links: [
placeOrder : {
href : "...",
method : "post"
},
cancelOrder : {
href : "...",
method : "delete"
}]
}
Would the above links approach be valid within the context of HATEOAS ?
In a perfect world I guess one should just know about HTTP verbs for actions on the resource but if I want to let the UI know about what can be done to the resource, how do I do this in an idiomatic way?
What I mean is, the same kind of resource can have different "next possible state" depending on current status. And the UI needs to know about this.
Should the UI examine what links are available on the resource or how do I do this?
Yes, exactly. The UI should be coded entirely to the link relations presented to it. If a relation isn't available to follow, it shouldn't be returned in the link collection in the response. That drives not only current state, but also means that the UI isn't burdened with trying to calculate access control rules.
If by idiomatic you mean systematic then the description of the processing rules for the JSON you return need defining, then you communicate your cool new media-type via the Content-Type header.
https://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]
He's saying to put all your effort into figuring out how to design a data format and its rules such that the system is fully expressive for all its needs, like HTML.
(Or skip this and just use HTML for your API).

RESTful Collection Resources - idiomatic JSON representations and roundtripping

I have a collection resource called Columns. A GET with Accept: application/json can't directly return a collection, so my representation needs to nest it in a property:-
{ "propertyName": [
{ "Id": "Column1", "Description": "Description 1" },
{ "Id": "Column2", "Description": "Description 2" }
]
}
Questions:
what is the best name to use for the identifier propertyName above? should it be:
d (i.e. is d an established convention or is it specific to some particular frameworks (MS WCF and MS ASP.NET AJAX ?)
results (i.e. is results an established convention or is it specific to some particular specifications (MS OData)?)
Columns (i.e. the top level property should have a clear name and it helps to disambiguate my usage of generic application/json as the Media Type)
NB I feel pretty comfortable that there should be something wrapping it, and as pointed out by #tuespetre, XML or any other representation would force you to wrap it to some degree anyway
when PUTting the content back, should the same wrapping in said property be retained [given that it's not actually necessary for security reasons and perhaps conventional JSON usage idioms might be to drop such nesting for PUT and POST given that they're not necessary to guard against scripting attacks] ?
my gut tells me it should be symmetric as for every other representation but there may be prior art for dropping the d/*results** [assuming that's the answer to part 1]*
... Or should a PUT-back (or POST) drop the need for a wrapping property and just go with:-
[
{ "Id": "Column1", "Description": "Description 1" },
{ "Id": "Column2", "Description": "Description 2" }
]
Where would any root-level metadata go if one wished to add that?
How/would a person crafting a POST Just Know that it needs to be symmetric?
EDIT: I'm specifically interested in an answer that with a reasoned rationale that specifically takes into account the impacts on client usage with JSON. For example, HAL takes care to define a binding that makes sense for both target representations.
EDIT 2: Not accepted yet, why? The answers so far don't have citations or anything that makes them stand out over me doing a search and picking something out of the top 20 hits that seem reasonable. Am I just too picky? I guess I am (or more likely I just can't ask questions properly :D). Its a bit mad that a week and 3 days even with an )admittedly measly) bonus on still only gets 123 views (from which 3 answers ain't bad)
Updated Answer
Addressing your questions (as opposed than going off on a bit of a tangent in my original answer :D), here's my opinions:
1) My main opinion on this is that I dislike d. As a client consuming the API I would find it confusing. What does it even stand for anyway? data?
The other options look good. Columns is nice because it mirrors back to the user what they requested.
If you are doing pagination, then another option might be something like page or slice as it makes it clear to the client, that they are not receiving the entire contents of the collection.
{
"offset": 0,
"limit": 100,
"page" : [
...
]
}
2) TBH, I don't think it makes that much difference which way you go for this, however if it was me, I probably wouldn't bother sending back the envelope, as I don't think there is any need (see below) and why make the request structure any more complicated than it needs to be?
I think POSTing back the envelope would be odd. POST should let you add items into the collection, so why would the client need to post the envelope to do this?
PUTing the envelope back could make sense from a RESTful standpoint as it could be seen as updating metadata associated with the collection as a whole. I think it is worth thinking about the sort of meta data you will be exposing in the envelope. All the stuff I think would fit well in this envelope (like pagination, aggregations, search facets and similar meta data) is all read only, so it doesn't make sense for the client to send this back to the server. If you find yourself with a lot of data in the envelope that the client is able to mutate - then it could be a sign to break that data out into a separate resource with the list as a sub collection. Rubbish example:
/animals
{
"farmName": "farm",
"paging": {},
"animals": [
...
]
}
Could be broken up into:
/farm/1
{
"id": 1,
"farmName": "farm"
}
and
/farm/1/animals
{
"paging": {},
"animals": [
...
]
}
Note: Even with this split, you could still return both combined as a single response using something like Facebook's or LinkedIn's field expansion syntax. E.g. http://example.com/api/farm/1?field=animals.offset(0).limit(10)
In response, to your question about how the client should know what the JSON payload they are POSTing and PUTing should look like - this should be reflected in your API documentation. I'm not sure if there is a better tool for this, but Swagger provides a spec that allows you to document what your request bodies should look like using JSON Schema - check out this page for how to define your schemas and this page for how to reference them as a parameter of type body. Unfortunately, Swagger doesn't visualise the request bodies in it's fancy web UI yet, but it's is open source, so you could always add something to do this.
Original Answer
Check out William's comment in the discussion thread on that page - he suggests a way to avoid the exploit altogether which means you can safely use a JSON array at the root of your response and then you need not worry about either of you questions.
The exploit you link to relies on your API using a Cookie to authenticate a user's session - just use a query string parameter instead and you remove the exploit. It's probably worth doing this anyway since using Cookies for authentication on an API isn't very RESTful - some of your clients may not be web browsers and may not want to deal with cookies.
Why Does this fix work?
The exploit is a form of CSRF attack which relies on the attacker being able to add a script tag on his/her own page to a sensitive resource on your API.
<script src="http://mysite.com/api/columns"></script>
The victims web browser will send all Cookies stored under mysite.com to your server and to your servers this will look like a legitimate request - you will check the session_id cookie (or whatever your server-side framework calls the cookie) and see the user is authenticated. The request will look like this:
GET http://mysite.com/api/columns
Cookie: session_id=123456789;
If you change your API you ignore Cookies and use a session_id query string parameter instead, the attacker will have no way of tricking the victims web browser into sending the session_id to your API.
A valid request will now look like this:
GET http://mysite.com/api/columns?session_id=123456789
If using a JavaScript client to make the above request, you could get the session_id from a cookie. An attacker using JavaScript from another domain will not be able to do this, as you cannot get cookies for other domains (see here).
Now we have fixed the issue and are ignoring session_id cookies, the script tag on the attackers website will still send a similar request with a GET line like this:
GET http://mysite.com/api/columns
But your server will respond with a 403 Forbidden since the GET is missing the required session_id query string parameter.
What if I'm not authenticating users for this API?
If you are not authenticating users, then your data cannot be sensitive and anyone can call the URI. CSRF should be a non-issue since with no authentication, even if you prevent CSRF attacks, an attacker could just call your API server side to get your data and use it in anyway he/she wants.
I would go for 'd' because it clearly separates the 'envelope' of your resource from its content. This would also make it easier for consumers to parse your responses, as opposed to 'guessing' the name of the wrapping property of a given resource before being able to access what it holds.
I think you're talking about two different things:
POST request should be sent in application/x-www-form-urlencoded. Your response should basically mirror a GET if you choose to include a representation of the newly created resource in your reply. (not mandatory in HTTP).
PUTs should definitely be symmetric to GETs. The purpose of a PUT request is to replace an existing resource representation with another. It just makes sense to have both requests share the same conventions, doesn't it?
Go with 'Columns' because it is semantically meaningful. It helps to think of how JSON and XML could mirror each other.
If you would PUT the collection back, you might as well use the same media type (syntax, format, what you will call it.)

Can comments be used in JSON?

Can I use comments inside a JSON file? If so, how?
No.
JSON is data-only. If you include a comment, then it must be data too.
You could have a designated data element called "_comment" (or something) that should be ignored by apps that use the JSON data.
You would probably be better having the comment in the processes that generates/receives the JSON, as they are supposed to know what the JSON data will be in advance, or at least the structure of it.
But if you decided to:
{
"_comment": "comment text goes here...",
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Standard Generalized Markup Language",
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-markup language, used to create markup languages such as DocBook.",
"GlossSeeAlso": ["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}
No, comments of the form //… or /*…*/ are not allowed in JSON. This answer is based on:
https://www.json.org
RFC 4627:
The application/json Media Type for JavaScript Object Notation (JSON)
RFC 8259 The JavaScript Object Notation (JSON) Data Interchange Format (supercedes RFCs 4627, 7158, 7159)
Include comments if you choose; strip them out with a minifier before parsing or transmitting.
I just released JSON.minify() which strips out comments and whitespace from a block of JSON and makes it valid JSON that can be parsed. So, you might use it like:
JSON.parse(JSON.minify(my_str));
When I released it, I got a huge backlash of people disagreeing with even the idea of it, so I decided that I'd write a comprehensive blog post on why comments make sense in JSON. It includes this notable comment from the creator of JSON:
Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser. - Douglas Crockford, 2012
Hopefully that's helpful to those who disagree with why JSON.minify() could be useful.
Comments were removed from JSON by design.
I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability. I know that the lack of comments makes some people sad, but it shouldn't.
Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser.
Source: Public statement by Douglas Crockford on G+
JSON does not support comments. It was also never intended to be used for configuration files where comments would be needed.
Hjson is a configuration file format for humans. Relaxed syntax, fewer mistakes, more comments.
See hjson.github.io for JavaScript, Java, Python, PHP, Rust, Go, Ruby, C++ and C# libraries.
DISCLAIMER: YOUR WARRANTY IS VOID
As has been pointed out, this hack takes advantage of the implementation of the spec. Not all JSON parsers will understand this sort of JSON. Streaming parsers in particular will choke.
It's an interesting curiosity, but you should really not be using it for anything at all. Below is the original answer.
I've found a little hack that allows you to place comments in a JSON file that will not affect the parsing, or alter the data being represented in any way.
It appears that when declaring an object literal you can specify two values with the same key, and the last one takes precedence. Believe it or not, it turns out that JSON parsers work the same way. So we can use this to create comments in the source JSON that will not be present in a parsed object representation.
({a: 1, a: 2});
// => Object {a: 2}
Object.keys(JSON.parse('{"a": 1, "a": 2}')).length;
// => 1
If we apply this technique, your commented JSON file might look like this:
{
"api_host" : "The hostname of your API server. You may also specify the port.",
"api_host" : "hodorhodor.com",
"retry_interval" : "The interval in seconds between retrying failed API calls",
"retry_interval" : 10,
"auth_token" : "The authentication token. It is available in your developer dashboard under 'Settings'",
"auth_token" : "5ad0eb93697215bc0d48a7b69aa6fb8b",
"favorite_numbers": "An array containing my all-time favorite numbers",
"favorite_numbers": [19, 13, 53]
}
The above code is valid JSON. If you parse it, you'll get an object like this:
{
"api_host": "hodorhodor.com",
"retry_interval": 10,
"auth_token": "5ad0eb93697215bc0d48a7b69aa6fb8b",
"favorite_numbers": [19,13,53]
}
Which means there is no trace of the comments, and they won't have weird side-effects.
Happy hacking!
Consider using YAML. It's nearly a superset of JSON (virtually all valid JSON is valid YAML) and it allows comments.
You can't. At least that's my experience from a quick glance at json.org.
JSON has its syntax visualized on that page. There isn't any note about comments.
Comments are not an official standard, although some parsers support C++-style comments. One that I use is JsonCpp. In the examples there is this one:
// Configuration options
{
// Default encoding for text
"encoding" : "UTF-8",
// Plug-ins loaded at start-up
"plug-ins" : [
"python",
"c++",
"ruby"
],
// Tab indent size
"indent" : { "length" : 3, "use_space": true }
}
jsonlint does not validate this. So comments are a parser specific extension and not standard.
Another parser is JSON5.
An alternative to JSON TOML.
A further alternative is jsonc.
The latest version of nlohmann/json has optional support for ignoring comments on parsing.
Here is what I found in the Google Firebase documentation that allows you to put comments in JSON:
{
"//": "Some browsers will use this to enable push notifications.",
"//": "It is the same for all projects, this is not your project's sender ID",
"gcm_sender_id": "1234567890"
}
You should write a JSON schema instead. JSON schema is currently a proposed Internet draft specification. Besides documentation, the schema can also be used for validating your JSON data.
Example:
{
"description": "A person",
"type": "object",
"properties": {
"name": {
"type": "string"
},
"age": {
"type": "integer",
"maximum": 125
}
}
}
You can provide documentation by using the description schema attribute.
If you are using Jackson as your JSON parser then this is how you enable it to allow comments:
ObjectMapper mapper = new ObjectMapper().configure(Feature.ALLOW_COMMENTS, true);
Then you can have comments like this:
{
key: "value" // Comment
}
And you can also have comments starting with # by setting:
mapper.configure(Feature.ALLOW_YAML_COMMENTS, true);
But in general (as answered before) the specification does not allow comments.
NO. JSON used to support comments but they were abused and removed from the standard.
From the creator of JSON:
I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability. I know that the lack of comments makes some people sad, but it shouldn't. - Douglas Crockford, 2012
The official JSON site is at JSON.org. JSON is defined as a standard by ECMA International. There is always a petition process to have standards revised. It is unlikely that annotations will be added to the JSON standard for several reasons.
JSON by design is an easily reverse-engineered (human parsed) alternative to XML. It is simplified even to the point that annotations are unnecessary. It is not even a markup language. The goal is stability and interoperablilty.
Anyone who understands the "has-a" relationship of object orientation can understand any JSON structure - that is the whole point. It is just a directed acyclic graph (DAG) with node tags (key/value pairs), which is a near universal data structure.
This only annotation required might be "//These are DAG tags". The key names can be as informative as required, allowing arbitrary semantic arity.
Any platform can parse JSON with just a few lines of code. XML requires complex OO libraries that are not viable on many platforms.
Annotations would just make JSON less interoperable. There is simply nothing else to add unless what you really need is a markup language (XML), and don't care if your persisted data is easily parsed.
BUT as the creator of JSON also observed, there has always been JS pipeline support for comments:
Go ahead and insert all the comments you like.
Then pipe it through JSMin before handing it to your JSON parser. - Douglas Crockford, 2012
If you are using the Newtonsoft.Json library with ASP.NET to read/deserialize you can use comments in the JSON content:
//"name": "string"
//"id": int
or
/* This is a
comment example */
PS: Single-line comments are only supported with 6+ versions of Newtonsoft Json.
Additional note for people who can't think out of the box: I use the JSON format for basic settings in an ASP.NET web application I made. I read the file, convert it into the settings object with the Newtonsoft library and use it when necessary.
I prefer writing comments about each individual setting in the JSON file itself, and I really don't care about the integrity of the JSON format as long as the library I use is OK with it.
I think this is an 'easier to use/understand' way than creating a separate 'settings.README' file and explaining the settings in it.
If you have a problem with this kind of usage; sorry, the genie is out of the lamp. People would find other usages for JSON format, and there is nothing you can do about it.
If your text file, which is a JSON string, is going to be read by some program, how difficult would it be to strip out either C or C++ style comments before using it?
Answer: It would be a one liner. If you do that then JSON files could be used as configuration files.
The idea behind JSON is to provide simple data exchange between applications. These are typically web based and the language is JavaScript.
It doesn't really allow for comments as such, however, passing a comment as one of the name/value pairs in the data would certainly work, although that data would obviously need to be ignored or handled specifically by the parsing code.
All that said, it's not the intention that the JSON file should contain comments in the traditional sense. It should just be the data.
Have a look at the JSON website for more detail.
JSON does not support comments natively, but you can make your own decoder or at least preprocessor to strip out comments, that's perfectly fine (as long as you just ignore comments and don't use them to guide how your application should process the JSON data).
JSON does not have comments. A JSON encoder MUST NOT output comments.
A JSON decoder MAY accept and ignore comments.
Comments should never be used to transmit anything meaningful. That is
what JSON is for.
Cf: Douglas Crockford, author of JSON spec.
I just encountering this for configuration files. I don't want to use XML (verbose, graphically, ugly, hard to read), or "ini" format (no hierarchy, no real standard, etc.) or Java "Properties" format (like .ini).
JSON can do all they can do, but it is way less verbose and more human readable - and parsers are easy and ubiquitous in many languages. It's just a tree of data. But out-of-band comments are a necessity often to document "default" configurations and the like. Configurations are never to be "full documents", but trees of saved data that can be human readable when needed.
I guess one could use "#": "comment", for "valid" JSON.
It depends on your JSON library. Json.NET supports JavaScript-style comments, /* commment */.
See another Stack Overflow question.
Yes, the new standard, JSON5 allows the C++ style comments, among many other extensions:
// A single line comment.
/* A multi-
line comment. */
The JSON5 Data Interchange Format (JSON5) is a superset of JSON that aims to alleviate some of the limitations of JSON. It is fully backwards compatible, and using it is probably better than writing the custom non standard parser, turning non standard features on for the existing one or using various hacks like string fields for commenting. Or, if the parser in use supports, simply agree we are using JSON 5 subset that is JSON and C++ style comments. It is much better than we tweak JSON standard the way we see fit.
There is already npm package, Python package, Java package and C library available. It is backwards compatible. I see no reason to stay with the "official" JSON restrictions.
I think that removing comments from JSON has been driven by the same reasons as removing the operator overloading in Java: can be used the wrong way yet some clearly legitimate use cases were overlooked. For operator overloading, it is matrix algebra and complex numbers. For JSON comments, its is configuration files and other documents that may be written, edited or read by humans and not just by parser.
JSON makes a lot of sense for config files and other local usage because it's ubiquitous and because it's much simpler than XML.
If people have strong reasons against having comments in JSON when communicating data (whether valid or not), then possibly JSON could be split into two:
JSON-COM: JSON on the wire, or rules that apply when communicating JSON data.
JSON-DOC: JSON document, or JSON in files or locally. Rules that define a valid JSON document.
JSON-DOC will allow comments, and other minor differences might exist such as handling whitespace. Parsers can easily convert from one spec to the other.
With regards to the remark made by Douglas Crockford on this issues (referenced by #Artur Czajka)
Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser.
We're talking about a generic config file issue (cross language/platform), and he's answering with a JS specific utility!
Sure a JSON specific minify can be implemented in any language,
but standardize this so it becomes ubiquitous across parsers in all languages and platforms so people stop wasting their time lacking the feature because they have good use-cases for it, looking the issue up in online forums, and getting people telling them it's a bad idea or suggesting it's easy to implement stripping comments out of text files.
The other issue is interoperability. Suppose you have a library or API or any kind of subsystem which has some config or data files associated with it. And this subsystem is
to be accessed from different languages. Then do you go about telling people: by the way
don't forget to strip out the comments from the JSON files before passing them to the parser!
If you use JSON5 you can include comments.
JSON5 is a proposed extension to JSON that aims to make it easier for humans to write and maintain by hand. It does this by adding some minimal syntax features directly from ECMAScript 5.
The Dojo Toolkit JavaScript toolkit (at least as of version 1.4), allows you to include comments in your JSON. The comments can be of /* */ format. Dojo Toolkit consumes the JSON via the dojo.xhrGet() call.
Other JavaScript toolkits may work similarly.
This can be helpful when experimenting with alternate data structures (or even data lists) before choosing a final option.
JSON is not a framed protocol. It is a language free format. So a comment's format is not defined for JSON.
As many people have suggested, there are some tricks, for example, duplicate keys or a specific key _comment that you can use. It's up to you.
Disclaimer: This is silly
There is actually a way to add comments, and stay within the specification (no additional parser needed). It will not result into human-readable comments without any sort of parsing though.
You could abuse the following:
Insignificant whitespace is allowed before or after any token.
Whitespace is any sequence of one or more of the following code
points: character tabulation (U+0009), line feed (U+000A), carriage
return (U+000D), and space (U+0020).
In a hacky way, you can abuse this to add a comment. For instance: start and end your comment with a tab. Encode the comment in base3 and use the other whitespace characters to represent them. For instance.
010212 010202 011000 011000 011010 001012 010122 010121 011021 010202 001012 011022 010212 011020 010202 010202
(hello base three in ASCII) But instead of 0 use space, for 1 use line feed and for 2 use carriage return.
This will just leave you with a lot of unreadable whitespace (unless you make an IDE plugin to encode/decode it on the fly).
I never even tried this, for obvious reasons and neither should you.
You can have comments in JSONP, but not in pure JSON. I've just spent an hour trying to make my program work with this example from Highcharts.
If you follow the link, you will see
?(/* AAPL historical OHLC data from the Google Finance API */
[
/* May 2006 */
[1147651200000,67.79],
[1147737600000,64.98],
...
[1368057600000,456.77],
[1368144000000,452.97]
]);
Since I had a similar file in my local folder, there were no issues with the Same-origin policy, so I decided to use pure JSON… and, of course, $.getJSON was failing silently because of the comments.
Eventually I just sent a manual HTTP request to the address above and realized that the content-type was text/javascript since, well, JSONP returns pure JavaScript. In this case comments are allowed. But my application returned content-type application/json, so I had to remove the comments.
JSON doesn't allow comments, per se. The reasoning is utterly foolish, because you can use JSON itself to create comments, which obviates the reasoning entirely, and loads the parser data space for no good reason at all for exactly the same result and potential issues, such as they are: a JSON file with comments.
If you try to put comments in (using // or /* */ or # for instance), then some parsers will fail because this is strictly not
within the JSON specification. So you should never do that.
Here, for instance, where my image manipulation system has saved image notations and some basic formatted (comment) information relating to them (at the bottom):
{
"Notations": [
{
"anchorX": 333,
"anchorY": 265,
"areaMode": "Ellipse",
"extentX": 356,
"extentY": 294,
"opacity": 0.5,
"text": "Elliptical area on top",
"textX": 333,
"textY": 265,
"title": "Notation 1"
},
{
"anchorX": 87,
"anchorY": 385,
"areaMode": "Rectangle",
"extentX": 109,
"extentY": 412,
"opacity": 0.5,
"text": "Rect area\non bottom",
"textX": 98,
"textY": 385,
"title": "Notation 2"
},
{
"anchorX": 69,
"anchorY": 104,
"areaMode": "Polygon",
"extentX": 102,
"extentY": 136,
"opacity": 0.5,
"pointList": [
{
"i": 0,
"x": 83,
"y": 104
},
{
"i": 1,
"x": 69,
"y": 136
},
{
"i": 2,
"x": 102,
"y": 132
},
{
"i": 3,
"x": 83,
"y": 104
}
],
"text": "Simple polygon",
"textX": 85,
"textY": 104,
"title": "Notation 3"
}
],
"imageXW": 512,
"imageYW": 512,
"imageName": "lena_std.ato",
"tinyDocs": {
"c01": "JSON image notation data:",
"c02": "-------------------------",
"c03": "",
"c04": "This data contains image notations and related area",
"c05": "selection information that provides a means for an",
"c06": "image gallery to display notations with elliptical,",
"c07": "rectangular, polygonal or freehand area indications",
"c08": "over an image displayed to a gallery visitor.",
"c09": "",
"c10": "X and Y positions are all in image space. The image",
"c11": "resolution is given as imageXW and imageYW, which",
"c12": "you use to scale the notation areas to their proper",
"c13": "locations and sizes for your display of the image,",
"c14": "regardless of scale.",
"c15": "",
"c16": "For Ellipses, anchor is the center of the ellipse,",
"c17": "and the extents are the X and Y radii respectively.",
"c18": "",
"c19": "For Rectangles, the anchor is the top left and the",
"c20": "extents are the bottom right.",
"c21": "",
"c22": "For Freehand and Polygon area modes, the pointList",
"c23": "contains a series of numbered XY points. If the area",
"c24": "is closed, the last point will be the same as the",
"c25": "first, so all you have to be concerned with is drawing",
"c26": "lines between the points in the list. Anchor and extent",
"c27": "are set to the top left and bottom right of the indicated",
"c28": "region, and can be used as a simplistic rectangular",
"c29": "detect for the mouse hover position over these types",
"c30": "of areas.",
"c31": "",
"c32": "The textx and texty positions provide basic positioning",
"c33": "information to help you locate the text information",
"c34": "in a reasonable location associated with the area",
"c35": "indication.",
"c36": "",
"c37": "Opacity is a value between 0 and 1, where .5 represents",
"c38": "a 50% opaque backdrop and 1.0 represents a fully opaque",
"c39": "backdrop. Recommendation is that regions be drawn",
"c40": "only if the user hovers the pointer over the image,",
"c41": "and that the text associated with the regions be drawn",
"c42": "only if the user hovers the pointer over the indicated",
"c43": "region."
}
}
This is a "can you" question. And here is a "yes" answer.
No, you shouldn't use duplicative object members to stuff side channel data into a JSON encoding. (See "The names within an object SHOULD be unique" in the RFC).
And yes, you could insert comments around the JSON, which you could parse out.
But if you want a way of inserting and extracting arbitrary side-channel data to a valid JSON, here is an answer. We take advantage of the non-unique representation of data in a JSON encoding. This is allowed* in section two of the RFC under "whitespace is allowed before or after any of the six structural characters".
*The RFC only states "whitespace is allowed before or after any of the six structural characters", not explicitly mentioning strings, numbers, "false", "true", and "null". This omission is ignored in ALL implementations.
First, canonicalize your JSON by minifying it:
$jsonMin = json_encode(json_decode($json));
Then encode your comment in binary:
$hex = unpack('H*', $comment);
$commentBinary = base_convert($hex[1], 16, 2);
Then steg your binary:
$steg = str_replace('0', ' ', $commentBinary);
$steg = str_replace('1', "\t", $steg);
Here is your output:
$jsonWithComment = $steg . $jsonMin;
In my case, I need to use comments for debug purposes just before the output of the JSON. So I put the debug information in the HTTP header, to avoid breaking the client:
header("My-Json-Comment: Yes, I know it's a workaround ;-) ");
We are using strip-json-comments for our project. It supports something like:
/*
* Description
*/
{
// rainbows
"unicorn": /* ❤ */ "cake"
}
Simply npm install --save strip-json-comments to install and use it like:
var strip_json_comments = require('strip-json-comments')
var json = '{/*rainbows*/"unicorn":"cake"}';
JSON.parse(strip_json_comments(json));
//=> {unicorn: 'cake'}