JSON alternatives (for the purpose of specifying configuration)? - json

I like json as a format for configuration files for the software I write. I like that it's lightweight, simple, and widely supported. However, I'm finding that there are some things I'd really like in json that it doesn't have.
Json doesn't have multiline strings or here documents ( http://en.wikipedia.org/wiki/Here_document ), and that is often very awkward when you want your json file to be human-readable and -editable. You can use arrays of strings, but that's a kludgy workaround.
Json doesn't allow comments.
If you look at the formats of unix configuration files, you see a lot of people designing their own awkward formats for things that it would really make more sense to do using some kind of general-purpose thing. For example, here's some code from an Apache config file:
RewriteEngine on
RewriteBase /temp
RewriteCond %{HTTP_ACCEPT} application/xhtml\+xml
RewriteCond %{HTTP_ACCEPT} !application/xhtml\+xml\s*;\s*q=0
RewriteCond %{REQUEST_URI} \.html
RewriteCond %{THE_REQUEST} HTTP/1\.1
RewriteRule t\.html t.xhtml [T=application/xhtml+xml]
Essentially, what's going on here is that they've invented an extremely painful way of writing a boolean function f(w,x,y,z)=w&!x&y&z. You want a logical "or"? They've got some separate (ugly) mechanism for that, too.
What this seems to point toward is some kind of data description language that is simple and Turing-incomplete, but still more expressive, flexible, and convenient than json. Does anyone know of such a language?
To my taste, XML is too complicated, and lisp expressions have the wrong features (Turing-completeness) and lack the right features (here documents, expressive syntax).
[EDIT] The title is misleading. I'm not literally interested in the next iteration of json. I'm not interested in languages that are a subset of javascript. I'm interested in alternative data-description languages.

The EDN format is one option based on Clojure literals. It is almost a superset of JSON, except that no special symbol separates keys and values in maps (as : does in JSON); rather, all elements are separated by whitespace and/or a comma and a map is encoded as a list with an even number of elements, enclosed in {..}.
EDN allows for comments (to newline using ;, or to end of the next element read using #_), but not here-docs. It is extensible to new types using a tag notation:
#myapp/Person {:first "Fred" :last "Mertz"}
The argument of the myapp/Person tag (i.e. {:first "Fred" :last "Mertz"}) must be a valid EDN expression, which makes it unextensible to here-doc support.
It has two built-in tags: #inst for timestamps and #uuid. It also supports namespaced symbol (i.e. identifier) and keyword (i.e. map key consts) types; it distinguishes lists (..) and vectors [..]. An element of any type may be used as a key in a map.
In the context of your above problem, one could invent an #apache/rule-or tag which accepts a sequence of elements, whose semantics I leave up to you!

Have a look at http://github.com/igagis/puu/
It is even simpler than JSON.
It has C++ style comments.
It is possible to format multiline strings and use escaped new line \n and tab \t chars if "real" new line or tab is needed.
Here is the example snippet:
"String object"
AnotherStringObject
"String with children"{
"child 1"
Child2
"child three"{
SubChild1
"Subchild two"
Property1 {Value1}
"Property two" {"Value 2"}
//comment
/* multi-line
comment */
"multi-line
string"
"Escape sequences \" \n \r \t \\"
}
R"qwerty(
This is a
raw string, "Hello world!"
int main(argc, argv){
int a = 10;
printf("Hello %d", a);
}
)qwerty"
}

Consider TOML.
Designed for configuration. Appears to be pretty friendly and powerful. Easy to read and supports a wide range of datatypes and structures. There are parsers for a lot of languages:
C
C#
C++
Common Lisp
Crystal
Dart
Erlang
Fortran
Go
Janet
Java
JavaScript
Julia
Kotlin
Lua
Nim
OCaml
Perl
Perl6/Raku
Python
Rust
Swift
V

The 'J' in JSON is "Javascript". If a particular desired syntax construct isn't in Javascript, then it won't be on JSON.
Heredocs are beyond JSON's purview. That's a language syntax construct for simplified multi-line string definition, but JSON is a transport notation. It has nothing to do with construction. It does, however, have multiline strings, simply by allowing \n newline characters within strings. There's nothing in JSON that says you can't have a linebreak in a string. As long as the containing quote characters are correct, it's perfectly valid. e.g.
{"x":"y\nz"}
is 100% legitimate valid JSON, and is a multiline string, whereas
{"x":"y
z"}
isn't and will fail on parsing.

There's always what I like to call "real JSON". JSON stands for JavaScript Object Notation, and JavaScript does have comments and something close enough to heredocs.
For the heredoc, you would use JavaScript's E4X inline XML:
{
longString: <>
Hello, world!
This is a long string made possible with the magic of E4X.
Implementing a parser isn't so difficult.
</>.toString() // And a comment
/* And another
comment */
}
You can use Firefox's JavaScript engine (FF is the only browser to support E4X currently) or you can implement your own parser, which really isn't so difficult.
Here's the E4X quickstart guide, too.

Since March 2018 you can use JSON5 which seems to have added everything you (& many others) were missing from JSON.
Short Example (JSON5)
{
// comments
unquoted: 'and you can quote me on that',
singleQuotes: 'I can use "double quotes" here',
lineBreaks: "Look, Mom! \
No \\n's!",
hexadecimal: 0xdecaf,
leadingDecimalPoint: .8675309, andTrailing: 8675309.,
positiveSign: +1,
trailingComma: 'in objects', andIn: ['arrays',],
"backwardsCompatible": "with JSON",
}
The JSON5 Data Interchange Format (JSON5) is a superset of JSON that
aims to alleviate some of the limitations of JSON by expanding its
syntax to include some productions from ECMAScript 5.1.
Summary of Features
The following ECMAScript 5.1 features, which are not supported in
JSON, have been extended to JSON5.
Objects
Object keys may be an ECMAScript 5.1 IdentifierName.
Objects may have a single trailing comma.
Arrays
Arrays may have a single trailing comma.
Strings
Strings may be single quoted.
Strings may span multiple lines by escaping new line characters.
Strings may include character escapes.
Numbers
Numbers may be hexadecimal.
Numbers may have a leading or trailing decimal point.
Numbers may be IEEE 754 positive infinity, negative infinity, and NaN.
Numbers may begin with an explicit plus sign.
Comments
Single and multi-line comments are allowed.
White Space
Additional white space characters are allowed.
GitHub: https://github.com/json5/json5

One important attribute of JSON (probably the most important) is that you can easily "flip" between the string representation and the representation in object form, and the objects used to represent the object form are relatively simple arrays and maps. This is what makes JSON so useful in a networking context.
The functions you want would conflict with this dual nature of JSON.

For configuration you could use an embeddable scripting language, such as lua or python, in fact this is not an uncommon thing to do for configuration. That gives you multiline strings or here documents, and comments. It also makes it easier to have things like the boolean function you describe. However, the scripting languages are, of course, Turing complete.

There is also ELDF.
Although it does not support comments, they can be emulated via empty keys:
config_var1 = value1
=some comment
config_var2 = value2

Related

Regex for replacing unnecessary quotation marks within a JSON object containing an array

I am currently trying to format a JSON object using LabVIEW and have ran into the issue where it adds additional quotation marks invalidating my JSON formatting. I have not found a way around this so I thought just formatting the string manually would be enough.
Here is the JSON object that I have:
{
"contentType":"application/json",
"content":{
"msgType":2,
"objects":"["cat","dog","bird"]",
"count":3
}
}
Here is the JSON object I want with the quotation marks removed.
{
"contentType":"application/json",
"content":{
"msgType":2,
"objects":["cat","dog","bird"],
"count":3
}
}
I am still not an expert with regex and using a regex tester I was only able to grab the "objects" and "count" fields but I would still feel I would have to utilize substrings to remove the quotation marks.
Example I am using (would use a "count" to find the start of the next field and work backwards from there)
"([objects]*)"
Additionally, all the other Regex I have been looking at removes all instances of quotation marks whereas I only need a specific area trimmed. Thus, I feel that a specific regex replace would be a much more elegant solution.
If there is a better way to go about this I am happy to hear any suggestions!
Your question suggests that the built-in LabVIEW JSON tools are insufficient for your use case.
The built-in library converts LabVIEW clusters to JSON in a one-shot approach. Bundle all your data into a cluster and then convert it to JSON.
When it comes to parsing JSON, you use the path input terminal and the default type terminals to control what data is parsed from a JSON string.
If you need to handle JSON in a manner similar to say JavaScript, I would recommend something like the JSONText Toolkit which is free to use (and distribute) under the BSD licence. This allows more complex and iterative building of JSON strings from LabVIEW types and has text-path style element access along with many more features.
The Output controls from both my examples are identical - although JSONText provides a handy Pretty Print vi.
After using a regex from one of the comments, I ended up with this regex which allowed me to match the array itself.
(\[(?:"[^"]*"|[^"])+\])
I was able to split the the JSON string into before match, match and after match and removed the quotation marks from the end of 'before match' and start of 'after match' and concatenated the strings again to form a new output.

Custom reason for word wrap in VS Code extension to enable working with multiline values in Json

I write Jsons for an API that often requires to have multiline values because scripts are in between the data in the attributes. I've written an extension for me that can escape and unescape multiline values, therefore I can cycle between those states:
{
"value": "
multiline
value
"
}
{
"value": "multiline\n value"
}
However, in the un-escaped, formatted, status, I have an invalid Json, which just causes trouble. I have to switch between escaped and unescaped states to do any Json operation (like format), which I work around by replacing \n with \\n and back.
I have even considered switching to another format, but neither of those I tried had a killer feature making me switch. Among those: Jsonc (no multiline value support), XML (hard to write and read, but supports multiline values and indentation), YAML (would be an option, but does not support indentation in multiline values).
Can I force VS Code to render a specific sequence of characters as line break (in this case, it would be \\n) without changing the document data? The intended functionality is like what the Alt+Z word wrap does, just in a different place.
After some research, I've found the following:
it is not possible or hard to do to hijack the default editor and make it render lines in a way I want
even if I managed to, it might have an underlying issue of not tokenizing long lines
I have decided to go in a way, where I define a custom language, because:
* the API has a constant Json structure
* I can define a new grammar for the API scripts and embed it to the Jsons
This approach seems to be a way to go for me, although it might be a temporary solution. I'm losing Json validations, therefore I do not get a Json new line in value error, but I'm also losing errors in missing commas. This is something I want to approach with the following precautions:
if a JSON is valid and contains attributes of the API-defined classes, offer a button to switch to my defined language, which also un-escapes \\\n to \n
if language is my language, escape the newlines and try to pass it to a JSON validation.

When can quotes be omitted in JSON?

It seems one of the best-kept secrets of JSON: When exactly can you leave out the quotes around a string – and what quotes (single or double) are you supposed to use anyway?
The JSON standard is pretty clear about it: use double quotes, and use them always. Yet nobody seems to follow that, and parsers seem generally fine with it.
For example, the keys in JSON documents generally don't seem to need quotes. (I guess that's because the parser can assume that the key must be a string literal). But is that an actual rule? Are there any other such rules? Are they parser-specific or language-specific?
Note that although the question is about JSON, this includes the standard way to express JSON objects in a given programming language. If a language (such as JavaScript) has official rules that divert from the JSON standard, it would be helpful to see them defined.
Never. Dropping the quotes is legal in literals in JavaScript code, but illegal in JSON. Strings are always quoted, and keys are always strings. "Lax JSON" parsers may exist that accept illegal JSON with unquoted keys or other things, but that doesn't change the fact that it is illegal JSON as such, and no JSON parser is required to accept it.
Dropping the quotes in JSON object keys is a feature of the Javascript language, and possibly others. Python, for instance, has a dictionary syntax that is pretty similar to Javascript except that key names cannot be unquoted (though they can be single-quoted, and they don't need to be strings).
May be a duplicate of this question: JSON Spec - does the key have to be surrounded with quotes?
And this one: What is the difference between object keys with quotes and without quotes?
Neither of which addresses the question of whether this is in the Javascript specification, or if it is just allowed by most browsers. I found this in the official ECMAScript specification:
http://www.ecma-international.org/ecma-262/5.1/#sec-11.1.5
http://www.ecma-international.org/ecma-262/5.1/#sec-7.6
The first defines an object literal, in which the PropertyNameAndValue can be a StringLiteral or an IdentifierLiteral. The second defines an IdentifierLiteral, which does not have quotes.
So, yes, unquoted property names are officially allowed in Javascript.

JSON - not all fields quoted in Dojo diji.tree sample code in book

O'Reilly book "Dojo - The Definitive Guide" page 378 shows the following sample Tree structure which is supposedly JSON. It seems to work in building the Dijit Tree structure.
{
identifier: 'name',
label:'name',
items: [
{
name: "Programming Languages",
children: [
etc...
Should the word identifier, label, items, name, and children be enclosed in quotes?
I'm writing a Python program to generate syntax that is compatible with their desired tree structure. Just to test my output, I tried:
testDict = "xxxx" where xxxx is the string supposed JSON string above.
It always gives an error that 'identifier' is not defined.
So I'm curious if this was a typo - or if there are some new keywords or features of JSON that I need to learn.
Thanks,
Neal Walters
JSON doesn't really have any additional features. That's the beauty of it :)
You don't have to wrap those names in quotes. The names before the colon are supposed to be quoted, strictly according to the JSON spec. Why? Mostly (only?) because JavaScript gets upset when reserved words are used as object properties -- for example, if you had properties called 'function' or 'return'. Quoting these names consistently avoids this problem. Dojo doesn't care. It just uses eval to parse the JSON, and as long as you avoid keywords, it won't enforce the use of quotes. You can use quotes consistently if you like to be JSON compliant.
I'm not sure exactly what problem you had with your testDict example. I don't fully understand the context (what is testDict, what language are you using to set up that string, how is it used, etc.) Perhaps you needed to escape something in the JSON such as nested double quotes?
These are not new keywords or features of JSON but they are how dojo expects a JSON file to be structured. You should wrap them in quotes. Here's an example from dojocampus.

Do the JSON keys have to be surrounded by quotes?

Example:
Is the following code valid against the JSON Spec?
{
precision: "zip"
}
Or should I always use the following syntax? (And if so, why?)
{
"precision": "zip"
}
I haven't really found something about this in the JSON specifications. Although they use quotes around their keys in their examples.
Yes, you need quotation marks. This is to make it simpler and to avoid having to have another escape method for javascript reserved keywords, ie {for:"foo"}.
You are correct to use strings as the key. Here is an excerpt from RFC 4627 - The application/json Media Type for JavaScript Object Notation (JSON)
2.2. Objects
An object structure is represented as a pair of curly brackets
surrounding zero or more name/value pairs (or members). A name is a
string. A single colon comes after each name, separating the name
from the value. A single comma separates a value from a following
name. The names within an object SHOULD be unique.
object = begin-object [ member *( value-separator member ) ] end-object
member = string name-separator value
[...]
2.5. Strings
The representation of strings is similar to conventions used in the C
family of programming languages. A string begins and ends with
quotation marks. [...]
string = quotation-mark *char quotation-mark
quotation-mark = %x22 ; "
Read the whole RFC here.
From 2.2. Objects
An object structure is represented as a pair of curly brackets surrounding zero or more name/value pairs (or members). A name is a string.
and from 2.5. Strings
A string begins and ends with quotation marks.
So I would say that according to the standard: yes, you should always quote the key (although some parsers may be more forgiving)
Yes, quotes are mandatory. http://json.org/ says:
string
""
" chars "
Not if you use JSON5
For regular JSON, yes keys must be quoted. But if you need otherwise, checkout widely used JSON5, which is so-named because is a superset of JSON that allows ES5 syntax, including:
unquoted property keys
single-quoted, escaped and multi-line strings
alternate number formats
comments
extra whitespace
The JSON5 reference implementation (json5 npm package) provides a JSON5 object that has parse and stringify methods with the same args and semantics as the built-in JSON object.
widely used, and depended on by many high profile projects
JSON5 was started in 2012, and as of 2022, now gets >65M downloads/week, ranks in the top 0.1% of the most depended-upon packages on npm, and has been adopted by major projects like Chromium, Next.js, Babel, Retool, WebStorm, and more. It's also natively supported on Apple platforms like MacOS and iOS.
~ json5.org homepage
In your situation, both of them are valid, meaning that both of them will work.
However, you still should use the one with quotation marks in the key names because it is more conventional, which leads to more simplicity and ability to have key names with white spaces etc.
Therefore, use the one with the quotation marks.
edit// check this: What is the difference between JSON and Object Literal Notation?
Since you can put "parent.child" dotted notation and you don't have to put parent["child"] which is also valid and useful, I'd say both ways is technically acceptable. The parsers all should do both ways just fine. If your parser does not need quotes on keys then it's probably better not to put them (saves space). It makes sense to call them strings because that is what they are, and since the square brackets gives you the ability to use values for keys essentially it makes perfect sense not to.
In Json you can put...
>var keyName = "someKey";
>var obj = {[keyName]:"someValue"};
>obj
Object {someKey: "someValue"}
just fine without issues, if you need a value for a key and none quoted won't work, so if it doesn't, you can't, so you won't so "you don't need quotes on keys". Even if it's right to say they are technically strings. Logic and usage argue otherwise. Nor does it officially output Object {"someKey": "someValue"} for obj in our example run from the console of any browser.