When can quotes be omitted in JSON? - json

It seems one of the best-kept secrets of JSON: When exactly can you leave out the quotes around a string – and what quotes (single or double) are you supposed to use anyway?
The JSON standard is pretty clear about it: use double quotes, and use them always. Yet nobody seems to follow that, and parsers seem generally fine with it.
For example, the keys in JSON documents generally don't seem to need quotes. (I guess that's because the parser can assume that the key must be a string literal). But is that an actual rule? Are there any other such rules? Are they parser-specific or language-specific?
Note that although the question is about JSON, this includes the standard way to express JSON objects in a given programming language. If a language (such as JavaScript) has official rules that divert from the JSON standard, it would be helpful to see them defined.

Never. Dropping the quotes is legal in literals in JavaScript code, but illegal in JSON. Strings are always quoted, and keys are always strings. "Lax JSON" parsers may exist that accept illegal JSON with unquoted keys or other things, but that doesn't change the fact that it is illegal JSON as such, and no JSON parser is required to accept it.

Dropping the quotes in JSON object keys is a feature of the Javascript language, and possibly others. Python, for instance, has a dictionary syntax that is pretty similar to Javascript except that key names cannot be unquoted (though they can be single-quoted, and they don't need to be strings).
May be a duplicate of this question: JSON Spec - does the key have to be surrounded with quotes?
And this one: What is the difference between object keys with quotes and without quotes?
Neither of which addresses the question of whether this is in the Javascript specification, or if it is just allowed by most browsers. I found this in the official ECMAScript specification:
http://www.ecma-international.org/ecma-262/5.1/#sec-11.1.5
http://www.ecma-international.org/ecma-262/5.1/#sec-7.6
The first defines an object literal, in which the PropertyNameAndValue can be a StringLiteral or an IdentifierLiteral. The second defines an IdentifierLiteral, which does not have quotes.
So, yes, unquoted property names are officially allowed in Javascript.

Related

Is it good to always use string template literals with ES2015+/ES6+? [duplicate]

Is there a reason (performance or other) not to use backtick template literal syntax for all strings in a javascript source file? If so, what?
Should I prefer this:
var str1 = 'this is a string';
over this?
var str2 = `this is another string`;
Code-wise, there is no specific disadvantage. JS engines are smart enough to not have performance differences between a string literal and a template literal without variables.
In fact, I might even argue that it is good to always use template literals:
You can already use single quotes or double quotes to make strings. Choosing which one is largely arbitrary, and you just stick with one. However, it is encouraged to use the other quote if your string contains your chosen string marker, i.e. if you chose ', you would still do "don't argue" instead of 'don\'t argue'. However, backticks are very rare in normal language and strings, so you would actually more rarely have to either use another string literal syntax or use escape codes, which is good.
For example, you'd be forced to use escape sequences to have the string she said: "Don't do this!" with either double or single quotes, but you wouldn't have to when using backticks.
You don't have to convert if you want to use a variable in the string in the future.
However, those are very weak advantages. But still more than none, so I would mainly use template literals.
A real but in my opinion ignorable objection is the one of having to support environments where string literals are not supported. If you have those, you would know and wouldn't be asking this question.
The most significant reason not to use them is that ES6 is not supported in all environments.
Of course that might not affect you at all, but still: YAGNI. Don't use template literals unless you need interpolation, multiline literals, or unescaped quotes and apostrophes. Much of the arguments from When to use double or single quotes in JavaScript? carry over as well. As always, keep your code base consistent and use only one string literal style where you don't need a special one.
Always use template literals. In this case YAGNI is not correct. You absolutely will need it. At some point, you will have add a variable or new line to your string, at which point you will either need to change single quotes to backticks, or use the dreaded '+'.
Be careful when the values are for external use. We work with Tealium for marketing analysis, and it currently does not support ES6 template literals. Event data containing template literals aka string templates will cause the Tealium script to error.
I'm fairly convinced by other answers that there's no serious downside to using them exclusively, but one additional counterpoint is that template strings are also used in advanced "tagged template" syntax, and as illustrated in this Reddit comment, if you try to rely exclusively on JavaScript's automatic semicolon insertion or just forget to include a semicolon, you can run into parsing issues with statements that begin with a template string.
// OK (single (or double) quotes)
logger = console.log
'123'.split('').forEach(logger)
// OK (semicolon)
logger = console.log;
`123`.split('').forEach(logger)
// Not OK
logger = console.log
`123`.split('').forEach(logger) // Error

Unquoted json property name

One of the features in NewtonSoft's Json.Net parser is the support of unquoted property name, for example {test:"abc"}. Is it possible to turn off this feature so the json parser will throw an error when it parses a json string with unquoted property name?
What you can do is to deserialize and then serialize again and compare with original input. That will detect missing quotes. Even if this is not performance efficient, it is very simple and serves well when the relevant code performance does not matter much.

JSON alternatives (for the purpose of specifying configuration)?

I like json as a format for configuration files for the software I write. I like that it's lightweight, simple, and widely supported. However, I'm finding that there are some things I'd really like in json that it doesn't have.
Json doesn't have multiline strings or here documents ( http://en.wikipedia.org/wiki/Here_document ), and that is often very awkward when you want your json file to be human-readable and -editable. You can use arrays of strings, but that's a kludgy workaround.
Json doesn't allow comments.
If you look at the formats of unix configuration files, you see a lot of people designing their own awkward formats for things that it would really make more sense to do using some kind of general-purpose thing. For example, here's some code from an Apache config file:
RewriteEngine on
RewriteBase /temp
RewriteCond %{HTTP_ACCEPT} application/xhtml\+xml
RewriteCond %{HTTP_ACCEPT} !application/xhtml\+xml\s*;\s*q=0
RewriteCond %{REQUEST_URI} \.html
RewriteCond %{THE_REQUEST} HTTP/1\.1
RewriteRule t\.html t.xhtml [T=application/xhtml+xml]
Essentially, what's going on here is that they've invented an extremely painful way of writing a boolean function f(w,x,y,z)=w&!x&y&z. You want a logical "or"? They've got some separate (ugly) mechanism for that, too.
What this seems to point toward is some kind of data description language that is simple and Turing-incomplete, but still more expressive, flexible, and convenient than json. Does anyone know of such a language?
To my taste, XML is too complicated, and lisp expressions have the wrong features (Turing-completeness) and lack the right features (here documents, expressive syntax).
[EDIT] The title is misleading. I'm not literally interested in the next iteration of json. I'm not interested in languages that are a subset of javascript. I'm interested in alternative data-description languages.
The EDN format is one option based on Clojure literals. It is almost a superset of JSON, except that no special symbol separates keys and values in maps (as : does in JSON); rather, all elements are separated by whitespace and/or a comma and a map is encoded as a list with an even number of elements, enclosed in {..}.
EDN allows for comments (to newline using ;, or to end of the next element read using #_), but not here-docs. It is extensible to new types using a tag notation:
#myapp/Person {:first "Fred" :last "Mertz"}
The argument of the myapp/Person tag (i.e. {:first "Fred" :last "Mertz"}) must be a valid EDN expression, which makes it unextensible to here-doc support.
It has two built-in tags: #inst for timestamps and #uuid. It also supports namespaced symbol (i.e. identifier) and keyword (i.e. map key consts) types; it distinguishes lists (..) and vectors [..]. An element of any type may be used as a key in a map.
In the context of your above problem, one could invent an #apache/rule-or tag which accepts a sequence of elements, whose semantics I leave up to you!
Have a look at http://github.com/igagis/puu/
It is even simpler than JSON.
It has C++ style comments.
It is possible to format multiline strings and use escaped new line \n and tab \t chars if "real" new line or tab is needed.
Here is the example snippet:
"String object"
AnotherStringObject
"String with children"{
"child 1"
Child2
"child three"{
SubChild1
"Subchild two"
Property1 {Value1}
"Property two" {"Value 2"}
//comment
/* multi-line
comment */
"multi-line
string"
"Escape sequences \" \n \r \t \\"
}
R"qwerty(
This is a
raw string, "Hello world!"
int main(argc, argv){
int a = 10;
printf("Hello %d", a);
}
)qwerty"
}
Consider TOML.
Designed for configuration. Appears to be pretty friendly and powerful. Easy to read and supports a wide range of datatypes and structures. There are parsers for a lot of languages:
C
C#
C++
Common Lisp
Crystal
Dart
Erlang
Fortran
Go
Janet
Java
JavaScript
Julia
Kotlin
Lua
Nim
OCaml
Perl
Perl6/Raku
Python
Rust
Swift
V
The 'J' in JSON is "Javascript". If a particular desired syntax construct isn't in Javascript, then it won't be on JSON.
Heredocs are beyond JSON's purview. That's a language syntax construct for simplified multi-line string definition, but JSON is a transport notation. It has nothing to do with construction. It does, however, have multiline strings, simply by allowing \n newline characters within strings. There's nothing in JSON that says you can't have a linebreak in a string. As long as the containing quote characters are correct, it's perfectly valid. e.g.
{"x":"y\nz"}
is 100% legitimate valid JSON, and is a multiline string, whereas
{"x":"y
z"}
isn't and will fail on parsing.
There's always what I like to call "real JSON". JSON stands for JavaScript Object Notation, and JavaScript does have comments and something close enough to heredocs.
For the heredoc, you would use JavaScript's E4X inline XML:
{
longString: <>
Hello, world!
This is a long string made possible with the magic of E4X.
Implementing a parser isn't so difficult.
</>.toString() // And a comment
/* And another
comment */
}
You can use Firefox's JavaScript engine (FF is the only browser to support E4X currently) or you can implement your own parser, which really isn't so difficult.
Here's the E4X quickstart guide, too.
Since March 2018 you can use JSON5 which seems to have added everything you (& many others) were missing from JSON.
Short Example (JSON5)
{
// comments
unquoted: 'and you can quote me on that',
singleQuotes: 'I can use "double quotes" here',
lineBreaks: "Look, Mom! \
No \\n's!",
hexadecimal: 0xdecaf,
leadingDecimalPoint: .8675309, andTrailing: 8675309.,
positiveSign: +1,
trailingComma: 'in objects', andIn: ['arrays',],
"backwardsCompatible": "with JSON",
}
The JSON5 Data Interchange Format (JSON5) is a superset of JSON that
aims to alleviate some of the limitations of JSON by expanding its
syntax to include some productions from ECMAScript 5.1.
Summary of Features
The following ECMAScript 5.1 features, which are not supported in
JSON, have been extended to JSON5.
Objects
Object keys may be an ECMAScript 5.1 IdentifierName.
Objects may have a single trailing comma.
Arrays
Arrays may have a single trailing comma.
Strings
Strings may be single quoted.
Strings may span multiple lines by escaping new line characters.
Strings may include character escapes.
Numbers
Numbers may be hexadecimal.
Numbers may have a leading or trailing decimal point.
Numbers may be IEEE 754 positive infinity, negative infinity, and NaN.
Numbers may begin with an explicit plus sign.
Comments
Single and multi-line comments are allowed.
White Space
Additional white space characters are allowed.
GitHub: https://github.com/json5/json5
One important attribute of JSON (probably the most important) is that you can easily "flip" between the string representation and the representation in object form, and the objects used to represent the object form are relatively simple arrays and maps. This is what makes JSON so useful in a networking context.
The functions you want would conflict with this dual nature of JSON.
For configuration you could use an embeddable scripting language, such as lua or python, in fact this is not an uncommon thing to do for configuration. That gives you multiline strings or here documents, and comments. It also makes it easier to have things like the boolean function you describe. However, the scripting languages are, of course, Turing complete.
There is also ELDF.
Although it does not support comments, they can be emulated via empty keys:
config_var1 = value1
=some comment
config_var2 = value2

JSON and escaping characters

I have a string which gets serialized to JSON in Javascript, and then deserialized to Java.
It looks like if the string contains a degree symbol, then I get a problem.
I could use some help in figuring out who to blame:
is it the Spidermonkey 1.8 implementation? (this has a JSON implementation built-in)
is it Google gson?
is it me for not doing something properly?
Here's what happens in JSDB:
js>s='15\u00f8C'
15°C
js>JSON.stringify(s)
"15°C"
I would have expected "15\u00f8C' which leads me to believe that Spidermonkey's JSON implementation isn't doing the right thing... except that the JSON homepage's syntax description (is that the spec?) says that a char can be
any-Unicode-character-
except-"-or-\-or-
control-character"
so maybe it passes the string along as-is without encoding it as \u00f8... in which case I would think the problem is with the gson library.
Can anyone help?
I suppose my workaround is to use either a different JSON library, or manually escape strings myself after calling JSON.stringify() -- but if this is a bug then I'd like to file a bug report.
This is not a bug in either implementation. There is no requirement to escape U+00B0. To quote the RFC:
2.5. Strings
The representation of strings is
similar to conventions used in the C
family of programming languages. A
string begins and ends with quotation
marks. All Unicode characters may be
placed within the quotation marks
except for the characters that must be
escaped: quotation mark, reverse
solidus, and the control characters
(U+0000 through U+001F).
Any character may be escaped.
Escaping everything inflates the size of the data (all code points can be represented in four or fewer bytes in all Unicode transformation formats; whereas encoding them all makes them six or twelve bytes).
It is more likely that you have a text transcoding bug somewhere in your code and escaping everything in the ASCII subset masks the problem. It is a requirement of the JSON spec that all data use a Unicode encoding.
hmm, well here's a workaround anyway:
function JSON_stringify(s, emit_unicode)
{
var json = JSON.stringify(s);
return emit_unicode ? json : json.replace(/[\u007f-\uffff]/g,
function(c) {
return '\\u'+('0000'+c.charCodeAt(0).toString(16)).slice(-4);
}
);
}
test case:
js>s='15\u00f8C 3\u0111';
15°C 3◄
js>JSON_stringify(s, true)
"15°C 3◄"
js>JSON_stringify(s, false)
"15\u00f8C 3\u0111"
This is SUPER late and probably not relevant anymore, but if anyone stumbles upon this answer, I believe I know the cause.
So the JSON encoded string is perfectly valid with the degree symbol in it, as the other answer mentions. The problem is most likely in the character encoding that you are reading/writing with. Depending on how you are using Gson, you are probably passing it a java.io.Reader instance. Any time you are creating a Reader from an InputStream, you need to specify the character encoding, or java.nio.charset.Charset instance (it's usually best to use java.nio.charset.StandardCharsets.UTF_8). If you don't specify a Charset, Java will use your platform default encoding, which on Windows is usually CP-1252.

Do the JSON keys have to be surrounded by quotes?

Example:
Is the following code valid against the JSON Spec?
{
precision: "zip"
}
Or should I always use the following syntax? (And if so, why?)
{
"precision": "zip"
}
I haven't really found something about this in the JSON specifications. Although they use quotes around their keys in their examples.
Yes, you need quotation marks. This is to make it simpler and to avoid having to have another escape method for javascript reserved keywords, ie {for:"foo"}.
You are correct to use strings as the key. Here is an excerpt from RFC 4627 - The application/json Media Type for JavaScript Object Notation (JSON)
2.2. Objects
An object structure is represented as a pair of curly brackets
surrounding zero or more name/value pairs (or members). A name is a
string. A single colon comes after each name, separating the name
from the value. A single comma separates a value from a following
name. The names within an object SHOULD be unique.
object = begin-object [ member *( value-separator member ) ] end-object
member = string name-separator value
[...]
2.5. Strings
The representation of strings is similar to conventions used in the C
family of programming languages. A string begins and ends with
quotation marks. [...]
string = quotation-mark *char quotation-mark
quotation-mark = %x22 ; "
Read the whole RFC here.
From 2.2. Objects
An object structure is represented as a pair of curly brackets surrounding zero or more name/value pairs (or members). A name is a string.
and from 2.5. Strings
A string begins and ends with quotation marks.
So I would say that according to the standard: yes, you should always quote the key (although some parsers may be more forgiving)
Yes, quotes are mandatory. http://json.org/ says:
string
""
" chars "
Not if you use JSON5
For regular JSON, yes keys must be quoted. But if you need otherwise, checkout widely used JSON5, which is so-named because is a superset of JSON that allows ES5 syntax, including:
unquoted property keys
single-quoted, escaped and multi-line strings
alternate number formats
comments
extra whitespace
The JSON5 reference implementation (json5 npm package) provides a JSON5 object that has parse and stringify methods with the same args and semantics as the built-in JSON object.
widely used, and depended on by many high profile projects
JSON5 was started in 2012, and as of 2022, now gets >65M downloads/week, ranks in the top 0.1% of the most depended-upon packages on npm, and has been adopted by major projects like Chromium, Next.js, Babel, Retool, WebStorm, and more. It's also natively supported on Apple platforms like MacOS and iOS.
~ json5.org homepage
In your situation, both of them are valid, meaning that both of them will work.
However, you still should use the one with quotation marks in the key names because it is more conventional, which leads to more simplicity and ability to have key names with white spaces etc.
Therefore, use the one with the quotation marks.
edit// check this: What is the difference between JSON and Object Literal Notation?
Since you can put "parent.child" dotted notation and you don't have to put parent["child"] which is also valid and useful, I'd say both ways is technically acceptable. The parsers all should do both ways just fine. If your parser does not need quotes on keys then it's probably better not to put them (saves space). It makes sense to call them strings because that is what they are, and since the square brackets gives you the ability to use values for keys essentially it makes perfect sense not to.
In Json you can put...
>var keyName = "someKey";
>var obj = {[keyName]:"someValue"};
>obj
Object {someKey: "someValue"}
just fine without issues, if you need a value for a key and none quoted won't work, so if it doesn't, you can't, so you won't so "you don't need quotes on keys". Even if it's right to say they are technically strings. Logic and usage argue otherwise. Nor does it officially output Object {"someKey": "someValue"} for obj in our example run from the console of any browser.