Emitting JSON with yaml-cpp? - json

I'm using yaml-cpp on a for a variety of things on my project. Now I want to write out some data as JSON. Since JSON is a subset of YAML, at least for the features I need, I understand it should be possible to set some options in yaml-cpp to output pure JSON. How is that done?

yaml-cpp doesn't directly have a way to force JSON-compatible output, but you can probably emulate it.
YAML:Emitter Emitter;
emitter << YAML:: DoubleQuoted << YAML::Flow << /* rest of code */;

Jesse Beder's answer didn't seem to work for me; I still got multiple lines of output with YAML syntax. However, I found that by adding << YAML::BeginSeq immediately after << YAML::Flow, you can force everything to end up on one line with JSON syntax. You then have to remove the beginning [ character:
YAML::Emitter emitter;
emitter << YAML::DoubleQuoted << YAML::Flow << YAML::BeginSeq << node;
std::string json(emitter.c_str() + 1); // Remove beginning [ character
Here is a fully worked example.
There's still a major issue, though: numbers are quoted, turning them into strings. I'm not sure whether this is an intentional behavior of YAML::DoubleQuoted; looking at the tests, I didn't see any test case that covers what happens when you apply DoubleQuoted to a number. This issue has been filed here.

Related

how can i make my code indented like this for better readability and ease [duplicate]

Is it possible to have multi-line strings in JSON?
It's mostly for visual comfort so I suppose I can just turn word wrap on in my editor, but I'm just kinda curious.
I'm writing some data files in JSON format and would like to have some really long string values split over multiple lines. Using python's JSON module I get a whole lot of errors, whether I use \ or \n as an escape.
JSON does not allow real line-breaks. You need to replace all the line breaks with \n.
eg:
"first line
second line"
can be saved with:
"first line\nsecond line"
Note:
for Python, this should be written as:
"first line\\nsecond line"
where \\ is for escaping the backslash, otherwise python will treat \n as
the control character "new line"
Unfortunately many of the answers here address the question of how to put a newline character in the string data. The question is how to make the code look nicer by splitting the string value across multiple lines of code. (And even the answers that recognize this provide "solutions" that assume one is free to change the data representation, which in many cases one is not.)
And the worse news is, there is no good answer.
In many programming languages, even if they don't explicitly support splitting strings across lines, you can still use string concatenation to get the desired effect; and as long as the compiler isn't awful this is fine.
But json is not a programming language; it's just a data representation. You can't tell it to concatenate strings. Nor does its (fairly small) grammar include any facility for representing a string on multiple lines.
Short of devising a pre-processor of some kind (and I, for one, don't feel like effectively making up my own language to solve this issue), there isn't a general solution to this problem. IF you can change the data format, then you can substitute an array of strings. Otherwise, this is one of the numerous ways that json isn't designed for human-readability.
I have had to do this for a small Node.js project and found this work-around to store multiline strings as array of lines to make it more human-readable (at a cost of extra code to convert them to string later):
{
"modify_head": [
"<script type='text/javascript'>",
"<!--",
" function drawSomeText(id) {",
" var pjs = Processing.getInstanceById(id);",
" var text = document.getElementById('inputtext').value;",
" pjs.drawText(text);}",
"-->",
"</script>"
],
"modify_body": [
"<input type='text' id='inputtext'></input>",
"<button onclick=drawSomeText('ExampleCanvas')></button>"
],
}
Once parsed, I just use myData.modify_head.join('\n') or myData.modify_head.join(), depending upon whether I want a line break after each string or not.
This looks quite neat to me, apart from that I have to use double quotes everywhere. Though otherwise, I could, perhaps, use YAML, but that has other pitfalls and is not supported natively.
Check out the specification! The JSON grammar's char production can take the following values:
any-Unicode-character-except-"-or-\-or-control-character
\"
\\
\/
\b
\f
\n
\r
\t
\u four-hex-digits
Newlines are "control characters" so, no, you may not have a literal newline within your string. However you may encode it using whatever combination of \n and \r you require.
JSON doesn't allow breaking lines for readability.
Your best bet is to use an IDE that will line-wrap for you.
This is a really old question, but I came across this on a search and I think I know the source of your problem.
JSON does not allow "real" newlines in its data; it can only have escaped newlines. See the answer from #YOU. According to the question, it looks like you attempted to escape line breaks in Python two ways: by using the line continuation character ("\") or by using "\n" as an escape.
But keep in mind: if you are using a string in python, special escaped characters ("\t", "\n") are translated into REAL control characters! The "\n" will be replaced with the ASCII control character representing a newline character, which is precisely the character that is illegal in JSON. (As for the line continuation character, it simply takes the newline out.)
So what you need to do is to prevent Python from escaping characters. You can do this by using a raw string (put r in front of the string, as in r"abc\ndef", or by including an extra slash in front of the newline ("abc\\ndef").
Both of the above will, instead of replacing "\n" with the real newline ASCII control character, will leave "\n" as two literal characters, which then JSON can interpret as a newline escape.
Write property value as a array of strings. Like example given over here https://gun.io/blog/multi-line-strings-in-json/. This will help.
We can always use array of strings for multiline strings like following.
{
"singleLine": "Some singleline String",
"multiline": ["Line one", "line Two", "Line Three"]
}
And we can easily iterate array to display content in multi line fashion.
While not standard, I found that some of the JSON libraries have options to support multiline Strings. I am saying this with the caveat, that this will hurt your interoperability.
However in the specific scenario I ran into, I needed to make a config file that was only ever used by one system readable and manageable by humans. And opted for this solution in the end.
Here is how this works out on Java with Jackson:
JsonMapper mapper = JsonMapper.builder()
.enable(JsonReadFeature.ALLOW_UNESCAPED_CONTROL_CHARS)
.build()
This is a very old question, but I had the same question when I wanted to improve readability of our Vega JSON Specification code which uses complex conditoinal expressions. The code is like this.
As this answer says, JSON is not designed for human. I understand that is a historical decision and it makes sense for data exchange purposes. However, JSON is still used as source code for such cases. So I asked our engineers to use Hjson for source code and process it to JSON.
For example, in Git for Windows environment,
you can download the Hjson cli binary and put it in git/bin directory to use.
Then, convert (transpile) Hjson source to JSON. To use automation tools such as Make will be useful to generate JSON.
$ which hjson
/c/Program Files/git/bin/hjson
$ cat example.hjson
{
md:
'''
First line.
Second line.
This line is indented by two spaces.
'''
}
$ hjson -j example.hjson > example.json
$ cat example.json
{
"md": "First line.\nSecond line.\n This line is indented by two spaces."
}
In case of using the transformed JSON in programming languages, language-specific libraries like hjson-js will be useful.
I noticed the same idea was posted in a duplicated question but I would share a bit more information.
You can encode at client side and decode at server side. This will take care of \n and \t characters as well
e.g. I needed to send multiline xml through json
{
"xml": "PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiID8+CiAgPFN0cnVjdHVyZXM+CiAgICAgICA8aW5wdXRzPgogICAgICAgICAgICAgICAjIFRoaXMgcHJvZ3JhbSBhZGRzIHR3byBudW1iZXJzCgogICAgICAgICAgICAgICBudW0xID0gMS41CiAgICAgICAgICAgICAgIG51bTIgPSA2LjMKCiAgICAgICAgICAgICAgICMgQWRkIHR3byBudW1iZXJzCiAgICAgICAgICAgICAgIHN1bSA9IG51bTEgKyBudW0yCgogICAgICAgICAgICAgICAjIERpc3BsYXkgdGhlIHN1bQogICAgICAgICAgICAgICBwcmludCgnVGhlIHN1bSBvZiB7MH0gYW5kIHsxfSBpcyB7Mn0nLmZvcm1hdChudW0xLCBudW0yLCBzdW0pKQogICAgICAgPC9pbnB1dHM+CiAgPC9TdHJ1Y3R1cmVzPg=="
}
then decode it on server side
public class XMLInput
{
public string xml { get; set; }
public string DecodeBase64()
{
var valueBytes = System.Convert.FromBase64String(this.xml);
return Encoding.UTF8.GetString(valueBytes);
}
}
public async Task<string> PublishXMLAsync([FromBody] XMLInput xmlInput)
{
string data = xmlInput.DecodeBase64();
}
once decoded you'll get your original xml
<?xml version="1.0" encoding="utf-8" ?>
<Structures>
<inputs>
# This program adds two numbers
num1 = 1.5
num2 = 6.3
# Add two numbers
sum = num1 + num2
# Display the sum
print('The sum of {0} and {1} is {2}'.format(num1, num2, sum))
</inputs>
</Structures>
\n\r\n worked for me !!
\n for single line break and \n\r\n for double line break
I see many answers here that may not works in most cases but may be the easiest solution if let's say you wanna output what you wrote down inside a JSON file (for example: for language translations where you wanna have just one key with more than 1 line outputted on the client) can be just adding some special characters of your choice PS: allowed by the JSON files like \\ before the new line and use some JS to parse the text ... like:
Example:
File (text.json)
{"text": "some JSON text. \\ Next line of JSON text"}
import text from 'text.json'
{text.split('\\')
.map(line => {
return (
<div>
{line}
<br />
</div>
);
})}}
Assuming the question has to do with easily editing text files and then manually converting them to json, there are two solutions I found:
hjson (that was mentioned in this previous answer), in which case you can convert your existing json file to hjson format by executing hjson source.json > target.hjson, edit in your favorite editor, and convert back to json hjson -j target.hjson > source.json. You can download the binary here or use the online conversion here.
jsonnet, which does the same, but with a slightly different format (single and double quoted strings are simply allowed to span multiple lines). Conveniently, the homepage has editable input fields so you can simply insert your multiple line json/jsonnet files there and they will be converted online to standard json immediately. Note that jsonnet supports much more goodies for templating json files, so it may be useful to look into, depending on your needs.
If it's just for presentation in your editor you may use ` instead of " or '
const obj = {
myMultiLineString: `This is written in a \
multiline way. \
The backside of it is that you \
can't use indentation on every new \
line because is would be included in \
your string. \
The backslash after each line escapes the carriage return.
`
}
Examples:
console.log(`First line \
Second line`);
will put in console:
First line Second line
console.log(`First line
second line`);
will put in console:
First line
second line
Hope this answered your question.

Entry delimiter of JSON files for Hive table

We are collecting JSON data (public social media posts in particular) via REST API invocations, which we plan to dump into HDFS, then abstract a Hive table on top it using SerDe. I wonder though what would be the appropriate delimiter per JSON entry in a file? Is it new line ("\n")? So it would look like this:
{ id: entry1 ... post: }
{ id: entry2 ... post: }
...
{ id: entryn ... post: }
How about if we encounter a new line character within the JSON data itself, for example in post?
The best way would be one record per line, separated by "\n" exactly as you guessed.
This also means that you should be careful to escape "\n" that may be inside the JSON elements.
Indented JSON won't work well with hadoop/hive, since to distribute processing, hadoop must be able to tell when a records ends, so it can split processing of a file with N bytes with W workers in W chunks of size roughly N/W.
The splitting is done by the particular InputFormat that's been used, in case of text, TextInputFormat.
TextInputFormat will basically split the file at the first instance of "\n" found after byte i*N/W (for i from 1 to W-1).
For this reason, having other "\n" around would confuse Hadoop and it will give you incomplete records.
As an alternative, I wouldn't recommend it, but if you really wanted you could use a character other than "\n" by configuring the property "textinputformat.record.delimiter" when reading the file through hadoop/hive, using a character that won't be in JSON (for instance, \001 or CTRL-A is commonly used by Hive as a field delimiter) but that can be tricky since it has to also be supported by the SerDe.
Also, if you change the record delimiter, anybody who copies/uses the file on HDFS must be aware of the delimiter, or they won't be able to parse it correctly, and will need special code to do it, while keeping "\n" as a delimiter, the files will still be normal text files and can be used by other tools.
As for the SerDe, I'd recommend this one, with the disclaimer that I wrote it :)
https://github.com/rcongiu/Hive-JSON-Serde

Uncrustify command for CUDA kernel

I would like to apply uncrustify (via beautify in the Atom editor and a config file) to CUDA code. However, I don't know how to tell uncrustify to recognize CUDA kernel calls which have the following structure:
kernelName <<<N,M>>> (arg0,arg1,...);
However, uncrustify has problems with the <<< >>> and applying it gives the following unpleasant result
kernelName << < N, M >> >
(arg0,arg1,...);
I would like to have it look more like a function call and also avoid the formatting of <<< to << <. Ideally, the result would look like
kernelName <<< N, M >>> (arg0,arg1,
...); // line break if argument list is too long
Which arguments can I add to my config.cfg to achieve the above result?
Thank you very much.
Looking through whole documentation of uncrustify, I have found 2 arguments that could influence in your CUDA kernel style:
sp_compare { Ignore, Add, Remove, Force }
Add or remove space around compare operator '<', '>', '==', etc
And:
align_left_shift { False, True }
Align lines that start with '<<' with previous '<<'. Default=true
You can try to play around with these parameters to be more closely to the solution although I would try something like:
sp_compare = Remove
align_left_shift = False

Fixing a bad JSON grammar

I've just started learning about parsing, and I wrote this simple parser in Haskell (using parsec) to read JSON and construct a simple tree for it. I am using the grammar in RFC 4627.
However, when I try parsing the string {"x":1 }, I'm getting the output:
parse error at (line 1, column 8):
unexpected "}"
expecting whitespace character or ","
This only seems to be happening when I have spaces before a closing brace (]) or mustachio (}).
What have I done wrong? If I avoid whitespace before a closing symbol, it works perfectly.
Parsec doesn't do rewinding and backtracking automatically. When you write sepBy member valueSeparator, the valueSeparator consumes white space, so the parser will parse your value like so:
{"x":1 }
[------- object
% beginObject
[-] name
% nameSeparator
% jvalue
[- valueSeparator
X In valueSeparator: unexpected "}"
Legend:
[--] full match
% full char match
[-- incomplete match
X incomplete char match
When the valueSeparator fails, Parsec won't go back and try a different combination of parses, because one character has already matched in valueSeparator.
You have two options to solve your problem:
Since white space is insignificant in JSON, always consume white space after a significant token, never before. So, a tok should only consume white space after the char, so its definition is tok c = char c *> ws ((*>) from Control.Applicative); apply the same rule to all the other parsers. Since you'll never consume white space after having entered the "wrong parser" that way, you won't end up having to back-track.
Use back-tracking in Parsec by adding try in front of parsers that might consume more than one character, and that should rewind their input if they fail.
EDIT: updated ASCII graphic to make more sense.
A general solution would be to have all your parsers skip trailing whitespace. Check out lexeme (in ParsecToken) in the Parsec docs for a neat way to do this or just whip up a simple version yourself:
lexeme parser = do result <- parser
spaces
return result
Then use this function on all of your tokens (like numerical literals). This way you only ever have to worry about the whitespace at the very beginning of an expression.
For more info about ParsecToken and friends, look at the "Lexical Analysis" section of the Parsec docs.
It makes sense to only skip whitespace after a token except at the very beginning where you can skip it manually. You should take this approach even if you end up not using the ParsecToken module.
It seems you already have tok which acts like my lexeme except it consumes whitespace on both sides. Change it to only consume whitespace after the token and just ignore the whitespace at the very beginning of the input manually. That should (ideally :)) fix the problem.

Are multi-line strings allowed in JSON?

Is it possible to have multi-line strings in JSON?
It's mostly for visual comfort so I suppose I can just turn word wrap on in my editor, but I'm just kinda curious.
I'm writing some data files in JSON format and would like to have some really long string values split over multiple lines. Using python's JSON module I get a whole lot of errors, whether I use \ or \n as an escape.
JSON does not allow real line-breaks. You need to replace all the line breaks with \n.
eg:
"first line
second line"
can be saved with:
"first line\nsecond line"
Note:
for Python, this should be written as:
"first line\\nsecond line"
where \\ is for escaping the backslash, otherwise python will treat \n as
the control character "new line"
Unfortunately many of the answers here address the question of how to put a newline character in the string data. The question is how to make the code look nicer by splitting the string value across multiple lines of code. (And even the answers that recognize this provide "solutions" that assume one is free to change the data representation, which in many cases one is not.)
And the worse news is, there is no good answer.
In many programming languages, even if they don't explicitly support splitting strings across lines, you can still use string concatenation to get the desired effect; and as long as the compiler isn't awful this is fine.
But json is not a programming language; it's just a data representation. You can't tell it to concatenate strings. Nor does its (fairly small) grammar include any facility for representing a string on multiple lines.
Short of devising a pre-processor of some kind (and I, for one, don't feel like effectively making up my own language to solve this issue), there isn't a general solution to this problem. IF you can change the data format, then you can substitute an array of strings. Otherwise, this is one of the numerous ways that json isn't designed for human-readability.
I have had to do this for a small Node.js project and found this work-around to store multiline strings as array of lines to make it more human-readable (at a cost of extra code to convert them to string later):
{
"modify_head": [
"<script type='text/javascript'>",
"<!--",
" function drawSomeText(id) {",
" var pjs = Processing.getInstanceById(id);",
" var text = document.getElementById('inputtext').value;",
" pjs.drawText(text);}",
"-->",
"</script>"
],
"modify_body": [
"<input type='text' id='inputtext'></input>",
"<button onclick=drawSomeText('ExampleCanvas')></button>"
],
}
Once parsed, I just use myData.modify_head.join('\n') or myData.modify_head.join(), depending upon whether I want a line break after each string or not.
This looks quite neat to me, apart from that I have to use double quotes everywhere. Though otherwise, I could, perhaps, use YAML, but that has other pitfalls and is not supported natively.
Check out the specification! The JSON grammar's char production can take the following values:
any-Unicode-character-except-"-or-\-or-control-character
\"
\\
\/
\b
\f
\n
\r
\t
\u four-hex-digits
Newlines are "control characters" so, no, you may not have a literal newline within your string. However you may encode it using whatever combination of \n and \r you require.
JSON doesn't allow breaking lines for readability.
Your best bet is to use an IDE that will line-wrap for you.
This is a really old question, but I came across this on a search and I think I know the source of your problem.
JSON does not allow "real" newlines in its data; it can only have escaped newlines. See the answer from #YOU. According to the question, it looks like you attempted to escape line breaks in Python two ways: by using the line continuation character ("\") or by using "\n" as an escape.
But keep in mind: if you are using a string in python, special escaped characters ("\t", "\n") are translated into REAL control characters! The "\n" will be replaced with the ASCII control character representing a newline character, which is precisely the character that is illegal in JSON. (As for the line continuation character, it simply takes the newline out.)
So what you need to do is to prevent Python from escaping characters. You can do this by using a raw string (put r in front of the string, as in r"abc\ndef", or by including an extra slash in front of the newline ("abc\\ndef").
Both of the above will, instead of replacing "\n" with the real newline ASCII control character, will leave "\n" as two literal characters, which then JSON can interpret as a newline escape.
Write property value as a array of strings. Like example given over here https://gun.io/blog/multi-line-strings-in-json/. This will help.
We can always use array of strings for multiline strings like following.
{
"singleLine": "Some singleline String",
"multiline": ["Line one", "line Two", "Line Three"]
}
And we can easily iterate array to display content in multi line fashion.
While not standard, I found that some of the JSON libraries have options to support multiline Strings. I am saying this with the caveat, that this will hurt your interoperability.
However in the specific scenario I ran into, I needed to make a config file that was only ever used by one system readable and manageable by humans. And opted for this solution in the end.
Here is how this works out on Java with Jackson:
JsonMapper mapper = JsonMapper.builder()
.enable(JsonReadFeature.ALLOW_UNESCAPED_CONTROL_CHARS)
.build()
This is a very old question, but I had the same question when I wanted to improve readability of our Vega JSON Specification code which uses complex conditoinal expressions. The code is like this.
As this answer says, JSON is not designed for human. I understand that is a historical decision and it makes sense for data exchange purposes. However, JSON is still used as source code for such cases. So I asked our engineers to use Hjson for source code and process it to JSON.
For example, in Git for Windows environment,
you can download the Hjson cli binary and put it in git/bin directory to use.
Then, convert (transpile) Hjson source to JSON. To use automation tools such as Make will be useful to generate JSON.
$ which hjson
/c/Program Files/git/bin/hjson
$ cat example.hjson
{
md:
'''
First line.
Second line.
This line is indented by two spaces.
'''
}
$ hjson -j example.hjson > example.json
$ cat example.json
{
"md": "First line.\nSecond line.\n This line is indented by two spaces."
}
In case of using the transformed JSON in programming languages, language-specific libraries like hjson-js will be useful.
I noticed the same idea was posted in a duplicated question but I would share a bit more information.
You can encode at client side and decode at server side. This will take care of \n and \t characters as well
e.g. I needed to send multiline xml through json
{
"xml": "PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiID8+CiAgPFN0cnVjdHVyZXM+CiAgICAgICA8aW5wdXRzPgogICAgICAgICAgICAgICAjIFRoaXMgcHJvZ3JhbSBhZGRzIHR3byBudW1iZXJzCgogICAgICAgICAgICAgICBudW0xID0gMS41CiAgICAgICAgICAgICAgIG51bTIgPSA2LjMKCiAgICAgICAgICAgICAgICMgQWRkIHR3byBudW1iZXJzCiAgICAgICAgICAgICAgIHN1bSA9IG51bTEgKyBudW0yCgogICAgICAgICAgICAgICAjIERpc3BsYXkgdGhlIHN1bQogICAgICAgICAgICAgICBwcmludCgnVGhlIHN1bSBvZiB7MH0gYW5kIHsxfSBpcyB7Mn0nLmZvcm1hdChudW0xLCBudW0yLCBzdW0pKQogICAgICAgPC9pbnB1dHM+CiAgPC9TdHJ1Y3R1cmVzPg=="
}
then decode it on server side
public class XMLInput
{
public string xml { get; set; }
public string DecodeBase64()
{
var valueBytes = System.Convert.FromBase64String(this.xml);
return Encoding.UTF8.GetString(valueBytes);
}
}
public async Task<string> PublishXMLAsync([FromBody] XMLInput xmlInput)
{
string data = xmlInput.DecodeBase64();
}
once decoded you'll get your original xml
<?xml version="1.0" encoding="utf-8" ?>
<Structures>
<inputs>
# This program adds two numbers
num1 = 1.5
num2 = 6.3
# Add two numbers
sum = num1 + num2
# Display the sum
print('The sum of {0} and {1} is {2}'.format(num1, num2, sum))
</inputs>
</Structures>
\n\r\n worked for me !!
\n for single line break and \n\r\n for double line break
I see many answers here that may not works in most cases but may be the easiest solution if let's say you wanna output what you wrote down inside a JSON file (for example: for language translations where you wanna have just one key with more than 1 line outputted on the client) can be just adding some special characters of your choice PS: allowed by the JSON files like \\ before the new line and use some JS to parse the text ... like:
Example:
File (text.json)
{"text": "some JSON text. \\ Next line of JSON text"}
import text from 'text.json'
{text.split('\\')
.map(line => {
return (
<div>
{line}
<br />
</div>
);
})}}
Assuming the question has to do with easily editing text files and then manually converting them to json, there are two solutions I found:
hjson (that was mentioned in this previous answer), in which case you can convert your existing json file to hjson format by executing hjson source.json > target.hjson, edit in your favorite editor, and convert back to json hjson -j target.hjson > source.json. You can download the binary here or use the online conversion here.
jsonnet, which does the same, but with a slightly different format (single and double quoted strings are simply allowed to span multiple lines). Conveniently, the homepage has editable input fields so you can simply insert your multiple line json/jsonnet files there and they will be converted online to standard json immediately. Note that jsonnet supports much more goodies for templating json files, so it may be useful to look into, depending on your needs.
The reason OP asked is the same reason I ended up here. Had a json file with long text.
In VS Code it's just ALT+Z to turn on word wrapping in a json file. Changing the actual data isn't what you want, if all you really want is to read the contents of the file as a developer.
If it's just for presentation in your editor you may use ` instead of " or '
const obj = {
myMultiLineString: `This is written in a \
multiline way. \
The backside of it is that you \
can't use indentation on every new \
line because is would be included in \
your string. \
The backslash after each line escapes the carriage return.
`
}
Examples:
console.log(`First line \
Second line`);
will put in console:
First line Second line
console.log(`First line
second line`);
will put in console:
First line
second line
Hope this answered your question.