JSON slashes and backslashes in string on bourne shell - json

I am trying to parse json files that contain sequences of slashes and backslashes in some of their strings like this:
echo '{"tag_string":"/\/\/\ test"}' | jq
which gives me:
parse error: Invalid escape at line 1, column 27
I have tried escaping with backslashes at different positions, but I can't seem to find a correct way. How do I output the string as it is, without removing any character or getting errors?
This only works on bash, but not sh (or zsh):
echo '{"tag_string":"/\\/\\/\\ test"}' | jq -r '.tag_string'
/\/\/\ test

A forward slash character is legal, but a single backslash character is not. According to json.org char description, the valid chars are:
char
any-Unicode-character-
except-"-or-\-or-
control-character
\"
\\
\/
\b
\f
\n
\r
\t
\u four-hex-digits
So in your example, the single backslashes are not legal, you need either "\\" which is interpreted as double backslashes, or you need to remove them entirely.

If you are trying to include literal backslashes:
(bash)
echo '{"tag_string":"/\\/\\/\\ test"}' | jq
{
"tag_string": "/\\/\\/\\ test"
}
echo '{"tag_string":"/\\/\\/\\ test"}' | jq -r '.["tag_string"]'
/\/\/\ test
(sh)
echo '{"tag_string":"/\\\\/\\\\/\\\\ test"}' | jq -r '.["tag_string"]'
/\/\/\ test
printf "%s" '{"tag_string":"/\\/\\/\\ test"}' | jq -r '.["tag_string"]'
/\/\/\ test

If you are trying to convert a file with non-JSON strings, then consider a tool such as any-json. Using the "cson-to-json" mode, "\/" will be interpreted as "/":
$ any-json -format=cson
Input:
{"tag_string":"/\/\/\ test"}
Output:
{
"tag_string": "/// test"
}

Related

ANSI color codes with jq

Trying to make jq work with ANSI color codes.
Test cases:
$ echo '{"a":"b","c":"d"}' | jq -r .c
d # Matches my expected output
$ echo '{"a":"b","c":"\033[31md\033[0m"}' | jq -r .c
parse error: Invalid escape at line 1, column 31 # returns err code 4
$ echo '{"a":"b","c":"d"}' | jq -r '"foo"+.c+"bar"'
foodbar # Correct
$ echo '{"a":"b","c":"d"}' | jq -r '"\033[31m"+.c+"\033[0m"'
jq: error: Invalid escape at line 1, column 4 (while parsing '"\0"') at <top-level>, line 1:
"\033[31m"+.c+"\033[0m"
jq: error: Invalid escape at line 1, column 4 (while parsing '"\0"') at <top-level>, line 1:
"\033[31m"+.c+"\033[0m"
jq: 2 compile errors # returns err code 3
$ jq -rn '"\033[31mbar\033[0m"'
jq: error: Invalid escape at line 1, column 4 (while parsing '"\0"') at <top-level>, line 1:
"\033[31mbar\033[0m"
jq: error: Invalid escape at line 1, column 4 (while parsing '"\0"') at <top-level>, line 1:
"\033[31mbar\033[0m"
jq: 2 compile errors # returns err code 4
P.S. in case it matters, I am using the bash shell with version 5.1.16(1)-release on Linux.
Conslusion: ANSI colors do not work with jq, whether in the JSON string or directly concatenating it through the + operator.
Question: how to make ANSI colors work in jq? Any help would be appreciated.
Octal escape sequences are not valid JSON syntax, so you need to encode the ASCII escape character as \u001b rather than \033. Also, to add to the confusion, some versions of echo will attempt to interpret backslash (escape) sequences itself before passing them to jq, so in cases like this it's much safer to use printf '%s\n':
$ printf '%s\n' '{"a":"b","c":"\u001b[31md\u001b[0m"}' | jq -r .c
d
(You can't see it, but that "d" is red in my terminal.)
BTW, an easy way to find things like this out is to get jq to encode them in JSON for you. Here, I'll set the shell variable to the actual string (using bash's $'...' string format, which interprets ANSI-C escape sequences like \033), then use --arg to pass that to jq:
$ seq=$'\033[31md\033[0m'
$ jq -nc --arg seq "$seq" '{"a":"b","c":$seq}'
{"a":"b","c":"\u001b[31md\u001b[0m"}

How to extract elements from a string value in json, using jq [duplicate]

I'm trying to get jq to parse a JSON structure like:
{
"a" : 1,
"b" : 2,
"c" : "{\"id\":\"9ee ...\",\"parent\":\"abc...\"}\n"
}
That is, an element in the JSON is a string with escaped json.
So, I have something along the lines of
$ jq [.c] myFile.json | jq [.id]
But that crashes with jq: error: Cannot index string with string
This is because the output of .c is a string, not more JSON.
How do I get jq to parse this string?
My initial solution is to use sed to replace all the escape chars (\":\", \",\" and \") but that's messy, I assume there's a way built into jq to do this?
Thanks!
edit:
Also, the jq version available here is:
$ jq --version
jq version 1.3
I guess I could update it if required.
jq has the fromjson builtin for this:
jq '.c | fromjson | .id' myFile.json
fromjson was added in version 1.4.
You can use the raw output (-r) that will unescape characters:
jq -r .c myfile.json | jq .id
ADDENDUM: This has the advantage that it works in jq 1.3 and up; indeed, it should work in every version of jq that has the -r option.
Motivation: you want to parse JSON string - you want to escape a JSON object that's wrapped with quotes and represented as a String buffer, and convert it to a valid JSON object. For example:
some JSON unescaped string :
"{\"name\":\"John Doe\",\"position\":\"developer\"}"
the expected result ( a JSON object ):
{"name":"John Doe","position":"developer"}
Solution: In order to escape a JSON string and convert it into a valid JSON object use the sed tool in command line and use regex expressions to remove/replace specific characters:
cat current_json.txt | sed -e 's/\\\"/\"/g' -e 's/^.//g' -e 's/.$//g'
s/\\\"/\"/g replacing all backslashes and quotes ( \" ) into quotes only (")
s/^.//g replacing the first character in the stream to none character
s/.$//g replacing the last character in the stream to none character

how to escape single quote in `jq`

I am trying to format a json string using jq with expected output like this:
[
{
"command": [
"printf 'this is a text'"
]
}
]
However, I cannot get it to work for the single quotes ('), e.g. $ jq -n '[{"command": ["printf 'this is a text'"]}]' gives me a compile error.
I also thought about escaping all double quotes e.g. jq -n "[{\"command\": [\"printf 'this is a text'\"]}]", this is fine however the json string is passed in from a function, I can replace all double quotes with \" first and then run the jq command but it's not very elegant.
Is there a better way to handle the single quotes inside a json string?
Here are four alternatives that should work with a bash or bash-like shell. They can be adapted for other shells as well.
jq -n $'[{"command": ["printf \'this is a text\'"]}]'
cat << EOF | jq .
[{"command": ["printf 'this is a text'"]}]
EOF
jq --arg cmd "printf 'this is a text'" -n '[{command: [ $cmd ]}]'
VAR="[{\"command\": [\"printf 'this is a text'\"]}]"
jq -n --argjson var "$VAR" '$var'
See also How to escape single quotes within single quoted strings

jq double backslash sometime removed

I have a first json file like this:
{
"env_vars": {
"TERRAFORM_CFG_TLS_CERT": "-----BEGIN CERTIFICATE----\\nMIIIqzCCB5O"
}
}
If I use the command:
echo <file> | jq -r '.env_vars'
The result is as expected (the backslash are still there):
{
"TERRAFORM_CFG_TLS_CERT": "-----BEGIN CERTIFICATE----\\nMIIIqzCCB5O"
}
But if i execute this command:
cat <file> | jq -r '.env_vars' | jq -r 'keys[] as $k | "\($k)=\"\(.[$k])\""'
The result is:
TERRAFORM_CFG_TLS_CERT: "-----BEGIN CERTIFICATE----\nMIIIqzCCB5O"
=> One backslash has been removed... why ?
How to avoid this ?
Thanks.
Using the -r option tells jq to "translate" the JSON string into a "raw" string by interpreting the characters that are special to JSON (see e.g. http://json.org). Thus, following the [mcve] guidelines a bit more closely, we could start with:
$ jq . <<< '"X\\nY"'
"X\\nY"
$ jq -r . <<< '"X\\nY"'
X\nY
If you check the json.org specification of strings, you'll see this is exactly correct.
So if for some reason you want each occurrence of \\ in the JSON string to be replaced by two backslash characters (i.e. JSON: "\\\\"), you could use sub or gsub. That's a bit tricky, because the first argument of these functions is a regex. Behold:
$ jq -r 'gsub("\\\\"; "\\\\")' <<< '"X\\nY"'
X\\nY
You should output the string as json to preserve the escapes. By taking a string and outputting it raw, you're getting exactly what that string was, a literal backslash followed by an n.
$ ... | jq -r '.env_vars | to_entries[] | "\(.key): \(.value | tojson)"'
If any of the values are non-strings, add a tostring to the filter.

how to parse a JSON String with jq (or other alternatives)?

I'm trying to get jq to parse a JSON structure like:
{
"a" : 1,
"b" : 2,
"c" : "{\"id\":\"9ee ...\",\"parent\":\"abc...\"}\n"
}
That is, an element in the JSON is a string with escaped json.
So, I have something along the lines of
$ jq [.c] myFile.json | jq [.id]
But that crashes with jq: error: Cannot index string with string
This is because the output of .c is a string, not more JSON.
How do I get jq to parse this string?
My initial solution is to use sed to replace all the escape chars (\":\", \",\" and \") but that's messy, I assume there's a way built into jq to do this?
Thanks!
edit:
Also, the jq version available here is:
$ jq --version
jq version 1.3
I guess I could update it if required.
jq has the fromjson builtin for this:
jq '.c | fromjson | .id' myFile.json
fromjson was added in version 1.4.
You can use the raw output (-r) that will unescape characters:
jq -r .c myfile.json | jq .id
ADDENDUM: This has the advantage that it works in jq 1.3 and up; indeed, it should work in every version of jq that has the -r option.
Motivation: you want to parse JSON string - you want to escape a JSON object that's wrapped with quotes and represented as a String buffer, and convert it to a valid JSON object. For example:
some JSON unescaped string :
"{\"name\":\"John Doe\",\"position\":\"developer\"}"
the expected result ( a JSON object ):
{"name":"John Doe","position":"developer"}
Solution: In order to escape a JSON string and convert it into a valid JSON object use the sed tool in command line and use regex expressions to remove/replace specific characters:
cat current_json.txt | sed -e 's/\\\"/\"/g' -e 's/^.//g' -e 's/.$//g'
s/\\\"/\"/g replacing all backslashes and quotes ( \" ) into quotes only (")
s/^.//g replacing the first character in the stream to none character
s/.$//g replacing the last character in the stream to none character