This question already has answers here:
Parsing JSON with Unix tools
(45 answers)
Closed 6 years ago.
In shell I have a requirement wherein I have to read the JSON response which is in the following format:
{ "Messages": [ { "Body": "172.16.1.42|/home/480/1234/5-12-2013/1234.toSort", "ReceiptHandle": "uUk89DYFzt1VAHtMW2iz0VSiDcGHY+H6WtTgcTSgBiFbpFUg5lythf+wQdWluzCoBziie8BiS2GFQVoRjQQfOx3R5jUASxDz7SmoCI5bNPJkWqU8ola+OYBIYNuCP1fYweKl1BOFUF+o2g7xLSIEkrdvLDAhYvHzfPb4QNgOSuN1JGG1GcZehvW3Q/9jq3vjYVIFz3Ho7blCUuWYhGFrpsBn5HWoRYE5VF5Bxc/zO6dPT0n4wRAd3hUEqF3WWeTMlWyTJp1KoMyX7Z8IXH4hKURGjdBQ0PwlSDF2cBYkBUA=", "MD5OfBody": "53e90dc3fa8afa3452c671080569642e", "MessageId": "e93e9238-f9f8-4bf4-bf5b-9a0cae8a0ebc" } ] }
Here I am only concerned with the "Body" property value. I made some unsuccessful attempts like:
jsawk -a 'return this.Body'
or
awk -v k="Body" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}
But that did not suffice. Can anyone help me with this?
There is jq for parsing json on the command line:
jq '.Body'
Visit this for jq: https://stedolan.github.io/jq/
tl;dr
$ cat /tmp/so.json | underscore select '.Messages .Body'
["172.16.1.42|/home/480/1234/5-12-2013/1234.toSort"]
Javascript CLI tools
You can use Javascript CLI tools like
underscore-cli:
json:select(): CSS-like selectors for JSON.
Example
Select all name children of a addons:
underscore select ".addons > .name"
The underscore-cli provide others real world examples as well as the json:select() doc.
Similarly using Bash regexp. Shall be able to snatch any key/value pair.
key="Body"
re="\"($key)\": \"([^\"]*)\""
while read -r l; do
if [[ $l =~ $re ]]; then
name="${BASH_REMATCH[1]}"
value="${BASH_REMATCH[2]}"
echo "$name=$value"
else
echo "No match"
fi
done
Regular expression can be tuned to match multiple spaces/tabs or newline(s). Wouldn't work if value has embedded ". This is an illustration. Better to use some "industrial" parser :)
Here is a crude way to do it: Transform JSON into bash variables to eval them.
This only works for:
JSON which does not contain nested arrays, and
JSON from trustworthy sources (else it may confuse your shell script, perhaps it may even be able to harm your system, You have been warned)
Well, yes, it uses PERL to do this job, thanks to CPAN, but is small enough for inclusion directly into a script and hence is quick and easy to debug:
json2bash() {
perl -MJSON -0777 -n -E 'sub J {
my ($p,$v) = #_; my $r = ref $v;
if ($r eq "HASH") { J("${p}_$_", $v->{$_}) for keys %$v; }
elsif ($r eq "ARRAY") { $n = 0; J("$p"."[".$n++."]", $_) foreach #$v; }
else { $v =~ '"s/'/'\\\\''/g"'; $p =~ s/^([^[]*)\[([0-9]*)\](.+)$/$1$3\[$2\]/;
$p =~ tr/-/_/; $p =~ tr/A-Za-z0-9_[]//cd; say "$p='\''$v'\'';"; }
}; J("json", decode_json($_));'
}
use it like eval "$(json2bash <<<'{"a":["b","c"]}')"
Not heavily tested, though. Updates, warnings and more examples see my GIST.
Update
(Unfortunately, following is a link-only-solution, as the C code is far
too long to duplicate here.)
For all those, who do not like the above solution,
there now is a C program json2sh
which (hopefully safely) converts JSON into shell variables.
In contrast to the perl snippet, it is able to process any JSON,
as long as it is well formed.
Caveats:
json2sh was not tested much.
json2sh may create variables, which start with the shellshock pattern () {
I wrote json2sh to be able to post-process .bson with Shell:
bson2json()
{
printf '[';
{ bsondump "$1"; echo "\"END$?\""; } | sed '/^{/s/$/,/';
echo ']';
};
bsons2json()
{
printf '{';
c='';
for a;
do
printf '%s"%q":' "$c" "$a";
c=',';
bson2json "$a";
done;
echo '}';
};
bsons2json */*.bson | json2sh | ..
Explained:
bson2json dumps a .bson file such, that the records become a JSON array
If everything works OK, an END0-Marker is applied, else you will see something like END1.
The END-Marker is needed, else empty .bson files would not show up.
bsons2json dumps a bunch of .bson files as an object, where the output of bson2json is indexed by the filename.
This then is postprocessed by json2sh, such that you can use grep/source/eval/etc. what you need, to bring the values into the shell.
This way you can quickly process the contents of a MongoDB dump on shell level, without need to import it into MongoDB first.
Related
This question already has answers here:
Parsing JSON with Unix tools
(45 answers)
Closed 21 days ago.
I have a json file file.json like this:
{
"abc": "123",
"def": 456,
"ghi": 789
}
I am trying to get value of all the keys using regex in bash terminal.
Here is what I tried for getting value of abc:
var=cat file.json
regex='(abc\":) \"(.+)\",'
[[ $var =~ $regex ]]
echo ${BASE_REMATCH[1]}
It doesn't print anything. I am trying to get/print value of abc i.e. "123"
Please note that I can't use jq parser.
While it is generally advised to use a Json parser (don't badly do work that has already be well done by someone else) It is sometimes just OK to parse Json via regex for use-and-throw-away scripts, or for parsing very simple json files/strings (like the one you used in your input)
If that is your case, this may work for you (as long as the values are scalars)
var=$(cat file.json)
regex='"abc"[[:space:]]*:[[:space:]]*("([^"]+|\\")+"|[^[:space:],"]+)'
[[ $var =~ $regex ]]
if [[ $var =~ $regex ]]; then
echo ${BASH_REMATCH[1]}
fi
For any other cases, use a json parser.
Perl is installed on almost every Linux flavour, and the module JSON::PP should be part of the core modules.
So you could do something like this:
perl -MJSON::PP -E '$j=decode_json <>; say $j->{abc}' -0777 file.json
Reading your file.json from stdin with python 3.
example.py:
import sys
import json
f = open(sys.stdin.fileno(), 'r')
data = json.load(f)
print(data[sys.argv[1]])
Usage: python3 example.py 'abc' < file.json
Output:
123
This question already has answers here:
How to replace values in a JSON dictionary with their respective shell variables in jq?
(2 answers)
Closed 6 months ago.
EDIT: I've updated the question as the answers so far are correct but unfortunately I oversimplified the problem to the point that it will likely need a new question entirely. The issue I'm running into is that I need to replace only a part of value, and that having more than one variable causes sed to gobble multiple variables at once.
here's what I've been trying to do:
# test.json
{
"name": "Mr. ${FIRSTNAME} ${LASTNAME}"
}
I'd like to load this file, replace the variable, and store it in a variable:
NAME=Frodo JSON=$(eval "echo \"$(cat test.json)\"")
echo $JSON
I'd like it to print { "name": "Mr. Frodo Baggins" }, however it seems like eval is stripping out the double quotes, producing invalid JSON: { name: Frodo }. Any idea on how I can do this?
I should also mention that this approach seemed the cleanest, as the variables act as a key-value map. The other approach I tried was with an associative array and sed loop, which seems to work:
KEYS=( "FIRSTNAME" "LASTNAME" )
VALUES=( "Frodo" "Baggins" )
JSON=$(cat test.json)
for index in "${!KEYS[#]}"; do
JSON=$(echo "$JSON" | sed -E "s/\\$\{${KEYS[0]}}/${VALUES[0]}/g")
done
echo $JSON
NOTE: I'm aware that eval is a security risk, however this script is for testing purposes only
edit for altered OP
I'd write the script dynamically, rather than eval during runtime.
$: keys=( "FIRSTNAME" "LASTNAME" )
$: values=( "Frodo" "Baggins" )
$: sed "$(for i in "${!keys[#]}"; do echo "s/\${${keys[i]}}/${values[i]}/;"; done)" test.json
{
"name": "Mr. Frodo Baggins"
}
end edit
First, use jq if you can.
That being said...
If you are VERY CAREFUL with your quoting and ORDER of operations, eval isn't needed.
$: JSON="$(NAME=Frodo; sed s/\${NAME}/$NAME/ json)" && echo "--> $JSON <--"
--> {
"name": "Frodo"
} <--
(When testing, remember to unset JSON between runs.) ;)
This does NOT work without the internal semicolon!
$: JSON="$(NAME=Frodo sed s/\${NAME}/$NAME/ json)" && echo "--> $JSON <--"
--> {
"name": ""
} <--
Remember, assigning NAME=Frodo happens on the same parse of the command line, so you can't use $NAME in that pass. It has to either be a subsequent command or a subshell.
This works, because the parse runs in the subshell where the variable has been set, rather than is being set.
$: echo 'sed s/\${NAME}/$NAME/ json' >script
$: JSON="$(NAME=Frodo ./script)" && echo "--> $JSON <--"
--> {
"name": "Frodo"
} <--
Lets say this is my json file content:
[ { "id":"45" }, { "id":"56" }, { "id":"13" }, { "id":"5" } ]
and I want to find out if id "13" is in the json file.
Is there an way to do this in bash?
I tried to run the command with jq and all sorts of different variations of it (with contain and without for example) and nothing answers this query for me.
Note: when the question was closed, I added this answer to the question in an effort to get the question reopened:
(( $(jq < file.json '[.[].id | select(. == "13")] | length') > 0))
The OP said it was inefficient. I do not know why. Here is what it does:
It passes the JSON through the jq program, which is a JSON parser. The bash shell has no native understanding of JSON, so any solution is going to make use of an external program. Other programs will treat JSON as text, and may work in some or most cases, but it is best to use a program like jq that follows the formal JSON specification to parse the data.
It creates an array to capture the output of...
It loops through the array, picking out all the id fields
It outputs the value of the id field if the value is "13"
It counts the length of the array, which is the number of the id fields whose value is "13"
Using native bash, it converts that output into a number and evaluates to true if the number is greater than 0 and false otherwise.
I do not think you will find something significantly more efficient that formally follows the JSON spec.
This only runs 1 program, jq, which is the de facto standard JSON processor. It is not part of the POSIX standard (which predates JSON) but it is the most likely JSON processor to be installed on a system.
This uses native bash constructs to interpret the output and to the test.
There is not going to be a solution that is more efficient because it runs zero programs (bash cannot do it alone) and there is not going to be a better program to use than jq.
There is not going to be a significantly better jq filter, because it is going to process the entire input (that is just how it works) and the select filter stops the processing of objects that fail the test, which is all or almost all of them.
The alternative "peak" suggests is more compact and more elegant (good things) but not significantly more (or less) efficient. It looks better in the post because a lot is left out. The full test would be
[[ $(jq < file.json 'any(.[]; .id == "13")') == "true" ]]
Actually, the .[]; generator is unnecessary, so the even more compact answer would be
[[ $(jq < file.json 'any(.id == "13")') == "true" ]]
Here is one simple way to to determine if a given "id" value is present using perl:
echo '[ { "id":"45" }, { "id":"56" }, { "id":"13" }, { "id":"5" } ]' | perl -00 -lne 'if (/"id":"13"/) {print "true"} else {print "false"}'
true
echo '[ { "id":"45" }, { "id":"56" }, { "id":"13" }, { "id":"5" } ]' | perl -00 -lne 'if (/"id":"33"/) {print "true"} else {print "false"}'
false
Here is one possibility:
any(.[]; .id == "13")
I am trying to write a shell script that loops through a JSON file and does some logic based on every object's properties. The script was initially written for Windows but it does not work properly on a MacOS.
The initial code is as follows
documentsJson=""
jsonStrings=$(cat "$file" | jq -c '.[]')
while IFS= read -r document; do
# Get the properties from the docment (json string)
currentKey=$(echo "$document" | jq -r '.Key')
encrypted=$(echo "$document" | jq -r '.IsEncrypted')
# If not encrypted then don't do anything with it
if [[ $encrypted != true ]]; then
echoComment " Skipping '$currentKey' as it's not marked for encryption"
documentsJson+="$document,"
continue
fi
//some more code
done <<< $jsonStrings
When ran on a MacOs, the whole file is processed at once, so it does not loop through objects.
The closest I got to making it work - after trying a lot of suggestions - is as follows:
jq -r '.[]' "$file" | while read i; do
for config in $i ; do
currentKey=$(echo "$config" | jq -r '.Key')
echo "$currentKey"
done
done
The console result is parse error: Invalid numeric literal at line 1, column 6
I just cannot find a proper way of grabbing the JSON object and reading its properties.
JSON file example
[
{
"Key": "PdfMargins",
"Value": {
"Left":0,
"Right":0,
"Top":20,
"Bottom":15
}
},
{
"Key": "configUrl",
"Value": "someUrl",
"IsEncrypted": true
}
]
Thank you in advance!
Try putting the $jsonStrings in doublequotes: done <<< "$jsonStrings"
Otherwise the standard shell splitting applies on the variable expansion and you probably want to retain the line structure of the output of jq.
You could also use this in bash:
while IFS= read -r document; do
...
done < <(jq -c '.[]' < "$file")
That would save some resources. I am not sure about making this work on MacOS, though, so test this first.
I'm working on parsing JSON data using JSON.sh. And I wanted to read data from json file (test.json) whose content will be something like,
{
"/home/ukrishnan/projects/test.yml": {
"LOG_DRIVER": "syslog",
"IMAGE": "mysql:5.6"
},
"/home/ukrishnan/projects/mysql/app.xml": {
"ENV_ACCOUNT_BRIDGE_ENDPOINT": "/u01/src/test/sample.txt"
}
}
And I try to parse this JSON using JSON.sh by using,
test_parser=`sh ./lib/JSON.sh < test/test.json`
echo $test_parser
It prints,
["/home/ukrishnan/projects/test.yml","LOG_DRIVER"] "syslog" ["/home/ukrishnan/projects/test.yml","IMAGE"] "mysql:5.6" ["/home/ukrishnan/projects/test.yml"] {"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"} ["/home/ukrishnan/projects/mysql/app.xml","ENV_ACCOUNT_BRIDGE_ENDPOINT"] "/u01/src/test/sample.txt" ["/home/ukrishnan/projects/mysql/app.xml"] {"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"} [] {"/home/ukrishnan/projects/test.yml":{"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"},"/home/ukrishnan/projects/mysql/app.xml":{"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"}}
Whereas, the same command (sh ./lib/JSON.sh < test/test.json), if I run through terminal, it is printing with line breaks,
["/home/ukrishnan/projects/test.yml","LOG_DRIVER"] "syslog"
["/home/ukrishnan/projects/test.yml","IMAGE"] "mysql:5.6"
["/home/ukrishnan/projects/test.yml"] {"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"}
["/home/ukrishnan/projects/mysql/app.xml","ENV_ACCOUNT_BRIDGE_ENDPOINT"] "/u01/src/test/sample.txt"
["/home/ukrishnan/projects/mysql/app.xml"] {"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"}
[] {"/home/ukrishnan/projects/test.yml":{"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"},"/home/ukrishnan/projects/mysql/app.xml":{"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"}}
I wanted to read this and assign to bash variables like,
file_name='/home/ukrishnan/projects/test.yml'
key='LOG_DRIVER'
value='syslog'
As I'm almost completely new to shell script and grep or awk, I don't have much idea of how to achieve this. Any help on this would be greatly appreciated.
I wrote a JSON serializer / deserializer for gawk, if you're interested. Save that script and modify it, replacing everything above # === FUNCTIONS === with the following:
#!/usr/bin/gawk -f
# capture JSON string from beginning to end into a scalar variable
{ json = json ORS $0 }
END {
# objectify JSON string to the multilevel array "obj"
deserialize(json, obj)
for (filename in obj) {
print "file_name=" quote(filename)
for (key in obj[filename]) {
# print key="value"
print key "=" quote(obj[filename][key])
}
}
}
Do chmod 755 json.awk and execute it. Output will resemble this:
$ ./json.awk test5.json
file_name="/home/ukrishnan/projects/mysql/app.xml"
ENV_ACCOUNT_BRIDGE_ENDPOINT="/u01/src/test/sample.txt"
file_name="/home/ukrishnan/projects/test.yml"
LOG_DRIVER="syslog"
IMAGE="mysql:5.6"
Hopefully the logic is reasonably easy to follow. If you prefer to output filename=, key=, and value= on every loop iteration, modify the nested for loops accordingly:
for (filename in obj) {
for (key in obj[filename]) {
print "file_name=" quote(filename)
print "key=" quote(key)
print "value=" quote(obj[filename][key])
}
}
That change will result in the following output:
$ ./json.awk test5.json
file_name="/home/ukrishnan/projects/mysql/app.xml"
key="ENV_ACCOUNT_BRIDGE_ENDPOINT"
value="/u01/src/test/sample.txt"
file_name="/home/ukrishnan/projects/test.yml"
key="LOG_DRIVER"
value="syslog"
file_name="/home/ukrishnan/projects/test.yml"
key="IMAGE"
value="mysql:5.6"
Anyway, with that output, you can do something silly in BASH like this to populate and act upon the variables:
#!/bin/bash
./test.awk test5.json | while read -r line; do {
eval $line
[ "${line/=*/}" = "value" ] && {
echo "bash: file_name=$file_name"
echo "bash: key=$key"
echo "bash: value=$value"
echo "------"
}
}; done
It'd probably be more graceful just to do all processing within gawk from start to finish and not mess with the polyglot handoff, though.
Getting back to json.awk, if you prefer to keep json.awk modular for easy reuse in future projects, you could remove everything above # === FUNCTIONS ===, create a separate main.awk containing the code block at the top of this answer, and #include "json.awk" as a helper library pretty much anywhere outside of END {...} (just below the shbang, for example).
JSON.sh (from http://json.org) offers a nice bash friendly means of flattening out a JSON file. Which you've already provided how it looks in your question. So, the flatten form is the format:
[node] tab value
You have to think in UNIX script in extracting the information you want, you'll note the lines you're interested in actually follow this pattern:
["filename","key"] tab ["value"]
In regex notation, we replace:
filename with (.*)
key with (.*)
tab with \t
value with (.*)
We can retrieve the first, second and third matching groups with \1, \2, \3 respectively.
When used in sed we also note that these symbols []() need to be escaped with a backslash \, resulting in the following script:
./lib/JSON.sh < test/test.json | sed 's/\["\(.*\)","\(.*\)\"]\t"\(.*\)"/\1,\2,\3/;t;d'
/home/ukrishnan/projects/test.yml,LOG_DRIVER,syslog
/home/ukrishnan/projects/test.yml,IMAGE,mysql:5.6
/home/ukrishnan/projects/mysql/app.xml,ENV_ACCOUNT_BRIDGE_ENDPOINT,/u01/src/test/sample.txt
Now we put the lines in a loop and for each line, we can extract out filename,key,value:
for line in $(./lib/JSON.sh < test/test.json | sed 's/\["\(.*\)","\(.*\)\"]\t"\(.*\)"/\1,\2,\3/;t;d')
do
IFS="," read -ra arr <<< $line
filename=${arr[0]}
key=${arr[1]}
value=${arr[2]}
cat <<EOF
filename : $filename
key : $key
value : $value
EOF
done
Which outputs:
filename : /home/ukrishnan/projects/test.yml
key : LOG_DRIVER
value : syslog
filename : /home/ukrishnan/projects/test.yml
key : IMAGE
value : mysql:5.6
filename : /home/ukrishnan/projects/mysql/app.xml
key : ENV_ACCOUNT_BRIDGE_ENDPOINT
value : /u01/src/test/sample.txt