Get json value using regex in linux [duplicate] - json

This question already has answers here:
Parsing JSON with Unix tools
(45 answers)
Closed 21 days ago.
I have a json file file.json like this:
{
"abc": "123",
"def": 456,
"ghi": 789
}
I am trying to get value of all the keys using regex in bash terminal.
Here is what I tried for getting value of abc:
var=cat file.json
regex='(abc\":) \"(.+)\",'
[[ $var =~ $regex ]]
echo ${BASE_REMATCH[1]}
It doesn't print anything. I am trying to get/print value of abc i.e. "123"
Please note that I can't use jq parser.

While it is generally advised to use a Json parser (don't badly do work that has already be well done by someone else) It is sometimes just OK to parse Json via regex for use-and-throw-away scripts, or for parsing very simple json files/strings (like the one you used in your input)
If that is your case, this may work for you (as long as the values are scalars)
var=$(cat file.json)
regex='"abc"[[:space:]]*:[[:space:]]*("([^"]+|\\")+"|[^[:space:],"]+)'
[[ $var =~ $regex ]]
if [[ $var =~ $regex ]]; then
echo ${BASH_REMATCH[1]}
fi
For any other cases, use a json parser.
Perl is installed on almost every Linux flavour, and the module JSON::PP should be part of the core modules.
So you could do something like this:
perl -MJSON::PP -E '$j=decode_json <>; say $j->{abc}' -0777 file.json

Reading your file.json from stdin with python 3.
example.py:
import sys
import json
f = open(sys.stdin.fileno(), 'r')
data = json.load(f)
print(data[sys.argv[1]])
Usage: python3 example.py 'abc' < file.json
Output:
123

Related

jq truncates ENV variable after whitespace

Trying to write a bash script that replaces values in a JSON file we are running into issues with Environment Variables that contain whitespaces.
Given an original JSON file.
{
"version": "base",
"myValue": "to be changed",
"channelId": 0
}
We want to run a command to update some variables in it, so that after we run:
CHANNEL_ID=1701 MY_VALUE="new value" ./test.sh
The JSON should look like this:
{
"version": "base",
"myValue": "new value",
"channelId": 1701
}
Our script is currently at something like this:
#!/bin/sh
echo $MY_VALUE
echo $CHANNEL_ID
function replaceValue {
if [ -z $2 ]; then echo "Skipping $1"; else jq --argjson newValue \"${2}\" '. | ."'${1}'" = $newValue' build/config.json > tmp.json && mv tmp.json build/config.json; fi
}
replaceValue channelId ${CHANNEL_ID}
replaceValue myValue ${MY_VALUE}
In the above all values are replaced by string and strings are getting truncated at whitespace. We keep alternating between this issue and a version of the code where substitutions just stop working entirely.
This is surely an issue with expansions but we would love to figure out, how we can:
- Replace values in the JSON with both strings and values.
- Use whitespaces in the strings we pass to our script.
You don't have to mess with --arg or --argjson to import the environment variables into jq's context. It can very well read the environment on its own. You don't need a script separately, just set the values along with the invocation of jq
CHANNEL_ID=1701 MY_VALUE="new value" \
jq '{"version": "base", myValue: env.MY_VALUE, channelId: env.CHANNEL_ID}' build/config.json
Note that in the case above, the variables need not be exported globally but just locally to the jq command. This allows you to not export multiple variables into the shell and pollute the environment, but just the ones needed for jq to construct the desired JSON.
To make the changes back to the original file, do > tmp.json && mv tmp.json build/config.json or more clearly download the sponge(1) utility from moreutils package. If present, you can pipe the output of jq as
| sponge build/config.json
Pass variables with --arg. Do:
jq --arg key "$1" --arg value "$2" '.[$key] = $value'
Notes:
#!/bin/sh indicates that this is posix shell script, not bash. Use #!/bin/bash in bash scripts.
function replaceValue { is something from ksh shell. Prefer replaceValue() { to declare functions. Bash obsolete and deprecated syntax.
Use newlines in your script to make it readable.
--argjson passes a json formatted argument, not a string. Use --arg for that.
\"${2}\" doesn't quote $2 expansion - it only appends and suffixes the string with ". Because the expansion is not qouted, word splitting is performed, which causes your input to be split on whitespaces when creating arguments for jq.
Remember to quote variable expansions.
Use http://shellcheck.net to check your scripts.
. | means nothing in jq, it's like echo $(echo $(echo))). You could jq '. | . | . | . | . | .' do it infinite number of times - it passes the same thing. Just write the thing you want to do.
Do:
#!/bin/bash
echo "$MY_VALUE"
echo "$CHANNEL_ID"
replaceValue() {
if [ -z "$2" ]; then
echo "Skipping $1"
else
jq --arg key "$1" --arg value "$2" '.[$key] = $value' build/config.json > tmp.json &&
mv tmp.json build/config.json
fi
}
replaceValue channelId "${CHANNEL_ID}"
replaceValue myValue "${MY_VALUE}"
#edit Replaced ."\($key)" with easier .[$key]
jq allows you to build new objects:
MY_VALUE=foo;
CHANNEL_ID=4
echo '{
"version": "base",
"myValue": "to be changed",
"channelId": 0
}' | jq ". | {\"version\": .version, \"myValue\": \"$MY_VALUE\", \"channelId\": $CHANNEL_ID}"
The . selects the whole input, and inputs that (|) to the construction of a new object (marked by {}). For version is selects .version from the input, but you can set your own values for the other two. We use double quotes to allow the Bash variable expansion, which means escaping the double quotes in the JSON.
You'll need to adapt my snippet above to scriptify it.

Loop through JSON array shell script

I am trying to write a shell script that loops through a JSON file and does some logic based on every object's properties. The script was initially written for Windows but it does not work properly on a MacOS.
The initial code is as follows
documentsJson=""
jsonStrings=$(cat "$file" | jq -c '.[]')
while IFS= read -r document; do
# Get the properties from the docment (json string)
currentKey=$(echo "$document" | jq -r '.Key')
encrypted=$(echo "$document" | jq -r '.IsEncrypted')
# If not encrypted then don't do anything with it
if [[ $encrypted != true ]]; then
echoComment " Skipping '$currentKey' as it's not marked for encryption"
documentsJson+="$document,"
continue
fi
//some more code
done <<< $jsonStrings
When ran on a MacOs, the whole file is processed at once, so it does not loop through objects.
The closest I got to making it work - after trying a lot of suggestions - is as follows:
jq -r '.[]' "$file" | while read i; do
for config in $i ; do
currentKey=$(echo "$config" | jq -r '.Key')
echo "$currentKey"
done
done
The console result is parse error: Invalid numeric literal at line 1, column 6
I just cannot find a proper way of grabbing the JSON object and reading its properties.
JSON file example
[
{
"Key": "PdfMargins",
"Value": {
"Left":0,
"Right":0,
"Top":20,
"Bottom":15
}
},
{
"Key": "configUrl",
"Value": "someUrl",
"IsEncrypted": true
}
]
Thank you in advance!
Try putting the $jsonStrings in doublequotes: done <<< "$jsonStrings"
Otherwise the standard shell splitting applies on the variable expansion and you probably want to retain the line structure of the output of jq.
You could also use this in bash:
while IFS= read -r document; do
...
done < <(jq -c '.[]' < "$file")
That would save some resources. I am not sure about making this work on MacOS, though, so test this first.

How to get an element of a JSON object in bash?

I'm performing a curl command using a bash file and the return is a json object. How to get a element of this json object in this bash file?
Put request
https://sms.world-text.com/v2.0/sms/send?id=11111&key=Testkey&srcaddr=DA_Health&dstaddr=000000000000&method=PUT&txt=Message_Text_Text
Response
{"status":"1","error":"1000","desc":"Authorisation Failure"}
to="000000000000"
message="Test_message"
url="https://sms.world-text.com/v2.0/sms/send?id=11111&key=TestKey&srcaddr=SMSMsg&dstaddr=${to}&method=PUT&txt=${message}"
return=$(curl -sm 5 $url --data-urlencode "${message}" -A 'Test')
Finally, the "return" variable has the value below:
{"status":"1","error":"1000","desc":"Authorisation Failure"}
I expect to perform that validation
if [[ "$status" != 0]]; then
&2 echo "$return"
fi
But how can I get the element "status" and his value "1" from $return in the bash file?
it's ideologically wrong to process JSON format with JSON-agnostic tools (like awk, sed, etc). JSON format must be processed with JSON-aware tools.
E.g., if your curl response was a multi-line JSON (which is quite often the case), then most likely the sed based solution would not work right for you.
One of the unix utilities to work with JSON is jtc, with that one your solution would look like this:
status=$(<<<$return jtc -w[status] -qq)
and then you can apply your check:
if [[ "$status" != 0]]; then
>&2 echo "$return"
fi
PS> Disclosure: I'm the creator of the jtc - shell cli tool for JSON operations
You could use sed to filter out the status code from the $result.
status=$(echo "$return" | sed -E 's/\{"status"\s?:\s?"([0-9]+)".*/\1/')
and then can do test:
if [[ "$status" != 0]]; then
>&2 echo "$return"
fi
You can use jq. Here is a comprehensive guide about the tool -> https://stedolan.github.io/jq/tutorial/.
e.g.
echo '{ "foo": 123, "bar": 456 }' | jq '.foo'
This would print 123.
It also works with nested objects.

Get JSON files from particular interval based on date field

I've a lot json file the structure of which looks like below:
{
key1: 'val1'
key2: {
'key21': 'someval1',
'key22': 'someval2',
'key23': 'someval3',
'date': '2018-07-31T01:30:30Z',
'key25': 'someval4'
}
key3: []
... some other objects
}
My goal is to get only these files where date field is from some period.
For example from 2018-05-20 to 2018-07-20.
I can't base on date of creation this files, because all of this was generated in one day.
Maybe it is possible using sed or similar program?
Fortunately, the date in this format can be compared as a string. You only need something to parse the JSONs, e.g. Perl:
perl -l -0777 -MJSON::PP -ne '
$date = decode_json($_)->{key2}{date};
print $ARGV if $date gt "2018-07-01T00:00:00Z";
' *.json
-0777 makes perl slurp the whole files instead of reading them line by line
-l adds a newline to print
$ARGV contains the name of the currently processed file
See JSON::PP for details. If you have JSON::XS or Cpanel::JSON::XS, you can switch to them for faster processing.
I had to fix the input (replace ' by ", add commas, etc.) in order to make the parser happy.
If your files actually contain valid JSON, the task can be accomplished in a one-liner with jq, e.g.:
jq 'if .key2.date[0:10] | (. >= "2018-05-20" and . <= "2018-07-31") then input_filename else empty end' *.json
This is just an illustration. jq has date-handling functions for dealing with more complex requirements.
Handling quasi-JSON
If your files contain quasi-JSON, then you could use jq in conjunction with a JSON rectifier. If your sample is representative, then hjson
could be used, e.g.
for f in *.qjson
do
hjson -j $f | jq --arg f "$f" '
if .key2.date[0:7] == "2018-07" then $f else empty end'
done
Try like this:
Find a online converter. (for example: https://codebeautify.org/json-to-excel-converter#) and convert Json to CSV
Open CSV file with Excel
Filter your data

Read JSON data in a shell script [duplicate]

This question already has answers here:
Parsing JSON with Unix tools
(45 answers)
Closed 6 years ago.
In shell I have a requirement wherein I have to read the JSON response which is in the following format:
{ "Messages": [ { "Body": "172.16.1.42|/home/480/1234/5-12-2013/1234.toSort", "ReceiptHandle": "uUk89DYFzt1VAHtMW2iz0VSiDcGHY+H6WtTgcTSgBiFbpFUg5lythf+wQdWluzCoBziie8BiS2GFQVoRjQQfOx3R5jUASxDz7SmoCI5bNPJkWqU8ola+OYBIYNuCP1fYweKl1BOFUF+o2g7xLSIEkrdvLDAhYvHzfPb4QNgOSuN1JGG1GcZehvW3Q/9jq3vjYVIFz3Ho7blCUuWYhGFrpsBn5HWoRYE5VF5Bxc/zO6dPT0n4wRAd3hUEqF3WWeTMlWyTJp1KoMyX7Z8IXH4hKURGjdBQ0PwlSDF2cBYkBUA=", "MD5OfBody": "53e90dc3fa8afa3452c671080569642e", "MessageId": "e93e9238-f9f8-4bf4-bf5b-9a0cae8a0ebc" } ] }
Here I am only concerned with the "Body" property value. I made some unsuccessful attempts like:
jsawk -a 'return this.Body'
or
awk -v k="Body" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}
But that did not suffice. Can anyone help me with this?
There is jq for parsing json on the command line:
jq '.Body'
Visit this for jq: https://stedolan.github.io/jq/
tl;dr
$ cat /tmp/so.json | underscore select '.Messages .Body'
["172.16.1.42|/home/480/1234/5-12-2013/1234.toSort"]
Javascript CLI tools
You can use Javascript CLI tools like
underscore-cli:
json:select(): CSS-like selectors for JSON.
Example
Select all name children of a addons:
underscore select ".addons > .name"
The underscore-cli provide others real world examples as well as the json:select() doc.
Similarly using Bash regexp. Shall be able to snatch any key/value pair.
key="Body"
re="\"($key)\": \"([^\"]*)\""
while read -r l; do
if [[ $l =~ $re ]]; then
name="${BASH_REMATCH[1]}"
value="${BASH_REMATCH[2]}"
echo "$name=$value"
else
echo "No match"
fi
done
Regular expression can be tuned to match multiple spaces/tabs or newline(s). Wouldn't work if value has embedded ". This is an illustration. Better to use some "industrial" parser :)
Here is a crude way to do it: Transform JSON into bash variables to eval them.
This only works for:
JSON which does not contain nested arrays, and
JSON from trustworthy sources (else it may confuse your shell script, perhaps it may even be able to harm your system, You have been warned)
Well, yes, it uses PERL to do this job, thanks to CPAN, but is small enough for inclusion directly into a script and hence is quick and easy to debug:
json2bash() {
perl -MJSON -0777 -n -E 'sub J {
my ($p,$v) = #_; my $r = ref $v;
if ($r eq "HASH") { J("${p}_$_", $v->{$_}) for keys %$v; }
elsif ($r eq "ARRAY") { $n = 0; J("$p"."[".$n++."]", $_) foreach #$v; }
else { $v =~ '"s/'/'\\\\''/g"'; $p =~ s/^([^[]*)\[([0-9]*)\](.+)$/$1$3\[$2\]/;
$p =~ tr/-/_/; $p =~ tr/A-Za-z0-9_[]//cd; say "$p='\''$v'\'';"; }
}; J("json", decode_json($_));'
}
use it like eval "$(json2bash <<<'{"a":["b","c"]}')"
Not heavily tested, though. Updates, warnings and more examples see my GIST.
Update
(Unfortunately, following is a link-only-solution, as the C code is far
too long to duplicate here.)
For all those, who do not like the above solution,
there now is a C program json2sh
which (hopefully safely) converts JSON into shell variables.
In contrast to the perl snippet, it is able to process any JSON,
as long as it is well formed.
Caveats:
json2sh was not tested much.
json2sh may create variables, which start with the shellshock pattern () {
I wrote json2sh to be able to post-process .bson with Shell:
bson2json()
{
printf '[';
{ bsondump "$1"; echo "\"END$?\""; } | sed '/^{/s/$/,/';
echo ']';
};
bsons2json()
{
printf '{';
c='';
for a;
do
printf '%s"%q":' "$c" "$a";
c=',';
bson2json "$a";
done;
echo '}';
};
bsons2json */*.bson | json2sh | ..
Explained:
bson2json dumps a .bson file such, that the records become a JSON array
If everything works OK, an END0-Marker is applied, else you will see something like END1.
The END-Marker is needed, else empty .bson files would not show up.
bsons2json dumps a bunch of .bson files as an object, where the output of bson2json is indexed by the filename.
This then is postprocessed by json2sh, such that you can use grep/source/eval/etc. what you need, to bring the values into the shell.
This way you can quickly process the contents of a MongoDB dump on shell level, without need to import it into MongoDB first.