How to parse multi properties with spacing in jq json bash script? - json

I have a json string in the following format:
json='{"x": [{"y":"yyy1", "z":"zzz1 zzz1"}, {"y":"yyy2", "z":"zzz2 zzz2"}]}'
I'm trying to get an array var of all "y" and separate array of "z" with jq like
y_arr=$(echo "${json}" | jq '.x | [] | .y') # => outputs 2 array elements
y_arr=$(echo "${json}" | jq '.x | [] | .z') # => outputs 4 array elements due to space
Do you have any idea how to fix this?
Also, do you know any better way of parsing the json as single key/value array instead of two separate arrays?

Assuming your bash has readarray (aka mapfile) and that having the bash array values be valid JSON is acceptable, you could write:
readarray z_array < <(echo "${json}" | jq -c '.x[] | .z')
With your $json, this would yield an array of JSON strings (with quotation marks).
shell loop
If your shell does not support readarray,
then you could use the same invocation of jq but read the values in a bash loop, e.g.:
declare -a array
while read v
do
array+=("$v")
done < <(echo "${json}" | jq -c '.x[] | .z')
Strings
If all the .z values are JSON strings, then provided these strings do not have embedded newline characters, and that you want the bash array values to be "raw" strings (as opposed to JSON strings), you could add the -r option to the invocation of jq.
Associative Arrays
If your bash supports associative arrays, consider:
declare -A a
i=0
while read v
do
i=$((i+1))
a["$v"]=$i
done < <(echo "${json}" | jq -c '.x[].z')
declare -p a
for i in "${!a[#]}" ; do
echo $i
done
This produces:
declare -A a=(["\"zzz2 zzz2\""]="2" ["\"zzz1 zzz1\""]="1" )
"zzz2 zzz2"
"zzz1 zzz1"

Related

What is returned through a jq filter on a json object in bash? [duplicate]

I parsed a json file with jq like this :
# cat test.json | jq '.logs' | jq '.[]' | jq '._id' | jq -s
It returns an array like this : [34,235,436,546,.....]
Using bash script i described an array :
# declare -a msgIds = ...
This array uses () instead of [] so when I pass the array given above to this array it won't work.
([324,32,45..]) this causes problem. If i remove the jq -s, an array forms with only 1 member in it.
Is there a way to solve this issue?
We can solve this problem by two ways. They are:
Input string:
// test.json
{
"keys": ["key1","key2","key3"]
}
Approach 1:
1) Use jq -r (output raw strings, not JSON texts) .
KEYS=$(jq -r '.keys' test.json)
echo $KEYS
# Output: [ "key1", "key2", "key3" ]
2) Use #sh (Converts input string to a series of space-separated strings). It removes square brackets[], comma(,) from the string.
KEYS=$(<test.json jq -r '.keys | #sh')
echo $KEYS
# Output: 'key1' 'key2' 'key3'
3) Using tr to remove single quotes from the string output. To delete specific characters use the -d option in tr.
KEYS=$((<test.json jq -r '.keys | #sh')| tr -d \')
echo $KEYS
# Output: key1 key2 key3
4) We can convert the comma-separated string to the array by placing our string output in a round bracket().
It also called compound Assignment, where we declare the array with a bunch of values.
ARRAYNAME=(value1 value2 .... valueN)
#!/bin/bash
KEYS=($((<test.json jq -r '.keys | #sh') | tr -d \'\"))
echo "Array size: " ${#KEYS[#]}
echo "Array elements: "${KEYS[#]}
# Output:
# Array size: 3
# Array elements: key1 key2 key3
Approach 2:
1) Use jq -r to get the string output, then use tr to delete characters like square brackets, double quotes and comma.
#!/bin/bash
KEYS=$(jq -r '.keys' test.json | tr -d '[],"')
echo $KEYS
# Output: key1 key2 key3
2) Then we can convert the comma-separated string to the array by placing our string output in a round bracket().
#!/bin/bash
KEYS=($(jq -r '.keys' test.json | tr -d '[]," '))
echo "Array size: " ${#KEYS[#]}
echo "Array elements: "${KEYS[#]}
# Output:
# Array size: 3
# Array elements: key1 key2 key3
To correctly parse values that have spaces, newlines (or any other arbitrary characters) just use jq's #sh filter and bash's declare -a. (No need for a while read loop or any other pre-processing)
// foo.json
{"data": ["A B", "C'D", ""]}
str=$(jq -r '.data | #sh' foo.json)
declare -a arr="($str)" # must be quoted like this
$ declare -p arr
declare -a arr=([0]="A B" [1]="C'D" [2]="")
The reason that this works correctly is that #sh will produce a space-separated list of shell-quoted words:
$ echo "$str"
'A B' 'C'\''D' ''
and this is exactly the format that declare expects for an array definition.
Use jq -r to output a string "raw", without JSON formatting, and use the #sh formatter to format your results as a string for shell consumption. Per the jq docs:
#sh:
The input is escaped suitable for use in a command-line for a POSIX shell. If the input is an array, the output will be a series of space-separated strings.
So can do e.g.
msgids=($(<test.json jq -r '.logs[]._id | #sh'))
and get the result you want.
From the jq FAQ (https://github.com/stedolan/jq/wiki/FAQ):
𝑸: How can a stream of JSON texts produced by jq be converted into a bash array of corresponding values?
A: One option would be to use mapfile (aka readarray), for example:
mapfile -t array <<< $(jq -c '.[]' input.json)
An alternative that might be indicative of what to do in other shells is to use read -r within a while loop. The following bash script populates an array, x, with JSON texts. The key points are the use of the -c option, and the use of the bash idiom while read -r value; do ... done < <(jq .......):
#!/bin/bash
x=()
while read -r value
do
x+=("$value")
done < <(jq -c '.[]' input.json)
++ To resolve this, we can use a very simple approach:
++ Since I am not aware of you input file, I am creating a file input.json with the following contents:
input.json:
{
"keys": ["key1","key2","key3"]
}
++ Use jq to get the value from the above file input.json:
Command: cat input.json | jq -r '.keys | #sh'
Output: 'key1' 'key2' 'key3'
Explanation: | #sh removes [ and "
++ To remove ' ' as well we use tr
command: cat input.json | jq -r '.keys | #sh' | tr -d \'
Explanation: use tr delete -d to remove '
++ To store this in a bash array we use () with `` and print it:
command:
KEYS=(`cat input.json | jq -r '.keys | #sh' | tr -d \'`)
To print all the entries of the array: echo "${KEYS[*]}"

Pretty-print valid JSONs mixed with string keys

I have a Redis hash with keys and values like string key -- serialized JSON value.
Corresponding rediscli query (hgetall some_redis_hash) being dumped in a file:
redis_key1
{"value1__key1": "value1__value1", "value1__key2": "value1__value2" ...}
redis_key2
{"value2__key1": "value2__value1", "value2__key2": "value2__value2" ...}
...
and so on.
So the question is, how do I pretty-print these values enclosed in brackets? (note that key strings between are making the document invalid, if you'll try to parse the entire one)
The first thought is to get particular pairs from Redis, strip parasite keys, and use jq on the remaining valid JSON, as shown below:
rediscli hget some_redis_hash redis_key1 > file && tail -n +2 file
- file now contains valid JSON as value, the first string representing Redis key is stripped by tail -
cat file | jq
- produces pretty-printed value -
So the question is, how to pretty-print without such preprocessing?
Or (would be better in this particular case) how to merge keys and values in one big JSON, where Redis keys, accessible on the upper level, will be followed by dicts of their values?
Like that:
rediscli hgetall some_redis_hash > file
cat file | cool_parser
- prints { "redis_key1": {"value1__key1": "value1__value1", ...}, "redis_key2": ... }
A simple way for just pretty-printing would be the following:
cat file | jq --raw-input --raw-output '. as $raw | try fromjson catch $raw'
It tries to parse each line as json with fromjson, and just outputs the original line (with $raw) if it can't.
(The --raw-input is there so that we can invoke fromjson enclosed in a try instead of running it on every line directly, and --raw-output is there so that any non-json lines are not enclosed in quotes in the output.)
A solution for the second part of your questions using only jq:
cat file \
| jq --raw-input --null-input '[inputs] | _nwise(2) | {(.[0]): .[1] | fromjson}' \
| jq --null-input '[inputs] | add'
--null-input combined with [inputs] produces the whole input as an array
which _nwise(2) then chunks into groups of two (more info on _nwise)
which {(.[0]): .[1] | fromjson} then transforms into a list of jsons
which | jq --null-input '[inputs] | add' then combines into a single json
Or in a single jq invocation:
cat file | jq --raw-input --null-input \
'[ [inputs] | _nwise(2) | {(.[0]): .[1] | fromjson} ] | add'
...but by that point you might be better off writing an easier to understand python script.

create json from bash variable and associative array [duplicate]

This question already has answers here:
Constructing a JSON object from a bash associative array
(5 answers)
Closed 5 months ago.
Lets say I have the following declared in bash:
mcD="had_a_farm"
eei="eeieeio"
declare -A animals=( ["duck"]="quack_quack" ["cow"]="moo_moo" ["pig"]="oink_oink" )
and I want the following json:
{
"oldMcD": "had a farm",
"eei": "eeieeio",
"onThisFarm":[
{
"duck": "quack_quack",
"cow": "moo_moo",
"pig": "oink_oink"
}
]
}
Now I know I could do this with an echo, printf, or assign text to a variable, but lets assume animals is actually very large and it would be onerous to do so. I could also loop through my variables and associative array and create a variable as I'm doing so. I could write either of these solutions, but both seem like the "wrong way". Not to mention its obnoxious to deal with the last item in animals, after which I do not want a ",".
I'm thinking the right solution uses jq, but I'm having a hard time finding much documentation and examples on how to use this tool to write jsons (especially those that are nested) rather than parse them.
Here is what I came up with:
jq -n --arg mcD "$mcD" --arg eei "$eei" --arg duck "${animals['duck']}" --arg cow "${animals['cow']}" --arg pig "${animals['pig']}" '{onThisFarm:[ { pig: $pig, cow: $cow, duck: $duck } ], eei: $eei, oldMcD: $mcD }'
Produces the desired result. In reality, I don't really care about the order of the keys in the json, but it's still annoying that the input for jq has to go backwards to get it in the desired order. Regardless, this solution is clunky and was not any easier to write than simply declaring a string variable that looks like a json (and would be impossible with larger associative arrays). How can I build a json like this in an efficient, logical manner?
Thanks!
Assuming that none of the keys or values in the "animals" array contains newline characters:
for i in "${!animals[#]}"
do
printf "%s\n%s\n" "${i}" "${animals[$i]}"
done | jq -nR --arg oldMcD "$mcD" --arg eei "$eei" '
def to_o:
. as $in
| reduce range(0;length;2) as $i ({};
.[$in[$i]]= $in[$i+1]);
{$oldMcD,
$eei,
onthisfarm: [inputs] | to_o}
'
Notice the trick whereby {$x} in effect expands to {(x): $x}
Using "\u0000" as the separator
If any of the keys or values contains a newline character, you could tweak the above so that "\u0000" is used as the separator:
for i in "${!animals[#]}"
do
printf "%s\0%s\0" "${i}" "${animals[$i]}"
done | jq -sR --arg oldMcD "$mcD" --arg eei "$eei" '
def to_o:
. as $in
| reduce range(0;length;2) as $i ({};
.[$in[$i]]= $in[$i+1]);
{$oldMcD,
$eei,
onthisfarm: split("\u0000") | to_o }
'
Note: The above assumes jq version 1.5 or later.
You can reduce associative array with for loop and pipe it to jq:
for i in "${!animals[#]}"; do
echo "$i"
echo "${animals[$i]}"
done |
jq -n -R --arg mcD "$mcD" --arg eei "$eei" 'reduce inputs as $i ({onThisFarm: [], mcD: $mcD, eei: $eei}; .onThisFarm[0] += {($i): (input | tonumber ? // .)})'

Constructing a JSON object from a bash associative array

I would like to convert an associative array in bash to a JSON hash/dict. I would prefer to use JQ to do this as it is already a dependency and I can rely on it to produce well formed json. Could someone demonstrate how to achieve this?
#!/bin/bash
declare -A dict=()
dict["foo"]=1
dict["bar"]=2
dict["baz"]=3
for i in "${!dict[#]}"
do
echo "key : $i"
echo "value: ${dict[$i]}"
done
echo 'desired output using jq: { "foo": 1, "bar": 2, "baz": 3 }'
There are many possibilities, but given that you already have written a bash for loop, you might like to begin with this variation of your script:
#!/bin/bash
# Requires bash with associative arrays
declare -A dict
dict["foo"]=1
dict["bar"]=2
dict["baz"]=3
for i in "${!dict[#]}"
do
echo "$i"
echo "${dict[$i]}"
done |
jq -n -R 'reduce inputs as $i ({}; . + { ($i): (input|(tonumber? // .)) })'
The result reflects the ordering of keys produced by the bash for loop:
{
"bar": 2,
"baz": 3,
"foo": 1
}
In general, the approach based on feeding jq the key-value pairs, with one key on a line followed by the corresponding value on the next line, has much to recommend it. A generic solution following this general scheme, but using NUL as the "line-end" character, is given below.
Keys and Values as JSON Entities
To make the above more generic, it would be better to present the keys and values as JSON entities. In the present case, we could write:
for i in "${!dict[#]}"
do
echo "\"$i\""
echo "${dict[$i]}"
done |
jq -n 'reduce inputs as $i ({}; . + { ($i): input })'
Other Variations
JSON keys must be JSON strings, so it may take some work to ensure that the desired mapping from bash keys to JSON keys is implemented. Similar remarks apply to the mapping from bash array values to JSON values. One way to handle arbitrary bash keys would be to let jq do the conversion:
printf "%s" "$i" | jq -Rs .
You could of course do the same thing with the bash array values, and let jq check whether the value can be converted to a number or to some other JSON type as desired (e.g. using fromjson? // .).
A Generic Solution
Here is a generic solution along the lines mentioned in the jq FAQ and advocated by #CharlesDuffy. It uses NUL as the delimiter when passing the bash keys and values to jq, and has the advantage of only requiring one call to jq. If desired, the filter fromjson? // . can be omitted or replaced by another one.
declare -A dict=( [$'foo\naha']=$'a\nb' [bar]=2 [baz]=$'{"x":0}' )
for key in "${!dict[#]}"; do
printf '%s\0%s\0' "$key" "${dict[$key]}"
done |
jq -Rs '
split("\u0000")
| . as $a
| reduce range(0; length/2) as $i
({}; . + {($a[2*$i]): ($a[2*$i + 1]|fromjson? // .)})'
Output:
{
"foo\naha": "a\nb",
"bar": 2,
"baz": {
"x": 0
}
}
This answer is from nico103 on freenode #jq:
#!/bin/bash
declare -A dict=()
dict["foo"]=1
dict["bar"]=2
dict["baz"]=3
assoc2json() {
declare -n v=$1
printf '%s\0' "${!v[#]}" "${v[#]}" |
jq -Rs 'split("\u0000") | . as $v | (length / 2) as $n | reduce range($n) as $idx ({}; .[$v[$idx]]=$v[$idx+$n])'
}
assoc2json dict
You can initialize a variable to an empty object {} and add the key/values {($key):$value} for each iteration, re-injecting the result in the same variable :
#!/bin/bash
declare -A dict=()
dict["foo"]=1
dict["bar"]=2
dict["baz"]=3
data='{}'
for i in "${!dict[#]}"
do
data=$(jq -n --arg data "$data" \
--arg key "$i" \
--arg value "${dict[$i]}" \
'$data | fromjson + { ($key) : ($value | tonumber) }')
done
echo "$data"
This has been posted, and credited to nico103 on IRC, which is to say, me.
The thing that scares me, naturally, is that these associative array keys and values need quoting. Here's a start that requires some additional work to dequote keys and values:
function assoc2json {
typeset -n v=$1
printf '%q\n' "${!v[#]}" "${v[#]}" |
jq -Rcn '[inputs] |
. as $v |
(length / 2) as $n |
reduce range($n) as $idx ({}; .[$v[$idx]]=$v[$idx+$n])'
}
$ assoc2json a
{"foo\\ bar":"1","b":"bar\\ baz\\\"\\{\\}\\[\\]","c":"$'a\\nb'","d":"1"}
$
So now all that's needed is a jq function that removes the quotes, which come in several flavors:
if the string starts with a single-quote (ksh) then it ends with a single quote and those need to be removed
if the string starts with a dollar sign and a single-quote and ends in a double-quote, then those need to be removed and internal backslash escapes need to be unescaped
else leave as-is
I leave this last iterm as an exercise for the reader.
I should note that I'm using printf here as the iterator!
bash 5.2 introduces the #k parameter transformation which, makes this much easier. Like:
$ declare -A dict=([foo]=1 [bar]=2 [baz]=3)
$ jq -n '[$ARGS.positional | _nwise(2) | {(.[0]): .[1]}] | add' --args "${dict[#]#k}"
{
"foo": "1",
"bar": "2",
"baz": "3"
}

extract 2 values from JSON object and use as variables in loop using jq and bash

I am new to jq. I am trying to write a simple script that loops through a JSON file, gets two values within each object and assigns them to two separate variables I can use with a curl REST call. I see both values as output when I echo $i but how can I get value and addr as separate variables?
for i in `cat /Users/egraham/Downloads/test2 | jq .[] | jq ."value,.addr"`; do
You can do this:
jq -rc '.populator.value + " " + .populator.addr' file.json |
while read -r value addr; do
echo do something with "$value" and "$addr"
done
If spaces or tabs or other special characters make using 'read -r' problematic, and if your shell has "readarray", then it could be used:
$ readarray -t v < <(jq -rc '.populator | (.value,.addr)' file.json)
The values would then be available as ${v[0]} and ${v[1]}
This approach is especially useful if there are more than two values of interest, or if the number of values is variable or not known beforehand.
If your shell does not have readarray, then you can still use the array-oriented approach, e.g. along the lines of:
i=-1; while read -r a ; do i=$((i+1)); v[$i]="$a" ; done
First:
for i in cat /Users/egraham/Downloads/test2 | jq .[] | jq .value; do echo $i done
Second:
for i in cat /Users/egraham/Downloads/test2 | jq .[] | jq .addr; do echo $i done
I don't know any way to get it without running the commands separately. I don't know AWK, but maybe it's something worth considering.