Update values in json (array elements) using jq in loop - json

I have a input json file with an array. I need to updated two values (ver & date) in each array element. I could come up with below script but need help. I have hardcoded the ver & date to simplify the script.
input.json
[
{
"svcname": "svc1",
"repo": "https://repo.mycom.org/repocontext/svc1-list",
"ver": "0.1",
"date": "2019-11-05"
},
{
"svcname": "svc1",
"repo": "https://repo.mycom.org/repocontext/svc1-list",
"ver": "0.1",
"date": "2019-12-21"
}
]
Script:
#!/bin/bash
set +x
injson=input.json
updatedjson=$(jq .[] ${injson})
services=$(cat ${injson} | jq '.[] | .svcname' | tr -d \")
i=1
for svc in $services; do
echo "==>$svc"
echo "======> input json=${updatedjson}"
echo "======> update ver=${i}"
updatedjson=$(echo ${updatedjson} | jq ". | select( .name ==\"$svc\").ver=\"$i\"" | jq . )
svcdate="2020-01-$i"
echo "======> update date=$svcdate"
updatedjson=$(echo ${updatedjson} | jq ". | select( .name ==\"$svc\").date=\"$svcdate\"" | jq . )
echo "============================================"
echo
i=`expr $i + 1`
done
echo "======= write to file ====="
echo ${updatedjson}
echo ${updatedjson} | jq . > outjson.json

You are not using the true features of jq. What you shown in a loop, iterating over all the JSON objects can be simply reduced to one reduce() construct that is sort of a for loop in jq given an initial value and runs the filter incrementally
jq 'reduce range(0, length) as $d (.; (.[$d].ver = ($d+1|tostring)) | (.[$d].date = "2020-01-\($d+1)")) '
A brief explanation of how it works
The range expression returns a list with numbers generated from 0 to upto the length of the objects in the array. For your given input it produces 0,1 which is assigned to d
The reduce expression given the input value . the whole JSON, runs by setting the values in each object indexed by $d. So .[$d].ver refers to the ver field in the zeroth index. This is done incrementally till all the objects are processed.
The same way the date field is modified using [$d].date with the value string prefixed (YYYY-MM-) and date is set accordingly.

Related

jq add list to object list until condition

Background
I have an object with each value being a nested list of only strings. For each string value within the nested list, look up the string value within the object and add all of its values into the current value.
Here's what I have so far:
#!/bin/bash
in=$(jq -n '{
"bar": [["re", "de"]],
"do": [["bar","baz"]],
"baz": [["re"]],
"re": [["zoo"]]
}')
echo "expected:"
jq -n '{
"bar": [["re", "de"], ["zoo"]],
"do": [["bar","baz"], ["re", "de"], ["re"], ["zoo"]],
"baz": [["re"], ["zoo"]],
"re": [["zoo"]]
}'
echo "actual:"
echo ${in} | jq '. as $origin
| map_values( . +
until(
length == 0;
(. | flatten | map($origin[.]) | map(select( . != [[]] )) | add )
)
)'
Problem:
The output is the exact same as the input $in. If the until() function is removed from the statement, then the output correctly outputs one iteration. Although I want to recursively lookup the output strings within the object and add the lookup value until the lookup value is empty or non-existing.
For example, the key do has a value of [["bar","baz"]]. If we iterate through the values of do we come across baz. The value of baz within the object is [["re"]]. Add baz's value ["re"] to do so that do equals: [["bar","baz"], ["re"]]. Since re IS a key within the object, add the value of ["re"] which is ["zoo"]. Since ["zoo"] is NOT a key within the object finish baz and continue to the next key within the object.
The following solves the problem as originally stated, but the "expected" output as shown does not quite match the stated problem.
echo ${in} | jq -c '
. as $dict
| map_values(reduce (..|strings) as $v (.;
. + $dict[$v] ))
'
produces (after some manual reformatting for clarity):
{"bar":[["re","de"],["zoo"]],
"do":[["bar","baz"],["re","de"],["re"]],
"baz":[["re"],["zoo"]],"re":[["zoo"]]}
If some kind of recursive lookup is needed, then please reformulate the problem statement, being sure to avoid infinite loops.

Nested array in JSON to different rows in CSV

I have the following JSON:
{
"transmitterId": "30451155eda2",
"rssiSignature": [
{
"receiverId": "001bc509408201d5",
"receiverIdType": 1,
"rssi": -52,
"numberOfDecodings": 5,
"rssiSum": -52
},
{
"receiverId": "001bc50940820228",
"receiverIdType": 1,
"rssi": -85,
"numberOfDecodings": 5,
"rssiSum": -85
}
],
"timestamp": 1574228579837
}
I want to convert it to CSV format, where each row corresponds to an entry in rssiSignature (I have added the header row for visualization purposes):
timestamp,transmitterId,receiverId,rssi
1574228579837,"30451155eda2","001bc509408201d5",-52
1574228579837,"30451155eda2","001bc50940820228",-85
My current attempt is the following, but I get a single CSV row:
$ jq -r '[.timestamp, .transmitterId, .rssiSignature[].receiverId, .rssiSignature[].rssi] | #csv' test.jsonl
1574228579837,"30451155eda2","001bc509408201d5","001bc50940820228",-52,-85
How can I use jq to generate different rows for each entry of the rssiSignature array?
In order to reuse a value of the upper level, like the timestamp, for every item of the rssiSignature array, you can define it as a variable. You can get your csv like this:
jq -r '.timestamp as $t | .transmitterId as $tid |
.rssiSignature[] | [ $t, $tid, .receiverId, .rssi] | #csv
' file.json
Output:
1574228579837,"30451155eda2","001bc509408201d5",-52
1574228579837,"30451155eda2","001bc50940820228",-85
Also here is an way to print headers for an output file in bash, independent of what commands we call, using commands grouping.
(
printf "timestamp,transmitterId,receiverId,rssi\n"
jq -r '.timestamp as $t | .transmitterId as $tid |
.rssiSignature[] | [ $t, $tid, .receiverId, .rssi] | #csv
' file.json
) > output.csv
Actually, the task can be accomplished without the use of any variables; one can also coax jq to include a header:
jq -r '
["timestamp","transmitterId","receiverId","rssi"],
[.timestamp, .transmitterId] + (.rssiSignature[] | [.receiverId,.rssi])
| #csv'
A single header with multiple files
One way to produce a single header with multiple input files would be to use inputs in conjunction with the -n command-line option. This happens also to be efficient:
jq -nr '
["timestamp","transmitterId","receiverId","rssi"],
(inputs |
[.timestamp, .transmitterId] + (.rssiSignature[] | [.receiverId,.rssi]))
| #csv'

Perform average of values of a particular field inside a JSON using jq

I have the following JSON data.
{
"meta":{
"earliest_ts":1601425980,
"latest_ts":1601482740,
"interval":"minutes",
"level":"cluster",
"entity_id":"xxxxx-xxxxx-xxxxx-xxxxx-xxxxx",
"stat_labels":[
"status_code_classes_per_workspace_total"
],
"entity_type":"workspace",
"start_ts":"1601425980"
},
"stats":{
"cluster":{
"1601431620":{
"3xx":2,
"4xx":87,
"5xx":31,
"2xx":352
},
"1601472780":{
"3xx":14,
"4xx":296,
"5xx":2,
"2xx":3811
},
"1601479140":{
"3xx":17,
"4xx":397,
"5xx":19,
"2xx":4399
}
}
}
}
I try to do the average of all the "3xx" fields.
Using jq, I manage to get the key for each of my cluster :
echo $data | jq -r '.stats.cluster|keys' | while read key; do
echo $key
done
Output :
[
"1601431620",
"1601472780",
"1601479140"
]
But when I try to go further I can't manage to further and to retrieve the data from each field.
I got some insperation from this.
The code below doesn't work, but you get the idea :
# total var will be used to calculate the average
total=$(echo $data | jq ".stats.cluster" | jq length)
# for each cluster ...
echo $data | jq '.stats.cluster|keys' | while read key; do
# ... we retrieve the value "3xx"
i=$($data | jq '.stats.cluster.$key."3xx"')
# ... that we add into a sum var
sum=$(( sum + i ))
done
# we calculate the average
avg=$(( $sum / $total ))
echo "The average is $avg"
I can't path directly to the data in jq like jq '.stats.cluster."1601431620"."3xx" because the cluster are so many and change all the time.
The desired output with my example above would be 11 as (2 + 14 + 17) / 3, those number all coming from the 3xx's items field.
You can get the value directly from jq:
$ jq '[.stats.cluster[]["3xx"]] | add / length' <<< "$data"
11

Constructing a JSON object from a bash associative array

I would like to convert an associative array in bash to a JSON hash/dict. I would prefer to use JQ to do this as it is already a dependency and I can rely on it to produce well formed json. Could someone demonstrate how to achieve this?
#!/bin/bash
declare -A dict=()
dict["foo"]=1
dict["bar"]=2
dict["baz"]=3
for i in "${!dict[#]}"
do
echo "key : $i"
echo "value: ${dict[$i]}"
done
echo 'desired output using jq: { "foo": 1, "bar": 2, "baz": 3 }'
There are many possibilities, but given that you already have written a bash for loop, you might like to begin with this variation of your script:
#!/bin/bash
# Requires bash with associative arrays
declare -A dict
dict["foo"]=1
dict["bar"]=2
dict["baz"]=3
for i in "${!dict[#]}"
do
echo "$i"
echo "${dict[$i]}"
done |
jq -n -R 'reduce inputs as $i ({}; . + { ($i): (input|(tonumber? // .)) })'
The result reflects the ordering of keys produced by the bash for loop:
{
"bar": 2,
"baz": 3,
"foo": 1
}
In general, the approach based on feeding jq the key-value pairs, with one key on a line followed by the corresponding value on the next line, has much to recommend it. A generic solution following this general scheme, but using NUL as the "line-end" character, is given below.
Keys and Values as JSON Entities
To make the above more generic, it would be better to present the keys and values as JSON entities. In the present case, we could write:
for i in "${!dict[#]}"
do
echo "\"$i\""
echo "${dict[$i]}"
done |
jq -n 'reduce inputs as $i ({}; . + { ($i): input })'
Other Variations
JSON keys must be JSON strings, so it may take some work to ensure that the desired mapping from bash keys to JSON keys is implemented. Similar remarks apply to the mapping from bash array values to JSON values. One way to handle arbitrary bash keys would be to let jq do the conversion:
printf "%s" "$i" | jq -Rs .
You could of course do the same thing with the bash array values, and let jq check whether the value can be converted to a number or to some other JSON type as desired (e.g. using fromjson? // .).
A Generic Solution
Here is a generic solution along the lines mentioned in the jq FAQ and advocated by #CharlesDuffy. It uses NUL as the delimiter when passing the bash keys and values to jq, and has the advantage of only requiring one call to jq. If desired, the filter fromjson? // . can be omitted or replaced by another one.
declare -A dict=( [$'foo\naha']=$'a\nb' [bar]=2 [baz]=$'{"x":0}' )
for key in "${!dict[#]}"; do
printf '%s\0%s\0' "$key" "${dict[$key]}"
done |
jq -Rs '
split("\u0000")
| . as $a
| reduce range(0; length/2) as $i
({}; . + {($a[2*$i]): ($a[2*$i + 1]|fromjson? // .)})'
Output:
{
"foo\naha": "a\nb",
"bar": 2,
"baz": {
"x": 0
}
}
This answer is from nico103 on freenode #jq:
#!/bin/bash
declare -A dict=()
dict["foo"]=1
dict["bar"]=2
dict["baz"]=3
assoc2json() {
declare -n v=$1
printf '%s\0' "${!v[#]}" "${v[#]}" |
jq -Rs 'split("\u0000") | . as $v | (length / 2) as $n | reduce range($n) as $idx ({}; .[$v[$idx]]=$v[$idx+$n])'
}
assoc2json dict
You can initialize a variable to an empty object {} and add the key/values {($key):$value} for each iteration, re-injecting the result in the same variable :
#!/bin/bash
declare -A dict=()
dict["foo"]=1
dict["bar"]=2
dict["baz"]=3
data='{}'
for i in "${!dict[#]}"
do
data=$(jq -n --arg data "$data" \
--arg key "$i" \
--arg value "${dict[$i]}" \
'$data | fromjson + { ($key) : ($value | tonumber) }')
done
echo "$data"
This has been posted, and credited to nico103 on IRC, which is to say, me.
The thing that scares me, naturally, is that these associative array keys and values need quoting. Here's a start that requires some additional work to dequote keys and values:
function assoc2json {
typeset -n v=$1
printf '%q\n' "${!v[#]}" "${v[#]}" |
jq -Rcn '[inputs] |
. as $v |
(length / 2) as $n |
reduce range($n) as $idx ({}; .[$v[$idx]]=$v[$idx+$n])'
}
$ assoc2json a
{"foo\\ bar":"1","b":"bar\\ baz\\\"\\{\\}\\[\\]","c":"$'a\\nb'","d":"1"}
$
So now all that's needed is a jq function that removes the quotes, which come in several flavors:
if the string starts with a single-quote (ksh) then it ends with a single quote and those need to be removed
if the string starts with a dollar sign and a single-quote and ends in a double-quote, then those need to be removed and internal backslash escapes need to be unescaped
else leave as-is
I leave this last iterm as an exercise for the reader.
I should note that I'm using printf here as the iterator!
bash 5.2 introduces the #k parameter transformation which, makes this much easier. Like:
$ declare -A dict=([foo]=1 [bar]=2 [baz]=3)
$ jq -n '[$ARGS.positional | _nwise(2) | {(.[0]): .[1]}] | add' --args "${dict[#]#k}"
{
"foo": "1",
"bar": "2",
"baz": "3"
}

Using jq, Flatten Arbitrary JSON to Delimiter-Separated Flat Dictionary

I'm looking to transform JSON using jq to a delimiter-separated and flattened structure.
There have been attempts at this. For example, Flatten nested JSON using jq.
However the solutions on that page fail if the JSON contains arrays. For example, if the JSON is:
{"a":{"b":[1]},"x":[{"y":2},{"z":3}]}
The solution above will fail to transform the above to:
{"a.b.0":1,"x.0.y":2,"x.1.z":3}
In addition, I'm looking for a solution that will also allow for an arbitrary delimiter. For example, suppose the space character is the delimiter. In this case, the result would be:
{"a b 0":1,"x 0 y":2,"x 1 z":3}
I'm looking to have this functionality accessed via a Bash (4.2+) function as is found in CentOS 7, something like this:
flatten_json()
{
local JSONData="$1"
# jq command to flatten $JSONData, putting the result to stdout
jq ... <<<"$JSONData"
}
The solution should work with all JSON data types, including null and boolean. For example, consider the following input:
{"a":{"b":["p q r"]},"w":[{"x":null},{"y":false},{"z":3}]}
It should produce:
{"a b 0":"p q r","w 0 x":null,"w 1 y":false,"w 2 z":3}
If you stream the data in, you'll get pairings of paths and values of all leaf values. If not a pair, then a path marking the end of a definition of an object/array at that path. Using leaf_paths as you found would only give you paths to truthy leaf values so you'll miss out on null or even false values. As a stream, you won't get this problem.
There are many ways this could be combined to an object, I'm partial to using reduce and assignment in these situations.
$ cat input.json
{"a":{"b":["p q r"]},"w":[{"x":null},{"y":false},{"z":3}]}
$ jq --arg delim '.' 'reduce (tostream|select(length==2)) as $i ({};
.[[$i[0][]|tostring]|join($delim)] = $i[1]
)' input.json
{
"a.b.0": "p q r",
"w.0.x": null,
"w.1.y": false,
"w.2.z": 3
}
Here's the same solution broken up a bit to allow room for explanation of what's going on.
$ jq --arg delim '.' 'reduce (tostream|select(length==2)) as $i ({};
[$i[0][]|tostring] as $path_as_strings
| ($path_as_strings|join($delim)) as $key
| $i[1] as $value
| .[$key] = $value
)' input.json
Converting the input to a stream with tostream, we'll receive multiple values of pairs/paths as input to our filter. With this, we can pass those multiple values into reduce which is designed to accept multiple values and do something with them. But before we do, we want to filter those pairs/paths by only the pairs (select(length==2)).
Then in the reduce call, we're starting with a clean object and assigning new values using a key derived from the path and the corresponding value. Remember that every value produced in the reduce call is used for the next value in the iteration. Binding values to variables doesn't change the current context and assignments effectively "modify" the current value (the initial object) and passes it along.
$path_as_strings is just the path which is an array of strings and numbers to just strings. [$i[0][]|tostring] is a shorthand I use as an alternative to using map when the array I want to map is not the current array. This is more compact since the mapping is done as a single expression. That instead of having to do this to get the same result: ($i[0]|map(tostring)). The outer parentheses might not be necessary in general but, it's still two separate filter expressions vs one (and more text).
Then from there we convert that array of strings to the desired key using the provided delimiter. Then assign the appropriate values to the current object.
The following has been tested with jq 1.4, jq 1.5 and the current "master" version. The requirement about including paths to null and false is the reason for "allpaths" and "all_leaf_paths".
# all paths, including paths to null
def allpaths:
def conditional_recurse(f): def r: ., (select(.!=null) | f | r); r;
path(conditional_recurse(.[]?)) | select(length > 0);
def all_leaf_paths:
def isscalar: type | (. != "object" and . != "array");
allpaths as $p
| select(getpath($p)|isscalar)
| $p ;
. as $in
| reduce all_leaf_paths as $path ({};
. + { ($path | map(tostring) | join($delim)): $in | getpath($path) })
With this jq program in flatten.jq:
$ cat input.json
{"a":{"b":["p q r"]},"w":[{"x":null},{"y":false},{"z":3}]}
$ jq --arg delim . -f flatten.jq input.json
{
"a.b.0": "p q r",
"w.0.x": null,
"w.1.y": false,
"w.2.z": 3
}
Collisions
Here is a helper function that illustrates an alternative path-flattening algorithm. It converts keys that contain the delimiter to quoted strings, and array elements are presented in square brackets (see the example below):
def flattenPath(delim):
reduce .[] as $s ("";
if $s|type == "number"
then ((if . == "" then "." else . end) + "[\($s)]")
else . + ($s | tostring | if index(delim) then "\"\(.)\"" else . end)
end );
Example: Using flattenPath instead of map(tostring) | join($delim), the object:
{"a.b": [1]}
would become:
{
"\"a.b\"[0]": 1
}
To add a new option to the solutions already given, jqg is a script I wrote to flatten any JSON file and then search it using a regex. For your purposes your regex would simply be '.' which would match everything.
$ echo '{"a":{"b":[1]},"x":[{"y":2},{"z":3}]}' | jqg .
{
"a.b.0": 1,
"x.0.y": 2,
"x.1.z": 3
}
and can produce compact output:
$ echo '{"a":{"b":[1]},"x":[{"y":2},{"z":3}]}' | jqg -q -c .
{"a.b.0":1,"x.0.y":2,"x.1.z":3}
It also handles the more complicated example that #peak used:
$ echo '{"a":{"b":["p q r"]},"w":[{"x":null},{"y":false},{"z":3}]}' | jqg .
{
"a.b.0": "p q r",
"w.0.x": null,
"w.1.y": false,
"w.2.z": 3
}
as well as empty arrays and objects (and a few other edge-case values):
$ jqg . test/odd-values.json
{
"one.start-string": "foo",
"one.null-value": null,
"one.integer-number": 101,
"two.two-a.non-integer-number": 101.75,
"two.two-a.number-zero": 0,
"two.true-boolean": true,
"two.two-b.false-boolean": false,
"three.empty-string": "",
"three.empty-object": {},
"three.empty-array": [],
"end-string": "bar"
}
(reporting empty arrays & objects can be turned off with the -E option).
jqg was tested with jq 1.6
Note : I am the author of the jqg script.