Insert value into JSON array if it is not already present - json

I'm building a bash script that will read the response from an API and insert a value within a [] list (not {} array) if the value isn't present. Fake example response from API:
#response.json contains:
{
"foods": {
"menu": [
"tacos",
"spaghetti",
"pizza",
"chicken_florentine",
"bacon_cheeseburge",
"chow_mein",
"sushi",
"chocolate",
"whiskey"
]
}
}
The variable from my bash script is order="lasagna". If 'foods.menu[]' contains $order then do nothing, else insert $order value into 'foods.menu[]'.
Using bash variable order="lasagna" which currently doesn't exist in 'foods.menu[]', the resulting json should be:
{
"foods": {
"menu": [
"tacos",
"spaghetti",
"pizza",
"chicken_florentine",
"bacon_cheeseburge",
"chow_mein",
"sushi",
"chocolate",
"whiskey",
"lasagna" <----
]
}
}
I started with trying a bash for loop and variations of jq's if-then-else, select, and contains but went down a rabbit hole. Any help is appreciated.

You don't need a loop for that
jq --arg order lasagna '.foods.menu |= if index($order) then . else . + [$order] end' response.json
But you may need one for inserting multiple orders
jq 'reduce $ARGS.positional[] as $order (.;
.foods.menu |= if index($order) then . else . + [$order] end
)' response.json --args apple banana sushi

I don't know if this fits with any unspoken requirements.
Add the item regardless, and then unique the results:
echo "$result" | jq --arg order pizza '.foods.menu |= (. + [$order] | unique)'
{
"foods": {
"menu": [
"bacon_cheeseburge",
"chicken_florentine",
"chocolate",
"chow_mein",
"pizza",
"spaghetti",
"sushi",
"tacos",
"whiskey"
]
}
}
The resulting array is sorted. If there were any duplicates in the input, those are gone.

Here's another way using the alternative operator // to default to the array's length for the item's index:
jq --arg order 'lasagna' '
(.foods.menu | .[index($order) // length]) = $order
' response.json
Adding a list of items in the same manner:
jq '.foods.menu |= reduce $ARGS.positional[] as $order (.;
.[index($order) // length] = $order
)' response.json --args 'lasagna' 'sushi' 'milk'

Related

jq: how to loop through sub arrays

I'm having the following dataset:
{
"data": {
"activeFindings": {
"findings": [
{
"findingId": "someFindingID#84209",
"products": [
"hostA.corp.somedomain.org",
"hostB.corp.somedomain.org"
],
"totalAffectedObjectsCount": 6
},
{
"findingId": "someFindingID#2145016",
"products": [
"hostC.corp.somedomain.org"
],
"totalAffectedObjectsCount": 1
},
{
"findingId": "someFindingID#67129",
"products": [
"hostD.corp.somedomain.org"
],
"totalAffectedObjectsCount": 4
},
{
"findingId": "someFindingID#67774",
"products": [
"hostA.corp.somedomain.org"
],
"totalAffectedObjectsCount": 6
}
]
}
}
}
The following command (though the first result returns null) will give the list of findingID and its associated host(s):
cat test | jq -r '.data[] | .. | "\(.findingId?) \(.products?)"'
null null
someFindingID#84209 ["hostA.corp.somedomain.org","hostB.corp.somedomain.org"]
someFindingID#2145016 ["hostC.corp.somedomain.org","hostE.corp.somedomain.org","hostG.corp.somedomain.org"]
someFindingID#67129 ["hostD.corp.somedomain.org"]
someFindingID#67774 ["hostA.corp.somedomain.org"]
What I'd like to achieve is to loop through each values and pass the findingId & products as arguments in a bash script.
The following:
someFindingID#84209 ["hostA.corp.somedomain.org","hostB.corp.somedomain.org"]
someFindingID#2145016 ["hostC.corp.somedomain.org","hostE.corp.somedomain.org","hostG.corp.somedomain.org"]
someFindingID#67129 ["hostD.corp.somedomain.org"]
someFindingID#67774 ["hostA.corp.somedomain.org"]
Would result in:
./somescript.sh someFindingID#84209 hostA.corp.somedomain.org
./somescript.sh someFindingID#84209 hostB.corp.somedomain.org
./somescript.sh someFindingID#2145016 hostC.corp.somedomain.org
./somescript.sh someFindingID#2145016 hostE.corp.somedomain.org
./somescript.sh someFindingID#2145016 hostG.corp.somedomain.org
./somescript.sh someFindingID#67129 hostD.corp.somedomain.org
[...]
Any help/guidance on how to achieve the above would be greatly appreciated!
Thanks,
Solution:
jq -r '
.data[][][] |
.products[] as $product |
#sh "./somescript.sh \( .findingId ) \( $product )"
'
First of all, .data[] | .. returns way too many nodes.
.data[][][] would work great here.
.data.activeFindings.findings[] can used if you want to be more precise.
You ask how to loop, but you're already doing it: [] is used to loop over an array.
The catch is that you want to loop without changing the context (.). To do that, we can use as:
.products[] as $product
Finally, we want to avoid code injection bugs, so we'll use #sh "...". In the string literal that follows #sh, all interpolated values are converted into proper shell string literals.
$ jq -rn '"foo bar" | #sh "cmd \( . )"'
cmd 'foo bar'
All together, we get the following program:
.data[][][] |
.products[] as $product |
#sh "./somescript.sh \( .findingId ) \( $product )"
Demo on jqplay
I'd go with something like this:
jq -r '.data[]
| ..
| objects
| select(has("findingId"))
| "./somescript.sh \"\(.findingId)\" " + .products[]
'
You might also want to quote the "product" values as well.
Or consider using #sh.

How can jq be used to insert dynamic field names recursively for all objects in an array?

'm new to jq, and hoping to convert JSON below so that, for each object in the records array , the "Account" object is deleted and replaced with an "AccountID" field which has a the value of Account.Id.
Assuming I don't know what the name of the field (eg. Account ) is prior to executing, so it Has to be dynamically included as an argument to --arg.
Contacts.json:
{
"records": [
{
"attributes": {
"type": "Contact",
"referenceId": "ContactRef1"
},
"Account": {
"attributes": {
"type": "Account",
"url": "/services/data/v51.0/sobjects/Account/asdf"
},
"Id": "asdf"
}
},
{
"attributes": {
"type": "Contact",
"referenceId": "ContactRef2"
},
"Account": {
"attributes": {
"type": "Account",
"url": "/services/data/v51.0/sobjects/Account/qwer"
},
"Id": "qwer"
}
}
]
}
to
{
"records": [
{
"attributes": {
"type": "Contact",
"referenceId": "ContactRef1"
},
"AccountID": "asdf"
}
},{
"attributes": {
"type": "Contact",
"referenceId": "ContactRef2"
},
"AccountID": "qwer"
}
}
]
}
This example above is a little contrived because in actuality, I need to be able to dynamically name the ID field to be able to port the new JSON structure into destination system. For my use case, it's not always valid to tack "ID" onto the field name ( eg. Account .. ID ), so I passed the field names to --arg .
This is as close as I got.. but it's not quite there. and I suspect there is better way.
jq -c --arg field "Account" --arg field_name_id "AccountID" '. |= . + if .records?[]?[$field] != null then { "\($field_name_id)" : .records[][$field].Id } else empty end | if .records?[]?[$field] != null then del(.records[][$field]) else empty end' Contacts.json
I've wrestled with this quite a while, but this is as far as I'm able to manage without running into tons of syntax errors. I really appreciate any help to add an AccountID field on each object in the records array.
Here's the actual bash script where jq is being run ( relevant parts are where FIELD(S) is being used )
#! /bin/bash
# This script takes a of soql file as first and only argument
# The main purpose is to tweak the json results from an sfdx:data:tree:export so the json is compatible with sfdx:data:tree:import
# This is needed because sfdx export & import are inadequate to use whne relationships more than 2 levels deep in the export query.
# grab all unique object names within the soql file for any objects where the ID field is being SELECTed ( eg. "Account Iteration__r Profile UserRole" )
FIELDS=`grep -oe '\([A-Za-z_]\+\)\.[iI][dD]' $1 | cut -f 1 -d . - | sort -u`
#find all json files in file and rewrite the relationship FIELDS blocks into someting sfdx can import
for FIELD in $FIELDS;
do
if [[ $FIELD =~ __r ]]
then
FIELD_NAME_ID=`sed 's/__r/__c/' <<< $FIELD`
else
FIELD_NAME_ID="${FIELD}ID"
fi
JSON_FILES=`ls *.json`
#Loop all json files in direcotry
for DATA_FILE in $JSON_FILES
do
#replace any email addresses left in custom data( just in case )
#using gsed becuse Mac lacks -i flag for in-place substitution
gsed -i 's/[^# "]*#[^#]*\.[^# ,"]*/fake#test.com/g' $DATA_FILE
# make temporary file to hold the rewritten json
TEMP_FILE="temp-${DATA_FILE}.bk"
echo $DATA_FILE $FIELD $FIELD_NAME_ID
#For custom relationship jttrs. change __r to __c to get the name of Id field, otherwise just add "ID".
jq -c --arg field $FIELD --arg field_name_id $FIELD_NAME_ID '. |= . + if .records?[]?[$field] != null then { "\($field_name_id)" : .records[][$field].Id } else empty end | if .records?[]?[$field] != null then del(.records[][$field]) else empty end' $DATA_FILE 1> ./$TEMP_FILE 2> modify-json.errors
# if TEMP_FILE is not empty, then jq revised it, so replace contents the original JSON DATA_FILE
if [[ -s ./$TEMP_FILE ]]
then
#JSON format spacing/line-breaks
jq '.' $TEMP_FILE > $DATA_FILE
fi
rm $TEMP_FILE
done
done
The key to a simple solution is |=. Here's one using map:
.records |= map( .Account.Id as $x
| del(.Account)
| . + {AccountID: $x} )
which can be simplified to:
.records |= map( . + {AccountID: .Account.Id}
| del(.Account) )
Either of these can easily be adapted to the case where the two field names are passed in as arguments, or if they must be inferred from the "owner" of "Id".
Adapting peak's answer to use the dynamic field name:
jq -c --arg field "Account" \
--arg field_name_id "AccountID" '
.records |= map(.[$field].Id as $x
| del(.[$field])
| . + {($field_name_id): $x})
'

How to conditionally select array index to update based on value

I have a json file as below, need to append the new name into .root.application.names, but if the name passed in has prefix (everything before -, in below example it's jr), then find the list with same prefix names already present, and update it, if there is only one names list or if there is no matching list, then update first list.
In the below example, if
$application == 'application1' and $name == <whatever>; just update first list under application1, as there is only one list under application1, nothing to choose from.
$application == 'application2' and if $name has no prefix delimiter "-" or unmatched prefix (say sr-allen); then update the first list under application2.names, because foo has no or unmatched prefix.
$application == 'application2' and say $name == jr-allen; then update the second list under application2, because $name has prefix "jr-" and there is a list with items matching this prefix.
{
"root": {
"application1": [
{
"names": [
"john"
],
"project": "generic"
}
],
"application2": [
{
"names": [
"peter",
"jack"
],
"project": "generic"
},
{
"names": [
"jr-sam",
"jr-mike",
"jr-rita"
],
"project": "junior-project"
}
]
}
}
I found how to update the list, not sure how to add these conditions, any help please?
jq '."root"."application2"[1].names[."root"."application2"[1].names| length] |= . + "jr-allen"' foo.json
Update:
good if I can do this with jq/walk, I am still trying as below, but couldn't get anywhere close.
prefix=$(echo ${name} | cut -d"-" -f1) # this gives the prefix, e.g: "jr"
jq -r --arg app "${application}" name "${name}" prefix "${prefix}"'
def walk(f):
. as $in
| if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
walk( if type=="object" and ."$app" and (.names[]|startswith("$prefix")) ) then .names[]="$name" else . end )
' foo.json
It took me a while to have any confidence that I have understood the requirements, but I believe the following at least captures the essence of what you have in mind.
To make the solution easier to understand, we begin with a helper function, the name of which makes its purpose clear enough, at least given the context:
def updateFirstNamesArrayWithMatchingPrefix($prefix; $value):
(first( range(0; length) as $i
| if any(.[$i].names[]; startswith($prefix))
then $i else empty end) // 0) as $i
| .[$i].names += [$value] ;
.root |=
if .[$app] | length == 1
then .[$app][0].names += [ $name ]
elif .[$app] | length > 1
then
( $name | split("-")) as $components
| if $components|length==1 # no prefix
then .[$app][0].names += [ $name ]
else ($components[0] + "-" ) as $prefix
| .[$app] |= updateFirstNamesArrayWithMatchingPrefix($prefix; $name)
end
else .
end
Testing
The above passes the four tests originally proposed by
#Inian:
jq --arg app "application1" --arg name "foo" -f script.jq jsonFile
jq --arg app "application2" --arg name "jr-foo" -f script.jq jsonFile
jq --arg app "application2" --arg name "sr-foo" -f script.jq jsonFile
jq --arg app "application2" --arg name "foo" -f script.jq jsonFile
Herrings?
Based on my understanding of the problem, it seems to me that walk may be a bit of a red herring, but if not, I hope you'll be able to adapt the above to meet your actual requirements.
Your requirement is: If prefix exists in name, update the last array element, else the first one.
When the array has one element, like for application1, the last is the first also.
#!/bin/bash
application="$1"
name="$2"
json_file="file.json"
ind=0
# if name matches "-", set index to the last array element
[[ "$name" == *"-"* ]] && ind=-1
jq --arg app "$application" \
--arg name "$name" \
--argjson ind "$ind" \
'.root[$app][$ind].names += [$name]' "$json_file"
I believe above script is self-explanatory enough, --argjson used for having an unquoted index, += stands for |= . +.
Testing
Commands below produce the expected result.
bash test.sh application1 jr-John
bash test.sh application2 jr-John
bash test.sh application1 Mary
bash test.sh application2 Mary

Map arrays to objects with no common fields

How might one use jq-1.5-1-a5b5cbe to join a filtered set of arrays from STDIN to a set of objects which contains no common fields, assuming that all elements will be in predictable order?
Standard Input (pre-slurpfile; generated by multiple GETs):
{"ref":"objA","arr":["alpha"]}
{"ref":"objB","arr":["bravo"]}
Existing File:
[{"name":"foo"},{"name":"bar"}]
Desired Output:
[{"name":"foo","arr":["alpha"]},{"name":"bar","arr":["bravo"]}]
Current Bash:
$ multiGET | jq --slurpfile stdin /dev/stdin '.[].arr = $stdin[].arr' file
[
{
"name": "foo",
"arr": [
"alpha"
]
},
{
"name": "bar",
"arr": [
"alpha"
]
}
]
[
{
"name": "foo",
"arr": [
"bravo"
]
},
{
"name": "bar",
"arr": [
"bravo"
]
}
]
Sidenote: I wasn't sure when to use pretty/compact JSON in this question; please comment with your opinion on best practice.
Get jq to read file before stdin, so that the first entity in file will be . and you can get everything else using inputs.
$ multiGET | jq -c '. as $objects
| [ foreach (inputs | {arr}) as $x (-1; .+1;
. as $i | $objects[$i] + $x
) ]' file -
[{"name":"foo","arr":["alpha"]},{"name":"bar","arr":["bravo"]}]
"Slurping" (whether using -s or --slurpfile) is sometimes necessary but rarely desirable, because of the memory requirements. So here's a solution that takes advantage of the fact that your multiGET produces a stream:
multiGET | jq -n --argjson objects '[{"name":"foo"},{"name":"bar"}]' '
$objects
| [foreach inputs as $in (-1; .+1;
. as $ix
| $objects[$ix] + ($in | del(.ref)))]
'
Here's a functional approach that might be appropriate if your stream was in fact already packaged as an array:
multiGET | jq -s --argjson objects '[{"name":"foo"},{"name":"bar"}]' '
[$objects, map(del(.ref))]
| transpose
| map(add)
'
If the $objects array is in a file or too big for the command line, I'd suggest using --argfile, even though it is technically deprecated.
If the $objects array is in a file, and if you want to avoid --argfile, you could still avoid slurping, e.g. by using the fact that unless -n is used, jq will automatically read one JSON entity from stdin:
(echo '[{"name":"foo"},{"name":"bar"}]';
multiGET) | jq '
. as $objects
| [foreach inputs as $in (-1; .+1;
. as $ix | $objects[$ix] + $in | del(.ref))]
'

Need help to parse and print only 'category' values either using jq or jsawk or shell script

Need help to parse and print only category values either using jq or jsawk or shell script.
{
"fine_grained": {
"dog": [
{
"category": "cocker spaniel",
"mark": 0.9958831668
}
]
},
"coarse": [
{
"category": "dog",
"mark": 0.948208034
}
]
}
Assuming all category values are simple strings and you want all category values, regardless of where it is in the JSON, you could use this filter using jq:
.. | objects.category // empty
This returns the following strings:
"cocker spaniel"
"dog"
Here is a solution which uses leaf_paths and select to find all the paths with a leaf "category" member and then extract the corresponding values with foreach
foreach (leaf_paths | select(.[-1] == "category")) as $p (
.
; .
; getpath($p)
)
If your input is in a file called input.json and the above filter is in a file called filter.jq then the shell command
jq -f filter.jq input.json
should produce
"cocker spaniel"
"dog"
You can use the -r flag if you don't want the quotes in the output.
EDIT: I now realize a filter of the form foreach E as $X (.; .; R) can almost always be rewritten as E as $X | R so the above is really just
(leaf_paths | select(.[-1] == "category")) as $p
| getpath($p)