Print key if any nested value matches a set value - json

This is best explained with expected input and output.
Given this input:
{
"27852380038": {
"compute_id": 34234234,
"to_compute": [
{
"asset_id": 304221854,
"new_scheme": "mynewscheme",
"original_host": "oldscheme1234.location.com"
},
{
"asset_id": 12123121,
"new_scheme": "myotherscheme",
"original_host": "olderscheme1234.location.com"
}
]
},
"31352333022": {
"compute_id": 43888877,
"to_compute": [
{
"asset_id": 404221555,
"new_scheme": "mynewscheme",
"original_host": "oldscheme1234.location.com"
},
{
"asset_id": 52123444,
"new_scheme": "myotherscheme",
"original_host": "olderscheme1234.location.com"
}
]
}
}
And the asset_id that I'm searching for, 12123121, the output should be:
27852380038
So I want the top level keys where any of the asset_ids in to_compute match my input asset_id.
I haven't seen any jq example so far that combines nested access with an any test / if else.

The task can be accomplished without using environment variables, e.g.
< input.json jq -r --argjson ASSET_ID 12123121 '
to_entries[]
| {key, asset_id: .value.to_compute[].asset_id}
| select(.asset_id==$ASSET_ID)
| .key'
or more efficiently, using the filter:
to_entries[]
| select( any( .value.to_compute[]; .asset_id==$ASSET_ID) )
| .key

With some help from a coworker I was able to figure it out:
$ export ASSET_ID=12123121
$ cat input.json | jq -r "to_entries[] | .value.to_compute[] + {job: .key} | select(.asset_id==$ASSET_ID) | .job"
27852380038

Related

JSON selector for jq

I would like to get a CSV file from the Mozilla CommonVoice JSON statistics. It looks like this:
{
...,
"locales": {
"en": {
...,
"validHrs": 2275.28
},
"fa": {
...,
"validHrs": 327.14
},
}
}
I manage to get the value of validHrs for a single language:
jq ".locales.en.validHrs" cv-corpus-10.0-2022-07-04.json
2275.28
jq ".locales.fa.validHrs" cv-corpus-10.0-2022-07-04.json
327.14
But not for all. Goal is a CSV with:
en,2275.28
fa,327.14
...
This works for me:
cat cv-corpus-10.0-2022-07-04.json| jq -r '.locales | keys[] as $k | "\($k),\(.[$k] | .validHrs)"'
The output looks like:
ab,59.71
ar,87.747
as,2.14
ast,0
az,0.12
ba,256.46
bas,2.04
be,1089.38
bg,9.15
bn,61.6
br,9.56
ca,1389.83
...
You can use to_entries which will give you each key, value pair:
$ jq -r '.locales | to_entries[] | "\(.key),\(.value.validHrs)"' cv.json
en,2275.28
fa,327.14
fr,868.35
es,411.31
sl,10.09
kab,552.81
[...]

Index string array by prefix in JQ

I'm trying to figure out how to print name of keys and specific sub-sub values from it.
My JSON is:
{
"results": 3,
"rows": [
{
"hostname1": {
"tags": [
"owner:TEAM_A",
"friendlyname:myhost1",
"x:abc",
"y:jkl"
]
}
},
{
"hostname2": {
"tags": [
"friendlyname:myhost2",
"owner:TEAM_A",
"x:def",
"q:jkl"
]
}
},
{
"hostname3": {
"tags": [
"owner:TEAM_A",
"x:ghi",
"friendlyname:myhost3",
"q:jkl"
]
}
}
]
}
What I've already achieved is to print just keys of hostnames:
jq -r '.rows[] | keys[]' example.json
hostname1
hostname2
hostname3
I know how to print key:values from tags array:
jq -r .rows[0].hostname1.tags[0,1] example.json
owner:TEAM_A
friendlyname:myhost1
But I can't figure out how to print
hostname1
"owner:TEAM_A",
"friendlyname:myhost1",
hostname2
"owner:TEAM_A",
"friendlyname:myhost2",
hostname3
"owner:TEAM_A",
"friendlyname:myhost3",
Be aware, that the keys in tags array has different order, so I cannot reach it through .rows[0].hostname1.tags[0,1] I'm looking for something like .rows[0].all_keys.tags[owner,friendlyname]
My bash script was very close, but the order of keys brokes it.
hostnames=`jq -r '.rows[] | keys[]' example.json`
count=0
for i in $hostnames
do
jq -r .rows[$count].$i\.tags[0,1] example.json
echo $i
((count=count+1))
done
Turning tags into an object first would make it easier to retrieve tags in a particular order.
.rows[][].tags | INDEX(sub(":.*"; "")) | .owner, .friendlyname
Online demo
And it seems like you don't need a shell loop for this task, JQ can do all that and even more on its own.
.rows[]
| keys_unsorted[] as $hostname
| .[$hostname].tags
| INDEX(sub(":.*"; ""))
| $hostname, "\t" + (.owner, .friendlyname)
Online demo
You can use to_entries to convert an object into an array of key-value pairs, then access .key and .value of its items to your own likings. For instance:
jq -r '.rows[] | to_entries[] | [.key, .value.tags[0,1]] | join("\n ")'
hostname1
owner:TEAM_A
friendlyname:myhost1
hostname2
friendlyname:myhost2
owner:TEAM_A
hostname3
owner:TEAM_A
x:ghi
Demo
Another example:
jq -r '
.rows[] | to_entries[] | [.key, (
.value.tags[] | select(startswith("owner:", "friendlyname:"))
)] | join("\n ")
'
hostname1
owner:TEAM_A
friendlyname:myhost1
hostname2
friendlyname:myhost2
owner:TEAM_A
hostname3
owner:TEAM_A
friendlyname:myhost3
Demo

How do I write a jq query to convert a JSON file to CSV?

The JSON files look like:
{
"name": "My Collection",
"description": "This is a great collection.",
"date": 1639717379161,
"attributes": [
{
"trait_type": "Background",
"value": "Sand"
},
{
"trait_type": "Skin",
"value": "Dark Brown"
},
{
"trait_type": "Mouth",
"value": "Smile Basic"
},
{
"trait_type": "Eyes",
"value": "Confused"
}
]
}
I found a shell script that uses jq and has this code:
i=1
for eachFile in *.json; do
cat $i.json | jq -r '.[] | {column1: .name, column2: .description} | [.[] | tostring] | #csv' > extract-$i.csv
echo "converted $i of many json files..."
((i=i+1))
done
But its output is:
jq: error (at <stdin>:34): Cannot index string with string "name"
converted 1 of many json files...
Any suggestions on how I can make this work? Thank you!
Quick jq lesson
===========
jq filters are applied like this:
jq -r '.name_of_json_field_0 <optional filter>, .name_of_json_field_1 <optional filter>'
and so on and so forth. A single dot is the simplest filter; it leaves the data field untouched.
jq -r '.name_of_field .'
You may also leave the filter field untouched for the same effect.
In your case:
jq -r '.name, .description'
will extract the values of both those fields.
.[] will unwrap an array to have the next piped filter applied to each unwrapped value. Example:
jq -r '.attributes | .[]
extracts all trait_types objects.
You may sometime want to repackage objects in an array by surrounding the filter in brackets:
jq -r '[.name, .description, .date]
You may sometime want to repackage data in an object by surrounding the filter in curly braces:
`jq -r '{new_field_name: .name, super_new_field_name: .description}'
playing around with these, I was able to get
jq -r '[.name, .description, .date, (.attributes | [.[] | .trait_type] | #csv | gsub(",";";") | gsub("\"";"")), (.attributes | [.[] | .value] | .[]] | #csv | gsub(",";";") | gsub("\"";""))] | #csv'
to give us:
"My Collection","This is a great collection.",1639717379161,"Background;Skin;Mouth;Eyes","Sand;Dark Brown;Smile Basic;Confused"
Name, description, and date were left as is, so let's break down the weird parts, one step at a time.
.attributes | [.[] | .trait_type]
.[] extracts each element of the attributes array and pipes the result of that into the next filter, which says to simply extract trait_type, where they are re-packaged in an array.
.attributes | [.[] | .trait_type] | #csv
turn the array into a csv-parsable format.
(.attributes | [.[] | .trait_type] | #csv | gsub(",";";") | gsub("\"";""))
Parens separate this from the rest of the evaluations, obviously.
The first gsub here replaces commas with semicolons so they don't get interpreted as a separate field, the second removes all extra double quotes.

Parsing nested json with jq

I am parsing a nested json to get specific values from the json response. The json response is as follows:
{
"custom_classes": 2,
"images":
[
{
"classifiers":
[
{
"classes":
[
{
"class": "football",
"score": 0.867376
}
],
"classifier_id": "players_367677167",
"name": "players"
}
],
"image": "1496A400EDC351FD.jpg"
}
],
"images_processed": 1
}
From the class images=>classifiers=>classes:"class" & "score" are the values that I want to save in a csv file. I have found how to save the result in a csv file. But I am unable to parse the images alone. I can get custom_classes and image_processed.
I am using jq-1.5.
The different commands I have tried :
curl "Some address"| jq '.["images"]'
curl "Some address"| jq '.[.images]'
curl "Some address"| jq '.[.images["image"]]'
Most of the times the error is about not being able to index the array images.
Any hints?
I must say, I'm not terribly good at jq, so probably all those array iterations could be shorthanded somehow, but this yields the values you mentioned:
cat foo.json | jq ".[] | .images | .[] | .classifiers | .[] | .classes | .[] | .[]"
If you want the keys, too, just omit that last .[].`
Edit
As #chepner pointed out in the comments, this can indeed be shortened to
cat foo.json | jq ".images[].classifiers[].classes[] | [.class, .score] | #csv "
Depending on the data this filter which uses Recursive Descent: .., objects and has may work:
.. | objects | select(has("class")) | [.class,.score] | #csv
Sample Run (assuming data in data.json)
$ jq -Mr '.. | objects | select(has("class")) | [.class,.score] | #csv' data.json
"football",0.867376
Try it online at jqplay.org
Here is another variation which uses paths and getpath
getpath( paths(has("class")?) ) | [.class,.score] | #csv
Try it online at jqplay.org
jq solution to obtain a prepared csv record:
jq -r '.images[0].classifiers[0].classes[0] | [.class, .score] | #csv' input.json
The output:
"football",0.867376

How to map an object to arrays so it can be converted to csv?

I'm trying to convert an object that looks like this:
{
"123" : "abc",
"231" : "dbh",
"452" : "xyz"
}
To csv that looks like this:
"123","abc"
"231","dbh"
"452","xyz"
I would prefer to use the command line tool jq but can't seem to figure out how to do the assignment. I managed to get the keys with jq '. | keys' test.json but couldn't figure out what to do next.
The problem is you can't convert a k:v object like this straight into csv with #csv. It needs to be an array so we need to convert to an array first. If the keys were named, it would be simple but they're dynamic so its not so easy.
Try this filter:
to_entries[] | [.key, .value]
to_entries converts an object to an array of key/value objects. [] breaks up the array to each of the items in the array
then for each of the items, covert to an array containing the key and value.
This produces the following output:
[
"123",
"abc"
],
[
"231",
"dbh"
],
[
"452",
"xyz"
]
Then you can use the #csv filter to convert the rows to CSV rows.
$ echo '{"123":"abc","231":"dbh","452":"xyz"}' | jq -r 'to_entries[] | [.key, .value] | #csv'
"123","abc"
"231","dbh"
"452","xyz"
Jeff answer is a good starting point, something closer to what you expect:
cat input.json | jq 'to_entries | map([.key, .value]|join(","))'
[
"123,abc",
"231,dbh",
"452,xyz"
]
But did not find a way to join using newline:
cat input.json | jq 'to_entries | map([.key, .value]|join(","))|join("\n")'
"123,abc\n231,dbh\n452,xyz"
Here's an example I ended up using this morning (processing PagerDuty alerts):
cat /tmp/summary.json | jq -r '
.incidents
| map({desc: .trigger_summary_data.description, id:.id})
| group_by(.desc)
| map(length as $len
| {desc:.[0].desc, length: $len})
| sort_by(.length)
| map([.desc, .length] | #csv)
| join("\n") '
This dumps a CVS-separated document that looks something like:
"[Triggered] Something annoyingly frequent",31
"[Triggered] Even more frequent alert!",35
"[No data] Stats Server is probably acting up",55
Try This
give same output you want
echo '{"123":"abc","231":"dbh","452":"xyz"}' | jq -r 'to_entries | .[] | "\"" + .key + "\",\"" + (.value | tostring)+ "\""'
onecol2txt () {
awk 'BEGIN { RS="_end_"; FS="\n"}
{ for (i=2; i <= NF; i++){
printf "%s ",$i
}
printf "\n"
}'
}
cat jsonfile | jq -r -c '....,"_end_"' | onecol2txt