JSON selector for jq - json

I would like to get a CSV file from the Mozilla CommonVoice JSON statistics. It looks like this:
{
...,
"locales": {
"en": {
...,
"validHrs": 2275.28
},
"fa": {
...,
"validHrs": 327.14
},
}
}
I manage to get the value of validHrs for a single language:
jq ".locales.en.validHrs" cv-corpus-10.0-2022-07-04.json
2275.28
jq ".locales.fa.validHrs" cv-corpus-10.0-2022-07-04.json
327.14
But not for all. Goal is a CSV with:
en,2275.28
fa,327.14
...

This works for me:
cat cv-corpus-10.0-2022-07-04.json| jq -r '.locales | keys[] as $k | "\($k),\(.[$k] | .validHrs)"'
The output looks like:
ab,59.71
ar,87.747
as,2.14
ast,0
az,0.12
ba,256.46
bas,2.04
be,1089.38
bg,9.15
bn,61.6
br,9.56
ca,1389.83
...

You can use to_entries which will give you each key, value pair:
$ jq -r '.locales | to_entries[] | "\(.key),\(.value.validHrs)"' cv.json
en,2275.28
fa,327.14
fr,868.35
es,411.31
sl,10.09
kab,552.81
[...]

Related

How I can transform json to csv with jq?

My json file is similar of this:
{
"A1": "1.2"
"A2": "3.5"
"A3": "2.6"
}
I need transform it to csv file and it looks like this:
A1,1.2
A2,3.5
A3,2.6
My code is:
jq -r 'map(.[] | tonumber) | #csv' file.json > file.csv
and my result is:
1.2,3.5,2.6
Once you fix your example JSON so it's valid:
$ jq -r 'to_entries[] | [.key, (.value | tonumber)] | #csv' input.json
"A1",1.2
"A2",3.5
"A3",2.6
to_entries turns an object into an array of objects with the key field holding the name of one of the original object's keys and value its corresponding value. Then turn each of those objects into a two-element array which is fed to #csv.

Print key if any nested value matches a set value

This is best explained with expected input and output.
Given this input:
{
"27852380038": {
"compute_id": 34234234,
"to_compute": [
{
"asset_id": 304221854,
"new_scheme": "mynewscheme",
"original_host": "oldscheme1234.location.com"
},
{
"asset_id": 12123121,
"new_scheme": "myotherscheme",
"original_host": "olderscheme1234.location.com"
}
]
},
"31352333022": {
"compute_id": 43888877,
"to_compute": [
{
"asset_id": 404221555,
"new_scheme": "mynewscheme",
"original_host": "oldscheme1234.location.com"
},
{
"asset_id": 52123444,
"new_scheme": "myotherscheme",
"original_host": "olderscheme1234.location.com"
}
]
}
}
And the asset_id that I'm searching for, 12123121, the output should be:
27852380038
So I want the top level keys where any of the asset_ids in to_compute match my input asset_id.
I haven't seen any jq example so far that combines nested access with an any test / if else.
The task can be accomplished without using environment variables, e.g.
< input.json jq -r --argjson ASSET_ID 12123121 '
to_entries[]
| {key, asset_id: .value.to_compute[].asset_id}
| select(.asset_id==$ASSET_ID)
| .key'
or more efficiently, using the filter:
to_entries[]
| select( any( .value.to_compute[]; .asset_id==$ASSET_ID) )
| .key
With some help from a coworker I was able to figure it out:
$ export ASSET_ID=12123121
$ cat input.json | jq -r "to_entries[] | .value.to_compute[] + {job: .key} | select(.asset_id==$ASSET_ID) | .job"
27852380038

How to print path and key values of JSON file using JQ

I would like to print each path and value of a json file with included key values line by line. I would like the output to be comma delimited or at least very easy to cut and sort using Linux command line tools. Given the following json and jq, I have been given jq code which seems to do this for the test JSON, but I am not sure it works in all cases or is the proper approach.
Is there a function in jq which does this automatically? If not, is there a "most concise best way" to do it?
My wish would be something like:
$ cat short.json | jq -doit '.'
Reservations,0,Instances,0,ImageId,ami-a
Reservations,0,Instances,0,InstanceId,i-a
Reservations,0,Instances,0,InstanceType,t2.micro
Reservations,0,Instances,0,KeyName,ubuntu
Test JSON:
$ cat short.json | jq '.'
{
"Reservations": [
{
"Groups": [],
"Instances": [
{
"ImageId": "ami-a",
"InstanceId": "i-a",
"InstanceType": "t2.micro",
"KeyName": "ubuntu"
}
]
}
]
}
Code Recommended:
https://unix.stackexchange.com/questions/561460/how-to-print-path-and-key-values-of-json-file
Supporting:
https://unix.stackexchange.com/questions/515573/convert-json-file-to-a-key-path-with-the-resulting-value-at-the-end-of-each-k
JQ Code Too long and complicated!
jq -r '
paths(scalars) as $p
| [ ( [ $p[] | tostring ] | join(".") )
, ( getpath($p) | tojson )
]
| join(": ")
' short.json
Result:
Reservations.0.Instances.0.ImageId: "ami-a"
Reservations.0.Instances.0.InstanceId: "i-a"
Reservations.0.Instances.0.InstanceType: "t2.micro"
Reservations.0.Instances.0.KeyName: "ubuntu"
A simple jq query to achieve the requested format:
paths(scalars) as $p
| $p + [getpath($p)]
| join(",")
If your jq is ancient and you cannot upgrade, insert | map(tostring) before the last line above.
Output with the -r option
Reservations,0,Instances,0,ImageId,ami-a
Reservations,0,Instances,0,InstanceId,i-a
Reservations,0,Instances,0,InstanceType,t2.micro
Reservations,0,Instances,0,KeyName,ubuntu
Caveat
If a key or atomic value contains "," then of course using a comma may be inadvisable. For this reason, it might be preferable to use a character such as TAB that cannot appear in a JSON key or atomic value. Consider therefore using #tsv:
paths(scalars) as $p
| $p + [getpath($p)]
| #tsv
(The comment above about ancient versions of jq applies here too.)
Read it as a stream.
$ jq --stream -r 'select(.[1]|scalars!=null) | "\(.[0]|join(".")): \(.[1]|tojson)"' short.json
Use -c paths as follows:
cat short.json | jq -c paths | tr -d '[' | tr -d ']'
I am using jq-1.5-1-a5b5cbe

Parsing nested json with jq

I am parsing a nested json to get specific values from the json response. The json response is as follows:
{
"custom_classes": 2,
"images":
[
{
"classifiers":
[
{
"classes":
[
{
"class": "football",
"score": 0.867376
}
],
"classifier_id": "players_367677167",
"name": "players"
}
],
"image": "1496A400EDC351FD.jpg"
}
],
"images_processed": 1
}
From the class images=>classifiers=>classes:"class" & "score" are the values that I want to save in a csv file. I have found how to save the result in a csv file. But I am unable to parse the images alone. I can get custom_classes and image_processed.
I am using jq-1.5.
The different commands I have tried :
curl "Some address"| jq '.["images"]'
curl "Some address"| jq '.[.images]'
curl "Some address"| jq '.[.images["image"]]'
Most of the times the error is about not being able to index the array images.
Any hints?
I must say, I'm not terribly good at jq, so probably all those array iterations could be shorthanded somehow, but this yields the values you mentioned:
cat foo.json | jq ".[] | .images | .[] | .classifiers | .[] | .classes | .[] | .[]"
If you want the keys, too, just omit that last .[].`
Edit
As #chepner pointed out in the comments, this can indeed be shortened to
cat foo.json | jq ".images[].classifiers[].classes[] | [.class, .score] | #csv "
Depending on the data this filter which uses Recursive Descent: .., objects and has may work:
.. | objects | select(has("class")) | [.class,.score] | #csv
Sample Run (assuming data in data.json)
$ jq -Mr '.. | objects | select(has("class")) | [.class,.score] | #csv' data.json
"football",0.867376
Try it online at jqplay.org
Here is another variation which uses paths and getpath
getpath( paths(has("class")?) ) | [.class,.score] | #csv
Try it online at jqplay.org
jq solution to obtain a prepared csv record:
jq -r '.images[0].classifiers[0].classes[0] | [.class, .score] | #csv' input.json
The output:
"football",0.867376

How to map an object to arrays so it can be converted to csv?

I'm trying to convert an object that looks like this:
{
"123" : "abc",
"231" : "dbh",
"452" : "xyz"
}
To csv that looks like this:
"123","abc"
"231","dbh"
"452","xyz"
I would prefer to use the command line tool jq but can't seem to figure out how to do the assignment. I managed to get the keys with jq '. | keys' test.json but couldn't figure out what to do next.
The problem is you can't convert a k:v object like this straight into csv with #csv. It needs to be an array so we need to convert to an array first. If the keys were named, it would be simple but they're dynamic so its not so easy.
Try this filter:
to_entries[] | [.key, .value]
to_entries converts an object to an array of key/value objects. [] breaks up the array to each of the items in the array
then for each of the items, covert to an array containing the key and value.
This produces the following output:
[
"123",
"abc"
],
[
"231",
"dbh"
],
[
"452",
"xyz"
]
Then you can use the #csv filter to convert the rows to CSV rows.
$ echo '{"123":"abc","231":"dbh","452":"xyz"}' | jq -r 'to_entries[] | [.key, .value] | #csv'
"123","abc"
"231","dbh"
"452","xyz"
Jeff answer is a good starting point, something closer to what you expect:
cat input.json | jq 'to_entries | map([.key, .value]|join(","))'
[
"123,abc",
"231,dbh",
"452,xyz"
]
But did not find a way to join using newline:
cat input.json | jq 'to_entries | map([.key, .value]|join(","))|join("\n")'
"123,abc\n231,dbh\n452,xyz"
Here's an example I ended up using this morning (processing PagerDuty alerts):
cat /tmp/summary.json | jq -r '
.incidents
| map({desc: .trigger_summary_data.description, id:.id})
| group_by(.desc)
| map(length as $len
| {desc:.[0].desc, length: $len})
| sort_by(.length)
| map([.desc, .length] | #csv)
| join("\n") '
This dumps a CVS-separated document that looks something like:
"[Triggered] Something annoyingly frequent",31
"[Triggered] Even more frequent alert!",35
"[No data] Stats Server is probably acting up",55
Try This
give same output you want
echo '{"123":"abc","231":"dbh","452":"xyz"}' | jq -r 'to_entries | .[] | "\"" + .key + "\",\"" + (.value | tostring)+ "\""'
onecol2txt () {
awk 'BEGIN { RS="_end_"; FS="\n"}
{ for (i=2; i <= NF; i++){
printf "%s ",$i
}
printf "\n"
}'
}
cat jsonfile | jq -r -c '....,"_end_"' | onecol2txt