jq: how to loop through sub arrays - json

I'm having the following dataset:
{
"data": {
"activeFindings": {
"findings": [
{
"findingId": "someFindingID#84209",
"products": [
"hostA.corp.somedomain.org",
"hostB.corp.somedomain.org"
],
"totalAffectedObjectsCount": 6
},
{
"findingId": "someFindingID#2145016",
"products": [
"hostC.corp.somedomain.org"
],
"totalAffectedObjectsCount": 1
},
{
"findingId": "someFindingID#67129",
"products": [
"hostD.corp.somedomain.org"
],
"totalAffectedObjectsCount": 4
},
{
"findingId": "someFindingID#67774",
"products": [
"hostA.corp.somedomain.org"
],
"totalAffectedObjectsCount": 6
}
]
}
}
}
The following command (though the first result returns null) will give the list of findingID and its associated host(s):
cat test | jq -r '.data[] | .. | "\(.findingId?) \(.products?)"'
null null
someFindingID#84209 ["hostA.corp.somedomain.org","hostB.corp.somedomain.org"]
someFindingID#2145016 ["hostC.corp.somedomain.org","hostE.corp.somedomain.org","hostG.corp.somedomain.org"]
someFindingID#67129 ["hostD.corp.somedomain.org"]
someFindingID#67774 ["hostA.corp.somedomain.org"]
What I'd like to achieve is to loop through each values and pass the findingId & products as arguments in a bash script.
The following:
someFindingID#84209 ["hostA.corp.somedomain.org","hostB.corp.somedomain.org"]
someFindingID#2145016 ["hostC.corp.somedomain.org","hostE.corp.somedomain.org","hostG.corp.somedomain.org"]
someFindingID#67129 ["hostD.corp.somedomain.org"]
someFindingID#67774 ["hostA.corp.somedomain.org"]
Would result in:
./somescript.sh someFindingID#84209 hostA.corp.somedomain.org
./somescript.sh someFindingID#84209 hostB.corp.somedomain.org
./somescript.sh someFindingID#2145016 hostC.corp.somedomain.org
./somescript.sh someFindingID#2145016 hostE.corp.somedomain.org
./somescript.sh someFindingID#2145016 hostG.corp.somedomain.org
./somescript.sh someFindingID#67129 hostD.corp.somedomain.org
[...]
Any help/guidance on how to achieve the above would be greatly appreciated!
Thanks,

Solution:
jq -r '
.data[][][] |
.products[] as $product |
#sh "./somescript.sh \( .findingId ) \( $product )"
'
First of all, .data[] | .. returns way too many nodes.
.data[][][] would work great here.
.data.activeFindings.findings[] can used if you want to be more precise.
You ask how to loop, but you're already doing it: [] is used to loop over an array.
The catch is that you want to loop without changing the context (.). To do that, we can use as:
.products[] as $product
Finally, we want to avoid code injection bugs, so we'll use #sh "...". In the string literal that follows #sh, all interpolated values are converted into proper shell string literals.
$ jq -rn '"foo bar" | #sh "cmd \( . )"'
cmd 'foo bar'
All together, we get the following program:
.data[][][] |
.products[] as $product |
#sh "./somescript.sh \( .findingId ) \( $product )"
Demo on jqplay

I'd go with something like this:
jq -r '.data[]
| ..
| objects
| select(has("findingId"))
| "./somescript.sh \"\(.findingId)\" " + .products[]
'
You might also want to quote the "product" values as well.
Or consider using #sh.

Related

Grouping and sorting JSON records in Bash

I'm using curl to get JSON file. My problem is that I would like to get group of 4 words in one line, then break the line, and sort it by first column.
I'm trying:
curl -L 'http://mylink/ | jq '.[]| .location, .host_name, .serial_number, .model'
I'm getting
"Office-1"
"work-1"
"11xxx111"
"hp"
"Office-2"
"work-2"
"33xxx333"
"lenovo"
"Office-1"
"work-3"
"22xxx222"
"dell"
I would like to have:
"Office-1", "work-1", "11xxx111", "hp"
"Office-1" "work-3", "22xxx222", "dell"
"Office-2", "work-2", "33xxx333", "lenovo"
I tried jq -S ".[]| .location| group_by(.location), and few other combinations like sort_by(.location) but it doesn't work. I'm getting error: jq: error (at <stdin>:1): Cannot iterate over string ("Office-1")
Sample of my JSON file:
[
{
"location": "Office-1",
"host_name": "work-1",
"serial_number": "11xxx111",
"model": "hp"
},
{
"location": "Office-2",
"host_name": "work-2",
"serial_number": "33xxx333",
"model": "lenovo"
},
{
"location": "Office-1",
"host_name": "work-3",
"serial_number": "22xxx222",
"model": "dell"
}
]
To sort by .location only, without an external sort:
map( [ .location, .host_name, .serial_number, .model] )
| sort_by(.[0])[]
| map("\"\(.)\"") | join(", ")
The ", " is per the stated requirements.
If you want the output as CSV, simply replace the last line in the jq program above by #csv.
If minimizing keystrokes is a goal, then if you are certain that the keys are always in the desired order, you could get away with replacing the first line by map( [ .[] ] )
You can ask jq to produce arbitrary formatted strings.
curl -L 'http://mylink/ |
jq -r '.[]| "\"\(.location)\", \"\(.host_name)\", \"\(.serial_number)\", \"\(.model)\""' |
sort
Inside the double quotes, \" produces literal double quotes, and \(.field) interpolates a field name. The -r option is required to produce output which isn't JSON.
This will get you the output you wanted:
jq -r 'group_by(.location) | .[] | .[] | map(values) | "\"" + join ("\", \"") + "\""'
like so:
$ jq -r 'group_by(.location) | .[] | .[] | map(values) | "\"" + join ("\", \"") + "\""' /tmp/so7713.json
"Office-1", "work-1", "11xxx111", "hp"
"Office-1", "work-3", "22xxx222", "dell"
"Office-2", "work-2", "33xxx333", "lenovo"
If you want it all as one string, it's a bit simpler:
$ jq 'group_by(.location) | .[] | .[] | map(values) | join (", ")' /tmp/so7713.json
"Office-1, work-1, 11xxx111, hp"
"Office-1, work-3, 22xxx222, dell"
"Office-2, work-2, 33xxx333, lenovo"
Note the lack of -r in the second example.
I feel there has to be a better way of doing .[] | .[], but I don't know what it is (yet).

Map arrays to objects with no common fields

How might one use jq-1.5-1-a5b5cbe to join a filtered set of arrays from STDIN to a set of objects which contains no common fields, assuming that all elements will be in predictable order?
Standard Input (pre-slurpfile; generated by multiple GETs):
{"ref":"objA","arr":["alpha"]}
{"ref":"objB","arr":["bravo"]}
Existing File:
[{"name":"foo"},{"name":"bar"}]
Desired Output:
[{"name":"foo","arr":["alpha"]},{"name":"bar","arr":["bravo"]}]
Current Bash:
$ multiGET | jq --slurpfile stdin /dev/stdin '.[].arr = $stdin[].arr' file
[
{
"name": "foo",
"arr": [
"alpha"
]
},
{
"name": "bar",
"arr": [
"alpha"
]
}
]
[
{
"name": "foo",
"arr": [
"bravo"
]
},
{
"name": "bar",
"arr": [
"bravo"
]
}
]
Sidenote: I wasn't sure when to use pretty/compact JSON in this question; please comment with your opinion on best practice.
Get jq to read file before stdin, so that the first entity in file will be . and you can get everything else using inputs.
$ multiGET | jq -c '. as $objects
| [ foreach (inputs | {arr}) as $x (-1; .+1;
. as $i | $objects[$i] + $x
) ]' file -
[{"name":"foo","arr":["alpha"]},{"name":"bar","arr":["bravo"]}]
"Slurping" (whether using -s or --slurpfile) is sometimes necessary but rarely desirable, because of the memory requirements. So here's a solution that takes advantage of the fact that your multiGET produces a stream:
multiGET | jq -n --argjson objects '[{"name":"foo"},{"name":"bar"}]' '
$objects
| [foreach inputs as $in (-1; .+1;
. as $ix
| $objects[$ix] + ($in | del(.ref)))]
'
Here's a functional approach that might be appropriate if your stream was in fact already packaged as an array:
multiGET | jq -s --argjson objects '[{"name":"foo"},{"name":"bar"}]' '
[$objects, map(del(.ref))]
| transpose
| map(add)
'
If the $objects array is in a file or too big for the command line, I'd suggest using --argfile, even though it is technically deprecated.
If the $objects array is in a file, and if you want to avoid --argfile, you could still avoid slurping, e.g. by using the fact that unless -n is used, jq will automatically read one JSON entity from stdin:
(echo '[{"name":"foo"},{"name":"bar"}]';
multiGET) | jq '
. as $objects
| [foreach inputs as $in (-1; .+1;
. as $ix | $objects[$ix] + $in | del(.ref))]
'

Convert json to csv / jq Cannot iterate over string

[
{
"Description": "Copied for Destination xxx from Sourc 30c for Snapshot 1. Task created on X,52,87,14,76.",
"Encrypted": false,
"ID": "snap-074",
"Progress": "100%",
"Time": "2019-06-11T09:25:23.110Z",
"Owner": "883065",
"Status": "completed",
"Volume": "vol1",
"Size": 16
},
{
"Description": "Copied for Destination yy from Source 31c for Snapshot 2. Task created on X,52,87,14,76.",
"Encrypted": false,
"ID": "snap-096",
"Progress": "100%",
"Time": "2019-06-11T10:18:01.410Z",
"Owner": "1259",
"Status": "completed",
"Volume": "vol-2",
"Size": 4
}
]
I have that json file that I'm trying to convert to csv using the following command:
jq -r '. | map(.Description[], .Encrypted, .ID, .Progress, .Time, .Owner, .Status, .Volume, .Size | join(",")) | join("\n")' snapshots1.json
But I'm getting error:
jq: error (at snapshots1.json:24): Cannot iterate over string ("Copied for...)
I look at similar post in jq: error: Cannot iterate over string but can't figure out the error. Any help is appreciated.
jq -r '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.])) as $rows | $cols, $rows[] | #csv' snapshots1.json >> myfile.csv
Found this post that explains this code and it worked for me.
I think you were on the right track. Here is how I'd do it:
jq -r '.[] | map(..) | #csv' snapshot1.json > snapshot1.csv
There's a couple of small problems with your code:
.Descriptions[] - Descriptions doesn't have an array so the square brackets don't work - there's no array to open.
Suppose we get rid of the square brackets, you see that the code works insofar as it puts the contents of the objects into an array. However, it put the contents into one array - the result is that your csv will only have one line (and I'm assuming that you want each object on separate rows.). This is because the map function puts all the contents into one array (see documentation: jq Manual) - so you have to split open the array first.
The first part of your code with the dot (.) doesn't do anything - it simply returns the whole JSON as is. If you want play around with it, try .[] and then experiment from there.
Edited: Spelling
There's a risk in using .. here to extract the "values" in an object: what if the ordering of the keys in the input objects differs between objects?
Here's a generic filter which addresses this and other issues. It also emits a suitable "header" line:
def object2array(stream):
foreach stream as $x (null;
if . == null then $x | [true, keys_unsorted] else .[0]=false end;
(if .[0] then .[1] else empty end),
.[1] as $keys | $x | [getpath( $keys[] | [.]) ] );
Example
def data: [{a:1,b:2}, {b:22,a:11,c:0}];
object2array(data[])
produces:
["a","b"]
[1,2]
[11,22]
Just right for piping to #csv or #tsv.
Solution
So the solution to the original problem would essentially be:
object2array(.[]) | #csv

Building new JSON with JQ and bash

I am trying to create JSON from scratch using bash.
The final structure needs to be like:
{
"hosts": {
"a_hostname" : {
"ips" : [
1,
2,
3
]
},
{...}
}
}
First I'm creating an input file with the format:
hostname ["1.1.1.1","2.2.2.2"]
host-name2 ["3.3.3.3","4.4.4.4"]
This is being created by:
for host in $( ansible -i hosts all --list-hosts ) ; \
do echo -n "${host} " ; \
ansible -i hosts $host -m setup | sed '1c {' | jq -r -c '.ansible_facts.ansible_all_ipv4_addresses' ; \
done > hosts.txt
The key point here is that the IP list/array, is coming from a JSON file and being extracted by jq. This extraction outputs an already valid / quoted JSON array, but as a string in a txt file.
Next I'm using jq to parse the whole text file into the desired JSON:
jq -Rn '
{ "hosts": [inputs |
split("\\s+"; "g") |
select(length > 0 and .[0] != "") |
{(.[0]):
{ips:.[1]}
}
] | add }
' < ~/hosts.txt
This is almost correct, everything except for the IPs value which is treated as a string and quoted leading to:
{
"hosts": {
"hostname1": {
"ips": "[\"1.1.1.1\",\"2.2.2.2\"]"
},
"host-name2": {
"ips": "[\"3.3.3.3\",\"4.4.4.4\"]"
}
}
}
I'm now stuck at this final hurdle - how to insert the IPs without causing them to be quoted again.
Edit - quoting solved by using {ips: .[1] | fromjson }} instead of {ips:.[1]}.
However this was completely negated by #CharlesDuffy's help suggesting converting to TSV.
Original Q body:
So far I've got to
jq -n {hosts:{}} | \
for host in $( ansible -i hosts all --list-hosts ) ; \
do jq ".hosts += {$host:{}}" | \
jq ".hosts.$host += {ips:[1,2,3]}" ; \
done ;
([1,2,3] is actually coming from a subshell but including it seemed unnecessary as that part works, and made it harder to read)
This sort of works, but there seems to be 2 problems.
1) Final output only has a single host in it containg data from the first host in the list (this persists even if the second problem is bypassed):
{
"hosts": {
"host_1": {
"ips": [
1,
2,
3
]
}
}
}
2) One of the hostnames has a - in it, which causes syntax and compiler errors from jq. I'm stuck going around quote hell trying to get it to be interpreted but also quoted. Help!
Thanks for any input.
Let's say your input format is:
host_1 1 2 3
host_2 2 3 4
host-with-dashes 3 4 5
host-with-no-addresses
...re: edit specifying a different format: Add #tsv onto the JQ command producing the existing format to generate this one instead.
If you want to transform that to the format in question, it might look like:
jq -Rn '
{ "hosts": [inputs |
split("\\s+"; "g") |
select(length > 0 and .[0] != "") |
{(.[0]): .[1:]}
] | add
}' <input.txt
Which yields as output:
{
"hosts": {
"host_1": [
"1",
"2",
"3"
],
"host_2": [
"2",
"3",
"4"
],
"host-with-dashes": [
"3",
"4",
"5"
],
"host-with-no-addresses": []
}
}

Need help to parse and print only 'category' values either using jq or jsawk or shell script

Need help to parse and print only category values either using jq or jsawk or shell script.
{
"fine_grained": {
"dog": [
{
"category": "cocker spaniel",
"mark": 0.9958831668
}
]
},
"coarse": [
{
"category": "dog",
"mark": 0.948208034
}
]
}
Assuming all category values are simple strings and you want all category values, regardless of where it is in the JSON, you could use this filter using jq:
.. | objects.category // empty
This returns the following strings:
"cocker spaniel"
"dog"
Here is a solution which uses leaf_paths and select to find all the paths with a leaf "category" member and then extract the corresponding values with foreach
foreach (leaf_paths | select(.[-1] == "category")) as $p (
.
; .
; getpath($p)
)
If your input is in a file called input.json and the above filter is in a file called filter.jq then the shell command
jq -f filter.jq input.json
should produce
"cocker spaniel"
"dog"
You can use the -r flag if you don't want the quotes in the output.
EDIT: I now realize a filter of the form foreach E as $X (.; .; R) can almost always be rewritten as E as $X | R so the above is really just
(leaf_paths | select(.[-1] == "category")) as $p
| getpath($p)