Modify arrays within objects in jq - json

I have an array of objects, and want to filter the arrays in the b property to only have elements matching the a property of the object.
[
{
"a": 3,
"b": [
1,
2,
3
]
},
{
"a": 5,
"b": [
3,
5,
4,
3,
5
]
}
]
produces
[
{
"a": 3,
"b": [
3
]
},
{
"a": 5,
"b": [
5,
5
]
}
]
Currently, I've arrived at
[.[] | (.a as $a | .b |= [.[] | select(. == $a)])]
That works, but I'm wondering if there's a better (shorter, more readable) way.

I can think of two ways to do this with less code and both are variants of what you have already figured out on your own.
map(.a as $a | .b |= map(select(. == $a)))
del(.[] | .a as $a | .b[] | select(. != $a))

Related

jq - Get a higher level key after a selection

Given a JSON like the following:
{
"data": [{
"id": "1a2b3c",
"info": {
"a": {
"number": 0
},
"b": {
"number": 1
},
"c": {
"number": 2
}
}
}]
}
I want to select on a number that is greater than or equal to 2 and for that selection I want to return the values of id and number. I did this like so:
$ jq -r '.data[] | .id as $ID | .info[] | select(.number >= 2) | [$ID, .number]' in.json
[
"1a2b3c",
2
]
Now I would also like to return a higher level key for my selection, in my case I need to return c. How can I accomplish this?
Assuming you want the string "c" instead of 2 in the output, this will work:
$ jq '.data[] | .id as $ID | .info | to_entries[] | select(.value.number >= 2) | [$ID, .key]' input.json
[
"1a2b3c",
"c"
]

jq - parse structure and save values in bash variable

I have a json input as follow
{
"unique": 1924,
"coordinates": [
{
"time": "2015-01-25T00:00:01.683",
"xyz": [
{
"z": 4,
"y": 2,
"x": 1,
"id": 99,
"inner_arr" : [
{
"a": 1,
"b": 2
},
{
"a": 3,
"b": 4
}
]
},
{
"z": 9,
"y": 9,
"x": 8,
"id": 100,
"inner_arr" : [
{
"a": 1,
"b": 2
},
{
"a": 3,
"b": 4
}
]
},
{
"z": 9,
"y": 6,
"x": 10,
"id": 101,
"inner_arr" : [
{
"a": 1,
"b": 2
},
{
"a": 3,
"b": 4
}
]
}
]
},
{
"time": "2015-01-25T00:00:02.790",
"xyz": [
{
"z": 0,
"y": 3,
"x": 7,
"id": 99,
"inner_arr" : [
{
"a": 1,
"b": 2
},
{
"a": 3,
"b": 4
}
]
},
{
"z": 4,
"y": 6,
"x": 2,
"id": 100,
"inner_arr" : [
{
"a": 1,
"b": 2
},
{
"a": 3,
"b": 4
}
]
},
{
"z": 2,
"y": 9,
"x": 51,
"id": 101,
"inner_arr" : [
{
"a": 1,
"b": 2
},
{
"a": 3,
"b": 4
}
]
}
]
}
]
}
I want to parse this input with jq and store values in bash arrays:
#!/bin/bash
z=()
x=()
y=()
id=()
a=()
b=()
jq --raw-output '.coordinates[] | .xyz[] | (.z) as $z, (.y) as $y,7 (.x) as $x, (.id) as $id, .inner_arr[].a $a, .inner_arr[].b as $b | $z, $y, $x, $id, $a, $b' <<< "$input"
echo -e "${z}"
Expected output for above echo command:
4
9
9
0
4
2
echo -e "${a}"
Expected output for above echo command:
1
3
1
3
1
3
1
3
1
3
1
3
How can I do it with jq with a single jq call looping through all arrays in a cascading fashion?
I want to save CPU by calling jq just once and extract all single or array values.
You cannot set environment variable directly from jq (cf. manual). What you can do is to generate a series of bash declarations for the declare builtin. I suggest to store the declarations in an intermediate bash array (with mapfile) processed directly by declare so that you can stay away from hazardous commands like eval.
mapfile -t < <(
jq --raw-output '
def m(exp): first(.[0] | path(exp)[-1]) + "=(" + (map(exp) | #sh) + ")";
[ .coordinates[].xyz[] ]
| m(.x), m(.y), m(.z), m(.id), m(.inner_arr[].a), m(.inner_arr[].b)
' input
)
declare -a "${MAPFILE[#]}"
The jq script packs all xyz objects in a single array and filters it with the m function for each field represented as a path expression. The function returns a string formatted as field=(val1 val2... valN), where the field name is the last component of the path expression, i.e. x for .x and a for .inner_arr[].a (extracted on the first item of the array).
Then you can check the shell variables with declare -p var or ${var[#]}. ${var} refers to the first element only.
declare -p MAPFILE
declare -p z
echo a: "${a[#]}" / size = ${#a[#]}
declare -a MAPFILE=([0]="x=(1 8 10 7 2 51)" [1]="y=(2 9 6 3 6 9)" [2]="z=(4 9 9 0 4 2)" [3]="id=(99 100 101 99 100 101)" [4]="a=(1 3 1 3 1 3 1 3 1 3 1 3)" [5]="b=(2 4 2 4 2 4 2 4 2 4 2 4)")
declare -a z=([0]="4" [1]="9" [2]="9" [3]="0" [4]="4" [5]="2")
a: 1 3 1 3 1 3 1 3 1 3 1 3 / size = 12

JQ - Denormalize nested object

I've been trying to convert some JSON to csv and I have the following problem:
I have the following input json:
{"id": 100, "a": [{"t" : 1,"c" : 2 }, {"t": 2, "c" : 3 }] }
{"id": 200, "a": [{"t": 2, "c" : 3 }] }
{"id": 300, "a": [{"t": 1, "c" : 3 }] }
And I expect the following CSV output:
id,t1,t2
100,2,3
200,,3
300,3,
Unfortunately JQ doesn't output if one of select has no match.
Example:
echo '{ "id": 100, "a": [{"t" : 1,"c" : 2 }, {"t": 2, "c" : 3 }] }' | jq '{t1: (.a[] | select(.t==1)).c , t2: (.a[] | select(.t==2)).c }'
output:
{ "t1": 2, "t2": 3 }
but if one of the objects select returns no match it doesn't return at all.
Example:
echo '{ "id": 100, "a": [{"t" : 1,"c" : 2 }] }' | jq '{t1: (.a[] | select(.t==1)).c , t2: (.a[] | select(.t==2)).c }'
Expected output:
{ "t1": 2, "t2": null }
Does anyone know how to achieve this with JQ?
EDIT:
Based on a comment made by #peak I found the solution that I was looking for.
jq -r '["id","t1","t2"],[.id, (.a[] | select(.t==1)).c//null, (.a[] | select(.t==2)).c//null ]|#csv'
The alternative operator does exactly what I was looking for.
Alternative Operator
Here's a simple solution that does not assume anything about the ordering of the items in the .a array, and easily generalizes to arbitrarily many .t values:
# Convert an array of {t, c} to a dictionary:
def tod: map({(.t|tostring): .c}) | add;
["id", "t1", "t2"], # header
(inputs
| (.a | tod) as $dict
| [.id, (range(1;3) as $i | $dict[$i|tostring]) ])
| #csv
Command-line options
Use the -n option (because inputs is being used), and the -r option (to produce CSV).
This is an absolute mess, but it works:
$ cat tmp.json
{"id": 100, "a": [{"t" : 1,"c" : 2 }, {"t": 2, "c" : 3 }] }
{"id": 200, "a": [{"t": 2, "c" : 3 }] }
{"id": 300, "a": [{"t": 1, "c" : 3 }] }
$ cat filter.jq
def t(id):
.a |
map({key: "t\(.t)", value: .c}) |
({t1:null, t2:null, id:id} | to_entries) + . | from_entries
;
inputs |
map(.id as $id | t($id)) |
(.[0] | keys) as $hdr |
([$hdr] + map(to_entries |map(.value)))[]|
#csv
$ jq -rn --slurp -f filter.jq tmp.json
"id","t1","t2"
2,3,100
,3,200
3,,300
In short, you produce a direct object containing the values from your input, then add it to a "default" object to fill in the missing keys.

How to generate continuing indices for multiple objects in nested arrays that are in an array

Given
[{
"objects": [{
"key": "value"
},{
"key": "value"
}]
}, {
"objects": [{
"key": "value"
}, {
"key": "value"
}]
}]
How do I generate
[{
"objects": [{
"id": 0,
"key": "value"
},{
"id": 1,
"key": "value"
}]
}, {
"objects": [{
"id": 2,
"key": "value"
}, {
"id": 3,
"key": "value"
}]
}]
Using jq?
I tried to use this one, but ids are all 0:
jq '[(-1) as $i | .[] | {objects: [.objects[] | {id: ($i + 1 as $i | $i), key}]}]'
The key to a simple solution here is to break the problem down into easy pieces. This can be accomplished by defining a helper function, addId/1. Once that is done, the rest is straightforward:
# starting at start, add {id: ID} to each object in the input array
def addId(start):
reduce .[] as $o
([];
length as $l
| .[length] = ($o | (.id = start + $l)));
reduce .[] as $o
( {start: -1, answer: []};
(.start + 1) as $next
| .answer += [$o | (.objects |= addId($next))]
| .start += ($o.objects | length) )
| .answer
Inspired by #peak answer, I came up with this solution. Not much difference, just shorter way to generate IDs and opt for foreach instead of reduce since there is intermediate result involved.
def addIdsStartWith($start):
[to_entries | map((.value.id = .key + $start) | .value)];
[foreach .[] as $set (
{start: 0};
.set = $set |
.start as $start | .set.objects |= addIdsStartWith($start) |
.start += ($set.objects | length);
.set
)]

Swap keys in nested objects using JQ

Using jq, how can I transform:
{ "a": {"b": 0}, "c": {"d": 1}}
into:
{"b": {"a": 0}, "d": {"c": 1}}
without knowing the name of the keys in the source?
(I know that this can lose data in the general case, but not with my data)
Here's an alternative using with_entries:
with_entries(.key as $parent
| (.value|keys[0]) as $child
| {
key: $child,
value: { ($parent): .value[$child] }
}
)
def swapper:
. as $in
| reduce keys[] as $key
( {}; . + ( $in[$key] as $o
| ($o|keys[0]) as $innerkey
| { ($innerkey): { ($key): $o[$innerkey] } } ) ) ;
Example:
{ "a": {"b": 0}, "c": {"d": 1}} | swapper
produces:
{"b":{"a":0},"d":{"c":1}}
Here is a solution which uses jq streams and variables:
[
. as $d
| keys[]
| $d[.] as $v
| ($v|keys[]) as $vkeys
| {
($vkeys): {
(.): ($vkeys|$v[.])
}
}
] | add
It is easy to lose track of what is what at the end so to see more clearly what's going on here is a slightly expanded version with additional comments and variables.
[
. as $d # $d: {"a":{"b":0},"c":{"d": 1}}
| keys[] | . as $k # $k: "a", "c"
| $d[$k] as $v # $v: {"b": 0}, {"d": 1}
| ($v|keys[]) as $vkeys # $vkeys: "b", "d"
| ($vkeys|$v[.]) as $vv # $vv: 0, 1
| {
($vkeys): { # "b": { "d": {
($k): $vv # "a": 0 "c": 1
} # } , }
}
] | add