Concat 2 arrays inside object based on object key/value - json

I have multiple json objects which could be less when i merge the arrays if a object key matches the same value as the next json object. I'm trying to accomplish this with jq.
I think i have to use group_by(.name) first to group matching keys. I'm also using slurp to first wrap all objects into one big array.
I don't have anything working for now.
given:
{
"name": "a",
"list": [ "a1", "a2" ]
}
{
"name": "a",
"list": [ "a3", "a4" ]
}
{
"name": "b",
"list": [ "b1", "b2" ]
}
should result in:
{
"name": "a",
"list": [ "a1", "a2", "a3", "a4" ]
}
{
"name": "b",
"list": [ "b1", "b2" ]
}

You can use reduce like this:
$ jq -c -n 'reduce inputs as $p ({}; .[$p.name] |= { name : $p.name, list : (.list + $p.list) }) | .[]' file
{"name":"a","list":["a1","a2","a3","a4"]}
{"name":"b","list":["b1","b2"]}

Here's a simple and efficient solution that uses a common "aggregate by" technique:
reduce inputs as $kv ({}; .[$kv.name] += $kv.list)
| keys_unsorted[] as $k
| {name: $k, list: .[$k]}
Since inputs has been used here, the -n command-line option of jq should be specified.

Related

How can I merge matching keys to into arrays via another key?

I have a GraphQL schema file with deeply nested object metadata that I'd like to extract into arrays of child properties. The original file is over 75000 lines long but I was able to successfully extract the Types & fields for each object using this command:
jq '.data.__schema.types[] | {name: .name, fields: .fields[]?.name?}' schema.json > output.json
Output:
{
"name": "UsersConnection",
"fields": "nodes"
}
{
"name": "UsersConnection",
"fields": "edges"
}
{
"name": "UsersConnection",
"fields": "pageInfo"
}
{
"name": "UsersConnection",
"fields": "totalCount"
}
{
"name": "UsersEdge",
"fields": "cursor"
}
{
"name": "UsersEdge",
"fields": "node"
}
...
But the output I want looks more like this:
[{
"name": "UsersConnection",
"fields": [ "nodes", "edges", "pageInfo", "totalCount" ]
},
{
"name": "UsersEdge",
"fields": [ "cursor", "node" ]
}]
I was able to do this by comma-separating each object, surrounding the output with { "data": [ -OUTPUT- ]} & the command:
jq 'map(. |= (group_by(.name) | map(first + {fields: map(.fields)})))' output.json > output2.json
How can I do this with a single command?
Assuming .data.__schema.types is an array, and so is .fields, you could try map in both cases:
.data.__schema.types | map({name: .name, fields: (.fields | map(.name))})
I totally missed that I put the fields object inside brackets like this:
jq '.data.__schema.types[] | {name: .name, fields: [.fields[]?.name?]}'
Keeping this up for posterity in case someone else is trying to do the same thing
Update: I was able to get a cleaner, comma-separated result like this:
jq 'reduce .data.__schema.types[] as $d (null; .[$d.name] += [$d.fields[]?.name?])'

Aggregate json arrays from multiple files using jq, grouping by key

I would like to aggregate two or more files into a single json, and aggregate arrays under a same key.
file1.json
{
"shapes": [
{
"id": "1",
"name": "circle"
},
{
"id": "2",
"name": "square"
}
]
}
file2.json
{
"shapes": [
{
"id": "3",
"name": "triangle"
}
]
}
Expected result :
{
"shapes": [
{
"id": "1",
"name": "circle"
},
{
"id": "2",
"name": "square"
},
{
"id": "3",
"name": "triangle"
}
]
}
I can do this with the following jq command :
jq -s '{shapes: map(.shapes)|add }' file*.json
But this requires me to know the shapes attribute and hardcode it. Is there a simple way I can get the same result without ever using the key name explicitly?
Here is a solution that’s suitable when each top-level object has only one key, and that is both efficient and conceptually simple. It assumes jq is invoked with the -n option.
reduce inputs as $in (null;
($in|keys_unsorted[0]) as $k | { ($k): (.[$k] + $in[$k]) })
or slightly more compactly:
reduce inputs as $in (null; ($in|keys_unsorted[0]) as $k | .[$k] += $in[$k] )
Here is a solution that also solves a more general problem: first, it handles arbitrarily many input files; and second, it forms the "sum" by key, for every key, on the assumption that every top-level key is array-valued.
The generic function:
# the values at each key are assumed to be arrays
def aggregate(stream):
reduce stream as $o ({};
reduce ($o|keys_unsorted[]) as $k (.;
.[$k] += $o[$k] ));
To avoid "slurping", we will use inputs:
aggregate(inputs)
The invocation must therefore use the -n command-line option:
jq -n -f program.jq *.json
Try the following code. This can handle any number of files. All inputs are assumed to be json objects with all values inside as arrays. All such arrays are aggregated after grouping by keys. It outputs an object which has keys associated with corresponding aggregated arrays.
jq -s 'map(to_entries)|add|group_by(.key)|
map( { "key": (.[0].key), "value": (map(.value)|add)})|
from_entries' file1.json file2.json
For your sample input this gives:
{
"shapes": [
{
"id": "1",
"name": "circle"
},
{
"id": "2",
"name": "square"
},
{
"id": "3",
"name": "triangle"
}
]
}

Update one JSON file values with values from another JSON using JQ (on all levels)

I have two JSON files:
source.json:
{
"general": {
"level1": {
"key1": "x-x-x-x-x-x-x-x",
"key3": "z-z-z-z-z-z-z-z",
"key4": "w-w-w-w-w-w-w-w"
},
"another" : {
"key": "123456",
"comments": {
"one": "111",
"other": "222"
}
}
},
"title": "The best"
}
and the
target.json:
{
"general": {
"level1": {
"key1": "xxxxxxxx",
"key2": "yyyyyyyy",
"key3": "zzzzzzzz"
},
"onemore": {
"kkeeyy": "0000000"
}
},
"specific": {
"stuff": "test"
},
"title": {
"one": "one title",
"other": "other title"
}
}
I need all the values for keys which exist in both files, copied from source.json to target.json, considering all the levels.
I've seen and tested the solution from this post.
It only copies the first level of keys, and I couldn't get it to do what I need.
The result from solution in this post, looks like this:
{
"general": {
"level1": {
"key1": "x-x-x-x-x-x-x-x",
"key3": "z-z-z-z-z-z-z-z",
"key4": "w-w-w-w-w-w-w-w"
},
"another": {
"key": "123456",
"comments": {
"one": "111",
"other": "222"
}
}
},
"specific": {
"stuff": "test"
},
"title": "The best"
}
Everything under the "general" key was copied as is.
What I need, is this:
{
"general": {
"level1": {
"key1": "x-x-x-x-x-x-x-x",
"key2": "yyyyyyyy",
"key3": "z-z-z-z-z-z-z-z"
},
"onemore": {
"kkeeyy": "0000000"
}
},
"specific": {
"stuff": "test"
},
"title": {
"one": "one title",
"other": "other title"
}
}
Only "key1" and "key3" should be copied.
Keys in target JSON must not be deleted and new keys should not be created.
Can anyone help?
One approach you could take is get all the paths to all scalar values for each input and take the set intersections. Then copy values from source to target from those paths.
First we'll need an intersect function (which was surprisingly difficult to craft):
def set_intersect($other):
(map({ ($other[] | tojson): true }) | add) as $o
| reduce (.[] | tojson) as $v ({}; if $o[$v] then .[$v] = true else . end)
| keys_unsorted
| map(fromjson);
Then to do the update:
$ jq --argfile s source.json '
reduce ([paths(scalars)] | set_intersect([$s | paths(scalars)])[]) as $p (.;
setpath($p; $s | getpath($p))
)
' target.json
[Note: this response answers the original question, with respect to the original data. The OP may have had paths in mind rather than keys.]
There is no need to compute the intersection to achieve a reasonably efficient solution.
First, let's hypothesize the following invocation of jq:
jq -n --argfile source source.json --argfile target target.json -f copy.jq
In the file copy.jq, we can begin by defining a helper function:
# emit an array of the distinct terminal keys in the input entity
def keys: [paths | .[-1] | select(type=="string")] | unique;
In order to inspect all the paths to leaf elements of $source, we can use tostream:
($target | keys) as $t
| reduce ($source|tostream|select(length==2)) as [$p,$v]
($target;
if $t|index($p[-1]) then setpath($p; $v) else . end)
Alternatives
Since $t is sorted, it would (at least in theory) make sense to use bsearch instead of index:
bsearch($p[-1]) > -1
Also, instead of tostream we could use paths(scalars).
Putting these alternatives together:
($target | keys) as $t
| reduce ($source|paths(scalars)) as $p
($target;
if $t|bsearch($p[-1]) > -1
then setpath($p; $source|getpath($p))
else . end)
Output
{
"general": {
"level1": {
"key1": "x-x-x-x-x-x-x-x",
"key2": "yyyyyyyy",
"key3": "z-z-z-z-z-z-z-z"
},
"onemore": {
"kkeeyy": "0000000"
}
},
"specific": {
"stuff": "test"
}
}
The following provides a solution to the revised question, which is actually about "paths" rather than "keys".
([$target|paths(scalars)] | unique) as $paths
| reduce ($source|paths(scalars)) as $p
($target;
if $paths | bsearch($p) > -1
then setpath($p; $source|getpath($p))
else . end)
unique is called so that binary search can be used subsequently.
Invocation:
jq -n --argfile source source.json --argfile target target.json -f program.jq

jq: translate array of objects to object

I have a response from curl in a format like this:
[
{
"list": [
{
"value": 1,
"id": 12
},
{
"value": 15,
"id": 13
},
{
"value": -4,
"id": 14
}
]
},
...
]
Given a mapping between ids like this:
{
"12": "newId1",
"13": "newId2",
"14": "newId3"
}
I want to make this:
[
{
"list": {
"newId1": 1,
"newId2": 15,
"newId3": -4,
}
},
...
]
Such that I get a mapping from ids to values (and along the way I'd like to remap the ids).
I've been working at this for a while and every time I get a deadend.
Note: I can use Shell or the like to preform loops if necessary.
edit: Here's one version what I've developed so far:
jq '[].list.id = ($mapping.[] | select(.id == key)) | del(.id)' -M --argjson "mapping" "$mapping"
I don't think it's the best one, but I'm looking to see if I can find an old version that was closer to what I need.
[EDIT: The following response was in answer to the question when it described (a) the mapping as shown below, and (b) the input data as having the form:
[
{
"list": [
{
"value": 1,
"id1": 12
},
{
"value": 15,
"id2": 13
},
{
"value": -4,
"id3": 14
}
]
}
]
END OF EDIT]
In the following I'll assume that the mapping is available via the following function, but that is an inessential assumption:
def mapping: {
"id1": "newId1",
"id2": "newId2",
"id3": "newId3"
} ;
The following jq filter will then produce the desired output:
map( .list
|= (map( to_entries[]
| (mapping[.key]) as $mapped
| select($mapped)
| {($mapped|tostring): .value} )
| add) )
There's plenty of ways to skin a cat. I'd do it like this:
.[].list |= reduce .[] as $i ({};
($i.id|tostring) as $k
| (select($mapping | has($k))[$mapping[$k]] = $i.value) // .
)
You would just provide the mapping through a separate file or argument.
$ cat program.jq
.[].list |= reduce .[] as $i ({};
($i.id|tostring) as $k
| (select($mapping | has($k))[$mapping[$k]] = $i.value) // .
)
$ cat mapping.json
{
"12": "newId1",
"13": "newId2",
"14": "newId3"
}
$ jq --argfile mapping mapping.json -f program.jq input.json
[
{
"list": {
"newId1": 1,
"newId2": 15,
"newId3": -4
}
}
]
Here is a reduce-free solution to the revised problem.
In the following I'll assume that the mapping is available via the following function, but that is an inessential assumption:
def mapping:
{
"12": "newId1",
"13": "newId2",
"14": "newId3"
} ;
map( .list
|= (map( mapping[.id|tostring] as $mapped
| select($mapped)
| {($mapped): .value} )
| add) )
The "select" is for safety (i.e., it checks that the .id under consideration is indeed mapped). It might also be appropriate to ensure that $mapped is a string by writing {($mapped|tostring): .value}.

Select or exclude multiples object with an array of IDs

I have the following JSON :
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
},
{
"id": "10",
"foo": "bar-c",
"hello": "world-c"
},
{
"id": "42",
"foo": "bar-d",
"hello": "world-d"
}
]
And I have the following array store in a variable: ["1", "2", "56", "1337"] (note the IDs are string, and may contain any regular character).
So, thanks to this SO, I found a way to filter my original data. jq 'jq '[.[] | select(.id == ("1", "2", "56", "1337"))]' ./data.json (note the array is surrounded by parentheses and not brackets) produces :
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
}
]
But I would also liked to do the opposite (basically excluding IDs instead of selecting them). Using select(.id != ("1", "2", "56", "1337")) doesn't work and using jq '[. - [.[] | select(.id == ("1", "2", "56", "1337"))]]' ./data.json seems very ugly and it doesn't work with my actual data (an output of aws ec2 describe-instances).
So have you any idea to do that? Thank you!
To include them, you need to verify that the id is any of the values in the keep set.
$ jq --argjson include '["1", "2", "56", "1337"]' 'map(select(.id == $include[]))' ...
To exclude them, you need to verify that all values are not in your excluded set. But it might just be easier to take the original set and remove the items that are in the excluded set.
$ jq --argjson exclude '["1", "2", "56", "1337"]' '. - map(select(.id == $exclude[]))' ...
Here is a solution that uses inside. Assuming you run jq as
jq -M --argjson IDS '["1","2","56","1337"]' -f filter.jq data.json
This filter.jq
map( select([.id] | inside($IDS)) )
produces the ids from data.json that are in the $IDS array:
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
}
]
and this filter.jq
map( select([.id] | inside($IDS) | not) )
produces the ids from data.json that are not in the $IDS array:
[
{
"id": "10",
"foo": "bar-c",
"hello": "world-c"
},
{
"id": "42",
"foo": "bar-d",
"hello": "world-d"
}
]