Merge two complex JSON objects with arrays - json

I have the two following json as input:
{
"one": {
"vars": [
{
"name": "a",
"value": "a"
},
{
"name": "b",
"value": "b"
}
]
},
"two": {
"vars": [
{
"name": "c",
"value": "c"
},
{
"name": "d",
"value": "d"
}
]
},
"extras": "whatever"
}
{
"one": {
"vars": [
{
"name": "e",
"value": "e"
},
{
"name": "f",
"value": "f"
}
]
},
"two": {
"vars": [
{
"name": "g",
"value": "g"
},
{
"name": "h",
"value": "h"
}
]
}
}
And I'd like to merge them in order to obtain the following result where each of the vars array of each section are merged together:
{
"one": {
"vars": [
{
"name": "a",
"value": "a"
},
{
"name": "b",
"value": "b"
},
{
"name": "e",
"value": "e"
},
{
"name": "f",
"value": "f"
}
]
},
"two": {
"vars": [
{
"name": "c",
"value": "c"
},
{
"name": "d",
"value": "d"
},
{
"name": "g",
"value": "g"
},
{
"name": "h",
"value": "h"
}
]
},
"extras": "whatever"
}
Ideally but not mandatory:
the keys (here one and two) would be arbitrary and an undefined number of them could be present.
the vars array would not contain duplicate (based on name) and right precedence would be applied to override values from the first array.
I managed to merge the two objects and only 1 array with the following command but the key is hardcoded and I'm a bit stuck from there:
jq -s '.[0].one.vars=([.[].one.vars]|flatten)|.[0]' file1.json file2.json

First, here is a solution which is oblivious to the top-level key names, but which does not attempt to avoid duplicates:
$A
| reduce keys_unsorted[] as $k (.;
if .[$k] | (type == "object") and has("vars")
then (.[$k]|.vars) += ($B[$k]|.vars) else . end )
Here of course $A and $B refer to the two objects. You can set $A and $B in several ways.
If you want to reorder the top-level keys, you can simply extend the above with a filter specifying the order, e.g.: {extras, two, one}.
To avoid duplicates, I'd suggest writing a helper function to do just that, as illustrated in the following section.
Avoiding duplicates
def extend(stream):
reduce stream as $s (.;
(map(.name) | index($s|.name)) as $i
| if $i then .[$i] += $s
else . + [$s]
end) ;
$A
| reduce keys_unsorted[] as $k (.;
if .[$k] | (type == "object") and has("vars")
then (.[$k].vars) = ( .[$k].vars | extend(($B[$k].vars[])))
else . end
)

jq -n 'input as $b | input
| .one.vars |= . + $b.one.vars
| .two.vars |= . + $b.two.vars' file2.json file1.json
file1.json must come after file2.json in order to preserve extras.

Related

How to make jq to pick name value pairs

Might be more or less the same ask as How to get JQ name/value pair from nested (array?) response?, but that question and example there is way too convoluted than what I'm asking --
Giving the input jason as in https://jqplay.org/s/jyKBnpx9NYX
Pick out all the name/value pair under .QueryString, .Params into the same unnested array
E.g., for an input of
{
"Some": "Random stuff",
"One": {
"QueryString": [
{ "Name": "IsOrdered", "Value": "1" },
{ "Name": "TimeStamp", "Value": "11654116426247" }
]
},
"Two": {
"QueryString": [
{ "Name": "IsOrdered", "Value": "1" },
{ "Name": "TimeStamp", "Value": "11654116426247" }
]
},
"Params": [
{ "Name": "ClassName", "Value": "PRODUCT" },
{ "Name": "ListID", "Value": "Products" },
{ "Name": "Mode ", "Value": "1" },
{ "Name": "Dept" , "Value": "5" },
{ "Name": "HasPrevOrder", "Value": "" }
],
"And": {
"QueryString":[]
},
"More": "like",
"More+": "this"
}
The output would be:
[
{
"Name": "IsOrdered",
"Value": "1"
},
{
"Name": "TimeStamp",
"Value": "11654116426247"
},
{
"Name": "IsOrdered",
"Value": "1"
},
{
"Name": "TimeStamp",
"Value": "11654116426247"
},
{
"Name": "ClassName",
"Value": "PRODUCT"
},
{
"Name": "ListID",
"Value": "Products"
},
...
],
without any empty arrays output ([]), while keep the repeated values in the array.
I tried to remove empty arrays output ([]) by changing the jq expression from
[( .. | objects | ( .QueryString, .Params ) | select( . != null) )]
to
[( .. | objects | ( .QueryString, .Params ) | select( . != null && . != []) )]
but it failed.
And the final output need to be unnested into a single array too.
Bonus Q: Would it be possible to output each name/value pair on one line of their own like the following?
{ "Name": "IsOrdered", "Value": "1" },
{ "Name": "TimeStamp", "Value": "11654116426247" },
{ "Name": "IsOrdered", "Value": "1" },
{ "Name": "TimeStamp", "Value": "11654116426247" },
To get the Name/Value objects, one per line, you could go with:
jq -c '.. | objects | (.QueryString, .Params) | .. | objects | select( .Name and .Value)'
or more cavalierly:
jq -c '.. | objects | select( .Name and .Value)'
The && must be replaced with and. On the result you can use | flatten to convert "array of arrays of objects" into just "array of objects".
Bonus A: Use the -c/--compact-output flag of jq together with | flatten[] instead of just | flatten.
Together:
jq -c '
[
..
| objects
| ( .QueryString, .Params )
| select(. != null and . != [])
]
| flatten[]' input.json
Although this expression can be simplified into .. | objects | .QueryString[]?, .Params[]?
The output is:
{"Name":"ClassName","Value":"PRODUCT"}
{"Name":"ListID","Value":"Products"}
{"Name":"Mode ","Value":"1"}
{"Name":"Dept","Value":"5"}
{"Name":"HasPrevOrder","Value":""}
{"Name":"IsOrdered","Value":"1"}
{"Name":"TimeStamp","Value":"11654116426247"}
{"Name":"IsOrdered","Value":"1"}
{"Name":"TimeStamp","Value":"11654116426247"}

jq ~ collapse specific single object arrays?

corresponding to jq ~ is there a better way to collapse single object arrays? and R: Nested data.table to JSON
how do I collapse only specific elements?
I want to get rid of the "group" arrays in
[
{
"id2": "A",
"group": [
{
"data": [
{
"id1": 1,
"group": [
{
"data": [
{
"a": 1,
"b": 1
},
{
"a": 2,
"b": 2
}
],
"type": "test"
}
],
"type": "B"
}
],
"type": "C"
}
]
},
{
"id2": "C",
"group": [
{
"data": [
{
"id1": 3,
"group": [
{
"data": [
{
"a": 1,
"b": 1
}
],
"type": "test"
}
],
"type": "B"
}
],
"type": "C"
}
]
}
]
desired output
[{
"id2": "A",
"group": {
"data": [{
"id1": 1,
"group": {
"data": [{
"a": 1,
"b": 1
},
{
"a": 2,
"b": 2
}
],
"type": "test"
},
"type": "B"
}],
"type": "C"
}
},
{
"id2": "C",
"group": {
"data": [{
"id1": 3,
"group": {
"data": [{
"a": 1,
"b": 1
}],
"type": "test"
},
"type": "B"
}],
"type": "C"
}
}
]
The line 'walk(if type=="array" and length==1 then .[0] else . end)' additionally removes the array from the single "data" object.
Unfortunately, we are not able to install the jq 1.6 version on our RStudio Server und thereby I'm not able to use the walk function. (Although is working perfectly fine on my local system)
Can anybody help me out with an alternative solution without walk? Would be highly appreciated.
edit
Ok I got it. I can manually add the walk function such as:
'def walk(f):
. as $in
| if type == "object" then
reduce keys_unsorted[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end; walk(if type=="object"
and has("group")
and (.group | type)=="array"
and (.group | length)==1
then .group = .group[0]
else . end)'
We could operate one level higher in the nesting hierarchy, and test for "group" being a key, then update accordingly .group = .group[0] instead of . = .[0]
jq 'walk(if type=="object"
and has("group")
and (.group | type)=="array"
and (.group | length)==1
then .group = .group[0]
else . end)'

Parse 2 files based on key value and recreate another json file [JQ]

I am new to JQ.
I need to make a json file based on another 2 files.
I am worked with it whole day and stack here. Badly need this.
Here is file 1
{
"name": "foo",
"key": "1",
"id": "x"
}
{
"name": "bar",
"key": "2",
"id": "x"
}
{
"name": "baz",
"key": "3",
"id": "y"
}
file 2
{
"name": "a",
"key": "1"
}
{
"name": "b",
"key": "1"
}
{
"name": "c",
"key": "2"
}
{
"name": "d",
"key": "2"
}
{
"name": "e",
"key": "3"
}
Expected Result:
{
"x": {
"foo": [
"a",
"b"
],
"bar": [
"c",
"d"
]
},
"y": {
"baz": [
"e"
]
}
}
I can do it with python script but I need it with jq.
Thanks in advance.
Use reduce on the first file's items ($i) to successively build up the result object using setpath with fields from the item and values as a matching map on the secondary dictionary file ($d).
jq -s --slurpfile d file2 '
reduce .[] as $i ({}; setpath(
[$i.id, $i.name];
[$d[] | select(.key == $i.key).name]
))
' file1
For efficiency, the following solution first constructs a "dictionary" based on file2; furthermore, it does so without having to "slurp" it.
< file2 jq -nc --slurpfile file1 file1 '
(reduce inputs as {$name, $key} ({};
.[$key] += [$name])) as $dict
| reduce $file1[] as {$name, $key, $id} ({};
.[$id] += [ {($name): $dict[$key]} ] )
'

Manipulate json, remove two items in a group by key value

How can I manipulate this chunk of json:
{
"id": "whatever",
"attributes": [
{
"key": "this",
"value": "A"
},
{
"key": "that",
"value": "B"
},
{
"key": "other",
"value": "C"
}
]
}
So that it matches on "that" and removes the key and value both in that grouping, leaving json like this:
{
"id": "whatever",
"attributes": [
{
"key": "this",
"value": "A"
},
{
"key": "other",
"value": "C"
}
]
}
I am attempting to use jq on linux.
Try this
.attributes |= map(select(.key != "that"))
Demo
Figured it out.
jq 'del(.attributes[] | select(.key == "that"))' test.json | sponge test.json

Is there a way to use default in object construction with jq?

I want to filter and assign a value from array based on a condition, and use default in case if array does not have the matched object.
Here is a sample object:
{
"array" : [
{
"id": "A",
"conversations": [
{
"conversation": "1",
"type": "good"
},
{
"conversation": "2",
"type": "bad"
}
]
},
{
"id": "B",
"conversations": [
{
"conversation": "3",
"type": "good"
},
{
"conversation": "4",
"type": "bad"
}
]
},
{
"id": "C",
"conversations": [
{
"conversation": "5",
"type": "bad"
},
{
"conversation": "6",
"type": "bad"
}
]
}
]
}
Required output:
{
"id": "A",
"goodConversation": "1"
}
{
"id": "B",
"goodConversation": "3"
},
{
"id": "C",
"goodConversation": null
}
echo of my input:
echo '{"array":[{"id":"A","conversations":[{"conversation":"1","type":"good"},{"conversation":"2","type":"bad"}]},{"id":"B","conversations":[{"conversation":"3","type":"good"},{"conversation":"4","type":"bad"}]},{"id":"C","conversations":[{"conversation":"5","type":"bad"},{"conversation":"6","type":"bad"}]}]}'
I tried running following jq
jq '.array[] | {id, "goodConversation": .conversations[] | select(.type == "good") | .conversation}'
Actual output:
{
"id": "A",
"goodConversation": "1"
}
{
"id": "B",
"goodConversation": "3"
}
since the object with id: "C" does not have any good conversation the whole object gets filtered out. Is there a way to create the output object which contains "C" with null as value?
Clarification:
"conversations" will have at most one good conversation.
I am using jq 1.5
One way to provide a default value is often to use the // "alternative" operator. Building on the foundations you've laid, you could write:
.array[]
| {id,
"goodConversation":
((.conversations[]
| select(.type == "good")
| .conversation) // null) }
If there is more than one "good" conversation, however, this may not be exactly what you want. If it's not, then consider using first, e.g.:
.array[]
| {id,
"goodConversation":
( first(.conversations[]
| select(.type == "good")
| .conversation) // null)}