Merge two complex JSON objects (using jq) - json

I'm trying to replace objects in a complex JSON object. It seemed that the tool jq could be offering the perfect solution, but I'm really struggling with the right choice / chain of filters.
I have a complete configuration JSON object which looks like this (has some more keys in it, shortened it for illustration):
{
"some-array": [
{
"name": "foo",
"attr": "value"
},
{
"name": "foo bar",
"attr": "value"
},
{
"name": "foo bar baz",
"attr": "value"
}
],
"some-other-array": []
}
Now I have another object containing an array with updated objects which I need to merge with the full configuration in some way. I need to find the nested objects by name, add it if it does not exist yet and replace it if it does exist.
{
"some-array": [
{
"name": "foo",
"attr": "new-value",
"new-attrib": "new-value"
},
{
"name": "foo bar",
"attr": "new-value"
}
]
}
So, with the above example, my expected result would be:
{
"some-array": [
{
"name": "foo",
"attr": "new-value",
"new-attrib": "new-value"
},
{
"name": "foo bar",
"attr": "new-value"
},
{
"name": "foo bar baz",
"attr": "value"
}
],
"some-other-array": []
}
I already tried select(."some-array"[].name == "foo") to begin with and a few other things as a jq filter, but I'm struggling to move forward here and would really appreciate some inspiration / an actual solution.
Can anyone tell me if what I'm trying to achieve is actually possible with jq or do I have to find another solution?

Here is a solution to the updated problem. This solution assumes that the names are string-valued. It relies on two helper functions:
# array-to-hash
def a2h(f): reduce .[] as $x ({}; . + {($x | f): $x});
# hash-to-array
def h2a: . as $in | reduce keys_unsorted[] as $k ([]; . + [$in[$k]]);
The first of these creates a "hash" based on an input array, and the second implements the inverse operation.
With these helper functions, the solution can be written:
.["some-array"] |= (a2h(.name) + ($update|.["some-array"] | a2h(.name)) | h2a)
where $update is the "new" value. This solution relies on the "right-dominance" of object-addition.
Output
For the given example, the output is:
{
"some-array": [
{
"name": "foo",
"attr": "new-value",
"new-attrib": "new-value"
},
{
"name": "foo bar",
"attr": "new-value"
},
{
"name": "foo bar baz",
"attr": "value"
}
],
"some-other-array": []
}

Yes, it's possible, and in fact quite easy under various interpretations of the problem as originally stated.
The following solves the the problem as it was originally stated, with "it" being interpreted as .["some-array"] rather than its constituents.
Assuming $update holds the object with the updated information as shown, the update could be performed using this filter:
.["some-array"] = ($update | .["some-array"])
There are many ways to endow $update with the desired value.

Related

Getting first level with JMESPath

I have this JSON document:
{
"1": {
"a": "G1"
},
"2": {
"a": "GM1"
}
}
My expected result should be:
1,G1
2,GM1
With *.a i get
[
"G1",
"GM1"
]
but I am absolutely stuck for the rest.
Sadly there is not much you can do that would be totally matching your use case and that would scale properly.
This is because JMESPath does not have a way to reference its parent, although this has been requested before, to allow you something like
*.[join(',', [keys($), a])]
You can definitely extract a list of keys and values, thanks to the function keys:
#.{keys: keys(#), values: *.a}
That gives
{
"keys": [
"1",
"2"
],
"values": [
"G1",
"GM1"
]
}
But then you just fall under the same case as this other question, because keys will give you a list of keys.
You can also end with a list of lists:
#.[keys(#), *.a]
Will give you:
[
[
"1",
"2"
],
[
"G1",
"GM1"
]
]
And you can even go further and flatten it if needed:
#.[keys(#), *.a] []
Gives:
[
"1",
"2",
"G1",
"GM1"
]
With all this if you do happen to have a list of exactly two items, then a solution would be to use a combination of join and slice:
#.[join(',',[keys(#),*.a][] | [::2]), join(',',[keys(#),*.a][] | [1::2])]
That would give the expected:
[
"1,G1",
"2,GM1"
]
But, sadly, as soon as you have more than two items to consider you would end up with a buggy:
[
"1,3,G1,GM3",
"2,4,GM1,GM4"
]
With a data set of
{
"1": {
"a": "G1"
},
"2": {
"a": "GM1"
},
"3": {
"a": "GM3"
},
"4": {
"a": "GM4"
}
}
And then, of course, the same can be achieved hardcoding indexes:
#.[join(',', [keys(#)[0], *.a | [0]]), join(',', [keys(#)[1], *.a | [1]])]
That also gives the expected:
[
"1,G1",
"2,GM1"
]
But, sadly, this only works if you know in advance the number of rows that are going to be returned to you.
And if you want a single string, given that were you want to feed the data accepts \n as a new line, you can join he whole array again:
#.[join(',', [keys(#)[0], *.a | [0]]), join(',', [keys(#)[1], *.a | [1]])].join(`\n`,#)
Will give:
"1,G1\n2,GM1"
Finally this expression worked 100% for me:
[{key1:keys(#)[0],a:*.a| [0]},{key1:keys(#)[1],a:*.a| [1]}]

pipe in del(... | select(... | ...)) works in v1.6, how to get same result in v1.5?

I'm trying to remove some objects based on tags within an array. I can get it working fine on jqplay.org (v1.6) but is there any way to get the same result in v1.5? I just get an error Invalid path expression with result
The goal is to return the JSON stripped of the top two (content and data) levels, and with the properties of notes stripped out if there isn't a types tag starting with 'x' or 'y' for that note.
Here's the v1.6 working example: https://jqplay.org/s/AVpz_IkfJa
There's also this: https://github.com/stedolan/jq/issues/1146 but I don't know how (or if it's possible) to apply the workaround for del() rather than path(), assuming it's the same basic problem.
JQ instructions:
.content.data
| del(
.hits[].doc.notes[]
| select
( .types
| any(startswith("x") or startswith("y"))
| not
)
)
input JSON:
{
"content": { "data": {
"meta": "stuff",
"hits": [
{ "doc":
{
"id": "10",
"notes": {
"f1": {"name": "F1", "types": ["wwwa", "zzzb"] },
"f2": {"name": "F2", "types": ["xxxa", "yyya"] }
}
},
"score": "1"
},
{ "doc":
{
"id": "11",
"notes": {
"f1": {"name": "F1", "types": ["wwwa", "zzzb"] },
"f3": {"name": "F3", "types": ["qzxb", "xxxb"] }
}
},
"score": "2"
} ] } } }
Desired result:
{
"meta": "stuff",
"hits": [
{
"doc": {
"id": "10",
"notes": {
"f2": {"name": "F2", "types": ["xxxa", "yyya"] }
}
},
"score": "1"
},
{
"doc": {
"id": "11",
"notes": {
"f3": {"name": "F3", "types": ["qzxb", "xxxb"] }
}
},
"score": "2"
} ] }
Any suggestions greatly appreciated. I'm pretty much a jq novice. Even if it's not practically do-able in v1.5 at least I won't lose more hours trying to make it work.
OP back after a few hours - I found something that seems to work, still very interested in any comments / other ways to crack the problem / improvements.
.content.data
| .hits[].doc.notes |= map (
if ( .types | any(startswith("x") or startswith("y")))
then .
else empty
end
)
This is just a variation of the solution proposed by the OP. It illustrates how a complex use of del can be expressed in a more straightforward and robust way by crafting a suitable helper function.
The relevant helper function in the present case implements the stripping-out requirement:
# Input: an object some keys of which are to be removed
def prune:
to_entries
| map( select( any(.value.types[]; test("^(x|y)")) ) )
| from_entries ;
The task can now be accomplished using a one-liner:
.content.data | .hits |= map( .doc.notes |= prune )
Invocation
With the above jq program in program.jq, a suitable invocation of jq
would look like this:
jq -f program.jq input.json

Collect JSON objects with same attribute value, and create new key/value pairs

Here is a simplified sample of the JSON data I'm working with:
[
{ "certname": "one.example.com",
"name": "fact1",
"value": "value1"
},
{ "certname": "one.example.com",
"name": "fact2",
"value": 42
},
{ "certname": "two.example.com",
"name": "fact1",
"value": "value3"
},
{ "certname": "two.example.com",
"name": "fact2",
"value": 10000
},
{ "certname": "two.example.com",
"name": "fact3",
"value": { "anotherkey": "anothervalue" }
}
]
The result I want to achieve, using jq preferably, is the following:
[
{
"certname": "one.example.com",
"fact1": "value1",
"fact2": 42
},
{
"certname": "two.example.com",
"fact1": "value3",
"fact2": 10000,
"fact3": { "anotherkey": "anothervalue" }
}
]
Its worth pointing out that not all elements have the same name/value pairs, by any means. Also, values are often complex objects in their own right.
If I was doing this in Python, it wouldn't be a big deal (and yes, I can hear the chorus of "do it in Python" ringing in my ears now). I would like to understand how to do this in jq, and it's escaping me at the moment.
... using jq preferably ...
That's the spirit! And in that spirit, here's a concise solution:
map( {certname, (.name): .value} )
| group_by(.certname)
| map(add)
Of course there are other reasonable solutions. If the above is at first puzzling, you might like to add a debug statement here or there, or you might like to explore the pipeline by executing the first line by itself, etc.

Update one value in array of dicts, using jq

I want to update a value in a dict, which I can only identify by another value in the dict. That is, given this input:
[
{
"format": "geojson",
"id": "foo"
},
{
"format": "geojson",
"id": "bar"
},
{
"format": "zip",
"id": "baz"
}
]
I want to change baz's accompanying format to 'csv':
[
{
"format": "geojson",
"id": "foo"
},
{
"format": "geojson",
"id": "bar"
},
{
"format": "csv",
"id": "baz"
}
]
I have found that this works:
jq 'map(if .id=="baz" then .format="csv" else . end)' my.json
But this seems rather verbose, so I wonder if there is a more elegant way to express this. jq seems to be missing some kind of expression selector, the equivalent of might be [#id='baz'] in xpath.
(When I started this question, I had [.[] |...], then I discovered map, so it's not quite as bad as I thought.)
A complex assignment is what you're looking for:
jq '(.[] | select(.id == "baz") | .format) |= "csv"' my.json
Perhaps not shorter but it is more elegant, as requested. See the last section of the docs at: http://stedolan.github.io/jq/manual/#Assignment
Edit: using map:
jq 'map((select(.id == "baz") | .format) |= "csv")' my.json

Using jq to list keys in a JSON object

I have a hierarchically deep JSON object created by a scientific instrument, so the file is somewhat large (1.3MB) and not readily readable by people. I would like to get a list of keys, up to a certain depth, for the JSON object. For example, given an input object like this
{
"acquisition_parameters": {
"laser": {
"wavelength": {
"value": 632,
"units": "nm"
}
},
"date": "02/03/2525",
"camera": {}
},
"software": {
"repo": "github.com/username/repo",
"commit": "a7642f",
"branch": "develop"
},
"data": [{},{},{}]
}
I would like an output like such.
{
"acquisition_parameters": [
"laser",
"date",
"camera"
],
"software": [
"repo",
"commit",
"branch"
]
}
This is mainly for the purpose of being able to enumerate what is in a JSON object. After processing the JSON objects from the instrument begin to diverge: for example, some may have a field like .frame.cross_section.stats.fwhm, while others may have .sample.species, so it would be convenient to be able to interrogate the JSON object on the command line.
The following should do exactly what you want
jq '[(keys - ["data"])[] as $key | { ($key): .[$key] | keys }] | add'
This will give the following output, using the input you described above:
{
"acquisition_parameters": [
"camera",
"date",
"laser"
],
"software": [
"branch",
"commit",
"repo"
]
}
Given your purpose you might have an easier time using the paths builtin to list all the paths in the input and then truncate at the desired depth:
$ echo '{"a":{"b":{"c":{"d":true}}}}' | jq -c '[paths|.[0:2]]|unique'
[["a"],["a","b"]]
Here is another variation uing reduce and setpath which assumes you have a specific set of top-level keys you want to examine:
. as $v
| reduce ("acquisition_parameters", "software") as $k (
{}; setpath([$k]; $v[$k] | keys)
)