How to use jq to reconstruct complete contents of json file, operating only on part of interest? - json

All the examples I've seen so far "reduce" the output (filter out) some part. I understand how to operate on the part of the input I want to, but I haven't figured out how to output the rest of the content "untouched".
The particular example would be an input file with several high level entries "array1", "field1", "array2", "array3" say. Each array contents is different. The specific processing I want to do is to sort "array1" entries by a "name" field which is doable by:
jq '.array1 | sort_by(.name)' test.json
but I also want this output as "array1" as well as all the other data to be preserved.
Example input:
{
"field1": "value1",
"array1":
[
{ "name": "B", "otherdata": "Bstuff" },
{ "name": "A", "otherdata": "Astuff" }
],
"array2" :
[
array2 stuff
],
"array3" :
[
array3 stuff
]
}
Expected output:
{
"field1": "value1",
"array1":
[
{ "name": "A", "otherdata": "Astuff" },
{ "name": "B", "otherdata": "Bstuff" }
],
"array2" :
[
array2 stuff
],
"array3" :
[
array3 stuff
]
}
I've tried using map but I can't seem to get the syntax correct to be able to handle any type of input other than the array I want to be sorted by name.

Whenever you use the assignment operators (=, |=, +=, etc.), the context of the expression is kept unchanged. So as long as your top-level filter(s) are assignments, in the end, you'll get the rest of the data (with your changes applied).
In this case, you're just sorting the array1 array so you could just update the array.
.array1 |= sort_by(.name)

Related

Updating Nested JSON Array with new key and value from another key

I have have a JSON file where I have IDs with tasks. Some tasks can be empty. I want to put the ID into the tasks where tasks are not empty.
[
{
"id": 1961126,
"tasks": [
{
"id": 70340700,
"title": "Test1",
},
{
"id": 69801130,
"title": "Test15A",
}
]
},
{
"id": 1961126,
"tasks": []
}
]
I would like to get the tasks array updated to look like
[
{
"id": 1961126,
"tasks": [
{
**"sId":1961126,**
"id": 70340700,
"title": "Test1",
},
{
**"sId":1961126,**
"id": 69801130,
"title": "Test15A",
}
]
},
{
"id": 1961126,
"tasks": []
}
]
I can't figure out how to get the id from the object into the nested array. Here is what I have come up with
jq 'map(.tasks[0]|select( . != null )|.sId = .id)' file.json
This is only pulling in the same id. I have tired to put in [].id but I get a error Cannot index number with string "id". I am still learning how to deal with nested arrays and objects.
Save the ID in a variable and add it as a new field to each array member.
jq 'map(.id as $sId | .tasks[] += {$sId})' file.json
Demo
Note #1: Get rid of the final , within each object (see the Demo), as it's not proper JSON.
Note #2: Object fields generally have no order, but if you want to have the propagated ID shown first, as seen in your expected output, you could try to replace += {$sId} (which by itself is shorthand for |= . + {$sId}) with |= {$sId} + . to flip the order of generation (Demo). Although there is no guarantee that it stays that way with further processing.

Getting first level with JMESPath

I have this JSON document:
{
"1": {
"a": "G1"
},
"2": {
"a": "GM1"
}
}
My expected result should be:
1,G1
2,GM1
With *.a i get
[
"G1",
"GM1"
]
but I am absolutely stuck for the rest.
Sadly there is not much you can do that would be totally matching your use case and that would scale properly.
This is because JMESPath does not have a way to reference its parent, although this has been requested before, to allow you something like
*.[join(',', [keys($), a])]
You can definitely extract a list of keys and values, thanks to the function keys:
#.{keys: keys(#), values: *.a}
That gives
{
"keys": [
"1",
"2"
],
"values": [
"G1",
"GM1"
]
}
But then you just fall under the same case as this other question, because keys will give you a list of keys.
You can also end with a list of lists:
#.[keys(#), *.a]
Will give you:
[
[
"1",
"2"
],
[
"G1",
"GM1"
]
]
And you can even go further and flatten it if needed:
#.[keys(#), *.a] []
Gives:
[
"1",
"2",
"G1",
"GM1"
]
With all this if you do happen to have a list of exactly two items, then a solution would be to use a combination of join and slice:
#.[join(',',[keys(#),*.a][] | [::2]), join(',',[keys(#),*.a][] | [1::2])]
That would give the expected:
[
"1,G1",
"2,GM1"
]
But, sadly, as soon as you have more than two items to consider you would end up with a buggy:
[
"1,3,G1,GM3",
"2,4,GM1,GM4"
]
With a data set of
{
"1": {
"a": "G1"
},
"2": {
"a": "GM1"
},
"3": {
"a": "GM3"
},
"4": {
"a": "GM4"
}
}
And then, of course, the same can be achieved hardcoding indexes:
#.[join(',', [keys(#)[0], *.a | [0]]), join(',', [keys(#)[1], *.a | [1]])]
That also gives the expected:
[
"1,G1",
"2,GM1"
]
But, sadly, this only works if you know in advance the number of rows that are going to be returned to you.
And if you want a single string, given that were you want to feed the data accepts \n as a new line, you can join he whole array again:
#.[join(',', [keys(#)[0], *.a | [0]]), join(',', [keys(#)[1], *.a | [1]])].join(`\n`,#)
Will give:
"1,G1\n2,GM1"
Finally this expression worked 100% for me:
[{key1:keys(#)[0],a:*.a| [0]},{key1:keys(#)[1],a:*.a| [1]}]

Filter JSON list based on an element appearing in a member's list using jq

I'm using jq to try to filter a JSON list based on the content of a list inside the objects within that list. Here's a sample of my JSON document:
{
"modules": [
{
"path": [
"root"
],
"outputs": {
"a": "b",
"c": "d"
}
},
{
"path": [
"other1"
],
"outputs": {
"e": "f",
"g": "h"
}
},
{
"path": [
"other2"
],
"outputs": {
"i": "j",
"k": "l"
}
}
]
}
I want to filter the modules list to the object where the path list contains "root", then return the outputs object. Essentially I want to return:
{"a":"b","c":"d"}
which I can do using jq .modules[0].outputs (see example on http://jqplay.org) but I don't want to make an assumption that the object I'm interested in is the 0th element of the modules list, instead I want to filter the modules list where the path list contains an element "root".
How can I do that?
Typical, as soon as I post the question I stumble upon the answer through trial and error.
.modules[] | select(.path == ["root"]).outputs
see example on jqplay.org

Create a new json string from jq output elements

My jq command returns objects in brackets but without comma separators. But I would like to create a new json string from it.
This call finds all elements of arr that have a FooItem in them and then returns texts from the nested array at index 3:
jq '.arr[] | select(index("FooItem")) | .[3].texts'
on this json (The original has more elements ):
{
"arr": [
[
"create",
"w199",
"FooItem",
{
"index": 0,
"texts": [
"aBarfoo",
"avalue"
]
}
],
[
"create",
"w200",
"NoItem",
{
"index": 1,
"val": 5,
"hearts": 5
}
],
[
"create",
"w200",
"FooItem",
{
"index": 1,
"texts": [
"mybarfoo",
"bValue"
]
}
]
]
}
returns this output:
[
"aBarfoo",
"avalue"
]
[
"mybarfoo",
"bValue"
]
But I'd like to create a new json from these objects that looks like this:
{
"arr": [
[
"aBarfoo",
"avalue"
],
[
"mybarfoo",
"bValue"
]
]
}
Can jq do this?
EDIT
One more addition: Considering that texts also has strings of zero length, how would you delete those/not have them in the result?
"texts": ["",
"mybarfoo",
"bValue",
""
]
You can always embed a stream of (zero or more) JSON entities within some other JSON structure by decorating the stream, that is, in the present case, by wrapping the STREAM as follows:
{ arr: [ STREAM ] }
In the present case, however, we can also take the view that we are simply editing the original document, and accordingly use a variation of the map(select(...)) idiom:
.arr |= map( select(index("FooItem")) | .[3].texts)
This latter approach ensures that the context of the "arr" key is preserved.
Addendum
To filter out the empty strings, simply add another map(select(...)):
.arr |= map( select(index("FooItem"))
| .[3].texts | map(select(length>0)))

Using jq to list keys in a JSON object

I have a hierarchically deep JSON object created by a scientific instrument, so the file is somewhat large (1.3MB) and not readily readable by people. I would like to get a list of keys, up to a certain depth, for the JSON object. For example, given an input object like this
{
"acquisition_parameters": {
"laser": {
"wavelength": {
"value": 632,
"units": "nm"
}
},
"date": "02/03/2525",
"camera": {}
},
"software": {
"repo": "github.com/username/repo",
"commit": "a7642f",
"branch": "develop"
},
"data": [{},{},{}]
}
I would like an output like such.
{
"acquisition_parameters": [
"laser",
"date",
"camera"
],
"software": [
"repo",
"commit",
"branch"
]
}
This is mainly for the purpose of being able to enumerate what is in a JSON object. After processing the JSON objects from the instrument begin to diverge: for example, some may have a field like .frame.cross_section.stats.fwhm, while others may have .sample.species, so it would be convenient to be able to interrogate the JSON object on the command line.
The following should do exactly what you want
jq '[(keys - ["data"])[] as $key | { ($key): .[$key] | keys }] | add'
This will give the following output, using the input you described above:
{
"acquisition_parameters": [
"camera",
"date",
"laser"
],
"software": [
"branch",
"commit",
"repo"
]
}
Given your purpose you might have an easier time using the paths builtin to list all the paths in the input and then truncate at the desired depth:
$ echo '{"a":{"b":{"c":{"d":true}}}}' | jq -c '[paths|.[0:2]]|unique'
[["a"],["a","b"]]
Here is another variation uing reduce and setpath which assumes you have a specific set of top-level keys you want to examine:
. as $v
| reduce ("acquisition_parameters", "software") as $k (
{}; setpath([$k]; $v[$k] | keys)
)