Updating Nested JSON Array with new key and value from another key - json

I have have a JSON file where I have IDs with tasks. Some tasks can be empty. I want to put the ID into the tasks where tasks are not empty.
[
{
"id": 1961126,
"tasks": [
{
"id": 70340700,
"title": "Test1",
},
{
"id": 69801130,
"title": "Test15A",
}
]
},
{
"id": 1961126,
"tasks": []
}
]
I would like to get the tasks array updated to look like
[
{
"id": 1961126,
"tasks": [
{
**"sId":1961126,**
"id": 70340700,
"title": "Test1",
},
{
**"sId":1961126,**
"id": 69801130,
"title": "Test15A",
}
]
},
{
"id": 1961126,
"tasks": []
}
]
I can't figure out how to get the id from the object into the nested array. Here is what I have come up with
jq 'map(.tasks[0]|select( . != null )|.sId = .id)' file.json
This is only pulling in the same id. I have tired to put in [].id but I get a error Cannot index number with string "id". I am still learning how to deal with nested arrays and objects.

Save the ID in a variable and add it as a new field to each array member.
jq 'map(.id as $sId | .tasks[] += {$sId})' file.json
Demo
Note #1: Get rid of the final , within each object (see the Demo), as it's not proper JSON.
Note #2: Object fields generally have no order, but if you want to have the propagated ID shown first, as seen in your expected output, you could try to replace += {$sId} (which by itself is shorthand for |= . + {$sId}) with |= {$sId} + . to flip the order of generation (Demo). Although there is no guarantee that it stays that way with further processing.

Related

JSONPath to get multiple values from nested json

I have a nested JSON in a field that contains multiple important keys that I would like to retrieve as an array:
{
"tasks": [
{
"id": "task_1",
"name": "task_1_name",
"assignees": [
{
"id": "assignee_1",
"name": "assignee_1_name"
}
]
},
{
"id": "task_2",
"name": "task_2_name",
"assignees": [
{
"id": "assignee_2",
"name": "assignee_2_name"
},
{
"id": "assignee_3",
"name": "assignee_3_name"
}
]}]}
All the queries that I've tried so far fx ( $.tasks.*.assignees..id) and many others have returned
[
"assignee_1",
"assignee_2",
"assignee_3"
]
But what I need is:
[
["assignee_1"],
["assignee_2", "assignee_3"]
]
Is it possible to do with JSONPath or any script inside of it, without involving 3rd party tools?
The problem you're facing is that tasks and assignees are arrays. You need to use [*] instead of .* to get the items in the array. So your path should look like
$.tasks[*].assignees[*].id
You can try it at https://json-everything.net/json-path.
NOTE The output from my site will give you both the value and its location within the original document.
Edit
(I didn't read the whole thing :) )
You're not going to be able to get
[
["assignee_1"],
["assignee_2", "assignee_3"]
]
because, as #Tomalak mentioned, JSON Path is a query language. It's going to remove all structure and return only values.

jq output is empty when tag name does not exist

When I run the jq command to parse a json document from the amazon cli I have the following problem.
I’m parsing through the IP address and a tag called "Enviroment". The enviroment tag in the instance does not exist therefore it does not throw me any result.
Here's an example of the relevant output returned by the AWS CLI
{
"Reservations": [
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.1",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
},
{
"Key": "Environment",
"Value": "alpha"
}
]
}
]
},
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.2",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
}
]
}
]
}
]
}
I’m running the following command
aws ec2 describe-instances --filters "Name=tag:Name,Values=Balance-OTA-SS_a" | jq -c '.Reservations[].Instances[] | ({IP: .PrivateIpAddress, Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)})'
## output
empty
How do I show the IP address in the output of the command even if the enviroment tag does not exist?
Regards,
Let's assume this input:
{
"Reservations": [
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.1",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
},
{
"Key": "Environment",
"Value": "alpha"
}
]
}
]
},
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.2",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
}
]
}
]
}
]
}
This is the format returned by describe-instances, but with all the irrelevant fields removed.
Note that tags is always a list of objects, each of which has a Key and a Value. This format is perfect for from_entries, which can transform this list of tags into a convenient mapping object. Try this:
.Reservations[].Instances[] |
{
IP: .PrivateIpAddress,
Ambiente: (.Tags|from_entries.Environment)
}
{"IP":"10.0.0.1","Ambiente":"alpha"}
{"IP":"10.0.0.2","Ambiente":null}
That answers how to do it. But you probably want to understand why your approach didn't work.
.Reservations[].Instances[] |
{
IP: .PrivateIpAddress,
Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)
}
The .[] filter you're using on the tags can return zero or multiple results. Similarly, the select filter can eliminate some or all items. When you apply this inside an object constructor (the expression from { to }), you're causing that whole object to be created a variable number of times. You need to be very careful where you use these filters, because often that's not what you want at all. Often you instead want to do one of the following:
Wrap the expression that returns multiple results in an array constructor [ ... ]. That way instead of outputting the parent object potentially zero or multiple times, you output it once containing an array that potentially has zero or multiple items. E.g.
[.Tags[]|select(.Key=="Environment")]
Apply map to the array to keep it an array but process its contents, e.g.
.Tags|map(select(.Key=="Environment"))
Apply first(expr) to capture only the first value emitted by the expression. If the expression might emit zero items, you can use the comma operator to provide a default, e.g.
first((.Tags[]|select(.Key=="Environment")),null)
Apply some other array-level function, such as from_entries.
.Tags|from_entries.Environment
You can either use an if ... then ... else ... end construct, or //. For example:
.Reservations[].Instances[]
| {IP: .PrivateIpAddress} +
({Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)}
// null)

Adding a new root level property to JSON object using jq

I have a large JSON file (about 7K lines) with deeply nested items which has a missing required property collection that I need to add.
Current JSON object:
{
"item": [
{
"id": "123",
"name": "Customer",
"item": [
{
"id": "456",
"name": "Retrieve a customer"
....
Using a bash script, I need to add a top level property "collection" like this, which still contains the same nested items within it. This is my desired result:
{
"collection": {
"item": [
{
"id": "123",
"name": "Customer",
"item": [
{
"id": "456",
"name": "Retrieve a customer",
....
At the end of the JSON object I also need the matching closing } brace at the end of the file for my newly added collection: key. Is there a way to do this with JQ?
jq '{"collection": .}' <in.json >out.json
And if your JSON is the output of another jq command, just add the collection at the end, like:
# For example: delete an element and then wrap
# entries around a "records" attribute (assuming
# the date is already a JSON list):
jq '[.[] | del(.undesiredAttribute)] | {"records": .}'
Then the output is:
{"records":[{"name":"Foo"},{"name":"Bar"}]}

How to use jq to reconstruct complete contents of json file, operating only on part of interest?

All the examples I've seen so far "reduce" the output (filter out) some part. I understand how to operate on the part of the input I want to, but I haven't figured out how to output the rest of the content "untouched".
The particular example would be an input file with several high level entries "array1", "field1", "array2", "array3" say. Each array contents is different. The specific processing I want to do is to sort "array1" entries by a "name" field which is doable by:
jq '.array1 | sort_by(.name)' test.json
but I also want this output as "array1" as well as all the other data to be preserved.
Example input:
{
"field1": "value1",
"array1":
[
{ "name": "B", "otherdata": "Bstuff" },
{ "name": "A", "otherdata": "Astuff" }
],
"array2" :
[
array2 stuff
],
"array3" :
[
array3 stuff
]
}
Expected output:
{
"field1": "value1",
"array1":
[
{ "name": "A", "otherdata": "Astuff" },
{ "name": "B", "otherdata": "Bstuff" }
],
"array2" :
[
array2 stuff
],
"array3" :
[
array3 stuff
]
}
I've tried using map but I can't seem to get the syntax correct to be able to handle any type of input other than the array I want to be sorted by name.
Whenever you use the assignment operators (=, |=, +=, etc.), the context of the expression is kept unchanged. So as long as your top-level filter(s) are assignments, in the end, you'll get the rest of the data (with your changes applied).
In this case, you're just sorting the array1 array so you could just update the array.
.array1 |= sort_by(.name)

Using jq to list keys in a JSON object

I have a hierarchically deep JSON object created by a scientific instrument, so the file is somewhat large (1.3MB) and not readily readable by people. I would like to get a list of keys, up to a certain depth, for the JSON object. For example, given an input object like this
{
"acquisition_parameters": {
"laser": {
"wavelength": {
"value": 632,
"units": "nm"
}
},
"date": "02/03/2525",
"camera": {}
},
"software": {
"repo": "github.com/username/repo",
"commit": "a7642f",
"branch": "develop"
},
"data": [{},{},{}]
}
I would like an output like such.
{
"acquisition_parameters": [
"laser",
"date",
"camera"
],
"software": [
"repo",
"commit",
"branch"
]
}
This is mainly for the purpose of being able to enumerate what is in a JSON object. After processing the JSON objects from the instrument begin to diverge: for example, some may have a field like .frame.cross_section.stats.fwhm, while others may have .sample.species, so it would be convenient to be able to interrogate the JSON object on the command line.
The following should do exactly what you want
jq '[(keys - ["data"])[] as $key | { ($key): .[$key] | keys }] | add'
This will give the following output, using the input you described above:
{
"acquisition_parameters": [
"camera",
"date",
"laser"
],
"software": [
"branch",
"commit",
"repo"
]
}
Given your purpose you might have an easier time using the paths builtin to list all the paths in the input and then truncate at the desired depth:
$ echo '{"a":{"b":{"c":{"d":true}}}}' | jq -c '[paths|.[0:2]]|unique'
[["a"],["a","b"]]
Here is another variation uing reduce and setpath which assumes you have a specific set of top-level keys you want to examine:
. as $v
| reduce ("acquisition_parameters", "software") as $k (
{}; setpath([$k]; $v[$k] | keys)
)