Remove parent elements with certain key-value pairs using JQ - json

I need to remove elements from a json file based on certain key values. Here is the file I am trying to process.
{
"element1": "Test Element 1",
"element2": {
"tags": "internal",
"data": {
"data1": "Test Data 1",
"data2": "Test Data 2"
}
},
"element3": {
"function1": {
"tags": [
"new",
"internal"
]
},
"data3": "Test Data 3",
"data4": "Test Data 4"
},
"element4": {
"function2": {
"tags": "new"
},
"data5": "Test Data 5"
}
}
I want to remove all elements that have a "tag" with value "internal". So the result should look like this:
{
"element1": "Test Element 1",
"element4": {
"function2": {
"tags": "new"
},
"data5": "Test Data 5"
}
}
I tried various approaches but I just don't get it done using jq. Any ideas? Thanks.
Just to add some more complexity. Let's assume the json is:
{
"element1": "Test Element 1",
"element2": {
"tags": "internal",
"data": {
"data1": "Test Data 1",
"data2": "Test Data 2"
}
},
"element3": {
"function1": {
"tags": [
"new",
"internal"
]
},
"data3": "Test Data 3",
"data4": "Test Data 4"
},
"element4": {
"function2": {
"tags": "new"
},
"data5": "Test Data 5"
},
"structure1" : {
"substructure1": {
"element5": "Test Element 5",
"element6": {
"tags": "internal",
"data6": "Test Data 6"
}
}
}
}
and I want to get
{
"element1": "Test Element 1",
"element4": {
"function2": {
"tags": "new"
},
"data5": "Test Data 5"
},
"structure1" : {
"substructure1": {
"element5": "Test Element 5",
}
}
}

Not easy, finding elements which have a tags key somewhere whose value is either the string internal, or an array of which an element is the string internal in a reliable way is only possible with a complex boolean expression as below.
Once found, deleting them can be done using the del built-in.
del(.[] | first(select(recurse
| objects
| has("tags") and (.tags
| . == "internal" or (
type == "array" and index("internal")
)
)
)))
Online demo

I think I figured out how to also solve the more complex case. I am now running:
walk(if type == "object" and has("tags") and (.tags | . == "internal" or (type == "array" and index("internal"))) then del(.) else . end) | delpaths([paths as $path | select(getpath($path) == null) | $path])
This will remove all elements that contain 'internal' as 'tag'.

The following solution is written with a helper function for clarity. The helper function uses any for efficiency and is defined so as to add a dash of generality.
To understand the solution, it will be helpful to know about with_entries and the infix // operator, both of which are explained in the jq manual.
# Does the incoming JSON value contain an object which has a .tags
# value that is equal to $value or to an array containing $value ?
def hasTag($value):
any(.. | select(type=="object") | .tags;
. == $value or (type == "array" and index($value)));
Assuming the top-level JSON entity is a JSON object, we can now simply write:
with_entries( select( .value | hasTag("internal") | not) )

Related

jq update json document to alter an array element

I know this has to be simple, but for some reason it's eluding me how to find an element given a condition and modify one of its fields. The doc should be fully output (sed style) with the edit made.
{
"state": "wait",
"steps": {
"step1": [
{ "name":"Foo", "state":"wait" },
{ "name":"Bar", "state":"wait" }
],
"step2": [
{ "name":"Foo", "state":"wait" },
{ "name":"Zoinks", "state":"ready" }
],
"step3": [
{ "name":"Foo", "state":"cancel" }
]
}
}
I'm expecting something like this should be workable.
jq '. | (select(.steps[][].name=="Foo" and .steps[][].state=="wait") |= . + {.state:"Ready"}'
or
jq '. | (select(.steps[][]) | if (.name=="Foo" and .state=="wait") then (.state="Ready") else . end)
The desired output, of course, would be
{
"state": "wait",
"steps": {
"step1": [
{ "name":"Foo", "state":"ready" },
{ "name":"Bar", "state":"wait" }
],
"step2": [
{ "name":"Foo", "state":"ready" },
{ "name":"Zoinks", "state":"ready" }
],
"step3": [
{ "name":"Foo", "state":"cancel" }
]
}
}
Instead, when I'm not getting cryptic errors, I'm either modifying a top-level field in the document or modifying the field for all the elements or repeated the entire doc multiple times.
Any insights greatly appreciated.
Thanks.
p.s. is there a better syntax than [] to wildcard the named-elements under steps? Or after the pipe to identify the indices discovered by the select?
Pipe the output of .steps[][] into a select call that chooses the objects with the desired name and state values, then set the state value on the result.
$ jq '(.steps[][] | select(.name == "Foo" and .state == "wait")).state = "ready"' tmp.json
{
"state": "wait",
"steps": {
"step1": [
{
"name": "Foo",
"state": "ready"
},
{
"name": "Bar",
"state": "wait"
}
],
"step2": [
{
"name": "Foo",
"state": "ready"
},
{
"name": "Zoinks",
"state": "ready"
}
],
"step3": [
{
"name": "Foo",
"state": "cancel"
}
]
}
}
You can help confirm this using diff (the first jq just normalizes the formatting so that only the changes made by the second one show up in the diff):
$ diff <(jq . tmp.json) <(jq '...' tmp.json)
7c7
< "state": "wait"
---
> "state": "ready"
17c17
< "state": "wait"
---
> "state": "ready"

Split a string and trim a known prefix from each part in a complex JSON structure

I'm dealing with a fairly complex JSON-structure in which a single entry needs to be edited in several places. For example:
[
{
"name": "test 1",
"stuff": {
"properties": {
"id": 0,
"stuff_list": [
{
"entryId": 1,
"description": "- item 1\n- item 2\n- item 3"
},
{
"entryId": 2,
"description": "- item 1\n- item 2\n- item 3"
}
]
}
}
},
{
"name": "test 2",
"stuff": {
"properties": {
"id": 1,
"stuff_list": [
{
"entryId": 1,
"description": null
},
{
"entryId": 2,
"description": "- item 1\n- item 2\n- item 3"
}
]
}
}
}
]
Here I would like to edit each "description"-element: The string needs to be split at each \n and the substrings "^\n?-\s" of each resulting array element need to be removed. So it should result in:
{
"entryId": 1,
"description": ["item 1", "item 2", "item 3"]
}
My first approach is:
jq '.[].stuff.properties.stuff_list[].description | split("\n")' the_file.json
but that's not working in the first place becaue of the null values that can occur at some places. So now I wonder: how can I achieve what I want?
An alternate version using split() on the \n and trimming string - on the left, would be to do
.[].stuff.properties.stuff_list[].description |=
if . != null then
split("\n") | map(ltrimstr("- "))
else
.
end
jqplay - Demo

How to join JSON objects on particular fields using jq?

I have following JSON structure:
{
"Reservations": [
{
"Id": "R-1",
"CustomerId": "1"
},
{
"Id": "R-2",
"CustomerId": "2"
}
],
"Customers": [
{
"Id": "1",
"Name": "customer 1"
},
{
"Id": "2",
"Name": "customer 2"
},
{
"Id": "3",
"Name": "customer 3"
}
]
}
I want to join Reservations with Customers and get something like this:
{
"ReservationId": "R-1",
"CustomerName": "customer 1"
}
{
"ReservationId": "R-2",
"CustomerName": "customer 2"
}
I've played with jq extensively, tried using multiple filters separated by comma, tried using variables, read the docs, but it seems like doing such a simple tasks is impossible with jq. Or, am I missing something?
Here's a simple solution using INDEX/2:
INDEX(.Customers[]; .Id) as $c
| .Reservations[]
| { ReservationId: .Id,
CustomerName: $c[.CustomerId].Name }
If your jq does not have INDEX/2 then now would be a good time to upgrade; otherwise, you can copy-and-paste its def from https://github.com/stedolan/jq/blob/master/src/builtin.jq, or you could use INDEX/3 as defined below.
INDEX/3
def INDEX(s; k; v):
reduce s as $x ({}; .[$x|k] = ($x|v));
INDEX(.Customers[]; .Id; .Name) as $c
| .Reservations[]
| { ReservationId: .Id,
CustomerName: $c[.CustomerId] }

Modifying array of key value in JSON jq

In case, I have an original json look like the following:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "user"
}
]
}
]
}
}
And I would like to inplace modify the value for the matched key like so:
jq '.taskDefinition.containerDefinitions[0].environment[] | select(.name=="DB_USERNAME") | .value="new"' json
I got the output
{
"name": "DB_USERNAME",
"value": "new"
}
But I want more like in-place modify or the whole json from the original with new value modified, like this:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "new"
}
]
}
]
}
}
Is it possible to do with jq or any known workaround?
Thank you.
Updated
For anyone looking for editing multi-values,
here is the approach I use
JQ=""
for e in DB_HOST=rds DB_USERNAME=xxx; do
k=${e%=*}
v=${e##*=}
JQ+="(.taskDefinition.containerDefinitions[0].environment[] | select(.name==\"$k\") | .value) |= \"$v\" | "
done
jq '${JQ%??}' json
I think there should be more concise way, but this seems working fine.
It is enough to assign to the path, if you are using |=, e.g.
jq '
(.taskDefinition.containerDefinitions[0].environment[] |
select(.name=="DB_USERNAME") | .value) |= "new"
' infile.json
Output:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "new"
}
]
}
]
}
}
Here is a select-free solution using |=:
.taskDefinition.containerDefinitions[0].environment |=
map(if .name=="DB_USERNAME" then .value = "new"
else . end)
Avoiding select within the expression on the LHS of |= makes the solution more robust w.r.t. the version of jq being used.
You might like to consider this alternative to using |=:
walk( if type=="object" and .name=="DB_USERNAME"
then .value="new" else . end)

JQ: How do I replace keys and values based on regex match?

I have two questions:
How can I use jq to search for "name" fields that start with an underscore (like _RDS_PASSWORD) and remove the leading underscore (so it becomes RDS_PASSWORD)
How can I use jq for "name" fields that start with an underscore (like _RDS_PASSWORD) and pass the value of the value cGFzc3dvcmQK to be decoded via base64? (ex: "cGFzc3dvcmQK" | base64 --decode)
Input:
[
{
"name": "RDS_DB_NAME",
"value": "rds_db_name"
},
{
"name": "RDS_HOSTNAME",
"value": "rds_hostname"
},
{
"name": "RDS_PORT",
"value": "1234"
},
{
"name": "RDS_USERNAME",
"value": "rds_username"
},
{
"name": "_RDS_PASSWORD",
"value": "cGFzc3dvcmQK"
}
]
Desired output:
[
{
"name": "RDS_DB_NAME",
"value": "rds_db_name"
},
{
"name": "RDS_HOSTNAME",
"value": "rds_hostname"
},
{
"name": "RDS_PORT",
"value": "1234"
},
{
"name": "RDS_USERNAME",
"value": "rds_username"
},
{
"name": "RDS_PASSWORD",
"value": "password"
}
]
Q1
walk( if type=="object" and has("name") and .name[0:1] == "_"
then .name |= .[1:]
else .
end)
If your jq does not have walk/1 then you can either upgrade to a more recent version of jq than 1.5, or include its def, which can be found at https://github.com/stedolan/jq/blob/master/src/builtin.jq
Q2
.. | objects | select(has("name") and .name[0:1] == "_") | .value
If you are certain that the encoded string was a UTF-8 string, you could use jq's #base64d; otherwise, invoke jq with the -r option and pipe the results to a decoder as you indicated you planned to do.