How to access key at any level? - json

I have this input file
{
"description": "this is a fake description",
"owner": "john",
"region": "us-east-1",
"topics": {
"For collecting responses for prod1": {
"suffix_name": "response-to-hogwarts.fifo",
"tags": {}
}
},
"queues": {
"For collecting responses and requests from test1": {
"suffix_name": "rr_sf_name.fifo",
"tags": {}
},
"For collecting responses for test2": {
"suffix_name": "response-to-hogwarts.fifo",
"tags": {}
}
},
"subscriptions": {
"For receiving harry_potter requests": {
"topic": {
"suffix_name": "harry_potter.fifo",
"tag": "staging"
},
"queue": {
"suffix_name": "rr_sf_name.fifo"
}
},
"For receiving harry_potter requests from test3": {
"topic": {
"suffix_name": "harry_potter.fifo",
"tag": "harry_potter_hogwarts"
},
"queue": {
"suffix_name": "rr_sf_name.fifo"
}
},
"For receiving demo requests": {
"topic": {
"suffix_name": "demo.fifo",
"tag": "staging"
},
"queue": {
"suffix_name": "rr_sf_name.fifo"
}
},
"For receiving demo requests from test4 through connector": {
"topic": {
"suffix_name": "demo.fifo",
"tag": "harry_potter_hogwarts"
},
"queue": {
"suffix_name": "rr_sf_name.fifo"
}
},
"For receiving testing responses for hogwarts": {
"topic": {
"suffix_name": "response-to-hogwarts.fifo"
},
"queue": {
"suffix_name": "response-to-hogwarts.fifo"
}
}
}
}
Now I want to remove the ".fifo" suffix from the value of ALL the suffix_name fields, at any level
Example output (truncated for brevity)
"topics": {
"For collecting responses for prod1": {
"suffix_name": "response-to-hogwarts",
"tags": {}
}
},
Now I came up with this, which is working for me.
.queues |= with_entries(.value.suffix_name |= sub(".fifo";""))
| .topics |= with_entries(.value.suffix_name |= sub(".fifo";""))
| .subscriptions |= with_entries(.value |= with_entries(.value |= with_entries(.value |= rtrimstr(".fifo"))))
I want to know, if there is a better way, using recurse or something, to parse through all keys and if the key is suffix_name, trim the ".fifo" from the value.

You could use walk:
walk(if type=="object" and .suffix_name
then .suffix_name |= sub("[.]fifo$";"") else . end)
Alternatively, just use |=:
(.. | select(type == "object" and .suffix_name) | .suffix_name)
|= sub("[.]fifo$";"")

An alternate to walk using the path functions - getpath and setpath
reduce ( paths | select(.[-1] | endswith("suffix_name")? ) ) as $p
( .; setpath($p; getpath($p) | sub("[.]fifo$";"") ) )
Identify paths from the root to suffix_name and iterate it over using reduce. For each of the paths, reconstruct the value by having the suffix removed.

Related

extract a subset of deep embed json and print only key,value pair I am interested in the subset json

I have a deep embeded json file:
I want to extract and parse only the subset I am interested in , in my case all content in 'node' key.
How can I:
extract subset of this json file which contains "edges[].node" (edges is the 'parent' key of node)
in 'node' session , I am interested in key:value pair of
.url,
.headline.default, (*this one is 'grandchild' of key 'node'*)
.firstPublished
I want to keep only above 3 item inside 'node' key
How can I print out the super slim version of json file I need ?
a better to have option is : can I still keep the structure/full path which leads json root key to embed 'node' json subset I am interested in ?
Here is the jqplay-myjson (full content of my json file)
Try to attach my full content here :
{
"data": {
"legacyCollection": {
"longDescription": "The latest news, analysis and investigations from Europe.",
"section": {
"name": "world",
"url": "/section/world"
},
"collectionsPage": {
"stream": {
"pageInfo": {
"hasNextPage": true,
"__typename": "PageInfo"
},
"__typename": "AssetsConnection",
"edges": [
{
"node": {
"url": "https://www.nytimes.com/video/world/europe/100000008323381/icc-war-crimes-ukraine.html",
"firstPublished": "2022-04-27T23:28:33.241Z",
"headline": {
"default": "I.C.C. Joins Investigation of War Crimes in Ukraine",
"__typename": "CreativeWorkHeadline"
},
"summary": "Karim Khan, the chief prosecutor of the International Criminal Court, said that his organization would participate in a joint effort — with Ukraine, Poland and Lithuania — to investigate war crimes committed since Russia’s invasion.",
"promotionalMedia": {
"__typename": "Image",
"id": "SW1hZ2U6bnl0Oi8vaW1hZ2UvYTY3MTVhNDUtZDE0NS01OWZjLThkZWItNzYxMWViN2UyODhk"
},
"embedded": false
},
"__typename": "AssetsEdge"
},
{
"node": {
"__typename": "Article",
"url": "https://www.nytimes.com/2022/04/27/sports/soccer/chelsea-sale-roman-abramovich.html",
"firstPublished": "2022-04-27T19:42:17.000Z",
"typeOfMaterials": [
"News"
],
"archiveProperties": {
"lede": "",
"__typename": "ArticleArchiveProperties"
},
"headline": {
"default": "Endgame Nears in Bidding for Chelsea F.C.",
"__typename": "CreativeWorkHeadline"
},
"summary": "The American bank selling the English soccer team on behalf of its Russian owner could name its preferred suitor by the end of the week. But the drama isn’t over.",
"translations": []
},
"__typename": "AssetsEdge"
}
],
"totalCount": 52559
}
},
"sourceId": "100000004047788",
"tagline": "",
"__typename": "LegacyCollection"
}
}
}
Here is the command I have jqplay Demo:
.data.legacyCollection.collectionsPage.stream.edges[].node|= with_entries(select([.key]|inside(["default","url","firstPublished"]))
And here is the output I got
{
"data": {
"legacyCollection": {
"longDescription": "The latest news, analysis and investigations from Europe.",
"section": {
"name": "world",
"url": "/section/world"
},
"collectionsPage": {
"stream": {
"pageInfo": {
"hasNextPage": true,
"__typename": "PageInfo"
},
"__typename": "AssetsConnection",
"edges": [
{
"node": {
"url": "https://www.nytimes.com/video/world/europe/100000008323381/icc-war-crimes-ukraine.html",
"firstPublished": "2022-04-27T23:28:33.241Z"
},
"__typename": "AssetsEdge"
},
{
"node": {
"url": "https://www.nytimes.com/2022/04/27/sports/soccer/chelsea-sale-roman-abramovich.html",
"firstPublished": "2022-04-27T19:42:17.000Z"
},
"__typename": "AssetsEdge"
}
],
"totalCount": 52559
}
},
"sourceId": "100000004047788",
"tagline": "",
"__typename": "LegacyCollection"
}
}
}
Here is the output I expect to have
{
"data": {
"legacyCollection": {
"collectionsPage": {
"stream": {
"edges": [
{
"node": {
"url": "https://www.nytimes.com/video/world/europe/100000008323381/icc-war-crimes-ukraine.html",
"firstPublished": "2022-04-27T23:28:33.241Z"
}
},
{
"node": {
"url": "https://www.nytimes.com/2022/04/27/sports/soccer/chelsea-sale-roman-abramovich.html",
"firstPublished": "2022-04-27T19:42:17.000Z"
}
}
]
}
}
}
}
}
Here's a (somewhat) declarative solution:
(.data.legacyCollection.collectionsPage.stream.edges
| map( {node: (.node
| {url,
firstPublished,
headline: {default: .headline.default} })})) as $edges
| {data: {
legacyCollection: {
collectionsPage: {
stream: {
$edges
}
}
}
}
}
Here's one way to make the selection while ensuring that the structure is preserved. This solution may be of interest because
it can easily be adapted for use with jq's "--stream" option.
def array_startswith($head): .[: $head|length] == $head;
. as $in
| ["data", "legacyCollection", "collectionsPage", "stream", "edges"] as $head
| ($head|length) as $len
| reduce (paths
| select( array_startswith($head) and .[1+$len] == "node" )) as $p
(null;
if ((($p|length) == $len + 3) and ($p[-1] | IN("url", "firstPublished")))
or ((($p|length) == $len + 4) and $p[-2:] == ["headline", "default"])
then setpath($p; $in | getpath($p))
else .
end)

Decrypt values with the same key at different levels from base64

My input is like below. I want to search for SearchString key (you can see that we can't use a fixed index for it) and when the key appears decrypt its value from base64 (perhaps using #base64d filter). Is this possible with JQ? If so, how?
[
{
"Name": "searchblock",
"Priority": 3,
"Statement": {
"RateBasedStatement": {
"Limit": 100,
"AggregateKeyType": "IP",
"ScopeDownStatement": {
"ByteMatchStatement": {
"SearchString": "Y2F0YWxvZ3NlYXJjaA==",
"FieldToMatch": {
"UriPath": {}
},
"TextTransformations": [
{
"Priority": 0,
"Type": "LOWERCASE"
}
],
"PositionalConstraint": "CONTAINS"
}
}
}
},
"Action": {
"Block": {}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "searchblock"
}
},
{
"Name": "bot-block",
"Priority": 4,
"Statement": {
"ByteMatchStatement": {
"SearchString": "Ym90",
"FieldToMatch": {
"SingleHeader": {
"Name": "user-agent"
}
},
"TextTransformations": [
{
"Priority": 0,
"Type": "LOWERCASE"
}
],
"PositionalConstraint": "CONTAINS"
}
},
"Action": {
"Allow": {}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "user-agent"
}
}
]
We use path, paths, getpath, and setpath built-ins for such operations when a fixed path is not available.
getpath(paths | select(.[-1] == "SearchString")) |= #base64d
Online demo
walk is quite intuitive for this kind of task:
walk(if type == "object" and .SearchString
then .SearchString |= #base64d else . end)
Using this approach, it's also trivial to modify the program to make it more robust, e.g. to check that .SearchString is a string:
walk(if type == "object" and (.SearchString|type) == "string"
then .SearchString |= #base64d else . end)
Note: if your jq does not include walk, you can simply copy its def from any reputable web site, or from https://github.com/stedolan/jq/blob/master/src/builtin.jq

Reconstructing JSON with jq

I have a JSON like this (sample.json):
{
"sheet1": [
{
"hostname": "sv001",
"role": "web",
"ip1": "172.17.0.3"
},
{
"hostname": "sv002",
"role": "web",
"ip1": "172.17.0.4"
},
{
"hostname": "sv003",
"role": "db",
"ip1": "172.17.0.5",
"ip2": "172.18.0.5"
}
],
"sheet2": [
{
"hostname": "sv004",
"role": "web",
"ip1": "172.17.0.6"
},
{
"hostname": "sv005",
"role": "db",
"ip1": "172.17.0.7"
},
{
"hostname": "vsv006",
"role": "db",
"ip1": "172.17.0.8"
}
],
"sheet3": []
}
I want to extract data like this:
sheet1
jq '(something command)' sample.json
{
"web": {
"hosts": [
"172.17.0.3",
"172.17.0.4"
]
},
"db": {
"hosts": [
"172.17.0.5"
]
}
}
Is it possible to perform the reconstruction with jq map?
(I will reuse the result for ansible inventory.)
Here's a short, straight-forward and efficient solution -- efficient in part because it avoids group_by by courtesy of the following generic helper function:
def add_by(f;g): reduce .[] as $x ({}; .[$x|f] += [$x|g]);
.sheet1
| add_by(.role; .ip1)
| map_values( {hosts: .} )
Output
This produces the required output:
{
"web": {
"hosts": [
"172.17.0.3",
"172.17.0.4"
]
},
"db": {
"hosts": [
"172.17.0.5"
]
}
}
If the goal is to regroup the ips by their roles within each sheet you could do this:
map_values(
reduce group_by(.role)[] as $g ({};
.[$g[0].role].hosts = [$g[] | del(.hostname, .role)[]]
)
)
Which produces something like this:
{
"sheet1": {
"db": {
"hosts": [
"172.17.0.5",
"172.18.0.5"
]
},
"web": {
"hosts": [
"172.17.0.3",
"172.17.0.4"
]
}
},
"sheet2": {
"db": {
"hosts": [
"172.17.0.7",
"172.17.0.8"
]
},
"web": {
"hosts": [
"172.17.0.6"
]
}
},
"sheet3": {}
}
https://jqplay.org/s/3VpRc5l4_m
If you want to flatten all to a single object keeping only unique ips, you can keep everything mostly the same, you'll just need to flatten the inputs prior to grouping and remove the map_values/1 call.
$ jq -n '
reduce ([inputs[][]] | group_by(.role)[]) as $g ({};
.[$g[0].role].hosts = ([$g[] | del(.hostname, .role)[]] | unique)
)
'
{
"db": {
"hosts": [
"172.17.0.5",
"172.17.0.7",
"172.17.0.8",
"172.18.0.5"
]
},
"web": {
"hosts": [
"172.17.0.3",
"172.17.0.4",
"172.17.0.6"
]
}
}
https://jqplay.org/s/ZGj1wC8hU3

Modifying array of key value in JSON jq

In case, I have an original json look like the following:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "user"
}
]
}
]
}
}
And I would like to inplace modify the value for the matched key like so:
jq '.taskDefinition.containerDefinitions[0].environment[] | select(.name=="DB_USERNAME") | .value="new"' json
I got the output
{
"name": "DB_USERNAME",
"value": "new"
}
But I want more like in-place modify or the whole json from the original with new value modified, like this:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "new"
}
]
}
]
}
}
Is it possible to do with jq or any known workaround?
Thank you.
Updated
For anyone looking for editing multi-values,
here is the approach I use
JQ=""
for e in DB_HOST=rds DB_USERNAME=xxx; do
k=${e%=*}
v=${e##*=}
JQ+="(.taskDefinition.containerDefinitions[0].environment[] | select(.name==\"$k\") | .value) |= \"$v\" | "
done
jq '${JQ%??}' json
I think there should be more concise way, but this seems working fine.
It is enough to assign to the path, if you are using |=, e.g.
jq '
(.taskDefinition.containerDefinitions[0].environment[] |
select(.name=="DB_USERNAME") | .value) |= "new"
' infile.json
Output:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "new"
}
]
}
]
}
}
Here is a select-free solution using |=:
.taskDefinition.containerDefinitions[0].environment |=
map(if .name=="DB_USERNAME" then .value = "new"
else . end)
Avoiding select within the expression on the LHS of |= makes the solution more robust w.r.t. the version of jq being used.
You might like to consider this alternative to using |=:
walk( if type=="object" and .name=="DB_USERNAME"
then .value="new" else . end)

Filtering cloudformation stack resources using JQ

I'm trying to write a JQ-filter for filtering specific resources from an AWS cloudformation template based on resource properties.
For example, when starting from the following (shortened) cloudformation template:
{
"Resources": {
"vpc001": {
"Type": "AWS::EC2::VPC",
"Properties": {
"CidrBlock": "10.1.0.0/16",
"InstanceTenancy": "default",
"EnableDnsSupport": "true",
"EnableDnsHostnames": "true"
}
},
"ig001": {
"Type": "AWS::EC2::InternetGateway",
"Properties": {
"Tags": [
{
"Key": "Name",
"Value": "ig001"
}
]
}
}
}
}
I would like to construct a jq-filter enabling me to filter out specific resources based on (one or multiple) of their property fields.
For example:
when filtering for Type="AWS::EC2::InternetGateway" the result should be
{
"Resources": {
"ig001": {
"Type": "AWS::EC2::InternetGateway",
"Properties": {
"Tags": [
{
"Key": "Name",
"Value": "ig001"
}
]
}
}
}
}
An added bonus would be to be able to filter on a 'OR'-ed combination of values.
As such a filter for "AWS::EC2::InternetGateway" OR "AWS::EC2::VPC" should yield the original document.
Any suggestion or insight would be greatly appreciated.
Tx!
#hek2mgl's suggestion may be sufficient for your purposes, but it doesn't quite produce the answer you requested. Here's one very similar solution that does. It uses a generalization of jq's map() and map_values() filters that is often useful anyway:
def mapper(f):
if type == "array" then map(f)
elif type == "object" then
. as $in
| reduce keys[] as $key
({};
[$in[$key] | f ] as $value
| if $value | length == 0 then . else . + {($key): $value[0]}
end)
else .
end;
.Resources |= mapper(select(.Type=="AWS::EC2::VPC"))
Using your example input:
$ jq -f resources.jq resources.json
{
"Resources": {
"vpc001": {
"Type": "AWS::EC2::VPC",
"Properties": {
"CidrBlock": "10.1.0.0/16",
"InstanceTenancy": "default",
"EnableDnsSupport": "true",
"EnableDnsHostnames": "true"
}
}
}
As #hek2mgl pointed out, it's now trivial to specify a more complex selection criterion.
}
Here is a solution which uses a separate function to select all resources matching a specified condition which is passed a {key,value} pair for each resource.
def condition:
.value.Type == "AWS::EC2::VPC"
;
{
Resources: .Resources | with_entries(select(condition))
}
Output from sample data:
{
"Resources": {
"vpc001": {
"Type": "AWS::EC2::VPC",
"Properties": {
"CidrBlock": "10.1.0.0/16",
"InstanceTenancy": "default",
"EnableDnsSupport": "true",
"EnableDnsHostnames": "true"
}
}
}
}
Use the select() function:
jq '.Resources[]|select(.Type=="AWS::EC2::VPC")' aws.json
You can use or if you want to filter by multiple conditions, like this:
jq '.Resources[]|select(.Type=="AWS::EC2::VPC" or .Type=="foo")' aws.json
Use aws cli's --query parameter.
Completely eliminates the need for jq.
http://docs.aws.amazon.com/cli/latest/userguide/controlling-output.html#controlling-output-filter
I found one way to do this without defining a function:
jq '.Resources | to_entries[] | select(.value.Type == "AWS::EC2::InternetGateway")|[{key: .key, value: .value}]|from_entries' example.json
{
"ig001": {
"Type": "AWS::EC2::InternetGateway",
"Properties": {
"Tags": [
{
"Key": "Name",
"Value": "ig001"
}
]
}
}
}