jq: insert{"key": "value"} pair in the nested structure - json

I just wonder how we can achieve this with simple online jq.
AS_IS
{
"rules": {
"aaa": {
"xxx": {
"url": "https://microsoft.com"
},
"xxxx": {
"url": "https://netflix.com"
}
},
"bbb": {
"xxx": {
"url": "https://amazon.com"
}
},
"ccc": {
"xxx": {
"url": "https://google.com"
}
}
}
}
TO_BE
{
"rules": {
"aaa": {
"xxx": {
"url": "https://microsoft.com"
},
"xxxx": {
"url": "https://netflix.com"
}
},
"bbb": {
"xxx": {
"url": "https://amazon.com"
}
},
"ccc": {
"xxx": {
"url": "https://google.com",
"pass": "abc"
}
}
}
}
What I tried so far is
.rules[] | select(.xxx.url | try contains("google.com")) | .xxx += {"pass": "abc"}
and the output is below, and I successfully insert "pass": "abc".
{
"xxx": {
"url": "https://google.com",
"pass": "abc"
}
}
but I want to get the whole lines of code. (Not a part of it)
ref. https://jqplay.org/s/S_UWrrLRl5

The idea is right, but ensure the last update happens without traversing down to the leaf path (at the same level as xxx), but relative to the root path by simply wrapping the filter under (..)
( .rules[] | select(.xxx.url | try contains("google.com")) | .xxx ) += {"pass": "abc"}
Always prefer using exact match conditions instead of a partial match
( .rules[] | select(.xxx.url == "https://google.com") | .xxx ) += {"pass": "abc"}

Here's an even more "dynamic" approach:
walk(if type=="object"
and (.url|type)=="string"
and (.url|endswith("google.com"))
then .pass = "abc" else . end)
Of course, depending on the task requirements, a simpler solution such as the following might suffice:
.rules.ccc.xxx.pass = "abc"

Related

extract a subset of deep embed json and print only key,value pair I am interested in the subset json

I have a deep embeded json file:
I want to extract and parse only the subset I am interested in , in my case all content in 'node' key.
How can I:
extract subset of this json file which contains "edges[].node" (edges is the 'parent' key of node)
in 'node' session , I am interested in key:value pair of
.url,
.headline.default, (*this one is 'grandchild' of key 'node'*)
.firstPublished
I want to keep only above 3 item inside 'node' key
How can I print out the super slim version of json file I need ?
a better to have option is : can I still keep the structure/full path which leads json root key to embed 'node' json subset I am interested in ?
Here is the jqplay-myjson (full content of my json file)
Try to attach my full content here :
{
"data": {
"legacyCollection": {
"longDescription": "The latest news, analysis and investigations from Europe.",
"section": {
"name": "world",
"url": "/section/world"
},
"collectionsPage": {
"stream": {
"pageInfo": {
"hasNextPage": true,
"__typename": "PageInfo"
},
"__typename": "AssetsConnection",
"edges": [
{
"node": {
"url": "https://www.nytimes.com/video/world/europe/100000008323381/icc-war-crimes-ukraine.html",
"firstPublished": "2022-04-27T23:28:33.241Z",
"headline": {
"default": "I.C.C. Joins Investigation of War Crimes in Ukraine",
"__typename": "CreativeWorkHeadline"
},
"summary": "Karim Khan, the chief prosecutor of the International Criminal Court, said that his organization would participate in a joint effort — with Ukraine, Poland and Lithuania — to investigate war crimes committed since Russia’s invasion.",
"promotionalMedia": {
"__typename": "Image",
"id": "SW1hZ2U6bnl0Oi8vaW1hZ2UvYTY3MTVhNDUtZDE0NS01OWZjLThkZWItNzYxMWViN2UyODhk"
},
"embedded": false
},
"__typename": "AssetsEdge"
},
{
"node": {
"__typename": "Article",
"url": "https://www.nytimes.com/2022/04/27/sports/soccer/chelsea-sale-roman-abramovich.html",
"firstPublished": "2022-04-27T19:42:17.000Z",
"typeOfMaterials": [
"News"
],
"archiveProperties": {
"lede": "",
"__typename": "ArticleArchiveProperties"
},
"headline": {
"default": "Endgame Nears in Bidding for Chelsea F.C.",
"__typename": "CreativeWorkHeadline"
},
"summary": "The American bank selling the English soccer team on behalf of its Russian owner could name its preferred suitor by the end of the week. But the drama isn’t over.",
"translations": []
},
"__typename": "AssetsEdge"
}
],
"totalCount": 52559
}
},
"sourceId": "100000004047788",
"tagline": "",
"__typename": "LegacyCollection"
}
}
}
Here is the command I have jqplay Demo:
.data.legacyCollection.collectionsPage.stream.edges[].node|= with_entries(select([.key]|inside(["default","url","firstPublished"]))
And here is the output I got
{
"data": {
"legacyCollection": {
"longDescription": "The latest news, analysis and investigations from Europe.",
"section": {
"name": "world",
"url": "/section/world"
},
"collectionsPage": {
"stream": {
"pageInfo": {
"hasNextPage": true,
"__typename": "PageInfo"
},
"__typename": "AssetsConnection",
"edges": [
{
"node": {
"url": "https://www.nytimes.com/video/world/europe/100000008323381/icc-war-crimes-ukraine.html",
"firstPublished": "2022-04-27T23:28:33.241Z"
},
"__typename": "AssetsEdge"
},
{
"node": {
"url": "https://www.nytimes.com/2022/04/27/sports/soccer/chelsea-sale-roman-abramovich.html",
"firstPublished": "2022-04-27T19:42:17.000Z"
},
"__typename": "AssetsEdge"
}
],
"totalCount": 52559
}
},
"sourceId": "100000004047788",
"tagline": "",
"__typename": "LegacyCollection"
}
}
}
Here is the output I expect to have
{
"data": {
"legacyCollection": {
"collectionsPage": {
"stream": {
"edges": [
{
"node": {
"url": "https://www.nytimes.com/video/world/europe/100000008323381/icc-war-crimes-ukraine.html",
"firstPublished": "2022-04-27T23:28:33.241Z"
}
},
{
"node": {
"url": "https://www.nytimes.com/2022/04/27/sports/soccer/chelsea-sale-roman-abramovich.html",
"firstPublished": "2022-04-27T19:42:17.000Z"
}
}
]
}
}
}
}
}
Here's a (somewhat) declarative solution:
(.data.legacyCollection.collectionsPage.stream.edges
| map( {node: (.node
| {url,
firstPublished,
headline: {default: .headline.default} })})) as $edges
| {data: {
legacyCollection: {
collectionsPage: {
stream: {
$edges
}
}
}
}
}
Here's one way to make the selection while ensuring that the structure is preserved. This solution may be of interest because
it can easily be adapted for use with jq's "--stream" option.
def array_startswith($head): .[: $head|length] == $head;
. as $in
| ["data", "legacyCollection", "collectionsPage", "stream", "edges"] as $head
| ($head|length) as $len
| reduce (paths
| select( array_startswith($head) and .[1+$len] == "node" )) as $p
(null;
if ((($p|length) == $len + 3) and ($p[-1] | IN("url", "firstPublished")))
or ((($p|length) == $len + 4) and $p[-2:] == ["headline", "default"])
then setpath($p; $in | getpath($p))
else .
end)

Decrypt values with the same key at different levels from base64

My input is like below. I want to search for SearchString key (you can see that we can't use a fixed index for it) and when the key appears decrypt its value from base64 (perhaps using #base64d filter). Is this possible with JQ? If so, how?
[
{
"Name": "searchblock",
"Priority": 3,
"Statement": {
"RateBasedStatement": {
"Limit": 100,
"AggregateKeyType": "IP",
"ScopeDownStatement": {
"ByteMatchStatement": {
"SearchString": "Y2F0YWxvZ3NlYXJjaA==",
"FieldToMatch": {
"UriPath": {}
},
"TextTransformations": [
{
"Priority": 0,
"Type": "LOWERCASE"
}
],
"PositionalConstraint": "CONTAINS"
}
}
}
},
"Action": {
"Block": {}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "searchblock"
}
},
{
"Name": "bot-block",
"Priority": 4,
"Statement": {
"ByteMatchStatement": {
"SearchString": "Ym90",
"FieldToMatch": {
"SingleHeader": {
"Name": "user-agent"
}
},
"TextTransformations": [
{
"Priority": 0,
"Type": "LOWERCASE"
}
],
"PositionalConstraint": "CONTAINS"
}
},
"Action": {
"Allow": {}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "user-agent"
}
}
]
We use path, paths, getpath, and setpath built-ins for such operations when a fixed path is not available.
getpath(paths | select(.[-1] == "SearchString")) |= #base64d
Online demo
walk is quite intuitive for this kind of task:
walk(if type == "object" and .SearchString
then .SearchString |= #base64d else . end)
Using this approach, it's also trivial to modify the program to make it more robust, e.g. to check that .SearchString is a string:
walk(if type == "object" and (.SearchString|type) == "string"
then .SearchString |= #base64d else . end)
Note: if your jq does not include walk, you can simply copy its def from any reputable web site, or from https://github.com/stedolan/jq/blob/master/src/builtin.jq

Leveling select fields

I am fetching a json response of following structure:
{
"data": {
"children": [
{
"data": {
"id": "abcdef",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_1.jpg"
}
}
]
},
"title": "Boring Title One"
}
},
{
"data": {
"id": "ghijkl",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_2.jpg"
}
}
]
},
"title": "Boring Title Two"
}
},
{
"data": {
"id": "mnopqr",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_3.jpg"
}
}
]
},
"title": "Boring Title Three"
}
},
{
"data": {
"id": "stuvwx",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_4.jpg"
}
}
]
},
"title": "Boring Title Four"
}
}
]
}
}
Ideally I would like to have a shortened json like this:
{
"data": [
{
"id": "abcdef",
"title": "Boring Title One",
"url": "https://example.com/somefiles_1.jpg"
},
{
"id": "ghijkl",
"title": "Boring Title Two",
"url": "https://example.com/somefiles_2.jpg"
},
{
"id": "mnopqr",
"title": "Boring Title Three",
"url": "https://example.com/somefiles_3.jpg"
},
{
"id": "stuvwx",
"title": "Boring Title Four",
"url": "https://example.com/somefiles_4.jpg"
}
]
}
If this is not possible I can work with joining those three values into a single string and latter split when necessary; like this:
abcdef#Boring Title One#https://example.com/somefiles_1.jpg
ghijkl#Boring Title Two#https://example.com/somefiles_2.jpg
mnopqr#Boring Title Three#https://example.com/somefiles_3.jpg
stuvwx#Boring Title Four#https://example.com/somefiles_4.jpg
This is where I am. I was uring the jq with select() and then pipe the results to to_entries like this:
jq -r '.data.children[] | select(.data.post_type|test("image")?) | .data | to_entries[] | [ .value.title , .value.preview.images[0].source.url ] | join("#")' ~/Documents/json/sample.json
I don't understand what goes after to_entries[]; I have tried multiple variations of .key and .values; Mostly I don't get any result but sometimes I get key pairs I do not intend to select. How to learn the proper syntax for it?
Is creating a flat json out of a nested json like this good or is it better to create the string outputs? I feel the string might be error prone especially with the presence of spaces or special characters.
Apparently what you're looking for is the {field} syntax. You don't need to resort to string outputs.
{ data: [
.data.children[].data
| select(has("post_type") and (.post_type | index("image")))
| {id, title} + (.preview.images[].source | {url})
# or, if images array always contains one element:
# | {id, title, url: .preview.images[0].source.url}
]
}
A simple solution to the main question is:
{data: [.data.children[]
| .data
| {id, title, url: .preview.images[0].source.url} ]}
(The "post_type" seems to have disappeared, but hopefully if it's relevant, you will be able to adapt the above as required. Likewise if .images[1] and beyond are relevant.)
String Output
If you want linear output, you should probably consider CSV or TSV, both of which are supported by jq via #csv and #tsv respectively.

Extract child elements and add parent field into them

I am trying to combine nested arrays into a single object so that I can do some sorting. For example, I have the following
{
"disk": [
{
"device": "/dev/sda",
"partitions": [
{ "type": "fat32", "mount": "/efi" },
{ "type": "ext4", "mount": "/boot" }
]
},
{
"device": "/dev/sdb",
"partitions": [
{ "type": "xfs", "mount": "/" }
]
}
]
}
I am trying to say 'give me all partitions where mount is not null, sort them by mount, but include their device name in the output'.
So far I have jq -c '.disk[].partitions[] | select (.mount != null)' which is giving me the correct partitions as such:
{ "type": "xfs", "mount": "/" }
{ "type": "ext4", "mount": "/boot" }
{ "type": "fat32", "mount": "/efi" }
However, I would like to pull in the parent device as such:
{ "type": "xfs", "mount": "/", "device": "/sdb" }
{ "type": "ext4", "mount": "/boot", "device": "/sda" }
{ "type": "fat32", "mount": "/efi", "device": "/sda" }
I've seen other examples that drive off the parent and then pull in the children, but it doesn't seem to work when the parent itself is an array. Is there a way to say "get a child property" such as ... | .device = ..device
You don't need to go back one level to fetch device. Just get a copy of it, select partitions, and add them together.
.disk
| map((.partitions[] | select(.mount != null)) + {device})
| sort_by(.mount)[]
Online demo
You can enrich the partitions with the device and filter the enriched partitions:
jq '.disk[]
| .device as $d
| .partitions[] += { device: $d }
| .partitions[]
| select(.mount != null)
' file.json

JsonPath query?

I have a JSON file like this:
{
"Resources": {
"myresource1": {
"Properties": {
"keyA": {
"Ref": "resource2"
},
"keyB": "something",
"keyC": {
"another object": {
"Ref": "resource3"
}
}
}
},
"resource2": {
"Properties": {
"keyA": 1,
"keyB": 2
}
},
"resource3": {}
}
}
I'd like a JSON Path query that finds all Resources that have Properties that have a Ref object in them.
So in the JSON above, myresource1 has two properties that satisfy this condition and the Refs are resource2 and resource3.
Is this possible?
I found this query works:
$..Properties.*..Ref
It gives me a list of paths:
[
"$['Resources']['myresource1']['Properties']['keyA']['Ref']",
"$['Resources']['myresource1']['Properties']['keyC']['another object']['Ref']"
]
Is there a better solution?