JQ: Delete key:value pairs anywhere in JSON tree - json

I am looking for a way to delete key:value pairs (where key is a known string) anywhere in the JSON tree with jq.
Specifically: I would like to delete "type": "Link" and "linkType": "Entry" everywhere (in all parts of the hierarchy, where ever they appear). My JSON file is +800000 rows long and is nested deeply.
Snippet:
{
"entries": [
{
"sys": {
"id": "vcLKKhJ3mZNfGMvVZZi07",
"contentType": {
"sys": {
"id": "page"
}
}
},
"fields": {
"title": {
"de-DE": "Startseite",
"en-US": "Home"
},
"description": {
"en-US": "foo"
},
"keywords": {
"en-US": "bar"
},
"stageModules": {
"en-US": [
{
"sys": {
"type": "Link",
"linkType": "Entry",
"id": "11AfBBuNK8bx3EygAS3WTY"
}
}
]
}
}
}
]
}
I've tried so many different things to iterate through the file to remove these two from the output, I constantly end up with "Cannot index array with string". E.g.
[ .[] | select(.linkType != "Entry", .type != "Link" ) ]
Ideal result:
{
"entries": [
{
"sys": {
"id": "vcLKKhJ3mZNfGMvVZZi07",
"contentType": {
"sys": {
"id": "page"
}
}
},
"fields": {
"title": {
"de-DE": "Startseite",
"en-US": "Home"
},
"description": {
"en-US": "foo"
},
"keywords": {
"en-US": "bar"
},
"stageModules": {
"en-US": [
{
"sys": {
"id": "11AfBBuNK8bx3EygAS3WTY"
}
}
]
}
}
}
]
}
Can someone help me out?
Thank you for your help.

From the jq manual:
The walk(f) function applies f recursively to every component of the
input entity.
Use this to recursively go through all items, check your conditions and use del if necessary:
walk(
if type == "object" and .type == "Link" then del(.type) else . end
)
Edit: If "matching is less important", as you state, then just use del on any object:
walk(if type == "object" then del(.type, .linkType) else . end)

Related

Copying fields between array elements in JQ

I'm relatively new to JQ and this might be a simple question.
I have a big, pretty nested JSON file with thousands of assets. Each asset always contains one array with two objects (one mother and one child).
{
"data": [
{
"assets": {
"array": [
{
"id": "7978918",
"labels": {
"text": "mother",
"value": true
},
"properties": {
"context_ids": [],
"parent": null
}
},
{
"id": "caa17b2a-4582-4d13-b891-2607e4ba33c6",
"labels": {
"text": "child",
"value": true
},
"properties": {
"context_ids": [],
"parent": null
}
}
]
}
}
]
}
What I want to do is copying the id from the mother object into the parent field of the child object.
{
"data": [
{
"assets": {
"array": [
{
"id": "7978918",
"labels": {
"text": "mother",
"value": true
},
"properties": {
"context_ids": [],
"parent": null
}
},
{
"id": "caa17b2a-4582-4d13-b891-2607e4ba33c6",
"labels": {
"text": "child",
"value": true
},
"properties": {
"context_ids": [],
"parent": "7978918"
}
}
]
}
}
]
}
So far I got
cat test.json | jq '.data[].assets.array[]|=(.properties.parent=.id|select(.labels.text=="mother"))' |less
but this, of course, only copies the mother id into the mother object's parent field and deletes the child object entirely.
I also tried
cat test.json | jq '.data[].assets.array[].properties += {"parent": .data[].assets.array[0].id} |less
On first sight this looked to be correct but seemingly has created a strange loop which never stops and creates a bigger and bigger JSON.
Assuming the first element is always the mother, this should work just fine:
.data[].assets.array |= (.[1].properties.parent = .[0].id)
Online demo
Otherwise sort them by labels.text first so that they are always at certain indices.
.data[].assets.array |= (sort_by(.labels.text) | .[0].properties.parent = .[1].id)
jq '.data |= map(.assets.array | .[1].properties.parent = .[0].id)' test.json

Fetching the array name when traversing array items in JSONata

I need to fetch the name of the array while traversing the child array items.
for example, if my input looks like
{"title": [
{
"value": "18724-100",
"locale": "en-GB"
},
{
"value": "18724-5",
"locale": "en-GB"
},
{
"value": "18724-99",
"locale": "fr-FR"
}
]}
I need output as
{
"data": [
{
"locale": "en-GB",
"metadata": [
{
"key": "title",
"value": "18724-100"
},
{
"key": "title",
"value": "18724-5"
}
]
},
{
"locale": "fr-FR",
"metadata": {
"key": "title",
"value": "18724-99"
}
}
]
}
I tried following spec in JSONata
{
"data": title{locale: value[]} ~> $each(function($v, $k) {
{
"locale": $k,
"metadata": $v.{"key": ???,"value": $}
}
})
}
Please help me to fill "???" so that I can get the array name
Assuming that the input object will always have a single root-level key you can write your expression like this:
{
"data": title{locale: value[]} ~> $each(function($v, $k) {
{
"locale": $k,
"metadata": $v.{"key": $keys($$)[0],"value": $}
}
})
}
$keys returns an array containing keys in the object. $keys($$) will return all keys in root-level of this array (in this case: "title").
Note that for a following input object:
{"title": [
{
"value": "18724-100",
"locale": "en-GB"
},
{
"value": "18724-5",
"locale": "en-GB"
},
{
"value": "18724-99",
"locale": "fr-FR"
}
],
"foo": 123
}
$keys($$) would return an array of two elements (["title", "foo"]).

jq conditional delete from array

I have this json i got from aws, this is just a test i created not my actual rules
[
{
"Name": "Fortinet-all_rules",
"Priority": 0,
"Statement": {
"ManagedRuleGroupStatement": {
"VendorName": "Fortinet",
"Name": "all_rules",
"ExcludedRules": [
{
"Name": "Database-Vulnerability-Exploit-01"
},
{
"Name": "Database-Vulnerability-Exploit-02"
},
{
"Name": "Database-Vulnerability-Exploit-03"
},
{
"Name": "Malicious-Robot"
},
{
"Name": "OS-Command-Injection-01"
},
{
"Name": "OS-Command-Injection-02"
},
{
"Name": "SQL-Injection-01"
},
{
"Name": "SQL-Injection-02"
},
{
"Name": "SQL-Injection-03"
},
{
"Name": "Source-Code-Disclosure"
},
{
"Name": "Web-Application-Injection-01"
},
{
"Name": "Web-Application-Injection-02"
},
{
"Name": "Web-Application-Vulnerability-Exploit-01"
},
{
"Name": "Web-Application-Vulnerability-Exploit-02"
},
{
"Name": "Web-Application-Vulnerability-Exploit-03"
},
{
"Name": "Web-Application-Vulnerability-Exploit-04"
},
{
"Name": "Web-Application-Vulnerability-Exploit-05"
},
{
"Name": "Web-Application-Vulnerability-Exploit-06"
},
{
"Name": "Web-Application-Vulnerability-Exploit-07"
},
{
"Name": "Web-Scanner-01"
},
{
"Name": "Web-Scanner-02"
},
{
"Name": "Web-Scanner-03"
},
{
"Name": "Web-Server-Vulnerability-Exploit-01"
},
{
"Name": "Web-Server-Vulnerability-Exploit-02"
},
{
"Name": "Web-Server-Vulnerability-Exploit-03"
},
{
"Name": "Web-Server-Vulnerability-Exploit-04"
}
],
"ScopeDownStatement": {
"RegexPatternSetReferenceStatement": {
"ARN": "",
"FieldToMatch": {
"UriPath": {}
},
"TextTransformations": [
{
"Priority": 0,
"Type": "NONE"
}
]
}
}
}
},
"OverrideAction": {
"None": {}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "Fortinet-all_rules"
}
},
{
"Name": "DDOS_rate_rule",
"Priority": 1,
"Statement": {
"RateBasedStatement": {
"Limit": 350,
"AggregateKeyType": "FORWARDED_IP",
"ScopeDownStatement": {
"NotStatement": {
"Statement": {
"IPSetReferenceStatement": {
"ARN": "",
"IPSetForwardedIPConfig": {
"HeaderName": "X-Forwarded-For",
"FallbackBehavior": "MATCH",
"Position": "FIRST"
}
}
}
}
},
"ForwardedIPConfig": {
"HeaderName": "X-Forwarded-For",
"FallbackBehavior": "MATCH"
}
}
},
"Action": {
"Block": {}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "DDOS_rate_rule"
}
}
]
So what i want for example is to delete the element { "Name": "OS-Command-Injection-01" }
I need to do it conditionally
So i tried using select jq '. | select([].Statement.ManagedRuleGroupStatement.ExcludedRules[].Name == "Malicious-Robot")'
problem is it errors jq: error (at :150): Cannot iterate over null (null)
also if i try to chain selects it doesn't work
I will also need to delete several object at once, but if i can delete one i can run the query several times so that's not an issue
|= is useful for modifying elements of a data structure.
The left-hand side should return the things to modify. (Use parens if it contains |.)
The right-hand side is evaluated as if | was used instead of |=.
The right-hand side should return the new value. (Use parens if it contains |.)
The whole returns . with the modifications made.
jq '
( .[].Statement.ManagedRuleGroupStatement.ExcludedRules | arrays ) |=
map(select(.Name != "OS-Command-Injection-01"))
'
jqplay
To delete objects from arrays, you could use the template:
walk(if type == "arrayā€¯
then map(select(
( type=="object" and
(.Name|IN( ... ) ) ) | not ))
else . end)
You can try this :
jq 'walk(if type=="object" and
(.Name|IN("OS-Command-Injection-01","SQL-Injection-03"))
then empty
else . end)' input-file

Use jq to output a flat array of JSON objects nested anywhere within source document

I'd like to select/identity-output all objects in arrays under "emp" keys into a flat array of those objects.
[
{
"eng": {
"dev": {
"dir": {
"name": "Mickey"
},
"emp": [
{
"name": "Goofy",
"job": "laugh",
"start": "today"
},
{
"name": "Minnie",
"job": "laugh"
}
]
}
}
},
{
"mgmt": {
"dir": {
"name": "Donald"
},
"emp": [
{
"name": "Woody",
"job": "smile"
},
{
"name": "Buzz",
"job": "smile"
}
]
}
}
]
I'm looking for a flat array of arbitrary objects found in arbitrary locations within the document (in this example, under "emp" parent/keys).
In this example, it would look like
[
{
"name": "Goofy",
"job": "laugh",
"start": "today"
},
{
"name": "Minnie",
"job": "laugh"
},
{
"name": "Woody",
"job": "smile"
},
{
"name": "Buzz",
"job": "smile"
}
]
I've looked through a lot of documentation and am able to do this if I know in advance precisely where these 'emp' keys are in the document, but not if they're distributed through the document at a priori unknown locations/paths.
Use recurse to walk the structure. From all the substrucures, select objects with the emp key. Output the corresponding values and merge the resulting arrays.
jq '[recurse | select (type == "object" and .emp) | .emp ] | add' file.json

Map conditional child elements

I am working with a JSON file which has contains lot of data that can be removed before sending to an API.
Found that JQ can be used to achieve this but not sure on how to map to get the desired results.
Input JSON
{
"name": "Sample name",
"id": "123",
"userStory": {
"id": "234",
"storyName": "Story Name",
"narrative": "Narrative",
"type": "feature"
},
"testSteps": [
{
"number": 1,
"description": "Step 1",
"level": 0,
"children": [
{
"number": 2,
"description": "Description",
"children": [
{
"number": 3,
"description": "Description"
}
]
},
{
"number": 4,
"anotherfield": "another field"
}
]
}
]
}
Desired Output
{
"name": "Sample name",
"userStory": {
"storyName": "Story Name"
},
"testSteps": [
{
"description": "Step 1",
"children": [
{
"description": "Description",
"children": [
{
"description": "Description"
}
]
},
{
"anotherfield": "anotherfield"
}
]
}
]
}
Tried to do it with the following jq command
map_values(..|{name, id, userStory})
but not sure how to filter only the userStory.storyName.
Thanks in advance.
Note: The actual JSON has different child elements that are repeated in some cases.
To delete .id from root object:
del(.id)
To leave only .storyName in .userStory:
.userStory |= {storyName}
To delete .number and .level from every object on any level in .testSteps:
.testSteps |= walk(if type == "object" then del(.number, .level) else . end)
Putting it all together:
del(.id) | (.userStory |= {storyName}) | (.testSteps |=
walk(if type == "object" then del(.number, .level) else . end))
Online demo