How to search within sections of a JSON File?S - json

So, lets say I had a JSON File like this:
{
"content": [
{
"word": "cat",
"adjectives": [
{
"type": "textile",
"adjective": "fluffy"
},
{
"type": "visual",
"adjective": "small"
}
]
},
{
"word": "dog",
"adjectives": [
{
"type": "textile",
"adjective": "fluffy"
},
{
"type": "visual",
"adjective": "big"
}
]
},
{
"word": "chocolate",
"adjectives": [
{
"type": "visual",
"adjective": "small"
},
{
"type": "gustatory",
"adjective": "sweet"
}
]
}
]
}
Now, say I wanted to search for two words. For example, "Fluffy" and "Small." The problem with this is that both words' adjectives contain small, and so I would have to manually search for which one contains fluffy. So, how would I do this in a quicker manner?
In other words, how would I find the word(s) with both "fluffy" and "small"
EDIT: Sorry, new asker. Anything that words in a terminal is fair game. jq is a really great JSON searcher, and so this is preferred, and sorry for the confusion. I also fixed the JSON

A command-line solution would be to use jq:
jq -r '.content[] | select(.adjectives[].adjective == "fluffy") | .word' /pathToJsonFile.json
Output:
cat
Are you looking for something like this? Do you need a solution that uses other programming languages?
(P.S. your JSON example appears to be invalid)

Since jq is now fair game (this was only clarified later in the comments), here is one solution using jq.
First, fix the JSON to be actually valid:
{
"content": [
{
"word": "cat",
"adjectives": [
{
"type": "textile",
"adjective": "fluffy"
},
{
"type": "visual",
"adjective": "small"
}
]
},
{
"word": "dog",
"adjectives": [
{
"type": "textile",
"adjective": "fluffy"
},
{
"type": "visual",
"adjective": "big"
}
]
},
{
"word": "chocolate",
"adjectives": [
{
"type": "visual",
"adjective": "small"
},
{
"type": "gustatory",
"adjective": "sweet"
}
]
}
]
}
Then, the following jq filter returns an array containing the words which contain both adjectives:
.content
| map(
select(
.adjectives as $adj
| all("small","fluffy"; IN($adj[].adjective))
)
| .word
)
If a non-array output is required, and only one word per line, use .[] instead of map (either after content or as a final filter), e.g.:
jq -r '.content[]
| select(
.adjectives as $adj
| all("small","fluffy"; IN($adj[].adjective))
)
| .word'

Related

jq update json document to alter an array element

I know this has to be simple, but for some reason it's eluding me how to find an element given a condition and modify one of its fields. The doc should be fully output (sed style) with the edit made.
{
"state": "wait",
"steps": {
"step1": [
{ "name":"Foo", "state":"wait" },
{ "name":"Bar", "state":"wait" }
],
"step2": [
{ "name":"Foo", "state":"wait" },
{ "name":"Zoinks", "state":"ready" }
],
"step3": [
{ "name":"Foo", "state":"cancel" }
]
}
}
I'm expecting something like this should be workable.
jq '. | (select(.steps[][].name=="Foo" and .steps[][].state=="wait") |= . + {.state:"Ready"}'
or
jq '. | (select(.steps[][]) | if (.name=="Foo" and .state=="wait") then (.state="Ready") else . end)
The desired output, of course, would be
{
"state": "wait",
"steps": {
"step1": [
{ "name":"Foo", "state":"ready" },
{ "name":"Bar", "state":"wait" }
],
"step2": [
{ "name":"Foo", "state":"ready" },
{ "name":"Zoinks", "state":"ready" }
],
"step3": [
{ "name":"Foo", "state":"cancel" }
]
}
}
Instead, when I'm not getting cryptic errors, I'm either modifying a top-level field in the document or modifying the field for all the elements or repeated the entire doc multiple times.
Any insights greatly appreciated.
Thanks.
p.s. is there a better syntax than [] to wildcard the named-elements under steps? Or after the pipe to identify the indices discovered by the select?
Pipe the output of .steps[][] into a select call that chooses the objects with the desired name and state values, then set the state value on the result.
$ jq '(.steps[][] | select(.name == "Foo" and .state == "wait")).state = "ready"' tmp.json
{
"state": "wait",
"steps": {
"step1": [
{
"name": "Foo",
"state": "ready"
},
{
"name": "Bar",
"state": "wait"
}
],
"step2": [
{
"name": "Foo",
"state": "ready"
},
{
"name": "Zoinks",
"state": "ready"
}
],
"step3": [
{
"name": "Foo",
"state": "cancel"
}
]
}
}
You can help confirm this using diff (the first jq just normalizes the formatting so that only the changes made by the second one show up in the diff):
$ diff <(jq . tmp.json) <(jq '...' tmp.json)
7c7
< "state": "wait"
---
> "state": "ready"
17c17
< "state": "wait"
---
> "state": "ready"

Leveling select fields

I am fetching a json response of following structure:
{
"data": {
"children": [
{
"data": {
"id": "abcdef",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_1.jpg"
}
}
]
},
"title": "Boring Title One"
}
},
{
"data": {
"id": "ghijkl",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_2.jpg"
}
}
]
},
"title": "Boring Title Two"
}
},
{
"data": {
"id": "mnopqr",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_3.jpg"
}
}
]
},
"title": "Boring Title Three"
}
},
{
"data": {
"id": "stuvwx",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_4.jpg"
}
}
]
},
"title": "Boring Title Four"
}
}
]
}
}
Ideally I would like to have a shortened json like this:
{
"data": [
{
"id": "abcdef",
"title": "Boring Title One",
"url": "https://example.com/somefiles_1.jpg"
},
{
"id": "ghijkl",
"title": "Boring Title Two",
"url": "https://example.com/somefiles_2.jpg"
},
{
"id": "mnopqr",
"title": "Boring Title Three",
"url": "https://example.com/somefiles_3.jpg"
},
{
"id": "stuvwx",
"title": "Boring Title Four",
"url": "https://example.com/somefiles_4.jpg"
}
]
}
If this is not possible I can work with joining those three values into a single string and latter split when necessary; like this:
abcdef#Boring Title One#https://example.com/somefiles_1.jpg
ghijkl#Boring Title Two#https://example.com/somefiles_2.jpg
mnopqr#Boring Title Three#https://example.com/somefiles_3.jpg
stuvwx#Boring Title Four#https://example.com/somefiles_4.jpg
This is where I am. I was uring the jq with select() and then pipe the results to to_entries like this:
jq -r '.data.children[] | select(.data.post_type|test("image")?) | .data | to_entries[] | [ .value.title , .value.preview.images[0].source.url ] | join("#")' ~/Documents/json/sample.json
I don't understand what goes after to_entries[]; I have tried multiple variations of .key and .values; Mostly I don't get any result but sometimes I get key pairs I do not intend to select. How to learn the proper syntax for it?
Is creating a flat json out of a nested json like this good or is it better to create the string outputs? I feel the string might be error prone especially with the presence of spaces or special characters.
Apparently what you're looking for is the {field} syntax. You don't need to resort to string outputs.
{ data: [
.data.children[].data
| select(has("post_type") and (.post_type | index("image")))
| {id, title} + (.preview.images[].source | {url})
# or, if images array always contains one element:
# | {id, title, url: .preview.images[0].source.url}
]
}
A simple solution to the main question is:
{data: [.data.children[]
| .data
| {id, title, url: .preview.images[0].source.url} ]}
(The "post_type" seems to have disappeared, but hopefully if it's relevant, you will be able to adapt the above as required. Likewise if .images[1] and beyond are relevant.)
String Output
If you want linear output, you should probably consider CSV or TSV, both of which are supported by jq via #csv and #tsv respectively.

How can I filter for entries that do NOT contain a key-value pair within a nested array

Let's say I have the following JSON output:
{
"Stacks": [
{
"StackName": "hello-world",
"Tags": [
{
"Key": "environment",
"Value": "sandbox"
},
{
"Key": "Joe Shmo",
"Value": "Dev"
}
]
},
{
"StackName": "hello-man",
"Tags": [
{
"Key": "environment",
"Value": "live"
},
{
"Key": "Tandy",
"Value": "Dev"
}
]
}
]
}
How would I write a jq query to grab all StackNames for stacks that do NOT have a Tags value "Key": "Joe Shmo"? So the result would return simply hello-man.
.Stacks[]
| select( any(.Tags[]; .Key == "Joe Shmo" ) | not)
| .StackName
This checks for equality efficiently (any has short-circuit semantics), whereas contains would check for containment.
Using contains, like this:
jq -r '.Stacks[]|select(.Tags|contains([{"Key": "Joe Shmo"}])|not).StackName'
Note: -r removes the quotes from output, otherwise jq would print "hello-man" (within double quotes)

Modifying array of key value in JSON jq

In case, I have an original json look like the following:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "user"
}
]
}
]
}
}
And I would like to inplace modify the value for the matched key like so:
jq '.taskDefinition.containerDefinitions[0].environment[] | select(.name=="DB_USERNAME") | .value="new"' json
I got the output
{
"name": "DB_USERNAME",
"value": "new"
}
But I want more like in-place modify or the whole json from the original with new value modified, like this:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "new"
}
]
}
]
}
}
Is it possible to do with jq or any known workaround?
Thank you.
Updated
For anyone looking for editing multi-values,
here is the approach I use
JQ=""
for e in DB_HOST=rds DB_USERNAME=xxx; do
k=${e%=*}
v=${e##*=}
JQ+="(.taskDefinition.containerDefinitions[0].environment[] | select(.name==\"$k\") | .value) |= \"$v\" | "
done
jq '${JQ%??}' json
I think there should be more concise way, but this seems working fine.
It is enough to assign to the path, if you are using |=, e.g.
jq '
(.taskDefinition.containerDefinitions[0].environment[] |
select(.name=="DB_USERNAME") | .value) |= "new"
' infile.json
Output:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "new"
}
]
}
]
}
}
Here is a select-free solution using |=:
.taskDefinition.containerDefinitions[0].environment |=
map(if .name=="DB_USERNAME" then .value = "new"
else . end)
Avoiding select within the expression on the LHS of |= makes the solution more robust w.r.t. the version of jq being used.
You might like to consider this alternative to using |=:
walk( if type=="object" and .name=="DB_USERNAME"
then .value="new" else . end)

Adding Elements to AWS Route 53 JSON

I've been attempting to programmatically update the AWS Route 53 DNS records, so I've been using jq to update the following JSON file;
{
"Comment": "Update 'A' record for drivepoc.biz zone file",
"Changes": [
{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "www.domain.biz.",
"Type": "A",
"TTL": 60,
"ResourceRecords": [
{
"Value": "123.123.123.123"
}
]
}
}
]
}
So, the existing entry "Value": "123.123.123.123" needs to remain, but needs to have additional entry of "Value": "456.456.456.456". The nearest I got to do this was:
cat a_record.json | jq '.Changes[0].ResourceRecordSet.ResourceRecords |= .+ ["Value: 456.456.456.456"]'
but this puts it outside the braces and the quotes are wrong;
"ResourceRecords": [
{
"Value": "52.18.219.57"
},
"Value": "456.456.456.456"
]
Instead of what is required;
"ResourceRecords": [
{
"Value": "52.18.219.57"
},
{
"Value": "456.456.456.456"
}
]
Can anyone give me any tips please?
You're adding an object to that array, not a string. Create an object to be inserted.
.Changes[].ResourceRecordSet.ResourceRecords += [{Value:"456.456.456.456"}]