Using jq: Only select parents with a certain child key - json

Example json:
{
"version": "3",
"services": {
"web": {
"build": "web"
},
"redis": {
"image": "redis"
},
"datadog": {
"build": "datadog"
},
"another": {
"image": "mysql"
}
}
}
I'd like to return a list of services that have the "build" key, and not the "image" key. Note that the value for the build key isn't something I can key off of.
Output should be: ["web", "datadog"]

Here are two ways that work:
1.
jq '.services
| . as $services
| keys_unsorted
| map( select($services[.] | has("build")) )'
(Drill down to .services, remember it as $services for later use, get the list of keys, and select the ones such that the corresponding value in $services has a build key).
2.
jq '.services
| to_entries
| map( select(.value | has("build")) | .key)'
(Drill down to .services, convert to a list of {"key": ..., "value": ...} objects, select the ones where the .value has a build key, and return the .key for each).
The second is probably more idiomatic jq, but the first provides an interesting way to think about the problem as well.

Here's a third approach, notable for being oblivious to the upper reaches:
[(paths(scalars)
| select(.[-1] == "build")) as $p
| getpath($p)]

Related

How to retrieve recursive path to a specific key (not displaying the parents' key name, but the value from a different key of each parent)

I have the following JSON
[
{
"name": "alpha"
},
{
"fields": [
{
"name": "beta_sub_1"
},
{
"name": "beta_sub_2"
}
],
"name": "beta"
},
{
"fields": [
{
"fields": [
{
"name": "gamma_sub_sub_1"
}
],
"name": "gamma_sub_1"
}
],
"name": "gamma"
}
]
and I would like to get the paths of "name" needed to get to each "name" values. Considering the above code, I would like the following result:
"alpha"
"beta.beta_sub_1"
"beta.beta_sub_2"
"beta"
"gamma.gamma_sub_1.gamma_sub_sub_1"
"gamma.gamma_sub_1"
"gamma"
I've been searching around but I couldn't get to this result. So far, I have this:
tostream as [$p,$v] | select($p[-1] == "name" and $v != null) | "\([$p[0,1]] | join(".")).\($v)"
but this gives me the path with the key name of the parents (and doesn't keep all the intermediary parents.
"0.name.alpha"
"1.fields.beta_sub_1"
"1.fields.beta_sub_2"
"1.name.beta"
"2.fields.gamma_sub_sub_1"
"2.fields.gamma_sub_1"
"2.name.gamma"
Any ideas?
P.S.: I've been searching for very detailed doc on jq but couldn't find anything good enough. If anyone has any recommendations, I'd appreciate.
The problem description does not seem to match the sample input and output, but the following jq program produces the required output:
def descend:
select( type == "object" and has("name") )
| if has("fields") then ([.name] + (.fields[] | descend)) else empty end,
[.name] ;
.[]
| descend
| join(".")
With your input, and using the -r command-line option, this produces:
alpha
beta.beta_sub_1
beta.beta_sub_2
beta
gamma.gamma_sub_1.gamma_sub_sub_1
gamma.gamma_sub_1
gamma
Resources
Apart from the jq manual, FAQ, and Cookbook, you might find the following helpful:
"jq Language Description"
"A Stream-Oriented Introduction to jq"

Get parent value from json using jq

My json file looks like this;
{
"RQBTYFE86MFC3oL": {
"name": "Nightmode",
"lights": [
"1",
"2",
"3",
"4",
"5",
"7",
"8",
"9",
"10",
"11"
],
"owner": "kvovodUUfn2vlby9h9okdDhv8SrTzkBFjk6kPz2v",
"recycle": false,
"locked": false,
"appdata": {
"version": 1,
"data": "QSXCj_r01_d99"
},
"picture": "",
"lastupdated": "2018-08-08T03:21:39",
"version": 2
}
}
I want to get the 'RQBTYFE86MFC3oL' value by doing a query for 'Nightmode'. So far I came up with this;
jq '.[] | select(.name == "Nightmode")'
This will return me the correct part of the Json but the 'RQBTYFE86MFC3oL' part is stripped. How do I get this part as well?
A simple way to determine the key name(s) corresponding to values satisfying a certain condition is to use to_entries, as explained in the jq manual.
Using this approach, the appropriate jq filter would be:
to_entries[] | select(.value.name == "Nightmode") | .key
with the result:
"RQBTYFE86MFC3oL"
If you want to get the key-value pair, you'd use with_entries as follows:
with_entries( select(.value.name == "Nightmode") )
If the input JSON is too large to fit comfortably in memory, then it would make sense to use jq's streaming parser (invoked with the --stream command-line option):
jq --stream '
select(.[1] == "Nightmode" and (first|length) == 2 and first[1] == "name")
| first | first'
This would produce the key name.
The key idea is that the streaming parser produces arrays including pairs of the form: [ARRAYPATH, VALUE] where VALUE is the value at ARRAYPATH.
You want to get the Key Value.
So use the keys command, to return 'RQBTYFE86MFC3oL' as that is the key, the rest is the value of that key.
jq 'keys'
Here is a snippet: https://jqplay.org/s/YvpCb2PH42
Reference: https://stedolan.github.io/jq/manual/

Use JQ to select specific, arbitrarily nested objects from JSON

I'm looking for efficient means to search through an large JSON object for "sub-objects" that match a filter (via select(), I imagine). However, the top-level JSON is an object with arbitrary nesting contained within, including more simple values, objects and arrays of objects. For example:
{
"name": "foo",
"class": "system",
"description": "top-level-thing",
"configuration": {
"status": "normal",
"uuid": "id"
},
"children": [
{
"id": "c1",
"class": "c1",
"children": [
{
"id": "c1.1",
"class": "c1.1"
},
{
"id": "c1.1",
"class": "FINDME"
}
]
},
{
"id": "c2",
"class": "FINDME"
}
],
"thing": {
"id": "c3",
"class": "FINDME"
}
}
I have a solution which does part of what I want (and is understandable):
jq -r '.. | arrays | .[] | select(.class=="FINDME"?) | .id'
which returns:
c2
c1.1
... however, it misses c3, plus it changes the order of items output. Additionally I'm expecting this to operate on potentially very large JSON structures, I would like to make sure I find an efficient solution. Bonus points for something that remains readable by jq neophytes (myself included).
FWIW, references I was using to help me on the way, in case they help others:
Select objects based on value of variable in object using jq
How to use jq to find all paths to a certain key
Recursive search values by key
For small to modest-sized JSON input, you're on the right track with ..
but it seems you want to select objects, like so:
.. | objects | select(.class=="FINDME"?) | .id
For JSON documents that are very large, this might require too much memory, so it may be worth knowing about jq's streaming parser. Unfortunately it's much more difficult to use, so I'd suggest trying the above, and if you're interested, look in the usual places for documentation about the --stream option.
Here's a streaming-parser solution. To make sense of it, you'll need to read up on the --stream option, but the key is that the output includes lines of the form: [PATH, VALUE]
program.jq
foreach inputs as $in (null;
if has("id") and has("class") then null
else . as $x
| $in
| if length != 2 then null
elif .[0][-1] == "id" then ($x + {id: .[-1]})
elif .[0][-1] == "class"
and .[-1] == "FINDME" then ($x + {class: .[-1]})
else $x
end
end;
select(has("id") and has("class")) | .id )
Invocation
jq -n --stream -f program.jq input.json
Output with sample input
"c1.1"
"c2"
"c3"

Deep JSON merge

I have multiple JSON files that I'd like to merge into one.
Some have the same root element but different children. I don't want to overwrite the children but too extend them if they have the same parent element.
I've tried this answer, but it doesn't work:
jq: error (at file2.json:0): array ([{"title":"...) and array ([{"title":"...) cannot be multiplied
Sample files and wanted result (Gist)
Thank you in advance.
Here is a recursive solution which uses group_by(.key) to decide
which objects to combine. This could be a little simpler if .children
were more uniform. Sometimes it's absent in the sample data and sometimes it's the unusual value [{}].
def merge:
def kids:
map(
.children
| if length<1 then empty else .[] end
)
| if length<1 then {} else {children:merge} end
;
def mergegroup:
{
title: .[0].title
, key: .[0].key
} + kids
;
if .==[{}] then .
else group_by(.key) | map(mergegroup)
end
;
[ .[] | .[] ] | merge
When run with the -s option as follows
jq -M -s -f filter.jq file1.json file2.json
It produces the following output.
[
{
"title": "Title1",
"key": "12345678",
"children": [
{
"title": "SubTitle2",
"key": "123456713",
"children": [
{}
]
},
{
"title": "SubTitle1",
"key": "12345679",
"children": [
{
"title": "SubSubTitle1",
"key": "12345610"
},
{
"title": "SubSubTitle2",
"key": "12345611"
},
{
"title": "DifferentSubSubTitle1",
"key": "12345612"
}
]
}
]
}
]
If the ordering of the objects within the .children matters
then an a sort_by can be added to the {children:merge} expression,
e.g. {children:merge|sort_by(.key)}
Here is something that will reproduce your desired result. It's by no means automatic, It's really a proof of concept at this stage.
One liner:
jq -s '. as $in | ($in[0][].children[].children + $in[1][].children[0].children | unique) as $a1 | $in[1][].children[1] as $s1 | $in[0] | .[0].children[0].children = ($a1) | .[0].children += [$s1]' file1.json file2.json
Multi line breakdown (Copy/Paste):
jq -s '. as $in
| ($in[0][].children[].children + $in[1][].children[0].children
| unique) as $a1
| $in[1][].children[1] as $s1
| $in[0]
| .[0].children[0].children = ($a1)
| .[0].children += [$s1]' file1.json file2.json
Where:
$in : file1.json and file2.json combined input
$a1: merged "SubSubTitle" array
$s1: second subtitle object
I suspect the reason this didn't work was because your schema is different and has nested arrays.
I find it quite hypnotic looking at this, it would be good if you could elaborate a bit on how fixed the structure is and what the requirements are.

jq - How to iterate through keys of different names

I've got JSON that looks like this
{
"keyword1": {
"identifier1": 16
},
"keyword2": {
"identifier2": 16
}
}
and I need to loop through the keywords to get the identifiers (not sure if I'm using the right terminology here). Seems pretty simple, but because the keywords are all named different, I don't know how to handle that.
The original tag for this question was jq so here is a jq solution:
.[] | keys[]
For example, with the input as shown in the question:
$ jq '.[] | keys[]' input.json
"identifier1"
"identifier2"
To retrieve the key names in the order they appear in the JSON object, use keys_unsorted.
I'd think something along these lines would work well:
jq '. | to_entries | .[].key'
see https://stedolan.github.io/jq/manual/#to_entries,from_entries,with_entries
or if you wanted to get the values from a variable:
JSON_DATA={main:{k1:v1,k2:v2}}
result=$(jq -n "$JSON_DATA" | jq '.main | to_entries | .[].value' --raw-output)
echo $result
##outputs: v1 v2
I came here hoping to sort out a bunch of keys from my JSON, I found two features handy. There are three functions "to_entries", "from_entries", and "with_entries". You can filter the values by key or value, like so:
JSON_DATA='
{
"fields": {
"first": null,
"second": "two",
"third": "three"
}
}
'
echo "$JSON_DATA" | jq '{fields: .fields | with_entries(select(.value != null and .key != "third")) }'
Output:
{
"fields": {
"second": "two"
}
}
simpler solution - just treat internal hash as a new hash and add one more filter. The query that helped me:
$ docker network inspect bridge|jq '.[].Containers'
{
"35c9e1273c43db01c45b5f43f6999d04c18beff3996ea09fb8b87a8b635c38ff": {
"Name": "nginx",
"EndpointID": "a6e788d6f90eb14df2321a2eb02517f0862c1fe7fe50c02f2b8c103c0c79cb6b",
"MacAddress": "02:42:ac:11:00:02",
"IPv4Address": "172.17.0.2/16",
"IPv6Address": ""
},
"b46c157cec243969f9227251dfd6fa65b7a904e145df80a63f79d4dc8b281355": {
"Name": "sweet_gates",
"EndpointID": "a600d9c1ee35b9f7db31249ae8f589c202e0b260e10a394757a88bfd66b5b42f",
"MacAddress": "02:42:ac:11:00:03",
"IPv4Address": "172.17.0.3/16",
"IPv6Address": ""
}
}
As I needed only couple of fields, add to above .json one more query:
$ docker network inspect bridge|jq -jr '.[].Containers[]|.IPv4Address, "\t", .Name, "\n"'
172.17.0.2/16 nginx
172.17.0.3/16 sweet_gates