How to extract all (also nested) key names with jq - json

How can I extract all key names, even in nested objects with jq?
For example, I have json:
{
"a": 1,
"b": {
"c": 2
}
}
and I want to get list:
a, b, b.c
I know that for top level keys I can get this, with:
. | to_entries[] | .key, but what about keys in nested objects?

Short jq solution:
jq -r '[paths | join(".")]' jsonfile
The output:
[
"a",
"b",
"b.c"
]
paths function outputs the paths to all the elements in its input
join(".") - to concatenate keys within hierarchical paths

Given input foo.json
{"a":1,"b":[{"c":2}]}
jq '[
paths |
map(select(type!="number")) |
select(length > 0) |
join(".")
] | unique' foo.json
outputs
[
"a",
"b",
"b.c"
]

Related

Get unique nested JSON keys with JQ

How to get the unique keys from attributes key with JQ
{"id":1, "attributes":{"a": 1, "b": 2, "c": 3}}
{"id":2, "attributes":{"a": 4, "b": 5, "d": 6}}
{"id":3, "name":"ABC"}
Result like this
[
"a",
"b",
"c",
"d"
]
I'm try like this
jq '.attributes' test.json | jq -r '[inputs | keys[]] | unique | sort'
or
jq -r '[inputs.attributes | keys[]] | unique | sort' test.json
but getting error
jq: error (at :11): null (null) has no keys
One way could be using reduce on subsequent inputs:
jq 'reduce inputs.attributes as $a (.attributes; . + $a) | keys'
[
"a",
"b",
"c",
"d"
]
Demo
Along the lines of your second attempt:
jq -n '[inputs.attributes // empty | keys_unsorted[]] | unique'
The important point is that we have to take care of the case where there is no "attributes" key.
Note also that unique sorts, so (unless you're using gojq) we can use keys_unsorted to avoid redundant sorting.
With slurp:
jq -s 'map(.attributes|keys?)|add|unique' test.json
-s loads the input file as array
map(.attributes|keys?) extracts only the keys (ignoring errors, such as trying to get keys of null)
add merges all nested arrays into a single array ([[1,2],[2,3]] becomes [1,2,2,3])
unique sorts and removes duplicates

Extract value inside a matching block using JQ in a nested array [duplicate]

I have the following json file:
{
"FOO": {
"name": "Donald",
"location": "Stockholm"
},
"BAR": {
"name": "Walt",
"location": "Stockholm"
},
"BAZ": {
"name": "Jack",
"location": "Whereever"
}
}
I am using jq and want to get the "name" elements of the objects where 'location' is 'Stockholm'.
I know I can get all names by
cat json | jq .[] | jq ."name"
"Jack"
"Walt"
"Donald"
But I can't figure out how to print only certain objects, given the value of a sub key (here: "location" : "Stockholm").
Adapted from this post on Processing JSON with jq, you can use the select(bool) like this:
$ jq '.[] | select(.location=="Stockholm")' json
{
"location": "Stockholm",
"name": "Walt"
}
{
"location": "Stockholm",
"name": "Donald"
}
To obtain a stream of just the names:
$ jq '.[] | select(.location=="Stockholm") | .name' json
produces:
"Donald"
"Walt"
To obtain a stream of corresponding (key name, "name" attribute) pairs, consider:
$ jq -c 'to_entries[]
| select (.value.location == "Stockholm")
| [.key, .value.name]' json
Output:
["FOO","Donald"]
["BAR","Walt"]
I had a similar related question: What if you wanted the original object format back (with key names, e.g. FOO, BAR)?
Jq provides to_entries and from_entries to convert between objects and key-value pair arrays. That along with map around the select
These functions convert between an object and an array of key-value
pairs. If to_entries is passed an object, then for each k: v entry in
the input, the output array includes {"key": k, "value": v}.
from_entries does the opposite conversion, and with_entries(foo) is a
shorthand for to_entries | map(foo) | from_entries, useful for doing
some operation to all keys and values of an object. from_entries
accepts key, Key, name, Name, value and Value as keys.
jq15 < json 'to_entries | map(select(.value.location=="Stockholm")) | from_entries'
{
"FOO": {
"name": "Donald",
"location": "Stockholm"
},
"BAR": {
"name": "Walt",
"location": "Stockholm"
}
}
Using the with_entries shorthand, this becomes:
jq15 < json 'with_entries(select(.value.location=="Stockholm"))'
{
"FOO": {
"name": "Donald",
"location": "Stockholm"
},
"BAR": {
"name": "Walt",
"location": "Stockholm"
}
}
Just try this one as a full copy paste in the shell and you will grasp it.
# pass the multiline string to the jq, use the jq to
# select the attribute named "card_id"
# ONLY if its neighbour attribute
# named "card_id_type" has the "card_id_type-01" value.
# jq -r means give me ONLY the value of the jq query no quotes aka raw
cat << EOF | \
jq -r '.[]| select (.card_id_type == "card_id_type-01")|.card_id'
[
{ "card_id": "id-00", "card_id_type": "card_id_type-00"},
{ "card_id": "id-01", "card_id_type": "card_id_type-01"},
{ "card_id": "id-02", "card_id_type": "card_id_type-02"}
]
EOF
# this ^^^ MUST start first on the line - no whitespace there !!!
# outputs:
# id-01
or with an aws cli command
# list my vpcs or
# list the values of the tags which names are "Name"
aws ec2 describe-vpcs | jq -r '.| .Vpcs[].Tags[]
|select (.Key == "Name") | .Value'|sort -nr
Note that you could move up and down in the hierarchy both during the filtering phase and during the selecting phase :
kubectl get services --all-namespaces -o json | jq -r '
.items[] | select( .metadata.name
| contains("my-srch-string")) |
{ name: .metadata.name, ns: .metadata.namespace
, nodePort: .spec.ports[].nodePort
, port: .spec.ports[].port}
'

Listing all keys of a nested object with jq

I want to list the keys of a nested object of my document.
For example, I want the keys in the "a" object: "a1", "b1"
The sample document:
{
"a": {
"a1": "hello",
"a2": "world"
},
"b": {
"b1": "bonjour",
"b2": "monde"
}
}
I know I can use keys, but it seems to work only for the first level object: cat my.json | jq keys will output a, b.
So far I chain two calls with jq but I wonder if we can do it in one call ?
cat my.json | jq .a | jq keys --> a1, b1
Ok I've just find out in a single call :
cat my.json | jq '.a|keys'
a1, b1
Or even as #Inian suggested without the cat
jq '.a|keys' my.json
a1, b1

How to group a JSON by a key and sort by its count?

I start from a jsonlines file similar to this
{ "kw": "foo", "age": 1}
{ "kw": "foo", "age": 1}
{ "kw": "foo", "age": 1}
{ "kw": "bar", "age": 1}
{ "kw": "bar", "age": 1}
Please note each line is a valid json, but the whole file is not.
The output I'm seeking is an ordered list of keywords sorted by its occurrence. Like this:
[
{"kw": "foo", "count": 3},
{"kw": "bar", "count": 2}
]
I'm able to group and count the keywords using the slurp option
jq --slurp '. | group_by(.kw) | .[] | {kw: .[0].kw, count: . | length }'
Output:
{"kw":"bar","count":2}
{"kw":"foo","count":3}
But:
This is not sorted
This is not valid JSON array
A very stupid solution I've found, is to pass twice via jq :)
jq --slurp --compact-output '. | group_by(.kw) | .[] | {kw: .[0].kw, count: . | length }' sample.json \
| jq --slurp --compact-output '. | sort_by(.count)'
But I'm pretty sure someone smarter than me can find a more elegant solution.
This is not sorted
That is not quite correct, group_by(.foo) internally does a sort(.foo), so the results are shown in the sorted order of the field. See jq Manual - group_by(path_expression)
This is not valid JSON array
Just enclose the operation within [..] and also the leading . is optional. So just do
jq --slurp --compact-output '[ group_by(.kw)[] | {kw: .[0].kw, count: length } ]'
If you are referring to sort by the .count you can do a ascending sort and reverse
jq --slurp --compact-output '[ group_by(.kw)[] | {kw: .[0].kw, count: length }] | sort_by(.count) | reverse'

Select objects based on value of variable in object using jq

I have the following json file:
{
"FOO": {
"name": "Donald",
"location": "Stockholm"
},
"BAR": {
"name": "Walt",
"location": "Stockholm"
},
"BAZ": {
"name": "Jack",
"location": "Whereever"
}
}
I am using jq and want to get the "name" elements of the objects where 'location' is 'Stockholm'.
I know I can get all names by
cat json | jq .[] | jq ."name"
"Jack"
"Walt"
"Donald"
But I can't figure out how to print only certain objects, given the value of a sub key (here: "location" : "Stockholm").
Adapted from this post on Processing JSON with jq, you can use the select(bool) like this:
$ jq '.[] | select(.location=="Stockholm")' json
{
"location": "Stockholm",
"name": "Walt"
}
{
"location": "Stockholm",
"name": "Donald"
}
To obtain a stream of just the names:
$ jq '.[] | select(.location=="Stockholm") | .name' json
produces:
"Donald"
"Walt"
To obtain a stream of corresponding (key name, "name" attribute) pairs, consider:
$ jq -c 'to_entries[]
| select (.value.location == "Stockholm")
| [.key, .value.name]' json
Output:
["FOO","Donald"]
["BAR","Walt"]
I had a similar related question: What if you wanted the original object format back (with key names, e.g. FOO, BAR)?
Jq provides to_entries and from_entries to convert between objects and key-value pair arrays. That along with map around the select
These functions convert between an object and an array of key-value
pairs. If to_entries is passed an object, then for each k: v entry in
the input, the output array includes {"key": k, "value": v}.
from_entries does the opposite conversion, and with_entries(foo) is a
shorthand for to_entries | map(foo) | from_entries, useful for doing
some operation to all keys and values of an object. from_entries
accepts key, Key, name, Name, value and Value as keys.
jq15 < json 'to_entries | map(select(.value.location=="Stockholm")) | from_entries'
{
"FOO": {
"name": "Donald",
"location": "Stockholm"
},
"BAR": {
"name": "Walt",
"location": "Stockholm"
}
}
Using the with_entries shorthand, this becomes:
jq15 < json 'with_entries(select(.value.location=="Stockholm"))'
{
"FOO": {
"name": "Donald",
"location": "Stockholm"
},
"BAR": {
"name": "Walt",
"location": "Stockholm"
}
}
Just try this one as a full copy paste in the shell and you will grasp it.
# pass the multiline string to the jq, use the jq to
# select the attribute named "card_id"
# ONLY if its neighbour attribute
# named "card_id_type" has the "card_id_type-01" value.
# jq -r means give me ONLY the value of the jq query no quotes aka raw
cat << EOF | \
jq -r '.[]| select (.card_id_type == "card_id_type-01")|.card_id'
[
{ "card_id": "id-00", "card_id_type": "card_id_type-00"},
{ "card_id": "id-01", "card_id_type": "card_id_type-01"},
{ "card_id": "id-02", "card_id_type": "card_id_type-02"}
]
EOF
# this ^^^ MUST start first on the line - no whitespace there !!!
# outputs:
# id-01
or with an aws cli command
# list my vpcs or
# list the values of the tags which names are "Name"
aws ec2 describe-vpcs | jq -r '.| .Vpcs[].Tags[]
|select (.Key == "Name") | .Value'|sort -nr
Note that you could move up and down in the hierarchy both during the filtering phase and during the selecting phase :
kubectl get services --all-namespaces -o json | jq -r '
.items[] | select( .metadata.name
| contains("my-srch-string")) |
{ name: .metadata.name, ns: .metadata.namespace
, nodePort: .spec.ports[].nodePort
, port: .spec.ports[].port}
'