jq - get dict element based on key regex - json

I'm working with a JSON object having the following structure:
{
"key-foo-1.0": [
{
"key1": "foo",
"key2": "bar",
"id": "01"
},
{
"key1": "foo",
"key2": "bar",
"id": "23"
}
],
"key-bar-1.0": [
{
"key1": "foo",
"key2": "bar",
"id": "45"
},
{
"key1": "foo",
"key2": "bar",
"id": "67"
}
],
"key-baz-1.0": [
{
"key1": "foo",
"key2": "bar",
"id": "89"
}
]
}
I want to get all the id values where the "parent" key name matches the pattern .*foo.* or .*bar.*.
So in my example something like this:
cat json | jq <some filter>
01
23
45
67
Based on https://unix.stackexchange.com/questions/443884/match-keys-with-regex-in-jq I tried:
$ cat json | jq 'with_entries(if (.key|test(".*foo.*$")) then ( {key: .key, value: .value } ) else empty end )'
{
"key-foo-1.0": [
{
"key1": "foo",
"key2": "bar",
"id": "01"
},
{
"key1": "foo",
"key2": "bar",
"id": "23"
}
]
}
But I don't really know how to continue.
I also think there is a better/simpler solution.

You could go with:
jq -r '.[keys_unsorted[] | select(test(".*foo.*|.bar.."))][].id'
01
23
45
67
This gathers all keys using keys_unsorted, then selects those matching the regular expression in test. The wrapping .[…] descends into them, the following [] iterates over the children, and .id outputs the values as raw text using the -r flag.

you can use the following JQ expression:
jq 'to_entries[] | select(.key | test(".*foo.*|.*bar.*")) | .value[] | .id'
JQ playground example

Related

Select objects from file A where value of path appear in file B in jq

I want to filter with jq the objects from json content of this fileA
[
{
"id": "bird",
"content": {
"key1": "a"
}
},
{
"id": "dog",
"content": {
"key1": "b"
}
},
{
"id": "cat",
"content": {
"key1": "c"
}
}
]
Where the id appear in this json content of fileB called theId (the sort order has no importance) :
[
{
"theId": "cat"
},
{
"theId": "bird"
}
]
Expected result (the sort order has no importance) :
[
{
"id": "cat",
"content": {
"key1": "c"
}
},
{
"id": "bird",
"content": {
"key1": "a"
}
}
]
I think I can do this in a bash loop :
looping on ids from fileB
execute jq to extract the given id such as
jq -c "map(select(.id | contains(\"$id\")))"
but I have to separate them with , which seems dirty.
I don't know how to say to jq the filter is composed of values of the given array which is stored in fileB
Is it possible ?
Here is one way:
$ jq 'map(.theId) as $ids | input | map(select(.id | IN($ids[])))' fileB fileA
[
{
"id": "bird",
"content": {
"key1": "a"
}
},
{
"id": "cat",
"content": {
"key1": "c"
}
}
]
A simple solution using --slurpfile:
jq --slurpfile b fileB 'map(select(.id|IN($b[][].theId)))' fileA

Use jq to parse key path for each each leaf

(I’m not sure the technical terms to use but can update the question if someone can clarify the terminology I’m lacking for what I'm trying to do. It might help someone find this answer in the future.)
Given the input JSON, how would I use jq to produce the expected output?
Input:
{
"items": {
"item1": {
"part1": {
"a": {
"key1": "value",
"key2": "value"
},
"b": {
"key1": "value",
"key2": "value"
}
},
"part2": {
"c": {
"key1": "value",
"key2": "value"
},
"d": {
"key1": "value",
"key2": "value"
}
}
},
"item2": {
"part3": {
"e": {
"key1": "value",
"key2": "value"
},
"f": {
"key1": "value",
"key2": "value"
}
},
"part4": {
"g": {
"key1": "value",
"key2": "value"
},
"h": {
"key1": "value",
"key2": "value"
}
}
}
}
}
Expected output:
{
"item1": [
"part1.a",
"part1.b",
"part2.c",
"part2.d"
]
"item2": [
"part3.e",
"part3.f"
"part4.g",
"part4.h"
]
}
Try this:
.items | map_values([path(.[][]) | join(".")])
Online demo
Each output path will contain as many path components as the number of []s in the .[][] part; in other words, if you change .[][] to .[][][], for example, you'll see part1.a.key1, part1.a.key2, etc.
This would do it:
# Output: a stream
def keyKey:
keys_unsorted[] as $k | $k + "." + (.[$k] | keys_unsorted[]);
.items | map_values([keyKey])
Some aspects are underspecified. For instance, you don't specify how deep the aggregation should go for the array items. Is it always two levels deep, or is it the whole tree but the last level?
Here's one way how you would go two levels deep with the keys sorted alphabetically:
jq '.items | .[] |= [keys[] as $k | $k + "." + (.[$k] | keys[])]'
Demo
Here's another way how to go down until the second-to-last level:
jq '.items | .[] |= ([path(.. | scalars)[:-1] | join(".")] | unique)'
Demo
Output:
{
"item1": [
"part1.a",
"part1.b",
"part2.c",
"part2.d"
],
"item2": [
"part3.e",
"part3.f",
"part4.g",
"part4.h"
]
}
the unique sequence of jq paths of 'keys' to each and every leaf
is returned from json2jqpath.jq
json2jqpath.jq dat.json
.
.items
.items|.item1
.items|.item1|.part1
.items|.item1|.part1|.a
.items|.item1|.part1|.a|.key1
.items|.item1|.part1|.a|.key2
.items|.item1|.part1|.b
.items|.item1|.part1|.b|.key1
.items|.item1|.part1|.b|.key2
.items|.item1|.part2
.items|.item1|.part2|.c
.items|.item1|.part2|.c|.key1
.items|.item1|.part2|.c|.key2
.items|.item1|.part2|.d
.items|.item1|.part2|.d|.key1
.items|.item1|.part2|.d|.key2
.items|.item2
.items|.item2|.part3
.items|.item2|.part3|.e
.items|.item2|.part3|.e|.key1
.items|.item2|.part3|.e|.key2
.items|.item2|.part3|.f
.items|.item2|.part3|.f|.key1
.items|.item2|.part3|.f|.key2
.items|.item2|.part4
.items|.item2|.part4|.g
.items|.item2|.part4|.g|.key1
.items|.item2|.part4|.g|.key2
.items|.item2|.part4|.h
.items|.item2|.part4|.h|.key1
.items|.item2|.part4|.h|.key2
It is not the output you asked for but as another noted, your question may be somewhat under specified. starting from a preprocessed structure such as this has the advantage of reducing every json file to its set of paths to start fiddling with.
json2jqpath

Using jq to parse Array and map to string

I have the following JSON Data-Structure:
{
"data": [
[
{
"a": "1",
"b": "i"
},
{
"a": "2",
"b": "ii"
},
{
"a": "3",
"b": "iii"
}
],
[
{
"a": "4",
"b": "iv"
},
{
"a": "5",
"b": "v"
},
{
"a": "6",
"b": "vi"
}
]
]
}
And I need to get the following output:
1+2+3 i|ii|iii
4+5+6 iv|v|vi
I tried the following without success:
$ cat data.json | jq -r '.data[] | .[].a | join("+")'
jq: error (at <stdin>:1642): Cannot iterate over string ("1")
And also this, but I don't even got an idea how to solve this:
$ cat data.json | jq -r '.data[] | to_entries | .[]'
Looks like an endless journey for me at this time, I you can help me, I would be very happy. :-)
Should be pretty simple. Get both the fields into an array, join them with the required delimit character and put it in a tabular format
jq -r '.data[] | [ ( map(.a) | join("+") ), ( map(.b) | join("|") ) ] | #tsv'

jq: sort object values

I want to sort this data structure by the object keys (easy with -S and sort the object values (the arrays) by the 'foo' property.
I can sort them with
jq -S '
. as $in
| keys[]
| . as $k
| $in[$k] | sort_by(.foo)
' < test.json
... but that loses the keys.
I've tried variations of adding | { "\($k)": . }, but then I end up with a list of objects instead of one object. I also tried variations of adding to $in (same problem) or using $in = $in * { ... }, but that gives me syntax errors.
The one solution I did find was to just have the separate objects and then pipe it into jq -s add, but ... I really wanted it to work the other way. :-)
Test data below:
{
"": [
{ "foo": "d" },
{ "foo": "g" },
{ "foo": "f" }
],
"c": [
{ "foo": "abc" },
{ "foo": "def" }
],
"e": [
{ "foo": "xyz" },
{ "foo": "def" }
],
"ab": [
{ "foo": "def" },
{ "foo": "abc" }
]
}
Maybe this?
jq -S '.[] |= sort_by(.foo)'
Output
{
"": [
{
"foo": "d"
},
{
"foo": "f"
},
{
"foo": "g"
}
],
"ab": [
{
"foo": "abc"
},
{
"foo": "def"
}
],
"c": [
{
"foo": "abc"
},
{
"foo": "def"
}
],
"e": [
{
"foo": "def"
},
{
"foo": "xyz"
}
]
}
#user197693 had a great answer. A suggestion I got in a private message elsewhere was to use
jq -S 'with_entries(.value |= sort_by(.foo))'
If for some reason using the -S command-line option is not a satisfactory option, you can also perform the by-key sort using the to_entries | sort_by(.key) | from_entries idiom. So a complete solution to the problem would be:
.[] |= sort_by(.foo)
| to_entries | sort_by(.key) | from_entries

jq map object key value to array of objects containing both

I would like to put an object parent key inside the object itself and convert each key value pair to an array
Given:
{
"field1": {
"key1": 11,
"key2": 10
},
"field2": {
"key1": 11,
"key2": 10
}
}
Desired output
[
{"name": "field1", "key1": 11, "key2": 10},
{"name": "field2", "key1": 11, "key2": 10}
]
I know that jq keys would give me ["field1", "field2"] and jq '[.[]]' would give
[
{ "key1": 11, "key2": 10 },
{ "key1": 11, "key2": 10 }
]
I cannot figure out a way to combine them, how should I go about it?
Generate an object in {"name": <key>} form for each key, and merge that with the key's value.
to_entries | map({name: .key} + .value)
or:
[keys_unsorted[] as $k | {name: $k} + .[$k]]
Something like below. Get the list of keys in the JSON using keys[] and add the new field name by indexing key on each object.
jq '[ keys[] as $k | { name: $k } + .[$k] ]'
If you want the ordering of keys maintained, use keys_unsorted[].