JQ: Filtering for keys - json

I'm trying to use the JQ command to filter a json object to selectively extract keys. Here's a sample object that I've placed in file x.txt:
{
"activities" : {
"-KSndgjqvmQkWVKHCpLh" : {
"create_device" : "...",
"stop_time_utc" : "2016-11-01T23:08:08Z"
},
"-KSoBSrBh6PZcjRocGD7" : {
"create_device" : "..."
},
"-KSptboGjo8g4bieUbGM" : {
"create_device" : "...",
"stop_time_utc" : "2017-01-17T23:08:08Z"
}
}
}
The following command can extract all of the activity keys:
cat x.txt | jq '.activities | keys'
[
"-KSndgjqvmQkWVKHCpLh",
"-KSoBSrBh6PZcjRocGD7",
"-KSptboGjo8g4bieUbGM"
]
I've been googling and experimenting for a few hours trying to filter the object to select only the activity entries that have a stop_time_utc value, and use something like a "select(.stop_time_utc | fromdateiso8601 > now)" to only pick activities that have expired. For example, I'd like to use filters to create an array from the sample object with only the one relevant entry:
[
"-KSndgjqvmQkWVKHCpLh"
]
Is attempting this with the keys option the wrong route? Any ideas or suggestions would be much appreciated.

with_entries/1 is your friend, e.g.:
.activities | with_entries( select(.value | has("stop_time_utc") ) )
produces:
{
"-KSndgjqvmQkWVKHCpLh": {
"create_device": "...",
"stop_time_utc": "2016-11-01T23:08:08Z"
},
"-KSptboGjo8g4bieUbGM": {
"create_device": "...",
"stop_time_utc": "2017-01-17T23:08:08Z"
}
It's now easy to add additional selection criteria, extract the key names of interest, etc. For example:
.activities
| with_entries( select( (.value.stop_time_utc? | fromdateiso8601?) < now ) )
| keys

Related

How to generate all JSON paths conditionally with jq

I have arbitrarily nested JSON objects similar to below.
{
"parent1" : "someval"`,
"parent2" : {
"a" : "someval",
"b" : "someval"
},
"parent3" : {
"child1" : {
"a" : "someval"
},
"child2" : {
"b" : "someval"
}
}
}
I need to recursively go through them and check to see if any parent has children keys a or b, or both, and generate the JSON path to that parent like so:
Output:
parent2
parent3.child1
parent3.child2
I have tried using
jq -r 'path(..) | map (. | tostring) | join (".")
Which helps me generate all paths, but I haven't found a way to combine conditions like has("a") with path successfully. How can I go about achieving this ?
You could use index to check if you keys are in the path array:
path(..) | select(index("a") or index("b")) | join(".")
"parent2.a"
"parent2.b"
"parent3.child1.a"
"parent3.child2.b"
JqPlay demo
If you don't want the last key, you could add [:-1] to 'remove' the last index in each array to output:
path(..) | select(index("a") or index("b"))[:-1] | join(".")
"parent2"
"parent2"
"parent3.child1"
"parent3.child2"
JqPlay demo

jq: How can I get array values based on superordinate key name

I'm trying to use jq to parse the output of https://ssl-config.mozilla.org/guidelines/5.6.json, a pretty simple JSON structure.
How can I get the "openssl" values if "configurations" is "modern" or "intermediate"?
The basic JSON structure would be:
{
"configurations": {
"intermediate": {
"ciphers": {
"openssl": [
"ECDHE-ECDSA-AES128-GCM-SHA256",
"ECDHE-RSA-AES128-GCM-SHA256",
"ECDHE-ECDSA-AES256-GCM-SHA384",
"ECDHE-RSA-AES256-GCM-SHA384",
"ECDHE-ECDSA-CHACHA20-POLY1305",
"ECDHE-RSA-CHACHA20-POLY1305",
"DHE-RSA-AES128-GCM-SHA256",
"DHE-RSA-AES256-GCM-SHA384"
]
}
}
}
}
I had to shorten it in order to avoid the "It looks like your post is mostly code; please add some more detail" error message.
To get all both the modern and intermediate openssl arrays, we can use:
jq '.configurations | with_entries(select([.key] | inside([ "modern", "intermediate" ])))[] | .ciphers.openssl' input
This will show:
[]
[
"ECDHE-ECDSA-AES128-GCM-SHA256",
"ECDHE-RSA-AES128-GCM-SHA256",
"ECDHE-ECDSA-AES256-GCM-SHA384",
"ECDHE-RSA-AES256-GCM-SHA384",
"ECDHE-ECDSA-CHACHA20-POLY1305",
"ECDHE-RSA-CHACHA20-POLY1305",
"DHE-RSA-AES128-GCM-SHA256",
"DHE-RSA-AES256-GCM-SHA384"
]
To get a result with an object so we can see on what key those openssl certs are found, use something like:
jq '.configurations | to_entries | map(select([.key] | inside([ "modern", "intermediate" ])) | { "\(.key)": .value.ciphers.openssl }) | add' input
This will produce:
{
"modern": [],
"intermediate": [
"ECDHE-ECDSA-AES128-GCM-SHA256",
"ECDHE-RSA-AES128-GCM-SHA256",
"ECDHE-ECDSA-AES256-GCM-SHA384",
"ECDHE-RSA-AES256-GCM-SHA384",
"ECDHE-ECDSA-CHACHA20-POLY1305",
"ECDHE-RSA-CHACHA20-POLY1305",
"DHE-RSA-AES128-GCM-SHA256",
"DHE-RSA-AES256-GCM-SHA384"
]
}

jq select error: "Cannot index string with string <object>"

command:
cat test.json | jq -r '.[] | select(.input[] | .["$link"] | contains("randomtext1")) | .id'
I was expecting to have both entries (a and b) to show up since they both contains randomtext1
Instead, I got the following output message:
a
jq: error (at <stdin>:22): Cannot index string with string "$link"
From some digging I understand that the issue is likely caused by the following object/value pair in the a entry:
"someotherobj": "123"
because it does not contain the object $link and the filter in the command expects to see $link in all objects under the input so it errors out before the command has a chance to search in the b entry.
What I really want is to be able to search for any entries that have at least one "$link": "randomtext1" pair under input. Is there a fuzzier search feature allowing me to achieve this?
I tried to use two contains hoping it will just pipe things through:
jq -r '.[] | select(.input[] | contains(["$link"]) | contains("randomtext1")) | .id'
but it did not like that at all..
the test.json file:
[
{
"input": {
"obj1": {
"$link": "randomtext1"
},
"obj2": {
"$link": "randomtext2"
},
"someotherobj": "123"
},
"id": "a"
},
{
"input": {
"obj3": {
"$link": "randomtext1"
},
"obj4": {
"$link": "randomtext2"
}
},
"id": "b"
}
]
What I really want is to be able to search for any entries that have at least one "$link": "randomtext1" pair under input.
The key word here, both in the question and the following answer, is any:
.[]
| select( any(.input[];
type=="object" and has("$link") and (.["$link"] | index("randomtext1"))))
| .id
Of course if you require the key's value to be "randomtext1", you'd write .["$link"] == "randomtext1".

Building objects with jq

Using jq I'd like to convert data of the format:
{
"key": "something-else",
"value": {
"value": "bloop",
"isEncrypted": false
}
}
{
"key": "something",
"value": {
"value": "blah",
"isEncrypted": false
}
}
To the format:
{
something: "blah",
something-else: "bloop"
}
Filtering out 'encrypted values' along the way. How can I achieve this? I've gotten as far as the following:
.parameters | to_entries[] | select (.value.isEncrypted == false) | .key + ": " + .value.value
Which produces:
"something-else: bloop"
"something: blah"
Close, but not there just yet. I suspect that there's some clever function for this.
Given the example input, here's a simple solution, assuming the stream of objects is available as an array. (This can be done using jq -s if the JSON objects are given as input to jq, or in your case, following your example, simply using .parameters | to_entries).
map( select(.value.isEncrypted == false) | {(.key): .value.value } )
| add
This produces the JSON object:
{
"something-else": "bloop",
"something": "blah"
}
The key ideas here are:
the syntax for object construction: {( KEYNAME ): VALUE}
add
One way to gain an understanding of how this works is to run the first part of the filter (map(...)) first.
Using keys_unsorted
If you want to avoid the overhead of to_entries, you might want to consider the following approach, which piggy-backs off your implicit description of .parameters:
.parameters
| [ keys_unsorted[] as $k
| if .[$k].isEncrypted == false
then { ($k) : .[$k].value } else empty end ]
| add

Using jq to extract common prefixes in a JSON data structure

I have a JSON data set with around 8.7 million key value pairs extracted from a Redis store, where each key is guaranteed to be an 8 digit number, and the key is an 8 alphanumeric character value i.e.
[{
"91201544":"INXX0019",
"90429396":"THXX0020",
"20140367":"ITXX0043",
...
}]
To reduce Redis memory usage, I want to transform this into a hash of hashes, where the hash prefix key is the first 6 characters of the key (see this link) and then store this back into Redis.
Specifically, I want my resulting JSON data structure (that I'll then write some code to parse this JSON structure and create a Redis command file consisting of HSET, etc) to look more like
[{
"000000": { "00000023": "INCD1234",
"00000027": "INCF1423",
....
},
....
"904293": { "90429300": "THXX0020",
"90429302": "THXX0024",
"90429305": "THXY0013"}
}]
Since I've been impressed by jq and I'm trying to be more proficient at functional style programming, I wanted to use jq for this task. So far I've come up with the following:
% jq '.[0] | to_entries | map({key: .key, pfx: .key[0:6], value: .value}) | group_by(.pfx)'
This gives me something like
[
[
{
"key": "00000130",
"pfx": "000001",
"value": "CAXX3231"
},
{
"key": "00000162",
"pfx": "000001",
"value": "CAXX4606"
}
],
[
{
"key": "00000238",
"pfx": "000002",
"value": "CAXX1967"
},
{
"key": "00000256",
"pfx": "000002",
"value": "CAXX0727"
}
],
....
]
I've tried the following:
% jq 'map(map({key: .pfx, value: {key, value}}))
| map(reduce .[] as $item ({}; {key: $item.key, value: [.value[], $item.value]} ))
| map( {key, value: .value | from_entries} )
| from_entries'
which does give me the correct result, but also prints out an error for every reduce (I believe) of
jq: error: Cannot iterate over null
The end result is
{
"000001": {
"00000130": "CAXX3231",
"00000162": "CAXX4606"
},
"000002": {
"00000238": "CAXX1967",
"00000256": "CAXX0727"
},
...
}
which is correct, but how can I avoid getting this stderr warning thrown as well?
I'm not sure there's enough data here to assess what the source of the problem is. I find it hard to believe that what you tried results in that. I'm getting errors with that all the way.
Try this filter instead:
.[0]
| to_entries
| group_by(.key[0:6])
| map({
key: .[0].key[0:6],
value: map(.key=.key[6:8]) | from_entries
})
| from_entries
Given data that looks like this:
[{
"91201544":"INXX0019",
"90429396":"THXX0020",
"20140367":"ITXX0043",
"00000023":"INCD1234",
"00000027":"INCF1423",
"90429300":"THXX0020",
"90429302":"THXX0024",
"90429305":"THXY0013"
}]
Results in this:
{
"000000": {
"23": "INCD1234",
"27": "INCF1423"
},
"201403": {
"67": "ITXX0043"
},
"904293": {
"00": "THXX0020",
"02": "THXX0024",
"05": "THXY0013",
"96": "THXX0020"
},
"912015": {
"44": "INXX0019"
}
}
I understand that this is not what you are asking for but, just for the reference, I think it will be MUCH more faster to do this with Redis's built-in Lua scripting.
And it turns out that it is a bit more straightforward:
for _,key in pairs(redis.call('keys', '*')) do
local val = redis.call('get', key)
local short_key = string.sub(key, 0, -2)
redis.call('hset', short_key, key, val)
redis.call('del', key)
end
This will be done in place without transferring from/to Redis and converting to/from JSON.
Run it from console as:
$ redis-cli eval "$(cat script.lua)" 0
For the record, jq's group_by relies on sorting, which of course will slow things down noticeably when the input is sufficiently large. The following is about 40% faster even when the input array has just 100,000 items:
def compress:
. as $in
| reduce keys[] as $key ({};
$key[0:6] as $k6
| $key[6:] as $k2
| .[$k6] += {($k2): $in[$key]} );
.[0] | compress
Given Jeff's input, the output is identical.