how to not skip array containing null using jq?

how to not skip array containing null using jq? - json

I want to process this data
{
"results": [
{
"headword": "binding",
"senses": [
{
"definition": [
"a promise, agreement etc that must be obeyed"
]
}
]
},
{
"headword": "non-binding",
"senses": [
{
"definition": [
"a non-binding agreement or decision does not have to be obeyed"
],
"examples": [
{
"text": "The industry has signed a non-binding agreement to reduce pollution."
}
]
}
]
}
]
}
into this
{
"headword": "binding",
"definition": "a promise, agreement etc that must be obeyed",
"examples": null
}
{
"headword": "non-binding",
"definition": "a non-binding agreement or decision does not have to be obeyed",
"examples": "The industry has signed a non-binding agreement to reduce pollution."
}
this command
cat data.json | jq '.results[] | {headword: .headword, definition: .senses[].definition[], examples: .senses[].examples[].text}'
errors out with 'Cannot iterate over null'
to overcome that, this command using '.[]?' filter
cat data.json | jq '.results[] | {headword: .headword, definition: .senses[].definition[], examples: .senses[].examples[]?.text}'
but this outputs only
{
"headword": "non-binding",
"definition": "a non-binding agreement or decision does not have to be obeyed",
"examples": "The industry has signed a non-binding agreement to reduce pollution."
}
so, How do you iterate over null and not skip array?

Using an if/else statement may help.
jq '.results[] | {
headword,
definition: .senses[0].definition[0],
examples: (if .senses[0].examples then .senses[0].examples[0].text else null end)
}' data.json

As #oguzismail has implicitly pointed out,
assuming that the senses array has only one element
is risky, especially as the choice of name suggests
it was anticipated that each headword might have more than one sense.
A similar observation could be made about .examples, but
the Q does not make it clear what should be done if .examples has more than one element.
In the following I shall therefore opt for a safe approach,
since it can easily be adjusted to meet more specific requirements.
.results[]
| { headword }
+ (.senses[]
| { definition: .definition[0],
examples: (if has("examples")
then [.examples[].text]
else null end) } )

Related

Jq to string filter

Trying to filter the output from Json, so far the filter used works as expected when , the software version is found .However when the software version is not present , jq will result in a error. Basically how do I escape the () and return empty in the csv file.
.result [] | [ "https://vuldb.com/?id." + .entry.id ,.software.vendor // "empty"
,.software.name // "empty", (.software.version []| tostring // "empty")
,.software.type // "empty"
,.software.platform //"empty" ]
"result": [
{
"entry": {
"id": "206880",
"title": "CrowdStrike Falcon 6.31.14505.0\/6.42.15610 Uninstallation authorization",
"summary": "A vulnerability was found in CrowdStrike Falcon 6.31.14505.0\/6.42.15610. It has been classified as problematic. Affected is some unknown functionality of the component Uninstallation Handler. There is no information about possible countermeasures known. It may be suggested to replace the affected object with an alternative product.",
"details": {
"affected": "A vulnerability was found in CrowdStrike Falcon 6.31.14505.0\/6.42.15610. It has been classified as problematic.",
"vulnerability": "CWE is classifying the issue as CWE-862. The software does not perform an authorization check when an actor attempts to access a resource or perform an action.",
"impact": "This is going to have an impact on availability.",
"exploit": "It is declared as functional. The vulnerability was handled as a non-public zero-day exploit for at least 54 days. During that time the estimated underground price was around $0-$5k.",
"countermeasure": "There is no information about possible countermeasures known. It may be suggested to replace the affected object with an alternative product.",
"sources": "Further details are available at modzero.com."
},
"timestamp": {
"create": "1661155277",
"change": "1661155462"
},
"changelog": [
"vulnerability_cvss3_meta_basescore",
"vulnerability_cvss3_meta_tempscore",
"vulnerability_cvss3_researcher_basescore"
]
},
"software": {
"vendor": "CrowdStrike",
"name": "Falcon",
"version": [
"6.31.14505.0",
"6.42.15610"
],

Suppress the error and provide an alternative for empty values:
.result | map(
"https://vuldb.com/?id.\(.entry.id)",
(
.software |
.vendor // "empty",
.name // "empty",
(.version[] | tostring // "empty")? // "no version",
.type // "empty",
.platform // "empty"
)
)

jq - return array value if its length is not null

I have a report.json generated by a gitlab pipeline.
It looks like:
{"version":"14.0.4","vulnerabilities":[{"id":"64e69d1185ecc48a1943141dcb6dbd628548e725f7cef70d57403c412321aaa0","category":"secret_detection"....and so on
If no vulnerabilities found, then "vulnerabilities":[]. I'm trying to come up with a bash script that would check if vulnerabilities length is null or not. If not, print the value of the vulnerabilities key. Sadly, I'm very far from scripting genius, so it's been a struggle.
While searching web for a solution to this, I've come across jq. It seems like select() should do the job.
I've tried:
jq "select(.vulnerabilities!= null)" report.json
but it returned {"version":"14.0.4","vulnerabilities":[{"id":"64e69d1185ecc48a194314... instead of expected "vulnerabilities":[{"id":"64e69d1185ecc48a194314...
and
map(select(.vulnerabilities != null)) report.json
returns "No matches found"
Would you mind pointing out what's wrong apart from my 0 experience with bash and JSON parsing? :)
Thanks in advance

Just use . filter to identify the object vulnerabilities.
these is some cases below
$ jq '.vulnerabilities' <<END
heredoc> {"version":"14.0.4","vulnerabilities":[{"id":"64e69d1185ecc48a1943141dcb6dbd628548e725f7cef70d57403c412321aaa0","category":"secret_detection"}]}
heredoc> END
[
{
"id": "64e69d1185ecc48a1943141dcb6dbd628548e725f7cef70d57403c412321aaa0",
"category": "secret_detection"
}
]
if vulnerabilities null, then jq will return null
$ jq '.vulnerabilities' <<END
{"version":"14.0.4","vulnerabilities":null}
END
null
then with pipe |, you can change it to any output you wanted.
change null to []: .vulnerabilities | if . == null then [] else . end
filter empty array: .vulnerabilities | select(length > 0)
For further information about jq filters, you can read the jq manual.

Assuming, by "print the value of the vulnerabilities key" you mean the value of an item's id field. You can retrieve it using .id and have it extracted to bash with the -r option.
If in case the array is not empty you want all of the "keys", iterate over the array using .[]. If you just wanted a specific key, let's say the first, address it using a 0-based index: .[0].
To check the length of an array there is a dedicated length builtin. However, as your final goal is to extract, you can also attempt to do so right anyway, suppress a potential unreachability error using the ? operator, and have your bash script read an appropriate exit status using the -e option.
Your bash script then could include the following snippet
if key=$(jq -re '.vulnerabilities[0].id?' report.json)
then
# If the array was not empty, $key contains the first key
echo "There is a vulnerability in key $key."
fi
# or
if keys=$(jq -re '.vulnerabilities[].id?' report.json)
then
# If the array was not empty, $keys contains all the keys
for k in $keys
do echo "There is a vulnerability in key $k."
done
fi

Firstly, please note that in the JSON world, it is important to distinguish
between [] (the empty array), the values 0 and null, and the absence of a value (e.g. as the result of the absence of a key in an object).
In the following, I'll assume that the output should be the value of .vulnerabilities
if it is not `[]', or nothing otherwise:
< sample.json jq '
select(.vulnerabilities != []).vulnerabilities
'
If the goal were to differentiate between two cases based on the return code from jq, you could use the -e command-line option.

You can use if-then-else.
Filter
if (.vulnerabilities | length) > 0 then {vulnerabilities} else empty end
Input
{
"version": "1.1.1",
"vulnerabilities": [
{
"id": "111",
"category": "secret_detection"
},
{
"id": "112",
"category": "secret_detection"
}
]
}
{
"version": "1.2.1",
"vulnerabilities": [
{
"id": "121",
"category": "secret_detection 2"
}
]
}
{
"version": "3.1.1",
"vulnerabilities": []
}
{
"version": "4.1.1",
"vulnerabilities": [
{
"id": "411",
"category": "secret_detection 4"
},
{
"id": "412",
"category": "secret_detection"
},
{
"id": "413",
"category": "secret_detection"
}
]
}
Output
{
"vulnerabilities": [
{
"id": "111",
"category": "secret_detection"
},
{
"id": "112",
"category": "secret_detection"
}
]
}
{
"vulnerabilities": [
{
"id": "121",
"category": "secret_detection 2"
}
]
}
{
"vulnerabilities": [
{
"id": "411",
"category": "secret_detection 4"
},
{
"id": "412",
"category": "secret_detection"
},
{
"id": "413",
"category": "secret_detection"
}
]
}
Demo
https://jqplay.org/s/wicmr4uVRm

Update deeply nested field with value from higher-level object in JQ

Given the following input JSON:
{
"version": 2,
"models": [
{
"name": "first_table",
"tests": [
{
"dbt_utils.equal_rowcount": {
"compare_model": null
}
}
]
},
{
"name": "second_table",
"tests": [
{
"dbt_utils.equal_rowcount": {
"compare_model": null
}
}
]
}
]
}
How would I, using jq, replace the null (i.e., the value of "compare_model") with the value from the "name" key? Note that the key-value pairs in question here are not at the same level in the hierarchy: the former is nested in an object in an array, and it is this array that is at the same level as the latter.
For example, the output file should read:
{
"version": 2,
"models": [
{
"name": "first_table",
"tests": [
{
"dbt_utils.equal_rowcount": {
"compare_model": "first_table"
}
}
]
},
{
"name": "second_table",
"tests": [
{
"dbt_utils.equal_rowcount": {
"compare_model": "second_table"
}
}
]
}
]
}
FWIW, this is an intermediate step in some YAML (via yq, the Python wrapper variety of jq as opposed to the go variant) wrangling I'm doing on DBT config files.
(Bonus points if you can wrap the replacement text with parentheses and/or prefix it without breaking out of jq. :D If not, no worries -- this step I can do with another program.)
Needless to say, but your help is very much appreciated!

The key to a simple solution is to use |=, e.g.
.models |=
map(.name as $name
| (.tests[]."dbt_utils.equal_rowcount".compare_model =
$name))
To wrap the replacement value in parentheses, just add them:
.models |=
map("(\(.name))" as $name
| (.tests[]."dbt_utils.equal_rowcount".compare_model =
$name))
If you want the replacement to be conditional on the existing value being null, you could perhaps (depending on the exact requirements) use //=.
Using //= and walk
Here's another take on the problem:
.models
|= map("(\(.name))" as $name
| walk(if type=="object" and has("compare_model")
then .compare_model //= $name
else . end))

That the fields are not at the same level doesn't really matter here.
.models[] |= (.tests[]."dbt_utils.equal_rowcount".compare_model = "(\(.name))")
Online demo

How to retrieve recursive path to a specific key (not displaying the parents' key name, but the value from a different key of each parent)

I have the following JSON
[
{
"name": "alpha"
},
{
"fields": [
{
"name": "beta_sub_1"
},
{
"name": "beta_sub_2"
}
],
"name": "beta"
},
{
"fields": [
{
"fields": [
{
"name": "gamma_sub_sub_1"
}
],
"name": "gamma_sub_1"
}
],
"name": "gamma"
}
]
and I would like to get the paths of "name" needed to get to each "name" values. Considering the above code, I would like the following result:
"alpha"
"beta.beta_sub_1"
"beta.beta_sub_2"
"beta"
"gamma.gamma_sub_1.gamma_sub_sub_1"
"gamma.gamma_sub_1"
"gamma"
I've been searching around but I couldn't get to this result. So far, I have this:
tostream as [$p,$v] | select($p[-1] == "name" and $v != null) | "\([$p[0,1]] | join(".")).\($v)"
but this gives me the path with the key name of the parents (and doesn't keep all the intermediary parents.
"0.name.alpha"
"1.fields.beta_sub_1"
"1.fields.beta_sub_2"
"1.name.beta"
"2.fields.gamma_sub_sub_1"
"2.fields.gamma_sub_1"
"2.name.gamma"
Any ideas?
P.S.: I've been searching for very detailed doc on jq but couldn't find anything good enough. If anyone has any recommendations, I'd appreciate.

The problem description does not seem to match the sample input and output, but the following jq program produces the required output:
def descend:
select( type == "object" and has("name") )
| if has("fields") then ([.name] + (.fields[] | descend)) else empty end,
[.name] ;
.[]
| descend
| join(".")
With your input, and using the -r command-line option, this produces:
alpha
beta.beta_sub_1
beta.beta_sub_2
beta
gamma.gamma_sub_1.gamma_sub_sub_1
gamma.gamma_sub_1
gamma
Resources
Apart from the jq manual, FAQ, and Cookbook, you might find the following helpful:
"jq Language Description"
"A Stream-Oriented Introduction to jq"

Using jq to list keys in a JSON object

I have a hierarchically deep JSON object created by a scientific instrument, so the file is somewhat large (1.3MB) and not readily readable by people. I would like to get a list of keys, up to a certain depth, for the JSON object. For example, given an input object like this
{
"acquisition_parameters": {
"laser": {
"wavelength": {
"value": 632,
"units": "nm"
}
},
"date": "02/03/2525",
"camera": {}
},
"software": {
"repo": "github.com/username/repo",
"commit": "a7642f",
"branch": "develop"
},
"data": [{},{},{}]
}
I would like an output like such.
{
"acquisition_parameters": [
"laser",
"date",
"camera"
],
"software": [
"repo",
"commit",
"branch"
]
}
This is mainly for the purpose of being able to enumerate what is in a JSON object. After processing the JSON objects from the instrument begin to diverge: for example, some may have a field like .frame.cross_section.stats.fwhm, while others may have .sample.species, so it would be convenient to be able to interrogate the JSON object on the command line.

The following should do exactly what you want
jq '[(keys - ["data"])[] as $key | { ($key): .[$key] | keys }] | add'
This will give the following output, using the input you described above:
{
"acquisition_parameters": [
"camera",
"date",
"laser"
],
"software": [
"branch",
"commit",
"repo"
]
}

Given your purpose you might have an easier time using the paths builtin to list all the paths in the input and then truncate at the desired depth:
$ echo '{"a":{"b":{"c":{"d":true}}}}' | jq -c '[paths|.[0:2]]|unique'
[["a"],["a","b"]]

Here is another variation uing reduce and setpath which assumes you have a specific set of top-level keys you want to examine:
. as $v
| reduce ("acquisition_parameters", "software") as $k (
{}; setpath([$k]; $v[$k] | keys)
)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

how to not skip array containing null using jq? - json

Using an if/else statement may help. jq '.results[] | { headword, definition: .senses[0].definition[0], examples: (if .senses[0].examples then .senses[0].examples[0].text else null end) }' data.json

Related

Jq to string filter

jq - return array value if its length is not null

Update deeply nested field with value from higher-level object in JQ

How to retrieve recursive path to a specific key (not displaying the parents' key name, but the value from a different key of each parent)

Using jq to list keys in a JSON object

Categories

Resources