Intersecting nested JSON documents with jq - json

I'm looking for a general approach to intersect two nested JSON-objects, in order to retrieve all common key-value pairs.
Given the following JSON-documents
{
"a": 0,
"b": [
"ba",
{
"bb": {
"bba": "a",
"bbc": [1, 2]
}
},
{
"bc": {
"bca": [2],
"bcb": 5
}
}
],
"c": 1
}
and
{
"a": 1,
"b": [
"ba",
{
"bb": {
"bba": "a",
"bbc": [1,3]
}
}
],
"c": 1
}
I want to find their intersection.
I expect to get the following JSON-document which only contains those key-value-pairs JSON-objects that are present in both inputs:
{
"b": [
"ba",
{
"bb": {
"bba": "a",
"bbc": [1]
}
}
],
"c": 1
}
I looked into jq's documentation, but could not find a hint on how to do this.
My tries to use the minus-operator were unsuccessful yielding object ... and object cannot be subtracted.
cat obj-list.txt | jq -c '.[1] - (.[0] - .[1])'
Can you give me some hints on how to accomplish a intersection of nested JSON-objects with jq?
Thank you already in advance for helping me out.
Update:
Given that the JSON-objects do not have the exact same structure, but differ like a = {"a": [{"aa":1]}, b = {"a": 0} for example. I'm interested in a solution on how to add error checking to catch Cannot index number with string-errors and the like?

Assuming $a holds the contents of one of your JSON entities and $b holds the other, then the following will perform the type of intersection you describe:
reduce ($a|paths) as $p (null;
($a|getpath($p)) as $va
| [try ($b|getpath($p)) // empty] as $vb
| if ($vb | length > 0) and ($va == $vb[0])
then setpath($p;$va) else . end)
You may however wish to explore variations of this, e.g. if there is nothing in common.
Footnote 1:
The above is easily adapted to similar problems. For example, if the common structure is all that matters:
def structural_intersection($a;$b):
reduce ($a|paths|select(.[-1]|type=="string")) as $p (null;
($a|getpath($p)) as $va
| [try ($b|getpath($p)) // empty] as $vb
| if ($vb | length > 0)
then setpath($p;null) else . end) ;
Footnote 2:
To handle arrays using array-intersection, you might wish to consider the following, but please be aware that in some cases, this hybrid approach will probably produce results that you might not expect:
def special_intersection($a;$b):
def i(x;y): x - (x-y);
reduce ($a|paths) as $p (null;
($a|getpath($p)) as $va
| [try ($b|getpath($p)) // empty] as $vb
| if ($vb | length > 0)
then if ($va == $vb[0])
then setpath($p;$va)
elif ($va|type == "array") and ($vb[0]|type) == "array"
then setpath($p; i($va; $vb[0]))
else . end
else . end) ;

Related

Count number of objects whose attribute are "null" or contain "null"

I have the following JSON. From there I'd like to count how many objects I have which type attribute is either "null" or has an array that contains the value "null". In the following example, the answer would be two. Note that the JSON could also be deeply nested.
{
"A": {
"type": "string"
},
"B": {
"type": "null"
},
"C": {
"type": [
"null",
"string"
]
}
}
I came up with the following, but obviously this doesn't work since it misses the arrays. Any hints how to solve this?
jq '[..|select(.type?=="null")] | length'
This answer focuses on efficiency, straightforwardness, and generality.
In brief, the following jq program produces 2 for the given example.
def count(s): reduce s as $x (0; .+1);
def hasValue($value):
has("type") and
(.type | . == $value or (type == "array" and any(. == $value)));
count(.. | objects | select(hasValue("null")))
Notice that using this approach, it would be easy to count the number of objects having null or "null":
count(.. | objects | select(hasValue("null") or hasValue(null)))
You were almost there. For arrays you could use IN. I also used objects, strings and arrays which are shortcuts to a select of the according types.
jq '[.. | objects.type | select(strings == "null", IN(arrays[]; "null"))] | length'
2
Demo
On larger structures you could also improve performance by not creating that array of which you would only calculate the length, but by instead just iterating over the matching items (e.g. using reduce) and counting on the go.
jq 'reduce (.. | objects.type | select(strings == "null", IN(arrays[]; "null"))) as $_ (0; .+1)'
2
Demo

JQ: How to create JSON object using data picked by jq?

I have a complex JSON file that contains hundreds of "attributes" with their types identified by "objectTypeAttributeId".
I know that objectTypeAttributeId=328 means tickedid, objectTypeAttributeId=329 contains array of hostnames etc..
There is simplified version of the file:
{
"objectEntries": [
{
"attributes": [
{
"id": 279792,
"objectTypeAttributeId": 328,
"objectAttributeValues": [
{
"displayValue": "ITSM-24210"
}
]
},
{
"id": 279795,
"objectTypeAttributeId": 329,
"objectAttributeValues": [
{
"displayValue": "testhost1"
},
{
"displayValue": "testhost2"
}
]
},
{
"id": 279793,
"objectTypeAttributeId": 330,
"objectAttributeValues": [
{
"displayValue": "28.02.2020 11:45"
}
]
}
]
}
]
}
I need to create output JSON using particular values picked out (according to the "objectTypeAttributeId" value) of input JSON in format like this:
{
"tickets": [
{
"ticketid": "ITSM-24210",
"hostnames": ["testhost1", "testhost2"],
"date": "28.02.2020 11:45"
}
]
}
I am new in jq, in the XSLT it is solvable using static template with placeholders for picked values.
I have tried this approach, there is my jq filter:
.objectEntries[].attributes[] |
{ticketid: select(.objectTypeAttributeId == 328) | .objectAttributeValues[0].displayValue},
{hostnames: select(.objectTypeAttributeId == 329) | [.objectAttributeValues[].displayValue]},
{date: select(.objectTypeAttributeId == 330) | .objectAttributeValues[0].displayValue}
but the result of this approach is:
{
"ticketid": "ITSM-24210"
}
{
"hostnames": [
"testhost1",
"testhost2"
]
}
{
"date": "28.02.2020 11:45"
}
And all my subsequent tries to format output better ends in broken jq filter or filter that does not return anything.
Please any ideas how to solve this problem?
Assuming a ticket is to be generated for each object entry:
{tickets: [
.objectEntries[]
| [.attributes[]
| [.objectTypeAttributeId,
(.objectAttributeValues | map(.displayValue))] as [$id, $val]
| if $id == 328 then {ticketId: $val[0]}
elif $id == 329 then {hostnames: $val}
elif $id == 330 then {date: $val[0]}
else empty end
] | add
]}
Online demo
Here we go, it's not pretty, there may be a better solution but it works: https://jqplay.org/s/sxussfa2Vj
.objectEntries | {tickets: map(.attributes |
{ticketID: (reduce .[] as $r (null; if $r.objectTypeAttributeId == 328
then $r.objectAttributeValues[0].value else . end)),
date: (reduce .[] as $r (null; if $r.objectTypeAttributeId == 330
then $r.objectAttributeValues[0].value else . end)),
hostnames: (reduce .[] as $r ([]; if $r.objectTypeAttributeId == 329
then $r.objectAttributeValues | map(.value) else . end))})}
There's a lot of unpacking and repacking going on here that sort of distracts from the core. You have an array of tickets (aka entries), and over those we map. The various properties we have to grab from different entries of an array, which is done using reduce. Reduce goes through the array of objects and picks out the right one and keeps track of the value.
Maybe there's a nice way, but this works already, so you can play with it further, trying to simplify.
Your original solution almost works, you did a good job there, just needed a map:
.objectEntries[].attributes |
{ticketid: . | map(select(.objectTypeAttributeId == 328))[0] |
.objectAttributeValues[0].displayValue,
date: . | map(select(.objectTypeAttributeId == 330))[0] |
.objectAttributeValues[0].displayValue,
hostnames: . | map(select(.objectTypeAttributeId == 329))[0] |
[.objectAttributeValues[].displayValue]}
Try it out, it even works with multiple tickets ;)
https://jqplay.org/s/ydoCgv9vsI

jq get number of jsons in an array containing a specific value

I've got an array of multiple JSON. I would like to get the number of of JSON which contain a specific value.
Example:
[
{
"key": "value1",
"2ndKey":"2ndValue1"
},
{
"key": "value2",
"2ndKey":"2ndValue2"
},
{
"key": "value1",
"2ndKey":"2ndValue3"
}
]
So in case I'm looking for value1 in key, the result should be 2.
I would like to get an solution using jq. I had already some tries, however they did not fully work. The best one yet was the following:
cat /tmp/tmp.txt | jq ' select(.[].key == "value1" ) | length '
I get the correct results but it is shown multiple times.
Can anybody help me to further improve my code. Thanks in advance!
You are pretty close. Try this
map(select(.key == "value1")) | length
or the equivalent
[ .[] | select(.key == "value1") ] | length
An efficient and convenient way to count is to use 'count' as defined below:
def count(s; cond): reduce s as $x (0; if ($x|cond) then .+1 else . end);
count(.[]; .key == "value1")

Remove all null values

I am trying to remove null values from a json object using jq. I found this issue on their github and so now I'm trying to remove them with del.
I have this:
'{ id: $customerId, name, phones: ([{ original: .phone },
{ original: .otherPhone}]), email} | del(. | nulls)'
This doesn't seem to do anything. However if I replace nulls with .phones it does remove the phone numbers.
The following illustrates how to remove all the null-valued keys from a JSON object:
jq -n '{"a":1, "b": null, "c": null} | with_entries( select( .value != null ) )'
{
"a": 1
}
Alternatively, paths/0 can be used as follows:
. as $o | [paths[] | {(.) : ($o[.])} ] | add
By the way, del/1 can also be used to achieve the same result, e.g. using this filter:
reduce keys[] as $k (.; if .[$k] == null then del(.[$k]) else . end)
Or less obviously, but more succinctly:
del( .[ (keys - [paths[]])[] ] )
And for the record, here are two ways to use delpaths/1:
jq -n '{"a":1, "b": null, "c": null, "d":2} as $o
| $o
| delpaths( [ keys[] | select( $o[.] == null ) ] | map( [.]) )'
$ jq -n '{"a":1, "b": null, "c": null, "d":2}
| [delpaths((keys - paths) | map([.])) ] | add'
In both these last two cases, the output is the same:
{
"a": 1,
"d": 2
}
For reference, if you wanted to remove null-valued keys from all JSON objects in a JSON text (i.e., recursively), you could use walk/1, or:
del(.. | objects | (to_entries[] | select(.value==null) | .key) as $k | .[$k])
This answer by Michael Homer on https://unix.stackexchange.com has a super concinse solution which works since jq 1.6:
del(..|nulls)
It deletes all null-valued properties (and values) from your JSON. Simple and sweet :)
nulls is a builtin filter and can be replaced by custom selects:
del(..|select(. == "value to delete"))
To remove elements based on multiple conditions, e.g. remove all bools and all numbers:
del(..|booleans,numbers)
or, to only delete nodes not matching a condition:
del(..|select(. == "value to keep" | not))
(The last example is only illustrative – of course you could swap == for !=, but sometimes this is not possible. e.g. to keep all truthy values: del(..|select(.|not)))
All the other answers to date here are workarounds for old versions of jq, and it isn't clear how do do this simply in the latest released version. In JQ 1.6 or newer this will do the job to remove nulls recursively:
$ jq 'walk( if type == "object" then with_entries(select(.value != null)) else . end)' input.json
Sourced from this comment on the issue where adding the walk() function was discussed upstream.
[WARNING: the definition of walk/1 given in this response is problematic, not least for the reason given in the first comment; note also that jq 1.6 defines walk/1 differently.]
I am adding the new answer to emphasize the extended version of the script by #jeff-mercado. My version of the script assumes the empty values are as follows:
null;
[] - empty arrays;
{} - empty objects.
Removing of empty arrays and objects was borrowed from here https://stackoverflow.com/a/26196653/3627676.
def walk(f):
. as $in |
if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ( $in[$key] | walk(f) ) } ) | f
elif type == "array" then
select(length > 0) | map( walk(f) ) | f
else
f
end;
walk(
if type == "object" then
with_entries(select( .value != null and .value != {} and .value != [] ))
elif type == "array" then
map(select( . != null and . != {} and .!= [] ))
else
.
end
)
That's not what del/1 was meant to be used for. Given an object as input, if you wanted to remove the .phones property, you'd do:
del(.phones)
In other words, the parameter to del is the path to the property you wish to remove.
If you wanted to use this, you would have to figure out all the paths to null values and pass it in to this. That would be more of a hassle though.
Streaming the input in could make this task even simpler.
fromstream(tostream | select(length == 1 or .[1] != null))
Otherwise for a more straightforward approach, you'll have to walk through the object tree to find null values. If found, filter it out. Using walk/1, your filter could be applied recursively to exclude the null values.
walk(
(objects | with_entries(select(.value != null)))
// (arrays | map(select(. != null)))
// values
)
So if you had this for input:
{
"foo": null,
"bar": "bar",
"biz": [1,2,3,4,null],
"baz": {
"a": 1,
"b": null,
"c": ["a","b","c","null",32,null]
}
}
This filter would yield:
{
"bar": "bar",
"baz": {
"a": 1,
"c": ["a","b","c","null",32]
},
"biz": [1,2,3,4]
}
Elsewhere on this page, some interest has been expressed in
using jq to eliminate recursively occurrences of [] and {} as well as null.
Although it is possible to use the built-in definition of walk/1 to do
this, it is a bit tricky to do so correctly. Here therefore is a variant version
of walk/1 which makes it trivial to do so:
def traverse(f):
if type == "object" then map_values(traverse(f)) | f
elif type == "array" then map( traverse(f) ) | f
else f
end;
In order to make it easy to modify the criterion for removing elements,
we define:
def isempty: .==null or ((type|(.=="array" or .=="object")) and length==0);
The solution is now simply:
traverse(select(isempty|not))
With newer versions of jq (1.6 and later)
You can use this expression to remove null-valued keys recursively:
jq 'walk( if type == "object" then with_entries(select(.value != null)) else . end)'
REF
[WARNING: the response below has several problems, not least those arising from the fact that 0|length is 0.]
Elaborating on an earlier answer, in addition to removing properties with null values, it's often helpful to remove JSON properties with empty array or object values (i.e., [] or {}), too.
ℹī¸ — jq's walk() function (walk/1) makes this easy. walk() will be
available in a future version of jq (> jq-1.5), but its
definition
can be added to current filters.
The condition to pass to walk() for removing nulls and empty structures is:
walk(
if type == "object" then with_entries(select(.value|length > 0))
elif type == "array" then map(select(length > 0))
else .
end
)
Given this JSON input:
{
"notNullA": "notNullA",
"nullA": null,
"objectA": {
"notNullB": "notNullB",
"nullB": null,
"objectB": {
"notNullC": "notNullC",
"nullC": null
},
"emptyObjectB": {},
"arrayB": [
"b"
],
"emptyBrrayB": []
},
"emptyObjectA": {},
"arrayA": [
"a"
],
"emptyArrayA": []
}
Using this function gives the result:
{
"notNullA": "notNullA",
"objectA": {
"notNullB": "notNullB",
"objectB": {
"notNullC": "notNullC"
},
"arrayB": [
"b"
]
},
"arrayA": [
"a"
]
}

Parse json and extract items in array upon condition of nested values

I've already asked a similar question here - Parse json and choose items\keys upon condition, but this time it's slightly different.
This is the Example:
[
{
"item1": "value123",
"array0": [
{
"item2": "aaa"
}
]
},
{
"item1": "value456",
"array1": [
{
"item2": "bbb"
}
]
},
{
"item1": "value789",
"array2": [
{
"item2": "ccc"
}
]
}
]
I'd like to get the value of "item1", only when "item2" has a specific value.
Let's say if item2 equals "bbb", then all I want to get back is "value456".
I've tried to solve it with jq like it worked for me in the issue mentioned above, but to no avail, as I can't extract values from a "higher" level than the one i'm searching in with jq's select and map.
An easier solution is available thanks to the magic powers of the recurse operator, ..:
jq -r '.[] | select(.. | .item2? == "bbb").item1'
Basically what this does is, for each (.[]) object in the original array, pick (select) only those in which any of the keys recursively (..) named .item2 equals "bbb", and then select the .item1 property of said object.
found the solution on jq's manual (with #ifthenelse and #Objects) - http://stedolan.github.io/jq/manual
jq -r '{name1: .[].item1, name2: .[].array[].item2} | /
if .item2 == "bbb" then .name1 elif .item2 != "bbb" then empty else empty end'
If you find the magic of .. too powerful for your purposes, the following may be useful. It restricts the search for the "item2" key.
def some(condition): reduce .[] as $x
(false; if . then . else $x|condition end);
.[]
| to_entries
| select( some( .value | (type == "array") and some(.item2 == "bbb")) )
| from_entries
| .item1
If your jq has any/1, then use it instead of some/1.
Here is a solution which uses tostream and getpath
foreach (tostream|select(length==2)) as [$p,$v] (.;.;
if $p[-1] == "item2" and $v == "bbb" then getpath($p[:-3]).item1 else empty end
)
EDIT: I now realize a filter of the form foreach E as $X (.; .; R) can almost always be rewritten as E as $X | R so the above is really just
(tostream | select(length==2)) as [$p,$v]
| if $p[-1] == "item2" and $v == "bbb" then getpath($p[:-3]).item1 else empty end