Filtering deeply within tree - json

I'm trying to prune nodes deeply within a JSON structure and I'm puzzled why empty behaves seemingly different from a normal value here.
Input
[
{
"name": "foo",
"children": [{
"name": "foo.0",
"color": "red"
}]
},
{
"name": "bar",
"children": [{
"name": "bar.0",
"color": "green"
},
{
"name": "bar.1"
}]
},
{
"name": "baz",
"children": [{
"name": "baz.0"
},
{
"name": "baz.1"
}]
}
]
Program
jq '(.[].children|.[])|=if has("color") then . else empty end' foo.json
Actual output
[
{
"name": "foo",
"children": [
{
"name": "foo.0",
"color": "red"
}
]
},
{
"name": "bar",
"children": [
{
"name": "bar.0",
"color": "green"
}
]
},
{
"name": "baz",
"children": [
{
"name": "baz.1"
}
]
}
]
Expected output
The output I get, except without the baz.1 child, as that one doesn't have a color.
Question
Apart from the right solution, I'm also curious why replacing empty in the script by a regular value like 42 would replace the children without colors with 42 as expected, but when replacing with empty, it looks like the else branch doesn't get executed?

.[].children |= map(select(.color))
Will remove children that does not has an color so the output becomes:
[
{
"name": "foo",
"children": [
{
"name": "foo.0",
"color": "red"
}
]
},
{
"name": "bar",
"children": [
{
"name": "bar.0",
"color": "green"
}
]
},
{
"name": "baz",
"children": []
}
]
Online demo
Regarding why your filter does not seem to like empty;
This git issue seems to be the cause, multiple elements with empty will fail.

There must be a bug with assigning empty to multiple paths.
In this case you can use del instead:
del(.[].children[] | select(has("color") | not))
Online demo

Related

jsonschema Required properties inoperative with $ref

I am going to write json schema to verify tree data.
Schema consisting of top root and block below.
There may be another block below the block.
Schema for validation.
schema = {
"$schema": "http://json-schema.org/draft-04/schema",
"$ref": "#/definitions/root",
"definitions":{
"root": {
"properties": {
"name": {
"type": "string"
},
"children": {
"type": "array",
"items": [
{"$ref":"#/definitions/block"}
]
}
},
"required": ["name", "children"]
},
"block": {
"properties": {
"name": {
"type": "string"
},
"children": {
"type": "array",
"items": [
{"$ref":"#/definitions/block"}
]
}
},
"required": ["name"]
}
}
}
Below is incorrect data for testing. The last name properties do not exist.
{
"name": "group8",
"children": [
{
"name": "group7",
"children": [
{
"name": "group6",
"children": [
{
"name": "group5",
"children": [
{ ###### wrong
"children": []
}
]
}
]
}
]
}
]
}
This data validates well, but it doesn't work on a slightly complex tree.
# Error: ValidationError: file /home/gulliver/.local/lib/python2.7/site-packages/jsonschema/validators.py line 934: 'name' is a required property #
{
"name": "group8",
"children": [
{
"name": "group7",
"children": [
{
"name": "group6",
"children": [
{
"name": "group12",
"children": [
{
"name": "group11",
"children": [
{
"name": "group10",
"children": []
}
]
}
]
},
{
"name": "group9",
"children": [
{
"name": "group5",
"children": [
{ ####### wrong
"children": []
}
]
}
]
}
]
}
]
},
{
"name": "group13",
"children": [
{
"name": "null1",
"children": []
}
]
}
]
}
It does not work when the data at the bottom of the tree is invalid.
My guess is that the branch splits and this happens, does anyone know why or how to fix it?
I tested using python and jsonschema.
When items is an array, it applies the subschema values to the same index location in the array in the instance.
For example, where you define...
"items": [
{"$ref":"#/definitions/block"}
]
only the first item in the array will be tested. It has nothing to do with deep nesting. For example, the follwing data is valid according to your schema...
{
"name": "group8",
"children": [
{
"name": "group7"
},
{
"something": "else",
"Not": "name"
}
]
}
(Live demo: https://jsonschema.dev/s/etFGE)
If you modify your use of items, then it will work like you expect:
"items": {"$ref":"#/definitions/block"}
(do this for both uses)
Live demo: https://jsonschema.dev/s/rk1OD

how to add values to JSON array without full path?

I have a problem with 'jq' I could not solve after searching for a few hours. Let's take this simple JSON as an example with "Category4" nested within "Category2":
{
"categories": [
{
"Id": 1,
"Name": "Category1",
"Children": []
},
{
"Id": 2,
"Name": "Category2",
"Children": [
{
"Id": 4,
"Name": "Category4",
"Children": []
}
]
},
{
"Id": 3,
"Name": "Category3",
"Children": []
}
]
}
I would like to add a "Category5" child within the "Category4" object such as:
{
"categories": [
{
"Id": 1,
"Name": "Category1",
"Children": []
},
{
"Id": 2,
"Name": "Category2",
"Children": [
{
"Id": 4,
"Name": "Category4",
"Children": [
{
"Id": 5,
"Name": "Category5",
"Children": []
}
]
}
]
},
{
"Id": 3,
"Name": "Category3",
"Children": []
}
]
}
I can do it by using the full path of the "Category4" object with:
jq --argjson a '[{"Id":5,"Name":"Category5","Children":[]}]' '.categories[1].Children[0].Children += $a' "myfile.json"
But how can I achieve the same result if I don't know the position of "Category4" (which could be at root level or nested deep inside other objects)? This command was my best guess:
jq --argjson a '[{"Id":5,"Name":"Category5","Children":[]}]' '.. | select(.Id?==4) | .Children += $a' "myfile"
but it only retrieves "Category4" and "Category5" objects (Category1, 2 and 3 are missing from the output). I feel like I am missing something stupid...
Thanks for any help!
Use walk builtin for applying filters to values at arbitrary depths without changing the overall structure.
walk(select(.Id? == 4) .Children += $a)
demo at jqplay.org

Using jq to convert object to key with values

I have been playing around with jq to format a json file but I am having some issues trying to solve a particular transformation. Given a test.json file in this format:
[
{
"name": "A", // This would be the first key
"number": 1,
"type": "apple",
"city": "NYC" // This would be the second key
},
{
"name": "A",
"number": "5",
"type": "apple",
"city": "LA"
},
{
"name": "A",
"number": 2,
"type": "apple",
"city": "NYC"
},
{
"name": "B",
"number": 3,
"type": "apple",
"city": "NYC"
}
]
I was wondering, how can I format it this way using jq?
[
{
"key": "A",
"values": [
{
"key": "NYC",
"values": [
{
"number": 1,
"type": "a"
},
{
"number": 2,
"type": "b"
}
]
},
{
"key": "LA",
"values": [
{
"number": 5,
"type": "b"
}
]
}
]
},
{
"key": "B",
"values": [
{
"key": "NYC",
"values": [
{
"number": 3,
"type": "apple"
}
]
}
]
}
]
I have followed this thread Using jq, convert array of name/value pairs to object with named keys and tried to group the json using this expression
jq '. | group_by(.name) | group_by(.city) ' ./test.json
but I have not been able to add the keys in the output.
You'll want to group the items at the different levels and building out your result objects as you want.
group_by(.name) | map({
key: .[0].name,
values: (group_by(.city) | map({
key: .[0].city,
values: map({number,type})
}))
})
Just keep in mind that group_by/1 yields groups in a sorted order. You'll probably want an implementation that preserves that order.
def group_by_unsorted(key_selector):
reduce .[] as $i ({};
.["\($i|key_selector)"] += [$i]
)|[.[]];

Filter only part of input using select

Given input like this:
{
"type": "collection",
"foo": "bar",
"children": [
{
"properties": {
"country": "GB"
},
"data": "..."
},
{
"properties": {
"country": "PL"
},
"data": "..."
}
]
}
How can I use jq to retain all of the JSON structure, but filter out some of the children using select(). For instance, If I wanted to return only children with country GB, I would expect the following output:
{
"type": "collection",
"foo": "bar",
"children": [
{
"properties": {
"country": "GB"
},
"data": "..."
}
]
}
If I only want the children, this is easy with .children[] | select(.properties.country == "GB"), but does not retain the rest of the JSON.
The key is to use |=. In the present case, you could use the following pattern:
.children |= map(select(...))

Traversing through nested json in scala play

I'm using scala play and am attempting to traverse a json tree in order to validate that specific name values have specific children with specific name values. I have the following Json in the form of a JsObject:
{ "name": "user", "children": [ { "name": "$a", "children": [ { "name": "foo", "children": [ ] }, { "name": "fooBar", "children": [ { "name": "$a", "children": [ { "name": "subFoobar1", "children": [ ] }, { "name": "subFoobar2", "children": [ { "name": "TEST", "children": [ ] } ] }, { "name": "subFoobar3", "children": [ ] } ] } ] }, { "name": "bar", "children": [ { "name": "$a", "children": [ ] }, { "name": "$c", "children": [ ] }, { "name": "$b", "children": [ ] } ] }, { "name": "barFoo", "children": [ ] } ] } ] }
Ideally I would use nested for loops to traverse but the JsObject structure is preventing me from accessing the underlying values when attempting traverse. I have also attempted mapping the JsObject to a map of type [Map[String,Map[String,Any]]] but I am getting invalid cast compiler errors.
Any tips on how I can traverse and validate the name value at each level would be appreciated. I would preferably like to use the play json library
Issue was in the case class I was attempting to use. I wasn't accounting for the recursive nature of my Json structure
case class ActorTree(name : String, children:Seq[ActorTree] )