Merge Json Array Nodes and roll up a child element - json

I have a requirement to roll a collection of nodes that uses the current node name (within the collection) and for the value take each child nodes value (single node) into a string array, then use the parents key as the key.
Given.
{
"client": {
"addresses": [
{
"id": "27ef465ef60d2705",
"type": "RegisteredOfficeAddress"
},
{
"id": "b7affb035be3f984",
"type": "PlaceOfBusiness"
},
{
"id": "a8a3bef166141206",
"type": "EmailAddress"
}
],
"links": [
{
"id": "29a9de859e70799e",
"type": "Director",
"name": "Bob the Builder"
},
{
"id": "22493ad4c4fd8ac5",
"type": "Secretary",
"name": "Jennifer"
}
],
"Names": [
{
"id": "53977967eadfffcd",
"type": "EntityName",
"name": "Banjo"
}
]
}
}
from this the output needs to be
{
"client": {
"addresses": [
"RegisteredOfficeAddress",
"PlaceOfBusiness",
"EmailAddress"
],
"links": [
"Director",
"Secretary"
],
"Names": [
"EntityName"
]
}
}
What is the best way to achieve this? Any pointers to what/how to do this would be greatly appreciated.
Ron.

You can iterate over entries of your client object first with the help of the $each function, then get types for each of them, and combine via $merge:
{
"client": client
~> $each(function($list, $key) {{ $key: $list.type }})
~> $merge
}
Live playground: https://stedi.link/OpuRdE9

Related

Use jq to output a flat array of JSON objects nested anywhere within source document

I'd like to select/identity-output all objects in arrays under "emp" keys into a flat array of those objects.
[
{
"eng": {
"dev": {
"dir": {
"name": "Mickey"
},
"emp": [
{
"name": "Goofy",
"job": "laugh",
"start": "today"
},
{
"name": "Minnie",
"job": "laugh"
}
]
}
}
},
{
"mgmt": {
"dir": {
"name": "Donald"
},
"emp": [
{
"name": "Woody",
"job": "smile"
},
{
"name": "Buzz",
"job": "smile"
}
]
}
}
]
I'm looking for a flat array of arbitrary objects found in arbitrary locations within the document (in this example, under "emp" parent/keys).
In this example, it would look like
[
{
"name": "Goofy",
"job": "laugh",
"start": "today"
},
{
"name": "Minnie",
"job": "laugh"
},
{
"name": "Woody",
"job": "smile"
},
{
"name": "Buzz",
"job": "smile"
}
]
I've looked through a lot of documentation and am able to do this if I know in advance precisely where these 'emp' keys are in the document, but not if they're distributed through the document at a priori unknown locations/paths.
Use recurse to walk the structure. From all the substrucures, select objects with the emp key. Output the corresponding values and merge the resulting arrays.
jq '[recurse | select (type == "object" and .emp) | .emp ] | add' file.json

jsonschema Required properties inoperative with $ref

I am going to write json schema to verify tree data.
Schema consisting of top root and block below.
There may be another block below the block.
Schema for validation.
schema = {
"$schema": "http://json-schema.org/draft-04/schema",
"$ref": "#/definitions/root",
"definitions":{
"root": {
"properties": {
"name": {
"type": "string"
},
"children": {
"type": "array",
"items": [
{"$ref":"#/definitions/block"}
]
}
},
"required": ["name", "children"]
},
"block": {
"properties": {
"name": {
"type": "string"
},
"children": {
"type": "array",
"items": [
{"$ref":"#/definitions/block"}
]
}
},
"required": ["name"]
}
}
}
Below is incorrect data for testing. The last name properties do not exist.
{
"name": "group8",
"children": [
{
"name": "group7",
"children": [
{
"name": "group6",
"children": [
{
"name": "group5",
"children": [
{ ###### wrong
"children": []
}
]
}
]
}
]
}
]
}
This data validates well, but it doesn't work on a slightly complex tree.
# Error: ValidationError: file /home/gulliver/.local/lib/python2.7/site-packages/jsonschema/validators.py line 934: 'name' is a required property #
{
"name": "group8",
"children": [
{
"name": "group7",
"children": [
{
"name": "group6",
"children": [
{
"name": "group12",
"children": [
{
"name": "group11",
"children": [
{
"name": "group10",
"children": []
}
]
}
]
},
{
"name": "group9",
"children": [
{
"name": "group5",
"children": [
{ ####### wrong
"children": []
}
]
}
]
}
]
}
]
},
{
"name": "group13",
"children": [
{
"name": "null1",
"children": []
}
]
}
]
}
It does not work when the data at the bottom of the tree is invalid.
My guess is that the branch splits and this happens, does anyone know why or how to fix it?
I tested using python and jsonschema.
When items is an array, it applies the subschema values to the same index location in the array in the instance.
For example, where you define...
"items": [
{"$ref":"#/definitions/block"}
]
only the first item in the array will be tested. It has nothing to do with deep nesting. For example, the follwing data is valid according to your schema...
{
"name": "group8",
"children": [
{
"name": "group7"
},
{
"something": "else",
"Not": "name"
}
]
}
(Live demo: https://jsonschema.dev/s/etFGE)
If you modify your use of items, then it will work like you expect:
"items": {"$ref":"#/definitions/block"}
(do this for both uses)
Live demo: https://jsonschema.dev/s/rk1OD

JsonPath - Extract object meeting multiple criteria?

In the Json string given below, I want to find all elements in which category = m AND the "middle" array contains elements which match this condition - the element's "middle" array has objects whose itemType = Executable.
I would like to use jsonpath to get the desired objects. I prefer to not use jmespath because it can be too complex for my purpose. But, I am new to jsonpath and I am not able to figure out the json query from online tutorials which are too trivial or basic. I wonder if its better to use a programming language instead to get the data I need. Please advise.
So far, I was able to only extract elements in which category = m by using this jsonpath query $.[?(#.category=="m")]. How do I do the remaining part ?
Json :
Overview - Every object has a "content" object. Each content object generally has a start, middle and end array besides other fields. Middle arrays can have multiple content objects inside them and so on. Some of the content objects have only a middle array. I am interested in locating items in such content objects as mentioned above.
Note that this is not the actual json which I have to process. It is an imitation which has been sanitized for SO.
{
"id": "123",
"contents": {
"title": "B1",
"start": [],
"middle": [
{
"level": "1",
"contents": {
"title": "C1",
"category": "c",
"start": [],
"middle": [
{
"level": "2",
"contents": {
"title": "M1",
"category": "m",
"start": [],
"middle": [
{
"level": "3",
"contents": {
"title": "MAT1",
"middle": [
{
"itemType": "Data"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT2",
"middle": [
{
"itemType": "Executable",
"id": "exec1"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT3",
"middle": [
{
"itemType": "Data"
}
]
}
}
],
"end": []
}
},
{
"level": "2",
"contents": {
"title": "M2",
"category": "m",
"start": [],
"middle": [
{
"level": "3",
"contents": {
"title": "MAT1",
"middle": [
{
"itemType": "Data"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT2",
"middle": [
{
"itemType": "Executable",
"id": "exec2"
}
]
}
}
],
"end": []
}
}
],
"end": []
}
},
{
"level": "1",
"contents": {
"title": "C2",
"category": "c",
"start": [],
"middle": [
{
"level": "2",
"contents": {
"title": "M1",
"category": "m",
"start": [],
"middle": [
{
"level": "3",
"contents": {
"title": "MAT1",
"middle": [
{
"itemType": "Data"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT2",
"middle": [
{
"itemType": "Executable",
"id": "exec3"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT3",
"middle": [
{
"itemType": "Data"
}
]
}
}
],
"end": []
}
},
{
"level": "2",
"contents": {
"title": "M2",
"category": "m",
"start": [],
"middle": [
{
"level": "3",
"contents": {
"title": "MAT1",
"middle": [
{
"itemType": "Data"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT2",
"middle": [
{
"itemType": "Executable",
"id": "exec4"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT3",
"middle": [
{
"itemType": "Data"
}
]
}
}
],
"end": []
}
}
],
"end": []
}
}
],
"end": []
}
}
Context
json with nested objects1
jsonpath expression language
choosing between jsonpath and jmespath (or other JSON expression engine)
Problem
DeveMasterJoe2 wants to extract some values from nested JSON
Discussion
There are lots of implementations of jsonpath out there, and they do not all support the same features
The structure and normalization of the source JSON is going to influence how easily this can be done with pure jsonpath
In choosing a JSON expression engine, one has to weigh multiple factors
how consistent are the implementations across languages?
how many choices are there within a given language?
how clear is the specification?
how many examples, unit-tests or tutorials are available?
who is supporting it?
Example solution using Python and jsonpath-ng
Here is an example solution using python 3.7 and jsonpath-ng
This example uses a mix of jsonpath and python instead of just pure jsonpath, because of the heavily-nested JSON
I will leave it for someone else to provide an answer that relies on pure jsonpath
Note that the source JSON arguably could stand to be cleaned up a bit
(for example, why is there no id field attached to itemType==Data elements?)
(for example, why is category not found on all contents elements?)
(for example, if you expressly specify level why complicate things with heavily nested objects when you can determine depth by level ?)
This example:
## import libraries
import codecs
import json
import jsonpath_ng
from jsonpath_ng.ext import parse
##;;
## init vars
href="path/to/my/jsonfile/nested_dict.json"
json_string = codecs.open(href, 'rb', encoding='utf8').read()
json_dataroot = json.loads(json_string)
final_result = []
##;;
## init jsonpath outer-query
match = parse('$..contents.middle[*]').find(json_dataroot)
##;;
## iterate through outer-query and gather subelements
for ijj,item in enumerate(match):
## restrict to desired category == 'm'
if(match[ijj].value.get('contents',{}).get('category','') == 'm'):
## extract out desired subelements
json_datafrag001 = [item.get('contents',{}).get('middle',{})[0]
for item in match[ijj].value.get('contents',{}).get('middle',{})
]
match001 = parse("$[?(#.itemType=='Executable')]").find(json_datafrag001)
final_result.extend(list(match001[ikk].value for ikk,item in enumerate(match001)))
pass
##;;
## show final result
vout = json.dumps(final_result, sort_keys=True,indent=4, separators=(',', ': '))
print(vout)
##;;
... produces this result ...
[
{
"id": "exec1",
"itemType": "Executable"
},
{
"id": "exec2",
"itemType": "Executable"
},
{
"id": "exec3",
"itemType": "Executable"
},
{
"id": "exec4",
"itemType": "Executable"
}
]
1 (aka dictionary, associative-array, hash)

Read Array Value Using Dataweave in Mule

I am trying to use dataweave in Mule to read specific data values from an incoming payload. My sample payload looks like below:
{
"source": [
{
"uri": "entities/1R6xV",
"createdBy": "API_USER",
"createdTime": 1562504739146,
"attributes": {
"label": "000000000002659654",
"value": {
"Name": [
{
}
],
"Id": [
{
}
],
"Number": [
{
"type": "config/Types/Number/attributes/Number",
"ov": true,
"value": "000000000002659654",
"uri": "entities/1R6xV/attributes/Num/1ZtyT/Number/60pvN6"
}
]
}
}
}
]
}
If I need to read the "label", I can achieve that by
label: payload.source.attributes.label
Similarly, how can I read the "value" under attributes > Number. It doesn't work by:
Value: payload.source.attributes.Number.value
I am new to Dataweave. Please advise.
The problem is that the dot selector (.) works on object and on array of objects. When it is applied to an array it will apply the dot selector to all the elements of the array that are of type object and return that result.
Lets go part by part
payload.source
Returns
[
{
"uri": "entities/1R6xV",
"createdBy": "API_USER",
"createdTime": 1562504739146,
"attributes": {
"label": "000000000002659654",
"value": {
"Name": [
{
}
],
"Id": [
{
}
],
"Number": [
{
"type": "config/Types/Number/attributes/Number",
"ov": true,
"value": "000000000002659654",
"uri": "entities/1R6xV/attributes/Num/1ZtyT/Number/60pvN6"
}
]
}
}
}
]
So far so good as payload is an Object it returns the value of source that is an array
payload.source.attributes
Returns
[
{
"label": "000000000002659654",
"value": {
"Name": [
{
}
],
"Id": [
{
}
],
"Number": [
{
"type": "config/Types/Number/attributes/Number",
"ov": true,
"value": "000000000002659654",
"uri": "entities/1R6xV/attributes/Num/1ZtyT/Number/60pvN6"
}
]
}
}
]
Works ok because the result of payload.source was ended an Array of object so it will do that selection over those objects.
Now when you execute
payload.source.attributes.value.Number
It returns
[
[
{
"type": "config/Types/Number/attributes/Number",
"ov": true,
"value": "000000000002659654",
"uri": "entities/1R6xV/attributes/Num/1ZtyT/Number/60pvN6"
}
]
]
That is an array of arrays and here is where it is broken.
My Solution
You have two alternatives here
Use flatten function
flatten(payload.source.attributes.value.Number).value
Use descendant selector
payload.source.attributes.value.Number..value
Since Number is an array, you need to specify the index you want. In this case, the zeroth element:
Value: payload.source[0].attributes.value.Number[0].value
If you have multiple numbers, it would look something like this:
%dw 1.0
%output application/json
---
values: payload.source[0].attributes.value.Number map {
value: $.value
}

Traversing through nested json in scala play

I'm using scala play and am attempting to traverse a json tree in order to validate that specific name values have specific children with specific name values. I have the following Json in the form of a JsObject:
{ "name": "user", "children": [ { "name": "$a", "children": [ { "name": "foo", "children": [ ] }, { "name": "fooBar", "children": [ { "name": "$a", "children": [ { "name": "subFoobar1", "children": [ ] }, { "name": "subFoobar2", "children": [ { "name": "TEST", "children": [ ] } ] }, { "name": "subFoobar3", "children": [ ] } ] } ] }, { "name": "bar", "children": [ { "name": "$a", "children": [ ] }, { "name": "$c", "children": [ ] }, { "name": "$b", "children": [ ] } ] }, { "name": "barFoo", "children": [ ] } ] } ] }
Ideally I would use nested for loops to traverse but the JsObject structure is preventing me from accessing the underlying values when attempting traverse. I have also attempted mapping the JsObject to a map of type [Map[String,Map[String,Any]]] but I am getting invalid cast compiler errors.
Any tips on how I can traverse and validate the name value at each level would be appreciated. I would preferably like to use the play json library
Issue was in the case class I was attempting to use. I wasn't accounting for the recursive nature of my Json structure
case class ActorTree(name : String, children:Seq[ActorTree] )