Nifi manipulate all Json Key Values from Invokehttp - json

I have a json that comes from InvokeHTTP. I did a Split Json and JoltTransform to get the Key, Values, but I need to change all the Keys from Camelcase to snakecase.
My Keys will be different with every InvokeHttp call. I've tried AttributestoJson and EvaluateJsonPath and some replace text, but haven't figured out a way to dynamically change just the keys and then merge back to the values without writing a custom processor.
Original Data from InvokeHTTP:
{
"data": {
"Table": [
{
"Age": 51,
"FirstName": "Bob",
"LastName": "Doe"
},
{
"Age": 26,
"FirstName": "Ryan",
"LastName": "Doe"
}
]
}
}
Input after Split Json (Gives me each json in a separate flowfile) and Jolt:
[
{
"Key": "Age",
"Value": 51
},
{
"Key": "FirstName",
"Value": "Bob"
},
{
"Key": "LastName",
"Value": "Doe"
}
]
Desired Output:
{
"data": {
"Table": [
{
"age": 51,
"first_name": "Bob",
"last_name": "Doe"
},
{
"age": 26,
"first_name": "Ryan",
"last_name": "Doe"
}
]
}
}

If you know the fields, you can use JoltTransformJSON on the original input JSON so you don't have to use SplitJson, here's a spec that will do the (explicit) field name conversion:
[
{
"operation": "shift",
"spec": {
"data": {
"Table": {
"*": {
"Age": "data.Table[#2].age",
"FirstName": "data.Table[#2].first_name",
"LastName": "data.Table[#2].last_name"
}
}
}
}
}
]
You could also use UpdateRecord, you'd just need separate schemas for the JsonTreeReader and JsonRecordSetWriter.

I wrote an answer to a similar question which uses ReplaceText to replace . in JSON keys with _. The same logic could be applied here (template available in link). As I pointed out in that answer, a cleaner solution would be to use ExecuteScript, especially as the transformation from camelcase to snakecase is done easily in most scripting languages.

Related

Merge Json Array Nodes and roll up a child element

I have a requirement to roll a collection of nodes that uses the current node name (within the collection) and for the value take each child nodes value (single node) into a string array, then use the parents key as the key.
Given.
{
"client": {
"addresses": [
{
"id": "27ef465ef60d2705",
"type": "RegisteredOfficeAddress"
},
{
"id": "b7affb035be3f984",
"type": "PlaceOfBusiness"
},
{
"id": "a8a3bef166141206",
"type": "EmailAddress"
}
],
"links": [
{
"id": "29a9de859e70799e",
"type": "Director",
"name": "Bob the Builder"
},
{
"id": "22493ad4c4fd8ac5",
"type": "Secretary",
"name": "Jennifer"
}
],
"Names": [
{
"id": "53977967eadfffcd",
"type": "EntityName",
"name": "Banjo"
}
]
}
}
from this the output needs to be
{
"client": {
"addresses": [
"RegisteredOfficeAddress",
"PlaceOfBusiness",
"EmailAddress"
],
"links": [
"Director",
"Secretary"
],
"Names": [
"EntityName"
]
}
}
What is the best way to achieve this? Any pointers to what/how to do this would be greatly appreciated.
Ron.
You can iterate over entries of your client object first with the help of the $each function, then get types for each of them, and combine via $merge:
{
"client": client
~> $each(function($list, $key) {{ $key: $list.type }})
~> $merge
}
Live playground: https://stedi.link/OpuRdE9

How can I build a json payload from a csv file that has string data separated by line using Python?

In an overview, let's say I have a CSV file that has 5 entries of data (I will have a large number of entries in the CSV file) that I need to use dynamically while building the JSON payload using python (in Databricks).
test.csv
1a2b3c
2n3m6g
333b4c
2m345j
123abc
payload.json
{
"records": {
"id": "37c8323c",
"names": [
{
"age": "1",
"identity": "Dan",
"powers": {
"key": "plus",
"value": "1a2b3c"
}
},
{
"age": "2",
"identity": "Jones",
"powers": {
"key": "minus",
"value": "2n3m6g"
}
},
{
"age": "3",
"identity": "Kayle",
"powers": {
"key": "multiply",
"value": "333b4c"
}
},
{
"age": "4",
"identity": "Donnis",
"powers": {
"key": "divide",
"value": "2m345j"
}
},
{
"age": "5",
"identity": "Layla",
"powers": {
"key": "power",
"value": "123abc"
}
}
]
}
}
The above payload that I need to construct as a result of multiple names objects in the array and I also would like the value property to read dynamically from the CSV file.
I basically need to append the below JSON object to the existing names array considering the value for the power object from the CSV file.
{
"age": "1",
"identity": "Dan",
"powers": {
"key": "plus",
"value": "1a2b3c"
}
}
Since I'm a newbie in Python, any guides would be appreciated. Thanks to the StackOverflow team in advance.

Json schema for a complex JSON?

I have a json that uses object of objects in place of an array, to easily search through the data with json keys.
How can I validate this against a schema, without hardcoding the key into the schema ? Should I be converting the object into an array in this case before validating ?
Example JSON :
{
"item_value": {
"id": 123,
"name": "Item Value",
"colors": {
"red_value": {
"id": 1231,
"name": "Red Value"
},
"blue_value": {
"id": 1231,
"name": "Blue Value"
}
}
},
"another_item_value": {
"id": 133,
"name": "Another Item Value",
"colors": {
"red_value_xyz": {
"id": 1331,
"name": "Red Value Xyz"
},
"blue_value_bar": {
"id": 1331,
"name": "Blue Value Bar"
}
}
}
}
You can use patternProperties for RegEx-matched property validation, or if all of your properties follow the same subschema, just use additionalProperties with the subschema.

How to make JOLT return key value as a list even it has a single value or multi-value?

I want to convert JSON format to another format so I use jolt for that. please if there is best from jolt recommend me to it.
when I map this object
{
"id": 1,
"username": "sd4s5d4",
"phone": "111",
"groups": [
{
"id": 1
},
{
"id": 2
}
]
}
the expected output is ok the value of groups is returned as a list
{
"id": 1,
"username": "sd4s5d4",
"phone": "111",
"groups": [ 1, 2 ]
}
but when I map this object
{
"id": 5,
"username": "sd4s5d4",
"phone": "111",
"groups": [
{
"id": 1
}
]
}
it returns
{
"id": 5,
"username": "sd4s5d4",
"phone": "111",
"groups": 1
}
how to make groups in last output to be list even if it one item.
wanted format
{
"id": 5,
"username": "sd4s5d4",
"phone": "111",
"groups": [1]
}
Spec
[
{
"operation": "shift",
"spec": {
"id": "id",
"username": "username",
"phone": "phone",
"groups": {
"*": {
"id": "groups"
}
}
}
}
]
Just replacing "id": "groups" with "id": "groups.[&1]" is enough, btw no need to repeat every element individually, just replace them with proper substitution by using ampersand operator prepended to the integer values which represent the level that provide to the target key (e.g.number of } operators while arriving the related key). So, convert "id": "groups.[&1]" to "id": "&2.[&1]", and use "*": "&", for the other elements such as
[
{
"operation": "shift",
"spec": {
"*": "&",
"groups": {
"*": {
"id": "&2.[&1]"
}
}
}
}
]

I need assistance in building my own json schema

How do I build my own json schema to validate the json coming back from an api is the same structure? I have this sample JSON
{
"httpStatus": 200,
"httpStatusMessage": "success",
"timestamp": "2020-11-11T19:32:45",
"response": {
"header": {
"SchoolId": 10006,
"SchoolName": "Naples"
},
"body": {
"dataProviders": [
{
"dataProviderId": 14,
"students": [
{
"studentId": 1000611000,
"driverGrade": "Junior",
"firstName": "Authur",
"lastName": "Boccuto"
},
{
"studentId": 1000611001,
"studentGrade": "Senior",
"firstName": "Antwan",
"lastName": "Carter"
}
]
}
]
}
}
}
At times it can come in with a different structure and I need to build my own json schema to validate that it's the same before manipulating the json data. How do I build my own schema to make sure that it has a valid structure?