Jolt transform specification input - json

i have the following input json:
{
"tags": {
"event": "observation",
"source": "hunter"
}
}
The output JSON should look like below:
{
"tags" : [ "event:observation", "source:hunter" ]
}
can anyone provide any guidance on how to build a proper jolt specification for the above?
thank you very much for the help ^_^

You can use this specification
[
{ // combine each key-value pair under within common arrays
"operation": "shift",
"spec": {
"tags": {
"*": {
"$": "&2_&1",
"#": "&2_&1"
}
}
}
},
{ // concatenate key-value pairs by colon characters
"operation": "modify-overwrite-beta",
"spec": {
"*": "=join(':',#(1,&))"
}
},
{
"operation": "shift",
"spec": { // make array key common("tags") for all arrays
// through use of _ seperator and * wildcard
"*_*": "&(0,1)"
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is

Related

transforming all string attributes which are boolean to true booleans

I thought this would be simple, perhaps it is but on my jolt learning journey i am once again struggling.
I have some json files (without a schema) which can be up to say 30Mb in size which have many thousands of string attributes at all levels of the document some of which (say 20%) which hold booleans as strings types.
I get that i can write a spec to pick out individual ones and convert them as per (post)[https://stackoverflow.com/questions/64972556/convert-boolean-to-string-for-map-values-in-nifi-jolt]
They technique wont work for me as nesting and levels are very arbitrary and there are simply way to many of them.
so how can i apply the data type transform to any attribute which has a boolean represented as a string ?
for example input
{
"name": "Fred",
"age": 45,
"opentowork" : "true",
"friends" : [
{
"name": "penny",
"closefriend": "false"
},
{
"name": "roger",
"farfriend": "true"
}
]
}
to desired
{
"name": "Fred",
"age": 45,
"opentowork" : true,
"friends" : [
{
"name": "penny",
"closefriend": false
},
{
"name": "roger",
"farfriend": true
}
]
}
I want to pick up attributes opentowork, closefriend and farfriend without explicity defining them int the spec, i also need to leave the values of the other attributes as they are (whatever level they are at).
You can use =toBoolean conversion just a bit separating case within the friends array by using "f*" representation from the else case "*" such as
[
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=toBoolean",
"f*": {
"*": {
"*": "=toBoolean"
}
}
}
}
]
or some multiple modify specs, without explicitly defining any attribute/array/object, might be added at the number of desired levels such as
[
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=toBoolean"
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": {
"*": "=toBoolean"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": {
"*": {
"*": "=toBoolean"
}
}
}
}
]

How to delete the text befor the symbol ( _ ) , using jolt json?

i have an input JSON file , which have some attributes those contains some informations which i want to delete on the output
( example input : Hello_World => output = World )
For example this is the input :
{
"test": "hello_world"
}
and this is output needed :
{
"test" : "world"
}
the result derived from the jolt spec i tried is not even close, this is what i tried :
[
{
"operation": "modify-overwrite-beta",
"spec": {
"test": "=test.substring(test.indexOf(_) + 1)"
}
}
]
Sorry im i a newbie at jolt, just started it.
You can split by _ character within a modify transformation spec, and pick the one with the index 1 in order to display the last component of the arrah considering the current case of having a single underscore such as
[
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=split('_', #(1,&))"
}
},
{
"operation": "shift",
"spec": {
"*": {
"1": "&1" // &1 represents going one level up to reach to the level of the label "test" to replicate it
}
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is :
Considering an input with attribute having value more than one underscore :
{
"test": "how_are_you"
}
[More generic] solution would be as follows :
[
{
"operation": "modify-overwrite-beta",
"spec": {
"test_": "=split('_', #(1,test))",
"test": "=lastElement(#(1,test_))"
}
},
{
// get rid of the extra generated attribute
"operation": "remove",
"spec": {
"test_": ""
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is :

Nifi Jolt convert after subtract

[
{
"dst_cnt": "149125"
},
{
"src_cnt": "149136"
},
{
"TABLENAME": "NAME"
}
]
I want to subtract dst_cnt and src_cnt from this data with NIfi jolt. Is data operation possible after type conversion in NIfi?
Yes, it's possible. You can try the following transformatios spec
[
{
// combine individual attributes within the common object
"operation": "shift",
"spec": {
"*": {
"*": "&"
}
}
},
{
// sum up the the integer values after determining the negative form of "dst_cnt"
"operation": "modify-overwrite-beta",
"spec": {
"dst_cnt_": "=divide(#(1,dst_cnt),-1)",
"cnt_dif": "=intSum(#(1,dst_cnt_),#(1,src_cnt))"
}
},
{
// only pick the result derived from the subtraction
"operation": "shift",
"spec": {
"cnt_*": "&"
}
}
]

Converting List to Comma Separated String in JOLT

I have below scenario, where two operations need to be performed. One is parsing the list and create a comma separated string. Then, transform that into the output format json
Input -
{
"list": ["ABC","XYZ"]
}
Output -
{
"additionalAttributes" : {
"userContext" : [ {
"auths" : "ABC,XYZ"
} ]
}
}
Check this spec
[
{
"operation": "modify-overwrite-beta",
"spec": {
"list": "=join(',',#(1,list))"
}
}, {
"operation": "shift",
"spec": {
"list": "additionalAttributes.userContext[].auths"
}
}
]

How tracnform rest of json into one field value using jolt?

Here is the JSON input:
{
"myRootKey": {
"directMove": "directValue",
"marker": "THE_MARKER",
"someTextField": "someString",
"someObject": {
"someKey": "value"
}
}
}
the output should be:
{
"myRootKey": {
"subKey": {
"directMove": "directValue"
},
"THE_MARKER": {
"someTextField": "someString",
"someObject": {
"someKey": "value"
}
}
}
}
With direct moving it is clear, but how rest of the input to the marker object value?
You match down to "someTextField" and "someObject", but use the new "#" / look up the tree logic to find the "marker" to use as an ouput path.
Spec
[
{
"operation": "shift",
"spec": {
"myRootKey": {
"directMove": "myRootKey.subKey.directValue",
"someTextField": "#(1,marker).someTextField",
"someObject": "#(1,marker).someObject"
}
}
}
]
#(1,marker) allows you to retrieve the value of the marker field
&1 retrieves the value of the matching node
So the spec you are looking for looks like :
[
{
"operation": "shift",
"spec": {
"myRootKey": {
"directMove": "myRootKey.subKey.directValue",
"someTextField": "&1.#(1,marker).someTextField",
"someObject": "&1.#(1,marker).someObject"
}
}
}
]
You can use this spec fully dynamically:
[
{
"operation": "shift",
"spec": {
"*": {
"directMove": "&1.subKey.&",
"*TextField": "&1.#(1,marker).&",
"*Object": "&1.#(1,marker).&"
}
}
}
]