Jolt spec for conditional presence of field - json

I have a scenario where I have two very similar inputs formats, but I need one Jolt spec to process both formats consistently.
This is input style 1:
{
"creationTime": 1503999158000,
"device": {
"ip": "155.157.36.226",
"hostname": "server-123.example.int"
}
}
and this is input style 2:
{
"creationTime": 1503999158000,
"device": {
"ip6": "2001::face",
"hostname": "server-123.example.int"
}
}
The only difference is that style 1 uses device.ip, and style 2 uses device.ip6. There will always be one or neither of those fields, but never both.
I want to simply extract the following:
{
"created_ts": 1503999158000,
"src_ip_addr": "....."
}
I need src_ip_addr to be set to whichever field was present out of ip and ip6. If neither field was present in the source data, the value should default to null.
Is this possible with a single Jolt spec?

A single spec with two operations.
Spec
[
{
"operation": "shift",
"spec": {
"creationTime": "created_ts",
"device": {
// map ip or ip6 to src_ip_addr
"ip|ip6": "src_ip_addr"
}
}
},
{
"operation": "default",
"spec": {
// if src_ip_addr does not exist, then apply a default of null
"src_ip_addr": null
}
}
]

I tried out the following and it worked for my requirements:
[
{
"operation": "shift",
"spec": {
"creationTime": "created_ts",
"device": {
// map both to src_ip_addr, whichever one is present will be used
"ip": "src_ip_addr",
"ip6": "src_ip_addr"
}
}
},
{
"operation": "default",
"spec": {
"src_ip_addr": null
}
}
]

Related

Nifi - Replace values

Good Afternoon!
I have a JSON with:
{
"cnpjemitente" : "48791685000168",
"pedido" : "543306",
"pedidocliente" : { },
"emissao" : "20220912"
}
I need to replace the value "pedidocliente: {}" to:
{
"cnpjemitente" : "48791685000168",
"pedido" : "543306",
"pedidocliente" : null,
"emissao" : "20220912"
}
Sometimes the value will come in the field, I just want to send null when it is empty with '{}'.
How can I do it this way?
Thanks!
You can use a modify-overwrite-beta transformation spec within a JoltTransformJSON processor such as
[
{
"operation": "modify-overwrite-beta",
"spec": {
"pedidocliente": null
}
}
]
as you only need to change an individual attribute's value without affecting the others.
If it's the case that the value does not return always {}(an empty object), then rather use a shift transformation spec such as
[
{
"operation": "shift",
"spec": {
"pedidocliente": {
"*": "&1.&"
},
"*": "&"
}
}
]

transforming all string attributes which are boolean to true booleans

I thought this would be simple, perhaps it is but on my jolt learning journey i am once again struggling.
I have some json files (without a schema) which can be up to say 30Mb in size which have many thousands of string attributes at all levels of the document some of which (say 20%) which hold booleans as strings types.
I get that i can write a spec to pick out individual ones and convert them as per (post)[https://stackoverflow.com/questions/64972556/convert-boolean-to-string-for-map-values-in-nifi-jolt]
They technique wont work for me as nesting and levels are very arbitrary and there are simply way to many of them.
so how can i apply the data type transform to any attribute which has a boolean represented as a string ?
for example input
{
"name": "Fred",
"age": 45,
"opentowork" : "true",
"friends" : [
{
"name": "penny",
"closefriend": "false"
},
{
"name": "roger",
"farfriend": "true"
}
]
}
to desired
{
"name": "Fred",
"age": 45,
"opentowork" : true,
"friends" : [
{
"name": "penny",
"closefriend": false
},
{
"name": "roger",
"farfriend": true
}
]
}
I want to pick up attributes opentowork, closefriend and farfriend without explicity defining them int the spec, i also need to leave the values of the other attributes as they are (whatever level they are at).
You can use =toBoolean conversion just a bit separating case within the friends array by using "f*" representation from the else case "*" such as
[
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=toBoolean",
"f*": {
"*": {
"*": "=toBoolean"
}
}
}
}
]
or some multiple modify specs, without explicitly defining any attribute/array/object, might be added at the number of desired levels such as
[
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=toBoolean"
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": {
"*": "=toBoolean"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": {
"*": {
"*": "=toBoolean"
}
}
}
}
]

How to delete the text befor the symbol ( _ ) , using jolt json?

i have an input JSON file , which have some attributes those contains some informations which i want to delete on the output
( example input : Hello_World => output = World )
For example this is the input :
{
"test": "hello_world"
}
and this is output needed :
{
"test" : "world"
}
the result derived from the jolt spec i tried is not even close, this is what i tried :
[
{
"operation": "modify-overwrite-beta",
"spec": {
"test": "=test.substring(test.indexOf(_) + 1)"
}
}
]
Sorry im i a newbie at jolt, just started it.
You can split by _ character within a modify transformation spec, and pick the one with the index 1 in order to display the last component of the arrah considering the current case of having a single underscore such as
[
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=split('_', #(1,&))"
}
},
{
"operation": "shift",
"spec": {
"*": {
"1": "&1" // &1 represents going one level up to reach to the level of the label "test" to replicate it
}
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is :
Considering an input with attribute having value more than one underscore :
{
"test": "how_are_you"
}
[More generic] solution would be as follows :
[
{
"operation": "modify-overwrite-beta",
"spec": {
"test_": "=split('_', #(1,test))",
"test": "=lastElement(#(1,test_))"
}
},
{
// get rid of the extra generated attribute
"operation": "remove",
"spec": {
"test_": ""
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is :

Convert object property to array with one element per key

I am trying to use Jolt to transform an object into an array with one array element per key in the original object. I'd like the output objects to include the original key as a property, and preserve any properties from the source value. And I need to handle three scenarios for the input properties:
"key": null
"key": {}
"key": {...}
Here's an example:
{
"operations": {
"foo": null,
"bar": {},
"baz": {
"arbitrary": 1
}
}
}
And the desired output"
{
"operations": [
{
"operation": "foo"
},
{
"operation": "bar"
},
{
"operation": "baz",
"arbitrary": 1
}
]
}
Note: foo, bar and baz are arbitrary here. It needs to handle any property names inside the operations object.
This is really close to what I want:
[
{
"operation": "default",
"spec": {
"operations": {
"*": {}
}
}
},
{
"operation": "shift",
"spec": {
"operations": {
"*": {
"$": "operations[].operation"
}
}
}
}
]
But it drops "arbitrary": 1 from the baz operation.
Alternately this keeps the properties in the operations, but doesn't add a key for the operation name:
[
{
"operation": "default",
"spec": {
"operations": {
"*": {}
}
}
},
{
"operation": "shift",
"spec": {
"operations": {
"*": {
"#": "operations[]"
}
}
}
}
]
Any help getting both behaviors would be appreciated.
You can use one level of shift transformation spec along with symbolical usage(wildcards) rather than repeated literals such as
[
{
"operation": "shift",
"spec": {
"*s": {
"*": {
"$": "&2[#2].&(2,1)",
"*": "&2[#2].&"
}
}
}
}
]
where
&2 represents going 2 levels up the tree by traversing { signs twice in order to pick the key name operations (if it were only &->eg.identicals &(0) or &(0,0), then it would traverse only the colon and reach $ to grab its value)
[#2] also represents going 2 levels of traversing { signs and : sign, as it's already located on the Right Hand Side of the spec, in order to ask that reached node how many matches it has had
&(2,1) subkey lookup represents going 2 levels up the tree and grab the reached key name of the object by the first argument, and which part of the key, which's partitioned by * wildcard, to use by the second argument. (in this case we produce the literal operation without plural suffix)
* wildcard, which's always on the Left Hand Side, represents the rest of the attributes(else case).
the demo on the site http://jolt-demo.appspot.com is

Jolt to Map input fields with conditions

I am trying to make some jolt where I need to map only one input to output.
Any help or suggestions appreciated.
If topicA.owner and topicZ.owner both present output owner.name should be mapped to topicZ.owner
if topicA.owner only then output owner.name should be mapped to topicA.owner
if topicZ.owner only then output owner.name should be mapped to topicZ.owner
Input :
{
"topicA": {
"owner": "topic_a_owner"
},
"topicZ": {
"owner": "topic_z_owner"
}
}
Jolt:
[
{
"spec": {
"*": {
"ta": "#(2,topicA.owner)",
"za": "#(2,topicZ.owner)"
}
},
"operation": "modify-default-beta"
},
{
"operation": "shift",
"spec": {
"topicA": {
"ta": "owner.name"
},
"topicZ": {
"za": "owner.name"
}
}
}
]
Expected Output:
{
"owner" : {
"name" : "topic_z_owner"
}
}
The 3 conditions you have mentioned can be simplified into 2 conditions as below.
If topicZ.owner presents, (irrespective of whether topicA.owner is present or not) then output owner.name should be mapped to topicZ.owner (This merges your 1st and 3rd conditions)
If topicA.owner only presents, then output owner.name should be mapped to topicA.owner
So based on this, you can do the following operations.
Use modify-default-beta operation to copy the value of topicA.owner to topicZ.owner field when topicZ.owner is not present.
Use shift operation to map the value of topicZ.owner to owner.name field on the output.
[
{
"spec": {
"topicZ": {
"owner": "#(2,topicA.owner)"
}
},
"operation": "modify-default-beta"
},
{
"operation": "shift",
"spec": {
"topicZ": {
"owner": "owner.name"
}
}
}
]