Apache Nifi transform with JoltTranform processor - json

I have some attributes myId, count. Now with these attributes I want write down the Jolt format to get the following output.
{
"projectId": "projectId",
"ticketId": "NO_TICKET",
"trigger": "SCHEDULED_BACKLOG",
"timestamp": 1539060316494,
"pivotVersion": 1,
"pivotType": "FlattenedTodoStats",
"todoCount": "todoCount",
"pivots": [
{
"state": "BACKLOG",
"type": "NA"
}
]
}

You can either use Jolt Transform (or) ReplaceText processors for this case.
As you are having some attributes to the flowfile so use ReplaceText processor
In ReplaceMent Value configure as
{
"projectId": "${projectId}",
"ticketId": "${ticketId}",
"trigger": "${trigger}",
"timestamp": "${timestamp}",
"pivotVersion":"${pivotVersion}",
"pivotType":"${pivotType}",
"todoCount":"${todoCount}",
"pivots[]": {
"*": {
"state": "${state}",
"type": "${type}"
}
}
}
Substitute all the attribute names(${projectId}..etc) with your attribute names.
Use Replacement Strategy as AlwaysReplace
(or)
If you want to use Jolt for this case then
Use default operation to replace your attribute values and prepare json message
Example:
Jolt Specification
[{ "operation": "shift", "spec": { "z":"z" } }, { "operation": "default", "spec": { "projectId": "${projectId}", "ticketId": "${ticketId}", "trigger": "${trigger}", "timestamp": "${timestamp}", "pivotVersion":"${pivotVersion}", "pivotType":"${pivotType}", "todoCount":"${todoCount}", "pivots[]": { "*": { "state": "${state}", "type": "${type}" } } } }]
As i don't have any attribute values, so my output json is having all empty values in it.
Change the spec jolt spec as per your requirements.

Related

How to get column value of key in JOLT

I'm looking for breaking following nested JSON file and transform it into a SQL prepared format.
Input JSON file:
{
"Product1": {
"Purchase": 31
},
"Product2": {
"Purchase": 6213,
"Cancel": 1988,
"Change": 3702,
"Renewal": 5934
}
}
Desired output:
[
{
"product": "Product1",
"Purchase": 31
},
{
"product": "Product2",
"Purchase": 6213,
"Cancel": 1988,
"Change": 3702,
"Renewal": 5934
}
]
What you need is using a $ wildcard within a shift transformation spec in order to replicate the current attributes's key such as
[
{
"operation": "shift",
"spec": {
"*": {
"$": "[#2].product",// $ grabs the value after going tree one level up from the current level
"*": "[#2].&"// keeps the current attributes conforming to the objects nested within a common array
}
}
}
]

transforming all string attributes which are boolean to true booleans

I thought this would be simple, perhaps it is but on my jolt learning journey i am once again struggling.
I have some json files (without a schema) which can be up to say 30Mb in size which have many thousands of string attributes at all levels of the document some of which (say 20%) which hold booleans as strings types.
I get that i can write a spec to pick out individual ones and convert them as per (post)[https://stackoverflow.com/questions/64972556/convert-boolean-to-string-for-map-values-in-nifi-jolt]
They technique wont work for me as nesting and levels are very arbitrary and there are simply way to many of them.
so how can i apply the data type transform to any attribute which has a boolean represented as a string ?
for example input
{
"name": "Fred",
"age": 45,
"opentowork" : "true",
"friends" : [
{
"name": "penny",
"closefriend": "false"
},
{
"name": "roger",
"farfriend": "true"
}
]
}
to desired
{
"name": "Fred",
"age": 45,
"opentowork" : true,
"friends" : [
{
"name": "penny",
"closefriend": false
},
{
"name": "roger",
"farfriend": true
}
]
}
I want to pick up attributes opentowork, closefriend and farfriend without explicity defining them int the spec, i also need to leave the values of the other attributes as they are (whatever level they are at).
You can use =toBoolean conversion just a bit separating case within the friends array by using "f*" representation from the else case "*" such as
[
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=toBoolean",
"f*": {
"*": {
"*": "=toBoolean"
}
}
}
}
]
or some multiple modify specs, without explicitly defining any attribute/array/object, might be added at the number of desired levels such as
[
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=toBoolean"
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": {
"*": "=toBoolean"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": {
"*": {
"*": "=toBoolean"
}
}
}
}
]

Match JSON arrays with JOLT

I have JSON from REST API:
{
"fields": [
"advertiser_id",
"campaign_id",
"day"
],
"data": [
[
"8905",
"234870",
"2021-09-28"
],
[
"5634",
"88467870",
"2021-09-28"
]
]
}
I want to match values inside fields array with values inside data. The have same order. So I expect to get:
[
{
"advertiser_id": "8905",
"campaign_id": "234870",
"day": "2021-09-28"
},
{
"advertiser_id": "5634",
"campaign_id": "88467870",
"day": "2021-09-28"
}
]
Any ways to do it with JOLT?
You can use a shift transformation spec in which
go 4 levels up (traverse once:, and { triple) in order to reach
fields array as picking sub-arrays of data array by using [&1]
dissipate all returning key-value pairs through use of [&2]. node
such as
[
{
"operation": "shift",
"spec": {
"data": {
"*": {
"*": {
"#": "[&2].#(4,fields[&1])"
}
}
}
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is

convert Boolean to String for map values in nifi jolt

I want to achieve following JSON transformation using Jolt processor in NIFI
The input in json, is a map (here image is a key and image1.png is a value, and etc, with different types (String, Boolean)
input JSON
{
"internal_value": "434252345",
"settings": {
"image": "image1.png",
"bold": false,
"country": false
}
}
Output JSON should be
{
"internal_value": "434252345",
"settings": {
"image": "image1.png",
"bold": "false",
"country": "false"
}
}
Is there a way to do this using existing Jolt operations ?
Thanks.
As a pure JOLT this would be:
[
{
"operation": "modify-overwrite-beta",
"spec": {
"settings": {
"bold": "=toString",
"country": "=toString"
}
}
}
]
You can use this tool to prototype JOLTs:
https://jolt-demo.appspot.com/#inception
Resources:
JOLT transformation for nested JSON?
JOLT change string to float

Jolt spec for conditional presence of field

I have a scenario where I have two very similar inputs formats, but I need one Jolt spec to process both formats consistently.
This is input style 1:
{
"creationTime": 1503999158000,
"device": {
"ip": "155.157.36.226",
"hostname": "server-123.example.int"
}
}
and this is input style 2:
{
"creationTime": 1503999158000,
"device": {
"ip6": "2001::face",
"hostname": "server-123.example.int"
}
}
The only difference is that style 1 uses device.ip, and style 2 uses device.ip6. There will always be one or neither of those fields, but never both.
I want to simply extract the following:
{
"created_ts": 1503999158000,
"src_ip_addr": "....."
}
I need src_ip_addr to be set to whichever field was present out of ip and ip6. If neither field was present in the source data, the value should default to null.
Is this possible with a single Jolt spec?
A single spec with two operations.
Spec
[
{
"operation": "shift",
"spec": {
"creationTime": "created_ts",
"device": {
// map ip or ip6 to src_ip_addr
"ip|ip6": "src_ip_addr"
}
}
},
{
"operation": "default",
"spec": {
// if src_ip_addr does not exist, then apply a default of null
"src_ip_addr": null
}
}
]
I tried out the following and it worked for my requirements:
[
{
"operation": "shift",
"spec": {
"creationTime": "created_ts",
"device": {
// map both to src_ip_addr, whichever one is present will be used
"ip": "src_ip_addr",
"ip6": "src_ip_addr"
}
}
},
{
"operation": "default",
"spec": {
"src_ip_addr": null
}
}
]