Splitting records in Apache Nifi - json

I am working on a Jolt transforms processor in Apache Nifi, I am facing some issues, please help me out.
Input:
{
"resourceid": "d6315d4d7f0c",
"timestamp": [
166406,
166404,
166504
],
"Key": [
"mem",
"net",
"diskspace"
],
"data": [
89,
90,
91
]
}
Expected output:
[
{
"resourceid": "d6315d4d7f0c",
"timestamp": 166406,
"Key": "mem",
"data": 89
},
{
"resourceid": "d6315d4d7f0c",
"timestamp": 166404,
"Key": "net",
"data": 90
},
{
"resourceid": "d6315d4d7f0c",
"timestamp": 166504,
"Key": "diskspace",
"data": 91
}
]

You can loop through the indexes of one of the arrays(in this case I've chosen timestamp) within a single shift transformation such as
[
{
"operation": "shift",
"spec": {
"timestamp": {
"*": {
"#(2,resourceid)": "[&].resourceid",
"#": "[&].&2", // "[&]" represents indexes 0,1,2... nested within array to construct a structure of type array
"#(2,Key[&])": "[&].Key", // go two levels up the three in order to reach the level of "Key", and grab its value
"#(2,data[&])": "[&].data"
}
}
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is

Related

How to dynamically add key and values from one object into another object in an array via jolt

I'm using jolt and I have an input object where I would like to take the keys out of one property and insert them into each object of an array in another property dynamically:
My input:
{
"data": {
"NAN_KEY": 1,
"TEMP": 3
},
"attributes": [
{
"name": "attribute1",
"value": 3
},
{
"name": "attribute2",
"value": 2
}
]
}
The result I'm aiming for:
"attributes": [
{
"name": "attribute1",
"value": 3,
"NAN_KEY": 1,
"TEMP": 3
},
{
"name": "attribute2",
"value": 2,
"NAN_KEY": 1,
"TEMP": 3
}
]
This question was posted previously in this thread
But after using the solution, I realized I needed it to dynamically add the entire object instead of hardcoding the fields
Any help is appreciated!
Yes, it's possible to make it more dynamic such as
[
{
"operation": "shift",
"spec": {
"attributes": {
"*": {
"#2,data": { "*": "[&1].&" }, // go two levels up the tree to grab the values of the "data" array
"*": "[&1].&"
}
}
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is

Transform Dictionary to Object with JOLT

i want to use JOLT Transform to convert an dictionary to an JSON Object. Here below i demonstrate it with an example below.
There is in the root an Array which contains the results from different computers. "Computer1", "Computer2", and more.
This structure should be stay. I removed the 2nd and more array elemt which will reoccur in the same way with {...}.
Given Object:
[
{
"name": "Computer1",
"events": {
"counts": [
{
"countType": "CRITICAL",
"count": 5
},
{
"countType": "HIGH",
"count": 12
},
{
"countType": "LOW",
"count": 40
}
]
},
"processes": {
"counts": [
{
"countType": "CRITICAL",
"count": 0
},
{
"countType": "HIGH",
"count": 2
},
{
"countType": "LOW",
"count": 80
}
]
}
},
{
"name": "Computer2",
"events": {...},...
}
]
Desired Output:
[
{
"name": "Computer1",
"events": {
"CRITICAL": 5,
"HIGH": 12,
"LOW": 40
},
"processes": {
"CRITICAL": 0,
"HIGH": 2,
"LOW": 80
}
}
, {
"name": "Computer2",
"events": {...},
...
}
]
Please help to identify the right JOLT spec.
Thanks in advance
Marcus
You can use such a shift transformation specs
[
{
"operation": "shift",
"spec": {
"*": {
"name": "&1.&",
"events|processes": { // pipe represents "OR" logic
"counts": {
"*": {
"#count": "&4.&3.#countType" // "&4" and "&3" represent going four and three level up the tree to grab the indices of the outermost level list and the key name of the objects("events" and "processes") respectively
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": ""
}
}
]

JOLT specification to shift columns to the end of JSON

I am trying to write a spec to shift a few key in a json object to the very end of the object.
{
"id": "12345",
"timestamp": "2019-10-28 13:24:44.547",
"action": "notify",
"name": "test"
}
to:
{
"id": "12345",
"action": "notify",
"name": "test",
"timestamp": "2019-10-28 13:24:44.547"
}
Appreciate any leads on how to go about doing this using JOLT.
I believe that is an answer, but I'm not so sure that Jolt keeps an order.
[
{
"operation": "shift",
"spec": {
"id": "id",
"action": "action",
"name": "name",
"timestamp": "timestamp"
}
}
]

Shift JOLT transformation - facing problem with below transformation

I'm trying to convert below input json to flatten necessary column names and its values while retaining all metadata.
Below is the input json that I've for my CDC use-case.
{
"type": "update",
"timestamp": 1558346256000,
"binlog_filename": "mysql-bin-changelog.000889",
"binlog_position": 635,
"database": "books",
"table_name": "publishers",
"table_id": 111,
"columns": [
{
"id": 1,
"name": "id",
"column_type": 4,
"last_value": 2,
"value": 2
},
{
"id": 2,
"name": "name",
"column_type": 12,
"last_value": "Suresh",
"value": "Suresh123"
},
{
"id": 3,
"name": "email",
"column_type": 12,
"last_value": "Suresh#yahoo.com",
"value": "Suresh#yahoo.com"
}
]
}
Below is the expected output json
[
{
"type": "update",
"timestamp": 1558346256000,
"binlog_filename": "mysql-bin-changelog.000889",
"binlog_position": 635,
"database": "books",
"table_name": "publishers",
"table_id": 111,
"columns": {
"id": "2",
"name": "Suresh123",
"email": "Suresh#yahoo.com"
}
}
]
I tried the below spec from which I'm able to retrieve columns object but not the rest of the metadata.
[
{
"operation": "shift",
"spec": {
"columns": {
"*": {
"#(value)": "[#1].#(1,name)"
}
}
}
}
]
Any leads would be very much appreciated.
I got the JOLT spec for above transformation. I'm posting it here incase if anyone stumbles upon the something like this.
[
{
"operation": "shift",
"spec": {
"columns": {
"*": {
"#(value)": "columns.#(1,name)"
}
},
"*": "&"
}
}
]

Unable to form the JOLT schema to transform JSON in NiFi

I am trying to use the jolt JSON to JSON transformation in Apache NiFi. I want to transform one JSON into another format.
Here is my original JSON:
{
"total_rows": 5884,
"offset": 0,
"rows": [
{
"id": "03888c0ab40c32451a018be6b409eba3",
"key": "03888c0ab40c32451a018be6b409eba3",
"value": {
"rev": "1-d5cc089dd8682422962ccab4f24bd21b"
},
"doc": {
"_id": "03888c0ab40c32451a018be6b409eba3",
"_rev": "1-d5cc089dd8682422962ccab4f24bd21b",
"topic": "iot-2/type/home-iot/id/1234/evt/temp/fmt/json",
"payload": {
"temperature": 36
},
"deviceId": "1234",
"deviceType": "home-iot",
"eventType": "temp",
"format": "json"
}
},
{
"id": "03888c0ab40c32451a018be6b409f163",
"key": "03888c0ab40c32451a018be6b409f163",
"value": {
"rev": "1-dee82cbb1b5ffa8a5e974135eb6340c5"
},
"doc": {
"_id": "03888c0ab40c32451a018be6b409f163",
"_rev": "1-dee82cbb1b5ffa8a5e974135eb6340c5",
"topic": "iot-2/type/home-iot/id/1234/evt/temp/fmt/json",
"payload": {
"temperature": 22
},
"deviceId": "1234",
"deviceType": "home-iot",
"eventType": "temp",
"format": "json"
}
}
]
}
I want this to be transformed in the following JSON:
[
{
"temperature":36,
"deviceId":"1234",
"deviceType":"home-iot",
"eventType":"temp"
},
{
"temperature":22,
"deviceId":"1234",
"deviceType":"home-iot",
"eventType":"temp"
}
]
This is what my spec looks like:
[
{
"operation": "shift",
"spec": {
"rows": {
"*": {
"doc": {
"deviceId": "[&1].deviceId",
"deviceType": "[&1].deviceType",
"eventType": "[&1].eventType",
"payload": {
"*": "[&1]"
}
}
}
}
}
}
]
I keep getting a null response. I am new to this and the documentation is not very easy to comprehend. Can somebody please help?
Because you are "down" one more level after the array index, by the time you get to deviceId you are 2 levels away from the index. Replace all the &1s with &2 except for payload. In that case you are another level "down" so you'll want to use &3 for the index. You also need to take whatever is matched by the * (temperature, e.g.) and set the outgoing field name to the same thing, by using & after the array index. Here's the resulting spec:
[
{
"operation": "shift",
"spec": {
"rows": {
"*": {
"doc": {
"deviceId": "[&2].deviceId",
"deviceType": "[&2].deviceType",
"eventType": "[&2].eventType",
"payload": {
"*": "[&3].&"
}
}
}
}
}
}
]