Jolt transform spec - json

I'm making ETL in NiFi, Mongo sends JSON messages to Kafka with structure like this:
{
"regionPriceEvent": {
"42": {
"type": "ACTIVATION",
"date": "2022-07-02T18:24:50.719Z"
},
"55": {
"type": "ACTIVATION",
"date": "2022-07-02T18:24:50.719Z"
}
},
"visibilityInRegions": [
{
"regionId": "42",
"visibility": "true"
},
{
"regionId": "66",
"visibility": "true"
}
]
}
And i need to transform it to structure like this, but don't know how to do that, it looks like full join in SQL:
{
"regionPriceEvent": [
{
"regionId": "42",
"type": "ACTIVATION",
"date": "2022-07-02T18:24:50.719Z",
"visibility": "true"
},
{
"regionId": "55",
"type": "ACTIVATION",
"date": "2022-07-02T18:24:50.719Z",
"visibility": ""
},
{
"regionId": "66",
"type": "",
"date": "",
"visibility": "true"
}
]
}
Is it possible to do? Or i'm just wasting a time?
Here is my spec:
[
{
"operation": "shift",
"spec": {
"visibilityInRegions": {
"*": {
"#": "visibilityInRegions.#regionId"
}
},
"*": "&"
}
},
{
"operation": "remove",
"spec": {
"visibilityInRegions": {
"*": {
"regionId": ""
}
}
}
},
{
"operation": "shift",
"spec": {
"regionPriceEvent": {
"*": "&"
},
"visibilityInRegions": {
"*": "&"
}
}
}
]
Here is the result:
{
"42": [
{
"type": "ACTIVATION",
"date": "2022-07-02T18:24:50.719Z"
},
{
"visibility": "true"
}
],
"55": {
"type": "ACTIVATION",
"date": "2022-07-02T18:24:50.719Z"
},
"66": {
"visibility": "true"
}
}

Yes, it's possible to do such as
[
{
// Collect all attributes under common "regionId" values
"operation": "shift",
"spec": {
"reg*": {
"#": "&"
},
"*": {
"*": {
"*": "regionPriceEvent.#(1,regionId).&"
}
}
}
},
{
"operation": "default",
"spec": {
"*": {
"*": {
// Default all "subobjects" to have a these keys
"type": "",
"date": "",
"visibility": ""
}
}
}
},
{
// complete the missing attributes called "regionId"
"operation": "shift",
"spec": {
"*": {
"*": {
"$": "&2.&1.regionId"
},
"#": "&"
}
}
},
{
// get rid of the labels of the objects
"operation": "shift",
"spec": {
"*": {
"*": ""
}
}
},
{
// pick only single one of the repeating "regionId" values
"operation": "cardinality",
"spec": {
"*": {
"*": "ONE"
}
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is :

Related

Transform complex data by jolt processor in apache nifi

I have following expected input and expected output and i want to transform data by jolt processor
Input JSON
{
"subscriptionId": "63",
"data": [
{
"type": "demo",
"Data": {
"transactionId": "598958",
"type": "json",
"xyz": "pqr",
"name": "john"
},
"nameData": [
{
"name": "Ama",
"nameCount": "215",
"genderData": [
{
"gender": "Male",
"genderCount": "140"
},
{
"gender": "Female",
"genderCount": "75"
}
]
},
{
"name": "Aedes",
"nameCount": "161",
"genderData": [
{
"gender": "Female",
"genderCount": "134"
},
{
"gender": "Male",
"genderCount": "27"
}
]
},
{
"name": "Culex",
"nameCount": "2610",
"genderData": [
{
"gender": "Male",
"genderCount": "1926"
},
{
"gender": "Female",
"genderCount": "684"
}
]
},
{
"name": "Kamp",
"nameCount": "465",
"genderData": [
{
"gender": "Male",
"genderCount": "465"
}
]
}
]
}
]
}
Expected Output JSON
{
"transactionId": "598958",
"type": "json",
"xyz": "pqr",
"name": "john",
"alert_array": [{
"abc": "123",
"xyz": "pqrs",
"properties": [{
"key": "Ama",
"value": "215",
"object": [{
"key": "Male",
"value": "140"
},
{
"key": "Female",
"value": "75"
}
]
},
{
"key": "Aedes",
"value": "161",
"object": [{
"key": "Male",
"value": "134"
},
{
"key": "Female",
"value": "27"
}
]
},
{
"key": "Culex",
"value": "2610",
"object": [{
"key": "Male",
"value": "1926"
},
{
"key": "Female",
"value": "684"
}
]
},
{
"key": "Kamp",
"value": "465",
"object": [{
"key": "Male",
"value": "465"
}]
},
{
"key": "type",
"value": "demo"
}
]
}]
}
I have the above following input JSON with nameData being an array and having genderData another array within it.
Need to convert the input JSON to expected JSON output as shown above.
please suggest JOLT spec to transform the above JSON.
You can start with determining the nodes alert_array,properties,object as diving to the innermost attributes within the shift transformation such as
[
{
"operation": "shift",
"spec": {
"data": {
"*": {
"Da*": "",
"nameD*": {
"#123": "alert_array[0].abc", // # wildcard is used to add fixed values
"#pqrs": "alert_array[0].xyz",
"*": {
"name": "alert_array[0].properties[&1].key",
"nameC*": "alert_array[0].properties[&1].value",
"*": "alert_array[0].properties[&1].object"
}
}
}
}
}
},
{
// rename the innermost attributes
"operation": "modify-overwrite-beta",
"spec": {
"*": {
"*": {
"*": {
"*": {
"*": {
"*": {
"key": "#(1,gender)",
"value": "#(1,genderCount)"
}
}
}
}
}
}
}
},
{
// get rid of the former names of the innermost attributes
"operation": "remove",
"spec": {
"*": {
"*": {
"*": {
"*": {
"*": {
"*": {
"gender": "",
"genderCount": ""
}
}
}
}
}
}
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is
Edit : You can add two more shift transformation specs in order to put the lately desired object
{
"key" : "type",
"value" : "demo"
}
such as
[
{
"operation": "shift",
"spec": {
"data": {
"*": {
"Da*": "",
"nameD*": {
"#123": "alert_array[0].abc", // # wildcard is used to add fixed values
"#pqrs": "alert_array[0].xyz",
"*": {
"name": "alert_array[0].properties[&1].key",
"nameC*": "alert_array[0].properties[&1].value",
"*": "alert_array[0].properties[&1].object"
}
}
}
}
}
},
{
// rename the innermost attributes
"operation": "modify-overwrite-beta",
"spec": {
"*": {
"*": {
"*": {
"*": {
"*": {
"*": {
"key": "#(1,gender)",
"value": "#(1,genderCount)"
}
}
}
}
}
}
}
},
{
// get rid of the former names of the innermost attributes
"operation": "remove",
"spec": {
"*": {
"*": {
"*": {
"*": {
"*": {
"*": {
"gender": "",
"genderCount": ""
}
}
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": "&",
"alert_array": {
"*": {
"*": "&2[&1].&",
"pro*": {
"#type": "&3[&2].&1.&2.key",
"#demo": "&3[&2].&1.&2.value",
"*": "&3[&2].&1.&"
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": "&",
"alert_array": {
"*": {
"*": "&2[&1].&",
"pro*": {
"*": {
"#": "&4[&3].&2"
}
}
}
}
}
}
]

How to write JOLT Spec for nested arrays

I am trying to transform a JSON using JOLT. This JSON consists of nested arrays and I am not able to transform it correctly. Can someone please help. Thanks.
{
"root": [
{
"id": "1234",
"password": "key1234",
"devices": [
{
"details": {
"deviceType": "tv-iot",
"deviceId": "tv-iot-111"
}
},
{
"details": {
"deviceType": "machine-iot",
"deviceId": "machine-iot-999"
}
}
]
},
{
"id": "6789",
"password": "key6789",
"devices": [
{
"details": {
"deviceType": "phone-iot",
"deviceId": "phone-iot-111"
}
},
{
"details": {
"deviceType": "mobile-iot",
"deviceId": "mobile-iot-999"
}
}
]
}
]
}
This is the spec that I have written.
[
{
"operation": "shift",
"spec": {
"root": {
"*": {
"id": "[&1].userid",
"password": "[&1].pwd",
"devices": {
"*": {
"details": {
"deviceType": "[&2].deviceCategory",
"deviceId": "[&2].deviceUniqueValue"
}
}
}
}
}
}
}
]
The expected JSON that I am looking for is:
[
{
"userid": "1234",
"pwd": "key1234",
"devices": [
{
"details": {
"deviceCategory": "tv-iot",
"deviceUniqueValue": "tv-iot-111"
}
},
{
"details": {
"deviceCategory": "machine-iot",
"deviceUniqueValue": "machine-iot-999"
}
}
]
},
{
"userid": "6789",
"pwd": "key6789",
"devices": [
{
"details": {
"deviceCategory": "phone-iot",
"deviceUniqueValue": "phone-iot-111"
}
},
{
"details": {
"deviceCategory": "mobile-iot",
"deviceUniqueValue": "mobile-iot-999"
}
}
]
}
]
However, I get this wrong output. Somehow, my nested objects are getting transformed into list.
[
{
"userid" : "1234",
"pwd" : "key1234",
"deviceCategory" : [ "tv-iot", "phone-iot" ],
"deviceUniqueValue" : [ "tv-iot-111", "phone-iot-111" ]
},
{
"deviceCategory" : [ "machine-iot", "mobile-iot" ],
"deviceUniqueValue" : [ "machine-iot-999", "mobile-iot-999" ],
"userid" : "6789",
"pwd" : "key6789"
}
]
I am unable to figure out what is wrong. Can someone please help?
UPDATE(Solution): Was able to come up with a shorter spec that works as well !
[
{
"operation": "shift",
"spec": {
"root": {
"*": {
"id": "[&1].userId",
"password": "[&1].pwd",
"*": "[&1].&"
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"devices": {
"*": {
"details": {
"deviceType": "[&4].&3.[&2].&1.deviceCategory",
"deviceId": "[&4].&3.[&2].&1.deviceUniqueVal"
}
}
},
"*": "[&1].&"
}
}
}
]
You can start by deep diving into the innermost object while partitioning the sub-objects by id values through a shift transformation such as
[
{
"operation": "shift",
"spec": {
"root": {
"*": {
"devices": {
"*": {
"details": {
"*": {
"#(4,id)": "#(5,id).userid",
"#(4,password)": "#(5,id).pwd",
"#": "#(5,id).devicedetails[&3].&2.&1"
}
}
}
}
}
}
}
},
{
// get rid of top level object names
"operation": "shift",
"spec": {
"*": ""
}
},
{
// get rid of repeating components of each arrays respectively
"operation": "cardinality",
"spec": {
"*": {
"us*": "ONE",
"pwd": "ONE"
}
}
},
{
// determine new key names for attributes respectively
"operation": "modify-overwrite-beta",
"spec": {
"*": {
"*": {
"*": {
"*": {
"deviceCategory": "=(#(1,deviceType))",
"deviceUniqueValue": "=(#(1,deviceId))"
}
}
}
}
}
},
{
// get rid of extra elements generated
"operation": "remove",
"spec": {
"*": {
"*": {
"*": {
"*": {
"deviceType": "",
"deviceId": ""
}
}
}
}
}
}
]

Update json attribute with a condition in jolt

I have a problem, I don't understand how to update an attribute with a condition in jolt. For example, I have an Object with an inner array of Items. I need to update an Item attribute if another Item attribute equals to something and to return the Object.
Input:
{
"object": {
"id": "3cf1543e-be4d-11eb-84c0-87ba01ce01e0",
"a": "abc",
"del_sign": false,
"items": [
{
"id": "111",
"del_sign": false
},
{
"id": "222",
"del_sign": false
},
{
"id": "333",
"del_sign": false
}
],
"b": [],
"c": []
}
}
I need:
{
"object": {
"id": "3cf1543e-be4d-11eb-84c0-87ba01ce01e0",
"a": "abc",
"del_sign": false,
"items": [
{
"id": "111",
// here changes to true
"del_sign": true
},
{
"id": "222",
"del_sign": false
},
{
"id": "333",
"del_sign": false
}
],
"b": [],
"c": []
}
}
My current jolt spec:
[
{
"operation": "shift",
"spec": {
"object": {
"items": {
"*": {
"id": {
"111": {
"#2": {
"#true": "del_sign",
"$1": "&3"
}
}
}
}
}
}
}
}
]
You can use two step of shift transformations.
Determine the related del_sign value within a conditional as having two arrays for id and del_sign within the same object(items), and properly format them within the second step such as
[
{
"operation": "shift",
"spec": {
"*": {
"*": "&",
"items": {
"*": {
"id": {
"111": {
"#(2,&1)": "&4.&3.id",
"#true": "&4.&3.del_sign"
},
"*": {
"#1": "&4.&3.&2",
"#(2,del_sign)": "&4.&3.del_sign"
}
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": "&",
"items": {
"*": {
"*": "&2.[#2].&"
}
}
}
}
]
Edit : If there are more attributes other than the current ones(id,del_sign), then prefer using the following code in order not to individually repeat the each key such as
[
{
"operation": "shift",
"spec": {
"*": {
"*": "&",
"items": {
"*": {
"id": {
"111": {
"#2": {
"#true": "&5.&2.del_sign",
"*": "&5.&1.&"
}
},
"*": {
"#2": "&4.&3"
}
}
}
}
}
}
},
{
"operation": "cardinality",
"spec": {
"items": {
"*": {
"del_sign": "ONE"
}
}
}
},
{
"operation": "shift",
"spec": {
"*": "&",
"items": {
"*": {
"*": "&2.[#2].&"
}
}
}
}
]

Need Date Field Transformation in JSON Using JOLT (Convert Date "1118083350" into Date : 18/11/2020 and Time: 053350)

Below is my Input
[
{
"corrId": "ed1e30",
"payloadFormat": "CASH",
"payload": {
"DateTime": "1118083350"
}
},
{
"correlationId": "ed1e30c",
"payloadFormat": "CREDIT",
"payload": {
"DateTime": "1119092545"
}
}
]
Expected output should be
[
{
"correlationId": "ed1e30",
"payloadFormat": "CASH",
"Date": "18/11/2020",
"Time": "083350"
},
{
"correlationId": "ed1e30c",
"payloadFormat": "CREDIT",
"Date": "19/11/2020",
"Time": "092545"
}
]
Jolt doesn't have Date utilities, but it can be done by modify-default-beta operations,
[
{
"operation": "modify-default-beta",
"spec": {
"*": {
"Time": "=substring(#(1,payload.DateTime),4,10)",
"month": "=substring(#(1,payload.DateTime),0,2)",
"day": "=substring(#(1,payload.DateTime),2,4)"
}
}
},
{
"operation": "modify-default-beta",
"spec": {
"*": {
"Date": "=concat(#(1,day),'/',#(1,month),'/2020')"
}
}
},
{
"operation": "remove",
"spec": {
"*": {
"payload": "",
"month": "",
"day": "",
"DateTime": ""
}
}
}
]
I have used below JOLT spec to convert the date.
[{
"operation": "shift",
"spec": {
"*": {
"#": "&",
"payload": {
"DateTime": "&2.payload.TMPDE"
}
}
}
}, {
"operation": "modify-default-beta",
"spec": {
"*": {
"payload": {
"DateM": "=substring(#(1,TMPDE), 0, 2)",
"DateD": "=substring(#(1,TMPDE), 2, 4)",
"Time": "=substring(#(1,TMPDE), 4, 10)",
"TrnDate": "=join('/',#(1,DateD),#(1,DateM),2020)"
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"correlationId": "[&1].COR_REL_ID",
"payloadFormat": "[&1].payloadFormat",
"payload": {
"#TrnDate": "[#3].Date",
"#Time": "[#3].Time"
}
}
}
}
]

sort an array of json document

I'm wondering if it's possible to sort or bring the min value in case of an array of json. I read something about this issue but found nothing.
This is the Input:
{
"intData": [
{
"DATE": "2018",
"NOME": "raf"
},
{
"DATE": "2001",
"NOME": "fabio"
},
{
"DATE": "2002",
"NOME": "fabiola"
}
]
}
I would:
{
"intData": [
{
"DATE": "2001",
"NOME": "fabio"
},
{
"DATE": "2002",
"NOME": "fabiola"
},
{
"DATE": "2018",
"NOME": "raf"
}
]
}
or
{
"DATE": "2001",
"NOME": "fabio"
}
Is it possible?
Ordered Results
The steps are as follows:
Create object with structure: $.DATE.NOME.#
Sort it
Turn it back into an array
[
{
"operation": "shift",
"spec": {
"intData": {
"*": {
"#": "#(1,DATE).#(1,NOME)"
}
}
}
},
{
"operation": "sort"
},
{
"operation": "shift",
"spec": {
"*": {
"*": {
"#": "intData.[]"
}
}
}
}
]
First result
The steps are as follows:
Create object with structure: $.DATE.NOME.#
Sort it
Turn it back into an array
Get first result
[
{
"operation": "shift",
"spec": {
"intData": {
"*": {
"#": "#(1,DATE).#(1,NOME)"
}
}
}
},
{
"operation": "sort"
},
{
"operation": "shift",
"spec": {
"*": {
"*": {
"#": "[]"
}
}
}
},
{
"operation": "shift",
"spec": {
"0": {
"#": ""
}
}
}
]