Using jolt transform, converting to nested json for multiple payloads - json

Convert the flat json to nested json for the multiple payloads. I am having some trouble with converting the flat JSON to nested JSON. Here, i want to aggregate the data to stops and need to be aggregated for unique payloads. I use https://jolt-demo.appspot.com to test the following below.
input:
[
{
"container_id": "DEF_id",
"haulType": "OL",
"loadNumber": "DO123345",
"billOfLading": "DO12345",
"referenceNumbers": "LoadIDEF",
"addressLine1": "DEF_address",
"stopReferenceId": "0004",
"stopType": "PL",
"containerNumber": "454545"
},
{
"container_id": "DEF_id",
"haulType": "OL",
"loadNumber": "DO123345",
"billOfLading": "DO12345",
"referenceNumbers": "LoadIDEF",
"addressLine1": null,
"stopReferenceId": "0003",
"stopType": "PU",
"containerNumber": "454545"
},
{
"container_id": "ABC_id",
"haulType": "IL",
"loadNumber": "BO123345",
"billOfLading": "BO12345",
"referenceNumbers": "LoadID",
"addressLine1": null,
"stopReferenceId": "0002",
"stopType": "PL",
"containerNumber": "232323"
},
{
"container_id": "ABC_id",
"haulType": "IL",
"loadNumber": "BO123345",
"billOfLading": "BO12345",
"referenceNumbers": "LoadID",
"addressLine1": "ABC Street",
"stopReferenceId": "0001",
"stopType": "PU",
"containerNumber": "232323"
}
]
Expected Output:
[
{
"load": {
"container_id": "DEF_id",
"haulType": [
"OL"
],
"loadNumber": "DO123345",
"billOfLading": "DO12345",
"referenceNumbers": [
"LoadIDEF"
],
"stops": [
{
"addressLine1": "DEF_address",
"stopReferenceId": "0004",
"stopType": "PL"
},
{
"addressLine1": null,
"stopReferenceId": "0003",
"stopType": "PU"
}
]
},
"containerInfo": {
"containerNumber": "454545"
}
},
{
"load": {
"container_id": "ABC_id",
"haulType": [
"IL"
],
"loadNumber": "BO123345",
"billOfLading": "BO12345",
"referenceNumbers": [
"LoadID"
],
"stops": [
{
"addressLine1": null,
"stopReferenceId": "0002",
"stopType": "PL"
},
{
"addressLine1": "ABC Street",
"stopReferenceId": "0001",
"stopType": "PU"
}
]
},
"containerInfo": {
"containerNumber": "232323"
}
}
]
Here it is my jolt spec used
[
{
"operation": "shift",
"spec": {
"*": {
"container_id": "#(1,containerNumber).load.&",
"haulType": "#(1,containerNumber).load.&",
"loadNumber": "#(1,containerNumber).load.&",
"billOfLading": "#(1,containerNumber).load.&",
"referenceNumbers": "#(1,containerNumber).load.&",
"addressLine1": "#(1,containerNumber).load.stops[&1].&",
"stopReferenceId": "#(1,containerNumber).load.stops[&1].&",
"stopType": "#(1,containerNumber).load.stops[&1].&",
"containerNumber": "#(1,containerNumber).containerInfo.&"
}
}
},
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"*": "ONE",
"stops": "MANY"
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"*": "&",
"load": {
"haulType|referenceNumbers": "&1.&[]",
"*": "&1.&"
}
}
}
}
]

No need to individually write the attributes considering the expected result. You can partition by containerNumber values along with * and & wildcards to reperesent the key-value pairs of all attributes within the first spec. Then the separation of attributes(conditional logic) should be performed within the second spec in order to distinguish the display style of each key-value pairs such as
[
{
"operation": "shift",
"spec": {
"*": {
"*": "#(1,containerNumber).&"
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"*": {
"0": "&2.load.&1"
},
"haulType|referenceN*": {
"0": "&2.load.&1[]" // 0 : pick only value of the first index from the array, &1[] : wrap up the values with square brackets
},
"addressLine1|stop*": {
"*": "&2.load.stops[&].&1"
},
"containerN*": {
"0": "&2.load.containerInfo.&1"
}
}
}
},
{
// get rid of object labels
"operation": "shift",
"spec": {
"*": ""
}
}
]
the demo on the site is http://jolt-demo.appspot.com/ :

Related

JOLT add a field in JSON in the middle of the structure

I try append a new field in a JSON structure,i need it in a determined position, before the array but the SPEC locate the field in the end.
This is my JSON original and the code of transformation written in JOLT TRANSFORMATION web:
INPUT:
{
"Id": ">COS",
"equipment": "ALA",
"elementId": "M15463",
"zone": "AMBA",
"hub": "AVA",
"terminalServer": "XS0156",
"Area": "null",
"timestamp": "1576155950000",
"collectedData": [
{
"name": "llljljiouohh",
"instance": "X1.M1.YE9.ON18",
"value": "5",
"unit": "db"
}
]
}
JSON spec:
[
{
"operation": "default",
"spec": {
"timestamp_dt": "2022-10-14 15:00"
}
}
]
**
result:**
{
"Id": ">COS",
"equipment": "ALA",
"elementId": "M15463",
"zone": "AMBA",
"hub": "AVA",
"terminalServer": "XS0156",
"Area": "null",
"timestamp": "1576155950000",
"collectedData": [
{
"name": "llljljiouohh",
"instance": "X1.M1.YE9.ON18",
"value": "5",
"unit": "db"
}
],
"timestamp_dt": "2022-10-14 15:00"
}
Expected:
{
"Id": ">COS",
"equipment": "ALA",
"elementId": "M15463",
"zone": "AMBA",
"hub": "AVA",
"terminalServer": "XS0156",
"Area": "null",
"timestamp": "1576155950000",
"timestamp_dt": "2022-10-14 15:00",
"collectedData": [
{
"name": "llljljiouohh",
"instance": "X1.M1.YE9.ON18",
"value": "5",
"unit": "db"
}
]
}
Any suggestion please? Thanks!
You can individually write each key-value pairs in the desired order within a shift transformation such as
[
{
"operation": "default",
"spec": {
"timestamp_dt": "2022-10-14 15:00"
}
},
{
"operation": "shift",
"spec": {
"Id": "&",
"equipment": "&",
"elementId": "&",
"zone": "&",
"hub": "&",
"terminalServer": "&",
"Area": "&",
"timestamp": "&",
"timestamp_dt": "&",
"collectedData": "&"
}
}
]

Convert Flat json to Nested Json with multiple arrays using Jolt transform

I'm trying to write a spec to do the below transformation using jolt transformation. I need to convert the flat json to nested Json
I am having some trouble with converting the flat JSON to Nested JSON. I have looked at examples and didn't get any closer as to what is mentioned above. I need to transform a JSON structure by using a JOLT spec. I use https://jolt-demo.appspot.com to test the following below.
Input :
[
{
"container_id": "a",
"carrier_scac": "b",
"location": "banglore",
"state": "karnataka",
"country": "India"
},
{
"container_id": "a",
"carrier_scac": "b",
"location": "pune",
"state": "maharashtra",
"country": "India"
},
{
"container_id": "c",
"carrier_scac": "d",
"location": "dharwad",
"state": "kan",
"country": "India"
},
{
"container_id": "c",
"carrier_scac": "d",
"location": "hubli",
"state": "kant",
"country": "India"
}
]
Desired Output:
[
{
"load": {
"container_id": "a",
"carrier_scac": "b",
"stops": [
{
"location": "banglore",
"state": "karnataka"
},
{
"location": "pune",
"state": "maharashtra"
}
]
},
"containerInfo": {
"country": "India"
}
},
{
"load": {
"container_id": "c",
"carrier_scac": "d",
"stops": [
{
"location": "dharwad",
"state": "kan"
},
{
"location": "hubli",
"state": "kant"
}
]
},
"containerInfo": {
"country": "India"
}
}
]
Jolt Spec that I'm using :
[
{
"operation": "shift",
"spec": {
"*": {
"container_id": "#(1,container_id).&",
"carrier_scac": "#(1,container_id).&",
"*": "#(1,container_id).stops[&1].&"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=recursivelySquashNulls"
}
},
{
"operation": "cardinality",
"spec": {
"*": {
"container_id": "ONE",
"carrier_scac": "ONE"
}
}
},
{
"operation": "shift",
"spec": {
"*": ""
}
}
]
Current spec is pretty good, just needs some little modifications such as adding load and containerInfo nodes, and shortening a bit as below
[
{
"operation": "shift",
"spec": {
"*": {
"*": "#(1,container_id).load.stops[&1].&", // "else" case
"country": "#(1,container_id).load.containerInfo.&",
"c*": "#(1,container_id).load.&" // the attributes starting with "c" but other than "country"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=recursivelySquashNulls"
}
},
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"c*": "ONE",
"containerInfo": {
"*": "ONE"
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": ""
}
}
]

Convert Flat json to Nested Json with multiple arrays and keep null values in output using Jolt transform

I'm trying to write a spec to do the below transformation using jolt transformation. I need to convert the flat JSON to nested JSON by keeping null values. I attached the input, expected output and jolt transform. I need to keep the null values in the output but it doesn't show in output after jolt transform. I didn't get exact output with my jolt transform.
I am having some trouble with converting the flat JSON to nested JSON. I have looked at examples and didn't get any closer as to what is mentioned above. I need to transform a JSON structure by using a JOLT spec. I use https://jolt-demo.appspot.com to test the following below.
Input:
[
{
"container_id": "ABC",
"shipperN": null,
"PNumber": null,
"trackingNumber": null,
"priority": null,
"HType": "IN_Load",
"loadNumber": "123345",
"billOfLading": "12345",
"referenceNumbers": "LID",
"addressLine1": "ABC Street",
"addressLine2": "null",
"city": "Chicago",
"country": "US",
"latitude": "null",
"longitude": "null",
"earliestAppointmentTime": "XXXXX09:25",
"latestAppointmentTime": "XXXXX09:25",
"postalCode": "XXXXX3",
"sequence": "1",
"state": "XY",
"stopReferenceId": "0001",
"stopType": "PU",
"truckNumber": null,
"trailerNumber": null,
"driverPhone": null,
"railEquipmentInitials": null,
"railEquipmentNumber": null,
"containerNumber": "XXXXXXXX"
},
{
"container_id": "ABC",
"shipperN": null,
"PNumber": null,
"trackingNumber": null,
"priority": null,
"HType": "IN_Load",
"loadNumber": "123345",
"billOfLading": "12345",
"referenceNumbers": "LID",
"addressLine1": "null",
"addressLine2": "null",
"city": "null",
"country": "null",
"latitude": null,
"longitude": null,
"earliestAppointmentTime": "XXXXX09:25",
"latestAppointmentTime": "XXXXX09:25",
"name": "null",
"postalCode": "null",
"sequence": "2",
"state": "null",
"stopReferenceId": "XXXXD",
"stopType": "PL",
"truckNumber": null,
"trailerNumber": null,
"driverPhone": null,
"railEquipmentInitials": null,
"railEquipmentNumber": null,
"containerNumber": "XXXXXXXX"
}
]
Desired Output:
{
"load": {
"container_id": "ABC",
"shipperN": null,
"PNumber": null,
"trackingNumber": null,
"priority": null,
"HType": [ "IN_Load" ],
"loadNumber": "123345",
"billOfLading": "12345",
"referenceNumbers": [ "LID" ],
"stops": [
{
"addressLine1": "ABC Street",
"addressLine2": "null",
"city": "Chicago",
"country": "US",
"earliestAppointmentTime": "XXXXX09:25",
"latestAppointmentTime": "XXXXX09:25",
"postalCode": "XXXXX3",
"sequence": "1",
"state": "XY",
"stopReferenceId": "0001",
"stopType": "PU"
},
{
"earliestAppointmentTime": "2021-03-09T15:25:00.203Z",
"latestAppointmentTime": "2021-03-09T15:25:00.203Z",
"sequence": "2",
"stopReferenceId": "dummy",
"stopType": "PL",
"externalAddressId": "dummy"
}
]
},
"containerInfo": {
"containerNumber": "XXXXXXXX"
},
"trackingInfo": {
"truckNumber": null,
"trailerNumber": null,
"driverPhone": null,
"railEquipmentInitials": null,
"railEquipmentNumber": null
}
}
Jolt Spec that I'm using :
[
{
"operation": "shift",
"spec": {
"*": {
"*": "#(1,container_id).load.stops[&1].&",
"container_id": "#(1,container_id).load.&", // "else" case
"shipperN": "#(1,container_id).load.&",
"PNumber": "#(1,container_id).load.&",
"trackingNumber": "#(1,container_id).load.&",
"priority": "#(1,container_id).load.&",
"HType": "#(1,container_id).load.&",
"loadNumber": "#(1,container_id).load.&",
"billOfLading": "#(1,container_id).load.&",
"referenceNumbers": "#(1,container_id).load.&",
"containerNumber": "#(1,container_id).containerInfo.&",
"truckNumber": "#(1,container_id).trackingInfo.&",
"trailerNumber": "#(1,container_id).trackingInfo.&",
"driverPhone": "#(1,container_id).trackingInfo.&",
"railEquipmentInitials": "#(1,container_id).trackingInfo.&",
"railEquipmentNumber": "#(1,container_id).trackingInfo.&"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": "=recursivelySquashNulls"
}
},
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"container_id": "ONE",
"shipperN": "ONE",
"PNumber": "ONE",
"trackingNumber": "ONE",
"priority": "ONE",
"HType": "ONE",
"referenceNumbers": "ONE",
"loadNumber": "ONE",
"billOfLading": "ONE",
"containerInfo": {
"*": "ONE"
},
"trackingInfo": {
"*": "ONE"
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": ""
}
}
]
You're so close;
The spec containing recursivelySquashNulls should be removed
The identifier .&[] should be used proper to the attributes HType and referenceNumbers
The cardinality spec preferably be shortened
So use the following as a whole spec
[
{
"operation": "shift",
"spec": {
"*": {
"*": "#(1,container_id).load.stops[&1].&",
"container_id": "#(1,container_id).load.&", // "else" case
"shipperN": "#(1,container_id).load.&",
"PNumber": "#(1,container_id).load.&",
"trackingNumber": "#(1,container_id).load.&",
"priority": "#(1,container_id).load.&",
"HType": "#(1,container_id).load.&",
"loadNumber": "#(1,container_id).load.&",
"billOfLading": "#(1,container_id).load.&",
"referenceNumbers": "#(1,container_id).load.&",
"containerNumber": "#(1,container_id).containerInfo.&",
"truckNumber": "#(1,container_id).trackingInfo.&",
"trailerNumber": "#(1,container_id).trackingInfo.&",
"driverPhone": "#(1,container_id).trackingInfo.&",
"railEquipmentInitials": "#(1,container_id).trackingInfo.&",
"railEquipmentNumber": "#(1,container_id).trackingInfo.&"
}
}
},
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"*": "ONE",
"stops": "MANY"
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"*": "&",
"load": {
"HType|referenceNumbers": "&1.&[]",
"*": "&1.&" // &1 stands for the key "load", and & replicates the leaf values
}
}
}
}
]

JOLT - Remove duplicates in array

I need to remove duplicates from docAddrs array from my document and keep the rest of the json unchanged. The last transformation is moving all the data into docAddrs array, instead of just the addr objects. This is what I tried:
Input:
{
"docId1": "1",
"docId2": "2",
"docInfo": {
"info1": "info1",
"info2": "info2",
"lines": [
{
"lineNum": "1",
"val": "1"
},
{
"lineNum": "2",
"val": "2"
}
]
},
"docAddrs": [
{
"addrId": "111",
"street": "street1",
"city": "city1",
"st": "st"
},
{
"addrId": "111",
"street": "street1",
"city": "city1",
"st": "st"
},
{
"addrId": "112",
"street": "street2",
"city": "city2",
"st": "st2"
},
{
"addrId": "112",
"street": "street2",
"city": "city2",
"st": "st2"
}
]
}
Spec:
[
{
"operation": "shift",
"spec": {
"*": "&",
"docAddrs": {
"*": "#addrId[]"
}
}
},
{
"operation": "cardinality",
"spec": {
"*": "ONE"
}
},
{
"operation": "shift",
"spec": {
"*": {
"docId1": "docId1",
"docId2": "docId2",
"docInfo": "docInfo",
"#": "docAddrs.[]"
}
}
}
]
Output:
{
"docId1": "1",
"docId2": "2",
"docInfo": {
"info1": "info1",
"info2": "info2",
"lines": [
{
"lineNum": "1",
"val": "1"
},
{
"lineNum": "2",
"val": "2"
}
]
},
"111": [
{
"addrId": "111",
"street": "street1",
"city": "city1",
"st": "st"
},
{
"addrId": "111",
"street": "street1",
"city": "city1",
"st": "st"
}
],
"112": [
{
"addrId": "112",
"street": "street2",
"city": "city2",
"st": "st2"
},
{
"addrId": "112",
"street": "street2",
"city": "city2",
"st": "st2"
}
]
}
Expected Output:
{
"docId1": "1",
"docId2": "2",
"docInfo": {
"info1": "info1",
"info2": "info2",
"lines": [
{
"lineNum": "1",
"val": "1"
},
{
"lineNum": "2",
"val": "2"
}
]
},
"docAddrs": [
{
"addrId": "111",
"street": "street1",
"city": "city1",
"st": "st"
},
{
"addrId": "112",
"street": "street2",
"city": "city2",
"st": "st2"
}
]
}
Can someone please suggest how I can get this to work. Thanks in advance
You can use the combination of the following specs
[
//exchange key-value pairs for "docAddrs" array
{
"operation": "shift",
"spec": {
"*": "&",
"docAddrs": {
"*": {
"*": {
"$": "&3[#(2,addrId)].#(0)"
}
}
}
}
},
// pick only the first components of the values from the newly formed array type values
// those already have identical components per each
{
"operation": "cardinality",
"spec": {
"docAddrs": {
"*": {
"*": "ONE"
}
}
}
},
// exchange key-value pairs again in order to collect each value array pairs
// under common keys respectively
{
"operation": "shift",
"spec": {
"*": "&",
"docAddrs": {
"*": {
"*": {
"$": "&3.#(0)"
}
}
}
}
},
// dissipate each value components to their related object
{
"operation": "shift",
"spec": {
"*": "&",
"docAddrs": {
"*": {
"*": "&2[&].&1"
}
}
}
}
]
The Demo on http://jolt-demo.appspot.com/

apache nifi- how to create a custom date format

I am new to nifi and I am trying to create a week_start_date and week_number from the date in json format.
I am using jolt transform.
The input is google ads api response.
This is the spec I use:
[
{
"operation": "shift",
"spec": {
"customer_id": {
"*": "[&].customer_id"
},
"customer_name": {
"*": "[&].customer_name"
},
"account_currency_code": {
"*": "[&].account_currency_code"
},
"campaign_id": {
"*": "[&].campaign_id"
},
"campaign_name": {
"*": "[&].campaign_name"
},
"campaign_status": {
"*": "[&].campaign_status"
},
"ad_group_id": {
"*": "[&].ad_group_id"
},
"ad_group_name": {
"*": "[&].ad_group_name"
},
"clicks": {
"*": "[&].clicks"
},
"cost": {
"*": "[&].cost"
},
"impressions": {
"*": "[&].impressions"
},
"device": {
"*": "[&].device"
},
"date": {
"*": "[&].date"
},
"week_number": {
"*": "[&].week_number"
},
"year": {
"*": "[&].year"
},
"keywords": {
"*": "[&].keywords"
},
"keywords_id": {
"*": "[&].keywords_id"
}
}
},
{
"operation": "modify-default-beta",
"spec": {
"date": {
"date": "=intSubtract(#(1,date))"
}
}
}
]
The expected output should be:
[
{
"customer_id": "2538943578",
"customer_name": "test.com",
"account_currency_code": "USD",
"campaign_id": "11137311251",
"campaign_name": "testers",
"campaign_status": "ENABLED",
"ad_group_id": "1111",
"ad_group_name": "tesst- E",
"clicks": "6",
"cost": "26580000",
"impressions": "40",
"device": "DESKTOP",
"date": "2021-12-01",
"week_number": "48",
"week_start_date": "2021-11-29",
"year": 2021,
"keywords": "test",
"keywords_id": "56357925842"
}
]
the output I have:
[
{
"customer_id": "2538943578",
"customer_name": "test.com",
"account_currency_code": "USD",
"campaign_id": "11137311251",
"campaign_name": "testers",
"campaign_status": "ENABLED",
"ad_group_id": "1111",
"ad_group_name": "tesst- E",
"clicks": "6",
"cost": "26580000",
"impressions": "40",
"device": "DESKTOP",
"date": "2021-12-01",
"week_number": "2021-11-29",
"year": 2021,
"keywords": "test",
"keywords_id": "56357925842"
}
]
I am not sure on how to use correctly the modify-default-beta
Also I tried looking at the docs:
https://github.com/bazaarvoice/jolt/tree/master/jolt-core/src/test/resources/json/shiftr
What is the correct way also to understand the structure?