Jolt: Merge arrays from properties - json

I'm trying to extract and merge objects from an array contained in some (but not all) of my input elements. Using the JOLT JSON transformation library.
Also, the arrays I'm trying to merge contain objects that don't always have the same properties. One key might be present in some, but not others.
Example is contrived/nonsensical simplification, but has the general shape of our data.
Input:
{
"Widgets": [
{
"Id": "1",
"PetFriendly": "True",
"Features": [
{
"Name": "Easy Button",
"Type": "Button"
},
{
"Name": "Lunch Lever",
"Type": "Food Service",
"MenuItems": [
"Pizza",
"Cheezburger"
]
}
]
},
{
"Id": "2",
"PetFriendly": "True"
},
{
"Id": "3",
"PetFriendly": "False",
"Features": [
{
"Name": "Missles",
"Type": "Attack"
}
]
},
{
"Id": "4",
"PetFriendly": "False",
"Features": [
{
"Name": "Bombs",
"Type": "Attack",
"MenuItems": [
"Rat Poison"
]
}
]
}
]
}
Desired output:
{
"Widgets": [
{
"Id": "1"
"PetFriendly": "True"
},
{
"Id": "2"
"PetFriendly": "True"
},
{
"Id": "3",
"PetFriendly": "False"
},
{
"Id": "4",
"PetFriendly": "False"
}
],
"Features": [
{
"WidgetId": "1",
"Name": "Easy Button",
"Type": "Button"
},
{
"WidgetId": "1",
"Name": "Lunch Lever",
"Type": "Food Service",
"MenuItems": [
"Pizza",
"Cheezburger"
]
},
{
"WidgetId": "3",
"Name": "Missles",
"Type": "Attack"
},
{
"WidgetId": "4",
"Name": "Bombs",
"Type": "Attack",
"MenuItems": [
"Rat Poison"
]
}
]
}
I have tried many transforms with no success, and read all the ShiftR documentation and its unit tests. A little help?

Spec
[
{
"operation": "shift",
"spec": {
"Widgets": {
"*": {
// build the finished "Widgets" output
"Id": "Widgets[&1].Id",
"PetFriendly": "Widgets[&1].PetFriendly",
//
// Process the Features, by pushing the Id
// down into them, but maintain the same doubly
// nested structure.
// Shift works property by property, so first
// fix the properties in side each Features element,
// (pulling ID down).
// Then in a 2nd Shift can accumulate things into array.
"Features": {
"*": {
"#(2,Id)": "temp[&3].Features[&1].WidgetId",
"*": "temp[&3].Features[&1].&"
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
// passthru
"Widgets": "Widgets",
"temp": {
"*": {
"Features": {
// walk thru the doubly nested structure an
// now accumulate all non-null itens into
// the the final Features array.
"*": "Features[]"
}
}
}
}
}
]

Finally got it working with the below spec, BUT it has an undesirable side effect: It leaves empty default arrays. Is there a way to remove empty arrays, or otherwise mark them during the default step so they can be deleted? I checked this GitHub issue but not sure how to translate it to arrays of string. Anyone have a better solution?
[
// First fill in default value for "MenuItems" since not all Features have it.
{
"operation": "default",
"spec": {
"Widgets[]": {
"*": {
"Features[]": {
"*": {
"MenuItems": []
}
}
}
}
}
},
{
// Extract the Features' properties into arrays. The defaults added above ensure that we can merge the arrays into Feature objects as in this example:
// https://github.com/bazaarvoice/jolt/blob/master/jolt-core/src/test/resources/json/shiftr/mergeParallelArrays2_and-do-not-transpose.json.
"operation": "shift",
"spec": {
"Widgets": {
"*": {
"Id": "Widgets[&1].Id",
"PetFriendly": "Widgets[&1].PetFriendly",
"Features": {
"*": {
"#(2,Id)": "temp.WidgetId",
"Name": "temp.Name",
"Type": "temp.Type",
"MenuItems": "temp.MenuItems[]"
}
}
}
}
}
},
// Finally merge the arrays into Feature objects.
{
"operation": "shift",
"spec": {
"Widgets": "Widgets",
"temp": {
"WidgetId": {
"*": "Features[&0].WidgetId"
},
"Name": {
"*": "Features[&0].Name"
},
"Type": {
"*": "Features[&0].Type"
},
"MenuItems": {
"*": "Features[&0].MenuItems"
}
}
}
}
]
Result:
{
"Widgets": [
{
"Id": "1",
"PetFriendly": "True"
},
{
"Id": "2",
"PetFriendly": "True"
},
{
"Id": "3",
"PetFriendly": "False"
},
{
"Id": "4",
"PetFriendly": "False"
}
],
"Features": [
{
"WidgetId": "1",
"Name": "Easy Button",
"Type": "Button",
"MenuItems": []
},
{
"WidgetId": "1",
"Name": "Lunch Lever",
"Type": "Food Service",
"MenuItems": [ "Pizza", "Cheezburger" ]
},
{
"WidgetId": "3",
"Name": "Missles",
"Type": "Attack",
"MenuItems": []
},
{
"WidgetId": "4",
"Name": "Bombs",
"Type": "Attack",
"MenuItems": [ "Rat Poison" ]
}
]
}

Related

Conditional Jolt spec for nested array

I am trying to write Jolt spec for the following input. I need to populate the primaryEmail field based on the condition if primary field is true in the emails array
[
{
"uid": "1234mark",
"name": "mark",
"userName": "markw",
"displayName": "Mark W",
"emails": [
{
"primary": false,
"value": "mark#gmail.com"
},
{
"primary": true,
"value": "mark#hotmail.com"
}
]
},
{
"uid": "9876steve",
"name": "steve",
"userName": "stevew",
"displayName": "Steve W",
"emails": [
{
"primary": false,
"value": "steve#gmail.com"
},
{
"primary": true,
"value": "steve#hotmail.com"
}
]
}
]
The desired output is
[
{
"user": {
"externalId": "1234mark",
"name": "mark",
"userName": "markw",
"displayName": "Mark W",
"primaryEmail": "mark#hotmail.com"
}
},
{
"user": {
"externalId": "9876steve",
"name": "steve",
"userName": "stevew",
"displayName": "Steve W",
"primaryEmail": "steve#hotmail.com"
}
}
]
But I get the following incorrect output since I am not able to populate the primaryEmail field conditionally properly.
[
{
"user": {
"externalId": "1234mark",
"name": "mark",
"userName": "markw",
"displayName": "Mark W"
}
},
{
"user": {
"externalId": "9876steve",
"name": "steve",
"userName": "stevew",
"displayName": "Steve W"
}
}
]
The spec I have created is the following
[
{
"operation": "shift",
"spec": {
"*": {
"uid": "[&1].user.externalId",
"name": "[&1].user.name",
"userName": "[&1].user.userName",
"displayName": "[&1].user.displayName",
"title": "[&1].user.title",
"emails": {
"*": {
"primary": {
"true": {
"#(2,value)": "primaryEmail"
}
}
}
}
}
}
}
]
Could someone please help with this query. Thanks.
What you need is to go 5 levels the three up from the innermost object while adding an extra node called user such as
[
{
"operation": "shift",
"spec": {
"*": {
"uid": "[&1].user.externalId",
"*": "[&1].user.&", // the attributes except for "uid" and "emails" array
"emails": {
"*": {
"primary": {
"true": {
"#(2,value)": "[&5].user.&2Email" // replicate literal "primary" by using &2
}
}
}
}
}
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is

JOLT Transformation split objects from JSON with common values

I have an array of objects and I want to split each key from the object into a separate array in JSON.
Input JSON:
[
{
"Company": "Tesla",
"Age": 30.2
},
{
"Company": "Facebook",
"Age": 40.5
},
{
"Company": "Amazon",
"Age": 60
}
]
Expected Output
[
{
"value": "Tesla",
"variable": "Company",
"model": "device"
},
{
"value": "Facebook",
"variable": "Company",
"model": "device"
},
{
"value": "Amazon",
"variable": "Company",
"model": "device"
},
{
"value": 30.2,
"variable": "Age",
"model": "device"
},
{
"value": 40.5,
"variable": "Age",
"model": "device"
},
{
"value": 60,
"variable": "Age",
"model": "device"
}
]
Here the value of the Model is fixed. I have tried with some jolt spec. It's not working.
Any help would be much appreciated!
You can use shift transformation spec such as
[
{
// distinguish the attributes by indexes
"operation": "shift",
"spec": {
"*": {
"*": {
"#": "&1[&2].value", // # represents values of the keys inherited from one upper level, &1 represents keys "Company" or "Age", [&2] represents going tree two levels up to grab the outermost indexes in an array fashion
"$": "&1[&2].variable", // $ represents values of the keys inherited from one upper level
"#device": "&1[&2].model"
}
}
}
},
{
// get rid of object/array labels
"operation": "shift",
"spec": {
"*": {
"*": ""
}
}
}
]
the demo on the site http://jolt-demo.appspot.com/ is
Edit : If the input was
[
{
"Company": "Tesla",
"Age": 30.2,
"Time": "27/10/1992"
},
{
"Company": "Facebook",
"Age": 40.5,
"Time": "27/10/1982"
}
]
as mentioned in the comment, then you could convert the spec to
[
{
"operation": "shift",
"spec": {
"*": {
"Company|Age": {
"#": "&1[&2].value",
"$": "&1[&2].variable",
"#(1,Time)": "&1[&2].DateofBirth",
"#device": "&1[&2].model"
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"*": ""
}
}
}
]
in order to get the following desired output :
[
{
"value": "Tesla",
"variable": "Company",
"DateofBirth": "27/10/1992",
"model": "device"
},
{
"value": "Facebook",
"variable": "Company",
"DateofBirth": "27/10/1982",
"model": "device"
},
{
"value": 30.2,
"variable": "Age",
"DateofBirth": "27/10/1992",
"model": "device"
},
{
"value": 40.5,
"variable": "Age",
"DateofBirth": "27/10/1982",
"model": "device"
}
]

Jolt Conversion - iterate list within a list and form single list

I am trying to iterate lists inside a list and form a single list with multiple objects. Iterating lists, I am able to achieve. But applying tags before the list iterating to each object is not happening if there are more objects in single list.
My input request is like below:
[
{
"success": [
{
"id": "4",
"Offers": [
{
"name": "Optional",
"type": {
"id": "1",
"name": "Optional"
},
"productOfferings": [
{
"id": "3",
"name": "Test1"
}
]
},
{
"name": "Default",
"type": {
"id": "2",
"name": "Default"
},
"productOfferings": [
{
"id": "1",
"name": "Test2"
},
{
"id": "2",
"name": "Test3"
}
]
}
]
}
]
}
]
My spec is like below:
[
{
"operation": "shift",
"spec": {
"*": {
"success": {
"*": {
"Offers": {
"*": {
"name": "[&1].[&3].typeName",
"type": {
"id": "[&2].[&4].typeNameId",
"name": "[&2].[&4].typeNameValue"
},
"productOfferings": {
"*": {
"id": "[&3].[&1].id",
"name": "[&3].[&1].name"
}
}
}
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"*": "[]"
}
}
}
]
Output Received from spec:
[
{
"typeName": "Optional",
"typeNameId": "1",
"typeNameValue": "Optional",
"id": "3",
"name": "Test1"
},
{
"typeName": "Default",
"typeNameId": "2",
"typeNameValue": "Default",
"id": "1",
"name": "Test2"
},
{
"id": "2",
"name": "Test3"
}
]
But Expected output is like below:
[
{
"typeName": "Optional",
"typeNameId": "1",
"typeNameValue": "Optional",
"id": "3",
"name": "Test1"
},
{
"typeName": "Default",
"typeNameId": "2",
"typeNameValue": "Default",
"id": "1",
"name": "Test2"
},
{
"typeName": "Default",
"typeNameId": "2",
"typeNameValue": "Default",
"id": "2",
"name": "Test3"
}
]
If there are more objects inside productOfferings object, I am not able to add typeName,typeNameId, typeNameValue to the actual object. Please help to fix this issue.
You seem just needing to collect all into the productOfferings array while prepending each key by the common identifier [&3].[&1] such as
[
{
"operation": "shift",
"spec": {
"*": {
"success": {
"*": {
"Offers": {
"*": {
"productOfferings": {
"*": {
"#(2,name)": "[&3].[&1].typeName",
"#(2,type.id)": "[&3].[&1].typeNameId",
"#(2,type.name)": "[&3].[&1].typeNameValue",
"*": "[&3].[&1].&"
}
}
}
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"*": "[]"
}
}
}
]

JOLT transform flatten nested array with key value pairs

I'm trying to transform the following JSON
{
"data": {
"keyvalues": [
{
"key": "location",
"value": "sydney, au"
},
{
"key": "weather",
"value": "sunny"
}
]
},
"food": {
"name": "AllFoods",
"date": "2018-03-08T09:35:17-03:00",
"count": 2,
"food": [
{
"name": "chocolate",
"date": "2018-03-08T12:59:58-03:00",
"rating": "10",
"data": null
},
{
"name": "hot dog",
"date": "2018-03-08T09:35:17-03:00",
"rating": "7",
"data": {
"keyvalues": [
{
"key": "topping",
"value": "mustard"
},
{
"key": "BUN type",
"value": "toasted"
},
{
"key": "servings",
"value": "2"
}
]
}
}
]
}
}
Into, something simpler like this, using JOLT (in NIFI). Bringing the first top-level food attributes (name, date, count) into the header and then pulling the nested food array up, and then flattening out the food.data.keyvalues into a dict/hashmap.
{
"header": {
"location": "sydney, au",
"weather": "sunny",
"date": "2018-03-08",
"count": 2
},
"foods": [
{
"name": "chocolate",
"date": "2018-03-08T12:59:58-03:00",
"rating": "10"
},
{
"name": "hot dog",
"date": "2018-03-08T09:35:17-03:00",
"rating": "7",
"topping": "mustard",
"bun_type": "toasted",
"servings": "2"
}
]
}
I've got the first data part working, but I'm not sure how to handle the nested food element. The top level food info needs to move into the header section, and the second level food array, needs to flatten out the data.keyvalues.
Current spec... (only handles the top data.keyvalues)
[
{
"operation": "shift",
"spec": {
"data": {
"keyvalues": {
"*": { "#value": "#key" }
}
}
}
}
]
Spec
[
{
"operation": "shift",
"spec": {
"data": {
"keyvalues": {
"*": {
"value": "header.#(1,key)"
}
}
},
"food": {
"date": "header.date",
"count": "header.count",
"food": {
"*": {
"name": "foods[&1].name",
"date": "foods[&1].date",
"rating": "foods[&1].rating",
"data": {
"keyvalues": {
"*": {
"value": "foods[&4].#(1,key)"
}
}
}
}
}
}
}
}
]

Unable to transform a JSON Object into Array of objects using JOLT JSON library

Input JSON that I have to transform is as follows :
{
"Business": [
{
"Label": "Entertainment",
"category": "Advert",
"weight": "",
"types": [
"T1",
"T2"
]
},
{
"Label": "FMCG",
"category": "Campaign",
"weight": "",
"types": [
"T9",
"T10"
]
}
]
}
Expected Output :
{
"Business": [
{
"Label": "Entertainment",
"category": "Advert",
"weight": "",
"types": "T1"
},
{
"Label": "Entertainment",
"category": "Advert",
"weight": "",
"types": "T2"
},
{
"Label": "FMCG",
"category": "Campaign",
"weight": "",
"types": "T9"
},
{
"Label": "FMCG",
"category": "Campaign",
"weight": "",
"types": "T10"
}
]
}
I have tried the different JsonSpecs provided at the JOLT github help page. But I am not able to solve this. Any help or pointers will be appreciated.
You have to do two shift operations.
You want to "duplicate" the Label and Category based on how many entries you have in "types" array. So do that first, into a temporary "bizArray".
Also record which "type" goes with that duplicated Label and Category in a temporary "typeArray", that has the same indexes as the bizArray.
In the second shift, "join" the two parallel arrays, "bizArray" and "typesArray" to get your final array.
Spec
[
{
"operation": "shift",
"spec": {
"Business": {
"*": { // business array
"types": {
"*": { // type array
"#2": "bizArray[]", // make a copy of the whole biz object
"#": "typesArray[]"
}
}
}
}
}
},
{
"operation": "shift",
"spec": {
"bizArray": {
"*": { // bizArray index
"Label": "Business[&1].Label",
"category": "Business[&1].category",
"weight": "Business[&1].weight"
}
},
"typesArray": {
"*": "Business[&].types"
}
}
}
]