How to Parse Json Object Element that contains Dynamic Attribute - json

We have a Json object with the following structure.
{
"results": {
"timesheets": {
"135288482": {
"id": 135288482,
"user_id": 1242515,
"jobcode_id": 17288283,
"customfields": {
"19142": "Item 1",
"19144": "Item 2"
},
"attached_files": [
50692,
44878
],
"last_modified": "1970-01-01T00:00:00+00:00"
},
"135288514": {
"id": 135288514,
"user_id": 1242509,
"jobcode_id": 18080900,
"customfields": {
"19142": "Item 1",
"19144": "Item 2"
},
"attached_files": [
50692,
44878
],
"last_modified": "1970-01-01T00:00:00+00:00"
}}
We need to access the elements that is inside the results --> timesheets --> Dynamic id.
Example:
{
"id": 135288482,
"user_id": 1242515,
"jobcode_id": 17288283,
"customfields": {
"19142": "Item 1",
"19144": "Item 2"
},
"attached_files": [
50692,
44878
],
"last_modified": "1970-01-01T00:00:00+00:00"
}
The problem is that "135288482": { is dynamic. How do we access what is inside of it.
We are trying to create data flow to parse the data. The data is dynamic, so accessing via attribute name is not possible.

AFAIK, as per your JSON structure and dynamic keys it might not be possible to get the desired result using Dataflow.
I have reproduced the above and able to get it done using set variable and ForEach like below.
In your JSON, the keys and the values for the ids are same. So, I have used that to get the list of keys first. Then using that list of keys, I am able to access the inner JSON object.
These are my variable in the pipeline:
First, I have a set variable of type string and stored "id" in it.
Then I have taken lookup activity to get the above JSON file from blob.
I have stored lookup the timesheets objects as string in a set variable using the below dynamic content
#string(activity('Lookup1').output.value[0].results.timesheets)
I have used split on that string with "id" and stored the result array in an array variable.
#split(variables('jsonasstring'), variables('ids'))
This will give the array like below.
Now, I took a Foreach to this array but skipped the first element. In Each iteration, I have taken first 9 indexes of the string to append variable and that is the key.
If your id values and keys are not same, then you can skip the last from split array and take the keys from the reverse side of the string as per your requirement.
This is my dynamic content for append variable activity inside ForEach #take(item(), 9)
Then I took another ForEach, and given this keys list array to it. Inside foreach you can access the JSON with below dynamic content.
#string(activity('Lookup1').output.value[0].results.timesheets[item()])
This is my pipeline JSON:
{
"name": "pipeline1",
"properties": {
"activities": [
{
"name": "Lookup1",
"type": "Lookup",
"dependsOn": [
{
"activity": "for ids",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"source": {
"type": "JsonSource",
"storeSettings": {
"type": "AzureBlobFSReadSettings",
"recursive": true,
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "JsonReadSettings"
}
},
"dataset": {
"referenceName": "Json1",
"type": "DatasetReference"
},
"firstRowOnly": false
}
},
{
"name": "JSON as STRING",
"type": "SetVariable",
"dependsOn": [
{
"activity": "Lookup1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "jsonasstring",
"value": {
"value": "#string(activity('Lookup1').output.value[0].results.timesheets)",
"type": "Expression"
}
}
},
{
"name": "for ids",
"type": "SetVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "ids",
"value": {
"value": "#string('\"id\":')",
"type": "Expression"
}
}
},
{
"name": "after split",
"type": "SetVariable",
"dependsOn": [
{
"activity": "JSON as STRING",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "split_array",
"value": {
"value": "#split(variables('jsonasstring'), variables('ids'))",
"type": "Expression"
}
}
},
{
"name": "ForEach to append keys to array",
"type": "ForEach",
"dependsOn": [
{
"activity": "after split",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "#skip(variables('split_array'),1)",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Append variable1",
"type": "AppendVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "key_ids_array",
"value": {
"value": "#take(item(), 9)",
"type": "Expression"
}
}
}
]
}
},
{
"name": "ForEach to access inner object",
"type": "ForEach",
"dependsOn": [
{
"activity": "ForEach to append keys to array",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "#variables('key_ids_array')",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Each object",
"type": "SetVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "show_res",
"value": {
"value": "#string(activity('Lookup1').output.value[0].results.timesheets[item()])",
"type": "Expression"
}
}
}
]
}
}
],
"variables": {
"jsonasstring": {
"type": "String"
},
"ids": {
"type": "String"
},
"split_array": {
"type": "Array"
},
"key_ids_array": {
"type": "Array"
},
"show_res": {
"type": "String"
}
},
"annotations": []
}
}
Result:
use #json() in the dynamic content to convert the below string to an object.
If you want to store this JSONs in a file, use a ForEach and inside Foreach use an SQL script to copy each object as a row to table in each iteration using JSON_VALUE . Then outside Foreach use copy activity to copy that SQL table to your destination as per your requirement.

Related

Parsing JSON array to individual variables in Azure logic apps

I am trying to write a logic app to parse a Json Object and Update salesforce record. I am pretty new to both Salesforce and Azure logic apps, so I am trying to figure this out. Below is my Json File
{
"ContactId": null,
"Email": "asong#uog.com",
"IsInternalUpdate": false,
"Preferences": [
{
"PrefCode": "EmailOptIn",
"CurrentValue": "Yes",
"Locale": "en-US"
},
{
"PrefCode": "MobilePhone",
"CurrentValue": "1234567890",
"Locale": "en-US"
},
{
"PrefCode": "SMSOptIn",
"CurrentValue": "Yes",
"Locale": "en-US"
},
{
"PrefCode": "ProductTrends",
"CurrentValue": "ProductTrends,OffersPromotions",
"Locale": "en-US"
},
]
}
Based on email value, I need to update a custom object in Salesforce. From the preference array, Prefcode value maps to a field in Salesforce and Current value maps to field value. i.e below snippet translates to set the value for EmailOptIn field in Salesforce to "Yes"
{
"PrefCode": "EmailOptIn",
"CurrentValue": "Yes",
"Locale": "en-US"
}
So far, I was able to pass hardcoded values and successfully update salesforce record from logic app.
I am trying to set individual variables for each field, so that I can pass it directly to salesforce. I have two issues that I am running into
What is the best way to capture the field value mapping?
I have couple of fields that allow multi select, how do I set the multiselect values. Below is an example
{
"PrefCode": "ProductTrends",
"CurrentValue": "ProductTrends,OffersPromotions",
"Locale": "en-US"
}
Below is my logic app structure
1
2
What is the best way to capture the field value mapping?
I could able to achieve your requirement by constructing JSON with required details and used parse JSON step again. Below is the flow of the logic app that worked for me.
I have couple of fields that allow multi select, how do I set the multiselect values. Below is an example
I have converted the ProductTrends into array and retrieved each item using its index as below.
array(split(body('Parse_JSON_2')?['ProductTrends'],','))[0]
array(split(body('Parse_JSON_2')?['ProductTrends'],','))[1]
Alternatively, if it is possible to make things work from the target end for multiple values, you can send in array or string directly to salesforce.
Below is the complete JSON of my logic app.
{
"definition": {
"$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
"actions": {
"Compose": {
"inputs": {
"ContactId": null,
"Email": "asong#uog.com",
"IsInternalUpdate": false,
"Preferences": [
{
"CurrentValue": "Yes",
"Locale": "en-US",
"PrefCode": "EmailOptIn"
},
{
"CurrentValue": "1234567890",
"Locale": "en-US",
"PrefCode": "MobilePhone"
},
{
"CurrentValue": "Yes",
"Locale": "en-US",
"PrefCode": "SMSOptIn"
},
{
"CurrentValue": "ProductTrends,OffersPromotions",
"Locale": "en-US",
"PrefCode": "ProductTrends"
}
]
},
"runAfter": {},
"type": "Compose"
},
"Compose_2": {
"inputs": "#array(split(body('Parse_JSON_2')?['ProductTrends'],','))[0]",
"runAfter": {
"Parse_JSON_2": [
"Succeeded"
]
},
"type": "Compose"
},
"For_each": {
"actions": {
"Append_to_string_variable": {
"inputs": {
"name": "Preferences",
"value": "\"#{items('For_each')?['PrefCode']}\":\"#{items('For_each')?['CurrentValue']}\",\n"
},
"runAfter": {},
"type": "AppendToStringVariable"
}
},
"foreach": "#body('Parse_JSON')?['Preferences']",
"runAfter": {
"Initialize_variable": [
"Succeeded"
]
},
"type": "Foreach"
},
"Initialize_variable": {
"inputs": {
"variables": [
{
"name": "Preferences",
"type": "string"
}
]
},
"runAfter": {
"Parse_JSON": [
"Succeeded"
]
},
"type": "InitializeVariable"
},
"Parse_JSON": {
"inputs": {
"content": "#outputs('Compose')",
"schema": {
"properties": {
"ContactId": {},
"Email": {
"type": "string"
},
"IsInternalUpdate": {
"type": "boolean"
},
"Preferences": {
"items": {
"properties": {
"CurrentValue": {
"type": "string"
},
"Locale": {
"type": "string"
},
"PrefCode": {
"type": "string"
}
},
"required": [
"PrefCode",
"CurrentValue",
"Locale"
],
"type": "object"
},
"type": "array"
}
},
"type": "object"
}
},
"runAfter": {
"Compose": [
"Succeeded"
]
},
"type": "ParseJson"
},
"Parse_JSON_2": {
"inputs": {
"content": "#concat('{',slice(variables('Preferences'),0,lastIndexOf(variables('Preferences'),',')),'}')",
"schema": {
"properties": {
"EmailOptIn": {
"type": "string"
},
"MobilePhone": {
"type": "string"
},
"ProductTrends": {
"type": "string"
},
"SMSOptIn": {
"type": "string"
}
},
"type": "object"
}
},
"runAfter": {
"For_each": [
"Succeeded"
]
},
"type": "ParseJson"
}
},
"contentVersion": "1.0.0.0",
"outputs": {},
"parameters": {},
"triggers": {
"manual": {
"inputs": {
"schema": {}
},
"kind": "Http",
"type": "Request"
}
}
},
"parameters": {}
}
After following the above process, I could able to get the preferences values individually.

Azure data factory appending to Json

Wanted to pick your brains on something
So, in Azure data factory, I am running a set of activities which at the end of the run produce a json segment
{"name":"myName", "email":"email#somewhere.com", .. <more elements> }
This set of activities occurs in a loop - Loop Until activity.
My goal is to have a final JSON object like this:
"profiles":[
{"name":"myName", "email":"email#somewhere.com", .. <more elements> },
{"name":"myName", "email":"email#somewhere.com", .. <more elements> },
{"name":"myName", "email":"email#somewhere.com", .. <more elements> },
...
{"name":"myName", "email":"email#somewhere.com", .. <more elements> }
]
That is a concatenation of all the individual ones.
To put in perspective, each individual item is a paged data from a rest api - and all them constitute the final response. I have no control over how many are there.
I understand how to concatenate individual items using 2 variables
jsonTemp = #concat(finalJson, individualResponse)
finalJson = jsonTemp
But, I do not know how to make it all under the single roof "profiles" afterwards.
So this is a bit of a hacky way of doing it and happy to hear a better solution.
I'm assuming you have stored all your results in an array variable (let's call this A).
First step is to find the number of elements in this array. You can
do this using the length(..) function.
Then you go into a loop, interating a counter variable and
concatenating each of the elements of the array making sure you add
a ',' in between each element. You have to make sure you do not add
the ',' after the last element(You will need to use an IF condition
to check if your counter has reached the length of the array. At the
end of this you should have 1 string variable like this.
{"name":"myName","email":"email#somewhere.com"},{"name":"myName","email":"email#somewhere.com"},{"name":"myName","email":"email#somewhere.com"},{"name":"myName","email":"email#somewhere.com"}
Now all you need to do this this expression when you are pushing the
response anywhere.
#json(concat('{"profiles":[',<your_string_variable_here>,']}'))
I agree with #Anupam Chand and I am following the same process with a different Second step.
You mentioned that your object data comes from API pages and to end the until loop you have to give a condition to about the number of pages(number of iterations).
This is my pipeline:
First I have initilized a counter and used that counter each web page URL and in until condition to meet a certain number of pages. As ADF do not support self referencing variables, I have used another temporary variable to increment the counter.
To Store the objects of each iteration from web activity, I have created an array variable in pipeline.
Inside ForEach, use append variable activity to append each object to the array like below.
For my sample web activity the dynamic content will be #activity('Data from REST').output.data[0]. for you it will be like #activity('Web1').output. change it as per your requirement.
This is my Pipeline JSON:
{
"name": "pipeline3",
"properties": {
"activities": [
{
"name": "Until1",
"type": "Until",
"dependsOn": [
{
"activity": "Counter intialization",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"expression": {
"value": "#equals('3', variables('counter'))",
"type": "Expression"
},
"activities": [
{
"name": "Data from REST",
"type": "WebActivity",
"dependsOn": [
{
"activity": "counter in temp variable",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"url": {
"value": "https://reqres.in/api/users?page=#{variables('counter')}",
"type": "Expression"
},
"method": "GET"
}
},
{
"name": "counter in temp variable",
"type": "SetVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "tempCounter",
"value": {
"value": "#variables('counter')",
"type": "Expression"
}
}
},
{
"name": "Counter increment using temp",
"type": "SetVariable",
"dependsOn": [
{
"activity": "Data from REST",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "counter",
"value": {
"value": "#string(add(int(variables('tempCounter')),1))",
"type": "Expression"
}
}
},
{
"name": "Append web output to array",
"type": "AppendVariable",
"dependsOn": [
{
"activity": "Counter increment using temp",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "arr",
"value": {
"value": "#activity('Data from REST').output.data[0]",
"type": "Expression"
}
}
}
],
"timeout": "0.12:00:00"
}
},
{
"name": "Counter intialization",
"type": "SetVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "counter",
"value": {
"value": "#string('1')",
"type": "Expression"
}
}
},
{
"name": "To show res array",
"type": "SetVariable",
"dependsOn": [
{
"activity": "Until1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "res_show",
"value": {
"value": "#variables('arr')",
"type": "Expression"
}
}
}
],
"variables": {
"arr": {
"type": "Array"
},
"counter": {
"type": "String"
},
"tempCounter": {
"type": "String"
},
"res_show": {
"type": "Array"
},
"arr_string": {
"type": "String"
}
},
"annotations": []
}
}
Result in an array variable:
You can access this array by the variable name. If you want the output to be like yours, you can use the below dynamic content.
#json(concat('{','"profile":',string(variables('res_show')),'}')))
However, if you want to store this in a variable, you have to wrap it in #string() as currently, ADF variables only supports int, string and array type only.

Map step function result and extract certain keys

I have the following output coming from a step function task: ListObjectsV2
{
"Contents": [
{
"ETag": "\"86c12c034bc6c30cb89b500b954c188f\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_1.csv",
"LastModified": "2023-02-09T13:46:20Z",
"Size": 796014,
"StorageClass": "STANDARD"
},
{
"ETag": "\"58e4a770e0f66073b00d185df500f07f\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_2.csv",
"LastModified": "2023-02-09T13:47:20Z",
"Size": 934038,
"StorageClass": "STANDARD"
},
{
"ETag": "\"460abd0de64d5cb67e8f0d46878cb1ef\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_3.csv",
"LastModified": "2023-02-09T13:46:57Z",
"Size": 794264,
"StorageClass": "STANDARD"
},
{
"ETag": "\"1bfedc3dc92e4ba8d04e24b9b5a0ed58\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_4.csv",
"LastModified": "2023-02-09T13:46:24Z",
"Size": 788756,
"StorageClass": "STANDARD"
},
{
"ETag": "\"9d6c434ce5ebdf203a790fbcf19338dc\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_5.csv",
"LastModified": "2023-02-09T13:47:07Z",
"Size": 831156,
"StorageClass": "STANDARD"
}
],
"IsTruncated": false,
"KeyCount": 5,
"MaxKeys": 1000,
"Name": "vita-internal-text-classification-dev-183576513728",
"Prefix": "55271f52fffe4461a2ee3228ebb97157"
}
I want to have an array containing only the Key key, to pass to the next state, like so:
[
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_1.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_2.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_3.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_4.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_5.csv",
}
]
So far I've tried setting the ResultPath to:
$.Contents[*].Key
$.Contents[*].['Key']
What I get is:
[
"55271f52fffe4461a2ee3228ebb97157/input/batch_1.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_2.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_3.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_4.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_5.csv",
]
But I've gotten bad output from that, any help?
The way I've solved this is to use an Inline Map state with a Pass state to build the necessary format. You can see this pattern in an example here for how to use Step Functions Distributed Map to bulk delete objects from S3. You can see this in the inner Create Object Identifier Array Map state. If you were doing this in Standard Workflows, this could be a cost concern given the number of state transitions involved. But since in the Item Processor I'm using Express Workflows, which are billed by duration (and these are super fast), it works pretty well.
{
"Comment": "A state machine to bulk delete objects from S3 using Distributed Map",
"StartAt": "Confirm Bucket Provided",
"States": {
"Confirm Bucket Provided": {
"Type": "Choice",
"Choices": [
{
"Not": {
"Variable": "$.bucket",
"IsPresent": true
},
"Next": "Fail - No Bucket"
}
],
"Default": "Check for Prefix"
},
"Check for Prefix": {
"Type": "Choice",
"Choices": [
{
"Not": {
"Variable": "$.prefix",
"IsPresent": true
},
"Next": "Generate Parameters - Without Prefix"
}
],
"Default": "Generate Parameters - With Prefix"
},
"Generate Parameters - Without Prefix": {
"Type": "Pass",
"Parameters": {
"Bucket.$": "$.bucket",
"Prefix": ""
},
"ResultPath": "$.list_parameters",
"Next": "Delete Objects from S3 Bucket"
},
"Fail - No Bucket": {
"Type": "Fail",
"Error": "InsuffcientArguments",
"Cause": "No Bucket was provided"
},
"Generate Parameters - With Prefix": {
"Type": "Pass",
"Next": "Delete Objects from S3 Bucket",
"Parameters": {
"Bucket.$": "$.bucket",
"Prefix.$": "$.prefix"
},
"ResultPath": "$.list_parameters"
},
"Delete Objects from S3 Bucket": {
"Type": "Map",
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "DISTRIBUTED",
"ExecutionType": "EXPRESS"
},
"StartAt": "Create Object Identifier Array",
"States": {
"Create Object Identifier Array": {
"Type": "Map",
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "INLINE"
},
"StartAt": "Create Object Identifier",
"States": {
"Create Object Identifier": {
"Type": "Pass",
"End": true,
"Parameters": {
"Key.$": "$.Key"
}
}
}
},
"ItemsPath": "$.Items",
"ResultPath": "$.object_identifiers",
"Next": "Delete Objects"
},
"Delete Objects": {
"Type": "Task",
"Next": "Clear Output",
"Parameters": {
"Bucket.$": "$.BatchInput.bucket",
"Delete": {
"Objects.$": "$.object_identifiers"
}
},
"Resource": "arn:aws:states:::aws-sdk:s3:deleteObjects",
"Retry": [
{
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 2,
"IntervalSeconds": 1,
"MaxAttempts": 6
}
],
"ResultSelector": {
"Deleted.$": "$.Deleted",
"RetryCount.$": "$$.State.RetryCount"
}
},
"Clear Output": {
"Type": "Pass",
"End": true,
"Result": {}
}
}
},
"ItemReader": {
"Resource": "arn:aws:states:::s3:listObjectsV2",
"Parameters": {
"Bucket.$": "$.list_parameters.Bucket",
"Prefix.$": "$.list_parameters.Prefix"
}
},
"MaxConcurrency": 5,
"Label": "S3objectkeys",
"ItemBatcher": {
"MaxInputBytesPerBatch": 204800,
"MaxItemsPerBatch": 1000,
"BatchInput": {
"bucket.$": "$.list_parameters.Bucket"
}
},
"ResultSelector": {},
"End": true
}
}
}

JSON Schema with Nested Objects with different properties

The entire JSON file is rather large so I've only taken out the subsection I've had an issue with.
{
"diagrams": {
"5f759d15cd046720c28531dd": {
"_id": "5f759d15cd046720c28531dd",
"offsetX": 320,
"offsetY": 42,
"zoom": 80,
"modified": 1604279356,
"nodes": {
"5f9f5c3ccd046720c28531e4": {
"nodeID": "5f9f5c3ccd046720c28531e4",
"type": "start",
"coords": [
360,
120
],
"data": {
"name": "Start",
"color": "standard",
"ports": [
{
"type": "",
"target": "5f9f5c3ccd046720c28531e6"
}
],
"steps": []
}
},
"5f9f5c3ccd046720c28531e5": {
"nodeID": "5f9f5c3ccd046720c28531e5",
"type": "block",
"coords": [
760,
120
],
"data": {
"name": "Help Message",
"color": "standard",
"steps": [
"5f9f5c3ccd046720c28531e6",
"5f9f5c3ccd046720c28531e7"
]
}
},
"5f9f5c3ccd046720c28531e6": {
"nodeID": "5f9f5c3ccd046720c28531e6",
"type": "speak",
"data": {
"randomize": false,
"dialogs": [
{
"voice": "Alexa",
"content": "You said help. Do you want to continue?"
}
],
"ports": [
{
"type": "",
"target": "5f9f5c3ccd046720c28531e7"
}
]
}
},
"5f9f5c3ccd046720c28531e7": {
"nodeID": "5f9f5c3ccd046720c28531e7",
"type": "interaction",
"data": {
"name": "Choice",
"else": {
"type": "path",
"randomize": false,
"reprompts": []
},
"choices": [
{
"intent": "",
"mappings": []
},
{
"intent": "",
"mappings": []
}
],
"reprompt": null,
"ports": [
{
"type": "else",
"target": null
},
{
"type": "",
"target": null
},
{
"type": "",
"target": "5f9f5c3ccd046720c28531e9"
}
]
}
},
"5f9f5c3ccd046720c28531e8": {
"nodeID": "5f9f5c3ccd046720c28531e8",
"type": "block",
"coords": [
1170,
260
],
"data": {
"name": "Exit",
"color": "standard",
"steps": [
"5f9f5c3ccd046720c28531e9"
]
}
},
"5f9f5c3ccd046720c28531e9": {
"nodeID": "5f9f5c3ccd046720c28531e9",
"type": "exit",
"data": {
"ports": []
}
}
},
"children": [],
"creatorID": 42661,
"variables": [],
"name": "Help Flow",
"versionID": "5f759d15cd046720c28531db"
}
}
}
The Current JSON Schema Definition I have is:
{
"$schema":"http://json-schema.org/schema#",
"type":"object",
"properties":{
"diagrams":{
"type":"object"
}
},
"required":[
"diagrams",
]
}
The problem I am having is that within diagrams contains multiple objects with a random string as the name e.g "5f759d15cd046720c28531dd".
Then within that object there are properties such as (_id, offsetX) which I want to express as well as a nodes object, which again contains multiple objects with arbitrary names e.g ("5f9f5c3ccd046720c28531e4", "5f9f5c3ccd046720c28531e5", ...) which have a unique node definition where some nodes have different properties to other nodes (nodeID, type, data vs nodeID, type, data, coords).
My question is with all these arbitrary things such as random names as well as different properties per each node. How do I turn it into 1 JSON schema definition which covers all the cases of how a diagram/node can be made.
You can do this with additionalProperties or patternProperties.
additionalProperties applies to any property that isn't declared in properties or patternProperties.
{
"type": "object",
"additionalProperties": {
"type": "object",
"properties": {
"_id": { ... },
"offsetX": { ... },
...
}
}
}
Your property names appear to always be hex numbers. If you want to enforce that those property names are always hex numbers, you can use patternProperties. Any property that matches the regex must conform to that schema.
{
"type": "object",
"patternProperties": {
"^[0-9a-f]{24}$": {
"type": "object",
"properties": {
"_id": { ... },
"offsetX": { ... },
...
}
}
},
"additionalProperties": false
}

Azure DataFactory ForEach Copy activity is not iterating through but instead pulling all files in blob. Why?

I have a pipeline in DF2 that has to look at a folder in blob and process each of the 145 files sequentially into a database table. After each file has been loaded into the table, a stored procedure should be trigger that will check each record and either insert it, or update an existing record into a master table.
Looking online I feel as though I have tried every combination of "Get MetaData", "For Each", "LookUp" and "Assign Variable" activates that have been suggested but for some reason my Copy Data STILL picks up all files at the same time and runs 145 times.
Recently found a blog online that I followed to use "Assign Variable" as it will be useful for multiple file locations but it does not work for me. I need to read the files as CSVs to tables and not binary objects so therefore I think this is my issue.
{
"name": "BulkLoadPipeline",
"properties": {
"activities": [
{
"name": "GetFileNames",
"type": "GetMetadata",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"dataset": {
"referenceName": "DelimitedText1",
"type": "DatasetReference",
"parameters": {
"fileName": "#item()"
}
},
"fieldList": [
"childItems"
],
"storeSettings": {
"type": "AzureBlobStorageReadSetting"
},
"formatSettings": {
"type": "DelimitedTextReadSetting"
}
}
},
{
"name": "CopyDataRunDeltaCheck",
"type": "ForEach",
"dependsOn": [
{
"activity": "BuildList",
"dependencyConditions": [
"Succeeded"
]
}
],
"typeProperties": {
"items": {
"value": "#variables('fileList')",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "WriteToTables",
"type": "Copy",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"source": {
"type": "DelimitedTextSource",
"storeSettings": {
"type": "AzureBlobStorageReadSetting",
"wildcardFileName": "*.*"
},
"formatSettings": {
"type": "DelimitedTextReadSetting"
}
},
"sink": {
"type": "AzureSqlSink"
},
"enableStaging": false,
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"name": "myID",
"type": "String"
},
"sink": {
"name": "myID",
"type": "String"
}
},
{
"source": {
"name": "Col1",
"type": "String"
},
"sink": {
"name": "Col1",
"type": "String"
}
},
{
"source": {
"name": "Col2",
"type": "String"
},
"sink": {
"name": "Col2",
"type": "String"
}
},
{
"source": {
"name": "Col3",
"type": "String"
},
"sink": {
"name": "Col3",
"type": "String"
}
},
{
"source": {
"name": "Col4",
"type": "String"
},
"sink": {
"name": "Col4",
"type": "String"
}
},
{
"source": {
"name": "DW Date Created",
"type": "String"
},
"sink": {
"name": "DW_Date_Created",
"type": "String"
}
},
{
"source": {
"name": "DW Date Updated",
"type": "String"
},
"sink": {
"name": "DW_Date_Updated",
"type": "String"
}
}
]
}
},
"inputs": [
{
"referenceName": "DelimitedText1",
"type": "DatasetReference",
"parameters": {
"fileName": "#item()"
}
}
],
"outputs": [
{
"referenceName": "myTable",
"type": "DatasetReference"
}
]
},
{
"name": "CheckDeltas",
"type": "SqlServerStoredProcedure",
"dependsOn": [
{
"activity": "WriteToTables",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"storedProcedureName": "[TL].[uspMyCheck]"
},
"linkedServiceName": {
"referenceName": "myService",
"type": "LinkedServiceReference"
}
}
]
}
},
{
"name": "BuildList",
"type": "ForEach",
"dependsOn": [
{
"activity": "GetFileNames",
"dependencyConditions": [
"Succeeded"
]
}
],
"typeProperties": {
"items": {
"value": "#activity('GetFileNames').output.childItems",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Create list from variables",
"type": "AppendVariable",
"typeProperties": {
"variableName": "fileList",
"value": "#item().name"
}
}
]
}
}
],
"variables": {
"fileList": {
"type": "Array"
}
}
}
}
The Details screen of the pipleline output shows the pipeline loops for the number of items in the blob but each time, the Copy Data and Stored Procedure are run for each file in the list at once as opposed to one at a time.
I feel like I am close to the answer but missing one vital part. Any help or suggestions are GREATLY appreciated.
Your payload is not correct.
GetMetadata actvitiy should not use the same dataset with Copy Activity.
GetMetadata activity should reference a dataset with a folder, the folder contains all file you want to deal with. but your dataset has 'filename' parameter.
use the output of the getMetadata activity as the input of forEach activity.