Azure DataFactory ForEach Copy activity is not iterating through but instead pulling all files in blob. Why? - json

I have a pipeline in DF2 that has to look at a folder in blob and process each of the 145 files sequentially into a database table. After each file has been loaded into the table, a stored procedure should be trigger that will check each record and either insert it, or update an existing record into a master table.
Looking online I feel as though I have tried every combination of "Get MetaData", "For Each", "LookUp" and "Assign Variable" activates that have been suggested but for some reason my Copy Data STILL picks up all files at the same time and runs 145 times.
Recently found a blog online that I followed to use "Assign Variable" as it will be useful for multiple file locations but it does not work for me. I need to read the files as CSVs to tables and not binary objects so therefore I think this is my issue.
{
"name": "BulkLoadPipeline",
"properties": {
"activities": [
{
"name": "GetFileNames",
"type": "GetMetadata",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"dataset": {
"referenceName": "DelimitedText1",
"type": "DatasetReference",
"parameters": {
"fileName": "#item()"
}
},
"fieldList": [
"childItems"
],
"storeSettings": {
"type": "AzureBlobStorageReadSetting"
},
"formatSettings": {
"type": "DelimitedTextReadSetting"
}
}
},
{
"name": "CopyDataRunDeltaCheck",
"type": "ForEach",
"dependsOn": [
{
"activity": "BuildList",
"dependencyConditions": [
"Succeeded"
]
}
],
"typeProperties": {
"items": {
"value": "#variables('fileList')",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "WriteToTables",
"type": "Copy",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"source": {
"type": "DelimitedTextSource",
"storeSettings": {
"type": "AzureBlobStorageReadSetting",
"wildcardFileName": "*.*"
},
"formatSettings": {
"type": "DelimitedTextReadSetting"
}
},
"sink": {
"type": "AzureSqlSink"
},
"enableStaging": false,
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"name": "myID",
"type": "String"
},
"sink": {
"name": "myID",
"type": "String"
}
},
{
"source": {
"name": "Col1",
"type": "String"
},
"sink": {
"name": "Col1",
"type": "String"
}
},
{
"source": {
"name": "Col2",
"type": "String"
},
"sink": {
"name": "Col2",
"type": "String"
}
},
{
"source": {
"name": "Col3",
"type": "String"
},
"sink": {
"name": "Col3",
"type": "String"
}
},
{
"source": {
"name": "Col4",
"type": "String"
},
"sink": {
"name": "Col4",
"type": "String"
}
},
{
"source": {
"name": "DW Date Created",
"type": "String"
},
"sink": {
"name": "DW_Date_Created",
"type": "String"
}
},
{
"source": {
"name": "DW Date Updated",
"type": "String"
},
"sink": {
"name": "DW_Date_Updated",
"type": "String"
}
}
]
}
},
"inputs": [
{
"referenceName": "DelimitedText1",
"type": "DatasetReference",
"parameters": {
"fileName": "#item()"
}
}
],
"outputs": [
{
"referenceName": "myTable",
"type": "DatasetReference"
}
]
},
{
"name": "CheckDeltas",
"type": "SqlServerStoredProcedure",
"dependsOn": [
{
"activity": "WriteToTables",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"storedProcedureName": "[TL].[uspMyCheck]"
},
"linkedServiceName": {
"referenceName": "myService",
"type": "LinkedServiceReference"
}
}
]
}
},
{
"name": "BuildList",
"type": "ForEach",
"dependsOn": [
{
"activity": "GetFileNames",
"dependencyConditions": [
"Succeeded"
]
}
],
"typeProperties": {
"items": {
"value": "#activity('GetFileNames').output.childItems",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Create list from variables",
"type": "AppendVariable",
"typeProperties": {
"variableName": "fileList",
"value": "#item().name"
}
}
]
}
}
],
"variables": {
"fileList": {
"type": "Array"
}
}
}
}
The Details screen of the pipleline output shows the pipeline loops for the number of items in the blob but each time, the Copy Data and Stored Procedure are run for each file in the list at once as opposed to one at a time.
I feel like I am close to the answer but missing one vital part. Any help or suggestions are GREATLY appreciated.

Your payload is not correct.
GetMetadata actvitiy should not use the same dataset with Copy Activity.
GetMetadata activity should reference a dataset with a folder, the folder contains all file you want to deal with. but your dataset has 'filename' parameter.
use the output of the getMetadata activity as the input of forEach activity.

Related

Azure Function App new JSON with only a few properties

I am struggling of finding a feasible solution for my Azure Logic App.
A HTTP Request-action call will list the virtual networks of an Azure subscription.
The response look somehow like this:
{
"value": [
{
"name": "virtualNetworkName1",
"id": "/subscriptions/111-222-333/resourceGroups/resourceGroupName1/providers/Microsoft.Network/virtualNetworks/virtualNetworkName1",
"etag": "W/\"11111-1111-111\"",
"type": "Microsoft.Network/virtualNetworks",
"location": "eastus",
"properties": {
"provisioningState": "Succeeded",
"resourceGuid": "111-1111-11111",
"addressSpace": {
"addressPrefixes": [
"192.168.0.0/25"
]
}
}
},
{
"name": "virtualNetworkName2",
"id": "/subscriptions/111-222-333/resourceGroups/resourceGroupName2/providers/Microsoft.Network/virtualNetworks/virtualNetworkName2",
"etag": "W/\"22222-2222-222\"",
"type": "Microsoft.Network/virtualNetworks",
"location": "westeurope",
"properties": {
"provisioningState": "Succeeded",
"resourceGuid": "222-2222-22222",
"addressSpace": {
"addressPrefixes": [
"192.168.1.0/24"
]
}
}
}
]
}
The resonse has even more properties which aren't necessary.
Regarding to this, I would like to use the HTTP Response-action in a JSON format with only a few properties:
Name
Id
Location
Like this:
[
{
"name": "virtualNetworkName1",
"id": "/subscriptions/111-222-333/resourceGroups/resourceGroupName1/providers/Microsoft.Network/virtualNetworks/virtualNetworkName1",
"location": "eastus"
},
{
"name": "virtualNetworkName2",
"id": "/subscriptions/111-222-333/resourceGroups/resourceGroupName2/providers/Microsoft.Network/virtualNetworks/virtualNetworkName2",
"location": "westeurope"
}
]
Is it possible to realize it with only Logic App native actions?
I have reproduced in my environment and got expected results as below:
Firstly, I have initialized your output in a variable and then used parse Json and compose to get required output:
Parse Json Schema:
{
"properties": {
"value": {
"items": {
"properties": {
"id": {
"type": "string"
},
"location": {
"type": "string"
},
"name": {
"type": "string"
}
},
"type": "object"
},
"type": "array"
}
},
"type": "object"
}
Output:
First json Output:
Second one:
Code view of Logic app:
{
"definition": {
"$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
"actions": {
"Compose": {
"inputs": {
"value": [
{
"etag": "W/\"11111-1111-111\"",
"id": "/subscriptions/111-222-333/resourceGroups/resourceGroupName1/providers/Microsoft.Network/virtualNetworks/virtualNetworkName1",
"location": "eastus",
"name": "virtualNetworkName1",
"properties": {
"addressSpace": {
"addressPrefixes": [
"192.168.0.0/25"
]
},
"provisioningState": "Succeeded",
"resourceGuid": "111-1111-11111"
},
"type": "Microsoft.Network/virtualNetworks"
},
{
"etag": "W/\"22222-2222-222\"",
"id": "/subscriptions/111-222-333/resourceGroups/resourceGroupName2/providers/Microsoft.Network/virtualNetworks/virtualNetworkName2",
"location": "westeurope",
"name": "virtualNetworkName2",
"properties": {
"addressSpace": {
"addressPrefixes": [
"192.168.1.0/24"
]
},
"provisioningState": "Succeeded",
"resourceGuid": "222-2222-22222"
},
"type": "Microsoft.Network/virtualNetworks"
}
]
},
"runAfter": {},
"type": "Compose"
},
"For_each": {
"actions": {
"Compose_2": {
"inputs": {
"id": "#items('For_each')?['id']",
"location": "#items('For_each')?['location']",
"name": "#items('For_each')?['name']"
},
"runAfter": {},
"type": "Compose"
}
},
"foreach": "#body('Parse_JSON')?['value']",
"runAfter": {
"Parse_JSON": [
"Succeeded"
]
},
"type": "Foreach"
},
"Parse_JSON": {
"inputs": {
"content": "#outputs('Compose')",
"schema": {
"properties": {
"value": {
"items": {
"properties": {
"id": {
"type": "string"
},
"location": {
"type": "string"
},
"name": {
"type": "string"
}
},
"type": "object"
},
"type": "array"
}
},
"type": "object"
}
},
"runAfter": {
"Compose": [
"Succeeded"
]
},
"type": "ParseJson"
}
},
"contentVersion": "1.0.0.0",
"outputs": {},
"parameters": {},
"triggers": {
"manual": {
"inputs": {
"schema": {}
},
"kind": "Http",
"type": "Request"
}
}
},
"parameters": {}
}
Now you will get required fields as I have got.

How to use Parser Transformation for JSON data in IICS?

I am new to IICS and I have JSON data as below, which I would to parse in csv file. I am using this link as a reference to achieve this transformation. I created valid mapping in IICS.The mapping runs fine. However, when I see my jobs I am receiving below error.I went to the path mentioned and opened the Events.cme file in Notepad but cannot make of what file is talking about (Note: in belwo output I deleted few of the numbers)
Not sure what is wrong ? Do I need to save my JSON data file as txt file ?
Any help will be appreciated! Thanks in advance!
ERROR after running the mapping
[ERROR] Failed to process data: File C:/IICSLabFiles/test.json doesn't exist or isn't readable- for more information see file://C:/PROGRA~1/Informatica Cloud Secure Agent/apps/Data_Integration_Server/data/CMReports/Tmp/2022-06-01/HierarchyParser_h2r_udt_8gns3_ONLY_H2R_XMAP_/Events.cme
Opening Events.cme file in notepad produces following
<B#80010%#>
!~109146~165266~~10.2.2.65()
<B#80032%#>
</B#8032%#>
<m -- XMap%m>
!~103149~1654220266~~Pages\/page_m_1.cmv%Pages\/page_m_1.json
<B#80037%XML#>
!~1031~1654220266~~Pages\/Input_of_m_1.cmv%Pages\/Input_of_m_1.json
<LocalFile>
!~309025~16542266~~C:\/IICSLabFiles\/test.json
</LocalFile>
!~103205~16540266~~C:\/IICSLabFiles\/test.json
!~3033~1654220266~~
</B#8007%XML#>
</m -- XMap>
</B#80010%#>
JSON Data that is saved in test.json (with File type as JSON File):
{
"current_page": 1,
"first_page_url": "https://covid-api.com/api/regions?per_page=20&page=1",
"last_page_url": "https://covid-api.com/api/regions?per_page=20&page=50",
"next_page_url": "https://covid-api.com/api/regions?per_page=20&page=2",
"prev_page_url": null,
"per_page": "20",
"last_page": 50,
"from": 1,
"path": "https://covid-api.com/api/regions",
"to": 20,
"total": 997,
"data": [
{
"iso": "CHN",
"name": "China"
},
{
"iso": "TWN",
"name": "Taipei and environs"
},
{
"iso": "USA",
"name": "US"
},
{
"iso": "JPN",
"name": "Japan"
},
{
"iso": "THA",
"name": "Thailand"
},
{
"iso": "KOR",
"name": "Korea, South"
},
{
"iso": "SGP",
"name": "Singapore"
},
{
"iso": "PHL",
"name": "Philippines"
},
{
"iso": "MYS",
"name": "Malaysia"
},
{
"iso": "VNM",
"name": "Vietnam"
},
{
"iso": "AUS",
"name": "Australia"
},
{
"iso": "MEX",
"name": "Mexico"
},
{
"iso": "BRA",
"name": "Brazil"
},
{
"iso": "COL",
"name": "Colombia"
},
{
"iso": "FRA",
"name": "France"
},
{
"iso": "NPL",
"name": "Nepal"
},
{
"iso": "CAN",
"name": "Canada"
},
{
"iso": "KHM",
"name": "Cambodia"
},
{
"iso": "LKA",
"name": "Sri Lanka"
},
{
"iso": "CIV",
"name": "Cote d'Ivoire"
}
]
}
**JSON SCHEMA that is saved in Hierarchy schema (with file type as JSON FILE) **
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"current_page": {
"type": "integer"
},
"first_page_url": {
"type": "string"
},
"last_page_url": {
"type": "string"
},
"next_page_url": {
"type": "string"
},
"prev_page_url": {
"type": "null"
},
"per_page": {
"type": "string"
},
"last_page": {
"type": "integer"
},
"from": {
"type": "integer"
},
"path": {
"type": "string"
},
"to": {
"type": "integer"
},
"total": {
"type": "integer"
},
"data": {
"type": "array",
"items": [
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
},
{
"type": "object",
"properties": {
"iso": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"iso",
"name"
]
}
]
}
},
"required": [
"current_page",
"first_page_url",
"last_page_url",
"next_page_url",
"prev_page_url",
"per_page",
"last_page",
"from",
"path",
"to",
"total",
"data"
]
}
Source connection Setup
Path_text file contains following information
Path
C:/IICSLabFiles/test.json
The error message "C:/IICSLabFiles/test.json doesn't exist or isn't readable" suggests you try reading local file. Is this the path to the file located on Secure Agent running the mapping or is it the path to a file stored on your laptop? What is your Source definition?
Keep in mind that you design the mapping on your laptop where you have access to files stored on your laptop - but once you execute, it gets processed by Secure Agent (that can be a different machine, cloud-hosted, etc.). In this case it seems the Secure Agent cannot access the file at the given location.
It's also possible to have Secure Agent installed on your machine and run the process on the laptop where you actually have been designing the mapping. In such case please make sure there are no typos in the path, no leading or trailing empty spaces. And if it's a Windows-based Secure Agent, verify the paths as the one you use has froward slashes while Windows uses backslashes usually:
C:/IICSLabFiles/test.json
vs
C:\IICSLabFiles\test.json

ARM Templates - Values and parameters for Adding Dynamic Data disks to VMs?

I'm new to ARM Templates.
I've downloaded an ARM Template from the Portal after building a VM with 1 managed Data Disk.
My objective is to use ARM Templates to build several VMs in a row.
For now, with identical parameters, except for the VM Name and of course NIC and Disks Names.
I noticed the parameters.json file had hardcoded values and that wouldn't work as a template, so I started modifying to see how could I make it more dynamic.
However I don't understand the Data Disks structure, which, in this template, is divided among different components and that's making me struggle with Dynamic Naming for the Disks.
Data disks appear in the template as a Resource and then as a property of the VM, inside a copy function.
However in the parameters file there are two objects, dataDisks and dataDisksResources.
I don't understand why the parameters have two different objects instead of one (for example, everything inside dataDisks instead of also having a dataDisksResources) and I also don't get why the parameters of the VM disk property are different and more than the parameters of the Disk Resource.
This is the template.json
{
"$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"location": {
"type": "string"
},
"subnetName": {
"type": "string"
},
"virtualNetworkId": {
"type": "string"
},
"virtualMachineName": {
"type": "string"
},
"virtualMachineRG": {
"type": "string"
},
"osDiskType": {
"type": "string"
},
"dataDisks": {
"type": "array"
},
"dataDiskResources": {
"type": "array"
},
"virtualMachineSize": {
"type": "string"
},
"adminUsername": {
"type": "string"
},
"adminPassword": {
"type": "secureString"
},
"diagnosticsStorageAccountName": {
"type": "string"
},
"diagnosticsStorageAccountId": {
"type": "string"
},
"diagnosticsStorageAccountType": {
"type": "string"
},
"diagnosticsStorageAccountKind": {
"type": "string"
}
},
"variables": {
"vnetId": "[parameters('virtualNetworkId')]",
"subnetRef": "[concat(variables('vnetId'), '/subnets/', parameters('subnetName'))]",
"nicName": "[concat(parameters('virtualMachineName'), substring(uniqueString(resourceGroup().id),0,4))]"
},
"resources": [
{
"name": "[variables('nicName')]",
"type": "Microsoft.Network/networkInterfaces",
"apiVersion": "2019-07-01",
"location": "[parameters('location')]",
"dependsOn": [],
"properties": {
"ipConfigurations": [
{
"name": "ipconfig1",
"properties": {
"subnet": {
"id": "[variables('subnetRef')]"
},
"privateIPAllocationMethod": "Dynamic"
}
}
]
},
"tags": {
}
},
{
"name": "[concat(parameters('virtualMachineName'),'_DataDisk_0')]",
"type": "Microsoft.Compute/disks",
"apiVersion": "2019-07-01",
"location": "[parameters('location')]",
"properties": "[parameters('dataDiskResources')[copyIndex()].properties]",
"sku": {
"name": "[parameters('dataDiskResources')[copyIndex()].sku]"
},
"copy": {
"name": "managedDiskResources",
"count": "[length(parameters('dataDiskResources'))]"
},
"tags": {
}
},
{
"name": "[parameters('virtualMachineName')]",
"type": "Microsoft.Compute/virtualMachines",
"apiVersion": "2019-07-01",
"location": "[parameters('location')]",
"dependsOn": [
"managedDiskResources",
"[concat('Microsoft.Network/networkInterfaces/', variables('nicName'))]",
"[concat('Microsoft.Storage/storageAccounts/', parameters('diagnosticsStorageAccountName'))]"
],
"properties": {
"hardwareProfile": {
"vmSize": "[parameters('virtualMachineSize')]"
},
"storageProfile": {
"osDisk": {
"createOption": "fromImage",
"managedDisk": {
"storageAccountType": "[parameters('osDiskType')]"
}
},
"imageReference": {
"publisher": "MicrosoftVisualStudio",
"offer": "VisualStudio",
"sku": "VS-2017-Ent-Latest-Win10-N",
"version": "latest"
},
"copy": [
{
"name": "dataDisks",
"count": "[length(parameters('dataDisks'))]",
"input": {
"lun": "[parameters('dataDisks')[copyIndex('dataDisks')].lun]",
"createOption": "[parameters('dataDisks')[copyIndex('dataDisks')].createOption]",
"caching": "[parameters('dataDisks')[copyIndex('dataDisks')].caching]",
"writeAcceleratorEnabled": "[parameters('dataDisks')[copyIndex('dataDisks')].writeAcceleratorEnabled]",
"diskSizeGB": "[parameters('dataDisks')[copyIndex('dataDisks')].diskSizeGB]",
"managedDisk": {
"id": "[coalesce(parameters('dataDisks')[copyIndex('dataDisks')].id, if(equals(parameters('dataDisks')[copyIndex('dataDisks')].name, json('null')), json('null'), resourceId('Microsoft.Compute/disks', parameters('dataDisks')[copyIndex('dataDisks')].name)))]",
"storageAccountType": "[parameters('dataDisks')[copyIndex('dataDisks')].storageAccountType]"
}
}
}
]
},
"networkProfile": {
"networkInterfaces": [
{
"id": "[resourceId('Microsoft.Network/networkInterfaces', variables('nicName'))]"
}
]
},
"osProfile": {
"computerName": "[parameters('virtualMachineName')]",
"adminUsername": "[parameters('adminUsername')]",
"adminPassword": "[parameters('adminPassword')]",
"windowsConfiguration": {
"enableAutomaticUpdates": true,
"provisionVmAgent": true
}
},
"licenseType": "Windows_Server",
"diagnosticsProfile": {
"bootDiagnostics": {
"enabled": true,
"storageUri": "[concat('https://', parameters('diagnosticsStorageAccountName'), '.blob.core.windows.net/')]"
}
}
},
"tags": {
}
},
{
"name": "[parameters('diagnosticsStorageAccountName')]",
"type": "Microsoft.Storage/storageAccounts",
"apiVersion": "2019-06-01",
"location": "[parameters('location')]",
"properties": {},
"kind": "[parameters('diagnosticsStorageAccountKind')]",
"sku": {
"name": "[parameters('diagnosticsStorageAccountType')]"
},
"tags": {
}
}
],
"outputs": {
"adminUsername": {
"type": "string",
"value": "[parameters('adminUsername')]"
}
}
}
And this is the parameters.json
{
"location": {
"value": "location"
},
"subnetName": {
"value": "subnetname"
},
"virtualNetworkId": {
"value": "networkid"
},
"virtualMachineRG": {
"value": "vmRG"
},
"osDiskType": {
"value": "Standard_LRS"
},
"dataDisks": {
"value": [
{
"lun": 0,
"createOption": "attach",
"caching": "None",
"writeAcceleratorEnabled": false,
"id": null,
"storageAccountType": null,
"name": null,
"diskSizeGB": null,
"diskEncryptionSet": {
"id": null
}
}
]
},
"dataDiskResources": {
"value": [
{
"sku": "Standard_LRS",
"properties": {
"diskSizeGB": 128,
"creationData": {
"createOption": "empty"
}
}
}
]
},
"virtualMachineSize": {
"value": "Standard_B4ms"
},
"adminUsername": {
"value": "admin"
},
"diagnosticsStorageAccountName": {
"value": "rg01diag"
},
"diagnosticsStorageAccountId": {
"value": "Microsoft.Storage/storageAccounts/rg01diag"
},
"diagnosticsStorageAccountType": {
"value": "Standard_LRS"
},
"diagnosticsStorageAccountKind": {
"value": "Storage"
} }
I also can't find any documentation for this kind of template. All the quick templates I find have a simpler version of this. For example they state all the disks properties inside the same template file, the parameters and properties are fewer and there isn't any dataDisksResources object anywhere.
I want to understand how would I need to modify these Disk structure to add dynamic naming that names them, for example, as Azure portal does (VMName_DataDisk_Lunnumber)
Because you have to specify different input when you create the data disk and when you attach it, but you dont have to create it, you can just tell the VM to create those. thsis would be one way of doing that:
"dataDisks": [
{
"diskSizeGB": "[parameters('sizeOfEachDataDiskInGB')]",
"lun": 0,
"createOption": "Empty"
},
{
"diskSizeGB": "[parameters('sizeOfEachDataDiskInGB')]",
"lun": 1,
"createOption": "Empty"
},
{
"diskSizeGB": "[parameters('sizeOfEachDataDiskInGB')]",
"lun": 2,
"createOption": "Empty"
},
{
"diskSizeGB": "[parameters('sizeOfEachDataDiskInGB')]",
"lun": 3,
"createOption": "Empty"
}
],
and you dont have to have a separate disk resource, these would be created automatically. you can also add a property called name to specify a name for those.
https://github.com/Azure/azure-quickstart-templates/blob/master/101-vm-multiple-data-disk/azuredeploy.json

Azure Data Factory V2 Copy Activity with Rest API giving one row for nested JSON

I am trying to flatten a nested JSON returned from a Rest source. The pipeline code is as follows.
The problem here is this pipeline returns only first object from JSON dataset and skips all the rest of the rows.
Can you please guide me on how to iterate over nested objects.
Thanks
Sameet
{
"name": "STG_NCR2",
"properties": {
"activities": [
{
"name": "Copy data1",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"source": {
"type": "RestSource",
"httpRequestTimeout": "00:01:40",
"requestInterval": "00.00:00:00.010",
"requestMethod": "GET",
"additionalHeaders": {
"OData-MaxVersion": "4.0",
"OData-Version": "4.0",
"Prefer": "odata.include-annotations=*"
}
},
"sink": {
"type": "AzureSqlSink"
},
"enableStaging": false,
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"path": "$['value'][0]['tco_ncrid']"
},
"sink": {
"name": "NCRID"
}
},
{
"source": {
"path": "['tco_name']"
},
"sink": {
"name": "EquipmentSerialNumber"
}
}
],
"collectionReference": "$['value'][0]['tco_ncr_tco_equipment']"
}
},
"inputs": [
{
"referenceName": "Rest_PowerApps_NCR",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "Prestaging_PowerApps_NCREquipments",
"type": "DatasetReference"
}
]
}
],
"annotations": []
}
}
The JSON is in the following format
[
{
"value":[
{
"tco_ncrid":"abc-123",
"tco_ncr_tco_equipment":[
{
"tco_name":"abc"
}
]
},
{
"tco_ncrid":"abc-456",
"tco_ncr_tco_equipment":[
{
"tco_name":"xyz"
},
{
"tco_name":"yzx"
}
}
]
]
}
]
This can be resolved by amending the translator property as follows.
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"path": "$.['value'][0].['tco_ncrid']"
},
"sink": {
"name": "NCRID",
"type": "String"
}
},
{
"source": {
"path": "$.['value'][0].['tco_text_id']"
},
"sink": {
"name": "EquipmentDescription",
"type": "String"
}
},
{
"source": {
"path": "['tco_name']"
},
"sink": {
"name": "EquipmentSerialNumber",
"type": "String"
}
}
],
"collectionReference": "$.['value'][*].['tco_ncr_tco_equipment']"
}
This code forces the pipeline to iterate over nested array but as you can see that the NCRID is hardcoded to first element of the value array. This is not exactly what I want as I am looking for all Equipment Serial Numbers against every NCRID. Still researching...

Azure Data Factory Copy Activity

I have been working on this for a couple days and cannot get past this error. I have 2 activities in this pipeline. The first activity copies data from an ODBC connection to an Azure database, which is successful. The 2nd activity transfers the data from Azure table to another Azure table and keeps failing.
The error message is:
Copy activity met invalid parameters: 'UnknownParameterName', Detailed message: An item with the same key has already been added..
I do not see any invalid parameters or unknown parameter names. I have rewritten this multiple times using their add activity code template and by myself, but do not receive any errors when deploying on when it is running. Below is the JSON pipeline code.
Only the 2nd activity is receiving an error.
Thanks.
Source Data set
{
"name": "AnalyticsDB-SHIPUPS_06shp-01src_AZ-915PM",
"properties": {
"structure": [
{
"name": "UPSD_BOL",
"type": "String"
},
{
"name": "UPSD_ORDN",
"type": "String"
}
],
"published": false,
"type": "AzureSqlTable",
"linkedServiceName": "Source-SQLAzure",
"typeProperties": {},
"availability": {
"frequency": "Day",
"interval": 1,
"offset": "04:15:00"
},
"external": true,
"policy": {}
}
}
Destination Data set
{
"name": "AnalyticsDB-SHIPUPS_06shp-02dst_AZ-915PM",
"properties": {
"structure": [
{
"name": "SHIP_SYS_TRACK_NUM",
"type": "String"
},
{
"name": "SHIP_TRACK_NUM",
"type": "String"
}
],
"published": false,
"type": "AzureSqlTable",
"linkedServiceName": "Destination-Azure-AnalyticsDB",
"typeProperties": {
"tableName": "[olcm].[SHIP_Tracking]"
},
"availability": {
"frequency": "Day",
"interval": 1,
"offset": "04:15:00"
},
"external": false,
"policy": {}
}
}
Pipeline
{
"name": "SHIPUPS_FC_COPY-915PM",
"properties": {
"description": "copy shipments ",
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "RelationalSource",
"query": "$$Text.Format('SELECT COMPANY, UPSD_ORDN, UPSD_BOL FROM \"orupsd - UPS interface Dtl\" WHERE COMPANY = \\'01\\'', WindowStart, WindowEnd)"
},
"sink": {
"type": "SqlSink",
"sqlWriterCleanupScript": "$$Text.Format('delete imp_fc.SHIP_UPS_IntDtl_Tracking', WindowStart, WindowEnd)",
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
},
"translator": {
"type": "TabularTranslator",
"columnMappings": "COMPANY:COMPANY, UPSD_ORDN:UPSD_ORDN, UPSD_BOL:UPSD_BOL"
}
},
"inputs": [
{
"name": "AnalyticsDB-SHIPUPS_03shp-01src_FC-915PM"
}
],
"outputs": [
{
"name": "AnalyticsDB-SHIPUPS_03shp-02dst_AZ-915PM"
}
],
"policy": {
"timeout": "1.00:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst",
"style": "StartOfInterval",
"retry": 3,
"longRetry": 0,
"longRetryInterval": "00:00:00"
},
"scheduler": {
"frequency": "Day",
"interval": 1,
"offset": "04:15:00"
},
"name": "915PM-SHIPUPS-fc-copy->[imp_fc]_[SHIP_UPS_IntDtl_Tracking]"
},
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": "$$Text.Format('select distinct ups.UPSD_BOL, ups.UPSD_BOL from imp_fc.SHIP_UPS_IntDtl_Tracking ups LEFT JOIN olcm.SHIP_Tracking st ON ups.UPSD_BOL = st.SHIP_SYS_TRACK_NUM WHERE st.SHIP_SYS_TRACK_NUM IS NULL', WindowStart, WindowEnd)"
},
"sink": {
"type": "SqlSink",
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
},
"translator": {
"type": "TabularTranslator",
"columnMappings": "UPSD_BOL:SHIP_SYS_TRACK_NUM, UPSD_BOL:SHIP_TRACK_NUM"
}
},
"inputs": [
{
"name": "AnalyticsDB-SHIPUPS_06shp-01src_AZ-915PM"
}
],
"outputs": [
{
"name": "AnalyticsDB-SHIPUPS_06shp-02dst_AZ-915PM"
}
],
"policy": {
"timeout": "1.00:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst",
"style": "StartOfInterval",
"retry": 3,
"longRetryInterval": "00:00:00"
},
"scheduler": {
"frequency": "Day",
"interval": 1,
"offset": "04:15:00"
},
"name": "915PM-SHIPUPS-AZ-update->[olcm]_[SHIP_Tracking]"
}
],
"start": "2017-08-22T03:00:00Z",
"end": "2099-12-31T08:00:00Z",
"isPaused": false,
"hubName": "adf-tm-prod-01_hub",
"pipelineMode": "Scheduled"
}
}
Have you seen this link?
They get the same error message and suggest using AzureTableSink instead of SqlSink
"sink": {
"type": "AzureTableSink",
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
}
It would make sense for you too since your 2nd copy activity is Azure to Azure
It could be a red herring but I'm pretty sure "tableName" is a require entry in the typeProperties for a sqlSource. Yours is missing this for the input dataset. Appreciate you have a join in the sqlReaderQuery so probably best to put a dummy (but real) table name in there.
Btw, not clear why you are using $$Text.Format and WindowStart/WindowEnd on your queries if you're not transposing these values into the query; you could just put the query between double quotes.