Dynamically refer to Json value in Data Factory copy - json

I have ADF CopyRestToADLS activity which correctly saves json complex object to Data Lake storage. But I additionally need to pass one of the json values (myextravalue) to a stored procedure. I tried referencing it in the stored procedure parameter as #{activity('CopyRESTtoADLS').output.myextravalue but I am getting error
The actions CopyRestToADLS refernced by 'inputs' in the action ExectuteStored procedure1 are not defined in the template
{
"items": [1000 items],
"count": 1000,
"myextravalue": 15983444
}
I would like to try to dynamically reference this value because the CopyRestToADLS source REST dataset dynamically calls different REST endpoints so the structure of JSON object is different each time. But the myextravalue is always present in each JSON call.
How is it possible to refernce myextravalue and use it as a parameter?
Rich750

You could create another lookup active on REST data source to get the json value. Then pass it to the Stored Procedure active.
Yes, it will create a new REST request, and it seams to be an easy way to achieve your purpose. Lookup active to get the content of the source and won't save it.
The another solution may be get the value from the copy active output file, after the copy active completed.
I'm glad you solved it by this way:
"I created a Data Flow to read from the folder where Copy Activity saves dynamically named output json filenames. After importing schema from sample file, I selected the myextravalue as the only mapping in the Sink Mapping section."

Related

Advanced mapping of JSON in Azure Data Factory - some guidance requested

I'm trying to map a JSON document (sensor data) into a more meaningful representation using Mapping Dataflows. However, hard time getting this to work and would really appreciate some insight/recommendations on how to solve the following:
The input is
What I would like to end up with is the following:
Any pointers as to how this can be implemented are more than welcome.
This can be accomplished using the Copy activity and then split function in Derived Column transformation in Azure Data Factory.
Use the copy activity to read the JSON file as source and in sink, use SQL database to store the data as table. In Mapping tab, Import the schema and map the JSON records to the corresponding column names. Refer this third-part tutorial for guidance - https://sqlkover.com/dynamically-map-json-to-sql-in-azure-data-factory/
Finally, use the Data Flow activity and choose the SQL table as source now which you have used as sink above.
Select the Derived Column transformation.
Use split function.
Add the column which will take the split values which you want to split as shown below.
Use split(<column_name_to_split>, '_') function to split the column on with _ delimiter. Change <column_name_to_split> to the name of column you cant to split. Refer image below.
Preview the data to check the result.

Map nested JSON (Mongo ATLAS) to SQL [Azure Data Factory]

I want to map nested json to sql table (Microsoft SSMS)
Source is a Dataset of MongoAtlas &
Sink is a Dataset of Azure SQL Database Managed Instance
I am able to map parentArray using collection reference.
but not able to select child under it.
also childArrays are kind of scalar arrays (they don't have any keys)
Note : I tried the option Map complex values to string
but it is putting the values in column cell like ["ABC", PQR] which I dont want
is there any way to map it ?
Expected output for Table : childarray2
Currently in ADF, Copy Activity supports mapping of arrays for only 1 level.
There is not way to map nested arrays.
For this I had to use Data flows.
Limitation was, we cannot use mongoDB/mongo Atlas as a input source in Data flow, so the workaround was
Convert Mongo To Azure Blob JSON (Copy Activity Task)
Use Azure Blob JSON files as a input source and then SQL tables as sink
Note: You can select this option to delete you blob files, to save storage space

How can I reference a JSON source for a derived column action in Azure Data Factory

I'm new to Azure Data Factory. I've been able to generate a set of JSON files from a REST API source using a Pipeline. Each file consists of one top level JSON object with an array of up to 100 child objects. The output is saved to an Azure Blob Storage container.
I now want to use a Mapping Data Flow to modify the JSON before I write it to Azure SQL, however I'm struggling with the syntax. I've configured the source to point to the directory containing the JSON files. The Source Projection tab displays the correct schema. I can preview the data and I see a row for each file and I can expand the child objects to see the full structure.
However, when I add a Derived Column action, the Input Schema is blank in the Expression Builder. I can refer to the top level elements in the source using the byName and byPosition functions, but I don't know how I can reference the child elements.
The examples that I have been able to find online use a SQL table or CSV file as a source. I can't find any examples that use hierarchical data as the source for a derived column.
Am I missing something? Is this scenario supported?
I found a way to achieve what I want. This may not be the best approach, but it works.
It seems that it is difficult to deal with JSON that has multiple hierarchies as a source for copy data activities. You can choose one level of repeating data to map to a table structure (the Collection Reference property on the Mapping tab).
In my scenario, there was additional repeating data within the data I was mapping to my table. I updated the mapping to write the child JSON data to a text field in my SQL table. To do this, I needed to use the Azure Data Factory JSON editor for my pipeline. You can access this from the "Code" link in the top right corner of the pipeline visual editor.
I added the following line after the closing bracket for the "mappings" array for my copy activity:
"mapComplexValuesToString": true
The full path to the mapping array in the activity definition is typeProperties - translator - mappings. Make sure your commas are correct after you add the new element.
With this approach, I had a row in my SQL table for each array item in my Collection Reference. The scalar child elements in the array items are mapped to table columns and the child JSON element is written to a data column in the same table.
To extract the values I need within the child JSON, I created a SQL view that uses the CROSS APPLY OPENJSON syntax. This allows me to treat the JSON in the data field similar to a related table. You can specify the structure that your JSON is in. If you have nested data in your JSON, you can apply the same approach for each level.
The OPENJSON command is only supported by more recent versions of SQL Server. I'm using Azure SQL, so that works for me.

Data Factory v2 - Generate a json file per row

I'm using Data Factory v2. I have a copy activity that has an Azure SQL dataset as input and a Azure Storage Blob as output. I want to write each row in my SQL dataset as a separate blob, but I don't see how I can do this.
I see a copyBehavior in the copy activity, but that only works from a file based source.
Another possible setting is the filePattern in my dataset:
Indicate the pattern of data stored in each JSON file. Allowed values
are: setOfObjects and arrayOfObjects.
setOfObjects - Each file contains single object, or line-delimited/concatenated multiple objects. When this option is chosen in an output dataset, copy activity produces a single JSON file with each object per line (line-delimited).
arrayOfObjects - Each file contains an array of objects.
The description talks about "each file" so initially I thought it would be possible, but now I've tested them it seems that setOfObjects creates a line separated file, where each row is written to a new line. The setOfObjects setting creates a file with a json array and adds each line as a new element of the array.
I'm wondering if I'm missing a configuration somewhere, or is it just not possible?
What I did for now is to load the rows in to a SQL table and run a foreach for each record in the table. The I use a Lookup activity to have an array to loop in a Foreach activity. The foreach activity writes each row to a blob store.
For Olga's documentDb question, it would look like this:
In the lookup, you get a list of the documentid's you want to copy:
You use that set in your foreach activity
Then you copy the files using a copy activity within the foreach activity. You query a single document in your source:
And you can use the id to dynamically name your file in the sink. (you'll have to define the param in your dataset too):

ServiceNow - JSON Web Service, display related tables

I'm working on a C# program that retrieves data from a ServiceNow database and converts that data into C# .NET objects. I'm using the JSON Web Service to return my data in JSON format.
What I want to achieve is as follows: If there is a relational mapping between a value (for
example: I have a table called Company, where CEO is not a TEXT field but an sys_id to a Employee Table) I want to be able to output that data not with an sys_id (or just displaying the name property by using the 'displayvariable' parameter) but by an object displayed in JSON.
This means that the value of a property should be an object in JSON instead of just a single value.
A few examples:
// I don't want the JSON like this
{"Company":{"CEO":"b181e841c9212c008aeb36850331fab2"}}
// Or by displaying the name of the sys_id table
{"Company":{"CEO":"James Henderson" }}
// I want the data as follows, so I can have all the data I need inside a single JSON record.
{"Company":{"CEO":{"name":"James Henderson", "age":34, "sex":"male", "office":"SBN Left Floor 23"}}}
From reading the documentation I couldn't find anything in the JSON Web Service that allowed me to display the information like this nor
find any other alternative. It should have something to do with joining the tables and displaying it all in the right format.
I have been using SNC for almost three years and have not found you can automatically join tables in a web service. Your best option would be to use a scripted web service which possibly takes a query parameter and table parameter. Then you can json serialized your result as you see fit.
Or, another option would be to generate a new processor that will traverse the GlideRecord object. The ?JSON parameter you pass in to the URL is merely a flag to pass your request to a particular processor. Unfortunately the OOB one I believe is a Java class not a JS script, so you would need to write a script much like I mentioned earlier to traverse the object path serializing the object graph as far down as your want to go.