Azure Data Factory - Process SSIS Output - ssis

I'm working to lift an SSIS package into Azure Data Factory V2, and I have successfully set up an IR and executed the package.
Now I'm attempting to work with the results in ADF. This package was originally designed to return a recordset to the calling client. Now that I'm in ADF, I'd like to take the recordset produced by the package and copy it to table storage. However, I see no way to access this recordset from within the ADF pipeline.
Is it possible to access and process this recordset from the host ADF pipeline, or will the package itself have to be modified to no longer return a recordset and perform the copy instead?

In the SSIS create a text file as output and copy it to a location/folder in blob or even your on premise folder.
If you run SSIS on premise, store it in an on premise folder , use AZCopy
tool to move it Azure blob to a BLOB
https://blogs.technet.microsoft.com/canitpro/2015/12/28/step-by-step-using-azcopy-to-transfer-files-to-azure/
Otherwise, you run SSIS on Azure as you mentioned .Copy the output of
your rowset to a flat file using flat file connection manager.Create another dataflow task in which you can upload the file to Azure BLOB in
https://www.powerobjects.com/blog/2018/11/20/uploading-azure-blob-ssis/
Now your Azure pipleline can access that BLOB as a source in the copy
activity and dump it to the table storage as a sink.
Let me know if you need more details on the implementation.

Related

How to set the path of a CSV file that is in account storage in azure data factory pipeline

I have created a SSIS package that reads from a CSV file (using the Flat file connection manager) and loads records into a database. I have deployed it on Azure data factory pipeline and I need to give the path of the CSV file as a parameter. I have created a azure storage account and uploaded the source file there as shown below.
Can I just give the URL of the source file for the Import file in the SSIS package settings as shown below? I tried it but it is currently throwing 2906 error. I am new to Azure - appreciate any help here.
First, you said Excel and then you said CSV. Those are two different formats. But since you mention the flat file connection manager, I'm going to assume you meant CSV. If not, let me know and I'll update my answer.
I think you will need to install the SSIS Feature Pack for Azure and use the Azure Storage Connection Manager. You can then use the Azure Blob Source in your data flow task (it supports CSV files). When you add the blob source, the GUI should help you create the new connection manager. There is a tutorial on MS SQL Tips that shows each step. It's a couple years old, but I don't think much has changed.
As a side thought, is there a reason you chose to use SSIS over native ADF V2? It does a nice job of copying data from blob storage to a database.

Azure Data Factory v2 Data Transformation

I am new to Azure Data Factory. And my question is, I have a requirement to move the data from an on-premise Oracle and on-premise SQL Server to a Blob storage. The data need to be transformed into JSON format. Each row as one JSON file. This will be moved to an Event Hub. How can I achieve this. Any suggestions.
You could use lookup activity + foreach activity. And inside the foreach, there is a copy activity. Please reference this post. How to copy СosmosDb docs to Blob storage (each doc in single json file) with Azure Data Factory
The Data copy tool as part of the azure data factory is an option to copy on premises data to azure.
the data copy tool comes with a configuration wizard where you do all the required steps like configuring the source, sink, integration pipeline etc.
In the source you need to write a custom query to fetch data from the tables you require in json format.
In case of SQL server to select json you would use the options OPENJSON, FOR JSON AUTO to convert the rows to json. Supported in SQL 2016. For older versions you need to explore the options available. Worst case you can write a simple console app in C#/java to fetch the rows and then convert them to json file. And then you can upload the file to azure blob storage. If this is an one time activity this option should work and you may not require a data factory.
In case of ORACLE you can use the JSON_OBJECT function.

select library in SAS from SSIS

I am using SSIS to extract some data out of an SAS server.
using this connection setup (SAS IOM Data Provider 9.3)
I can get the connection to read the default Library/Shared data folder.
What do I need to change/set to get it to read a different library?
These are the properties of the libraries:
The one on the left is the one I can read, the one on the right is the one I am trying to access.
If your data folder contains *.sas7bdat files then you could use this:
http://microsoft-ssis.blogspot.com/2016/09/using-sas-as-source-in-ssis.html
Simply write your SAS libname statement inside the SAS Workspace Init Script box, eg as follows:
libname YOURLIB "/your/path/to/sas/datasets" access=readonly;
More info: http://support.sas.com/kb/33/037.html

SSIS copy files task

I have a list of file paths in a table. I need to read these file paths copy these files to a drive which is on present on another server without changing the directory structure. Is this possible ?
You could implement this in SSIS in the following way:
Create a Data Flow Task. In the Data Flow Task, add a Data source component to fetch the filenames from the table. Add a Recordset Destination component and connect the two
Create a For Each container. Connect to the Data Flow Task created in step 1. Change the enumerator type to ADO enumerator
In the For Each task, add a Script Task that takes the file name and uses System.IO.File::Copy method to copy the file to its destination.

Can we have the mapping for OleDB Source to Excel Destination at runtime rather than at design time?

I am trying to create a SSIS package that will export csv data to an excel file. This package will be called up from C# giving the csv file as input to the package. I am quite new to SSIS and I've been able to produce the expected result with same headers.
I went with the following approach -
Script task - created scripts from the csv headers to create temp table, bulk insert, excel table scripts.
Execute SQL Task - created a temp table in database
Execute SQL Task - Bulk insert csv data into table
Execute SQL Task - Create Excel file
Data Flow Task - OleDB Source to Excel Destination
Execute SQL Task - Drop the temp table created
The challenge I am facing is that my csv may have different headers (both text and number of headers may be different). And I want a single package to serve this purpose.
With the headers being different, the mapping between OleDB Souce to Excel Destination in step 5 above is not working for dynamic headers and is giving unexpected results in the excel output. Is there any way these mappings can be decided at runtime and not at design time.
I don't believe that you can specify the columns or column mappings of a Data Flow at SSIS runtime. You could build the SSIS package on-the-fly, which would allow your C# code to create the column mappings, but the mappings have to be created before the package can run. See Building Packages Programmatically in MSDN for details.
On the other hand, if all you're trying to do is convert a CSV file into an Excel spreadsheet, it would seem logical to me to use the Workbook.SaveAs method of the Office object model.