Parallel execution in ssis - ssis

I have developed one master package(Main.dtsx) and 3 child packages (Processor.dtsx).Note: Code is same for all child packages that picks up files from source location and process. To optimize the performance, I want that all these 3 child packages should run simultaneously on 10000 files in such a way that first child will pick 1st file and start execution , at the same time second will pick up 2nd file and so on. Please share the code if you have. I tried with 'MaxConcurrentExecutables' option but in that case all components access same file which is not expected.

This cannot be done with a Foreach Loop, but you can accomplish the task with a Script task:
Add 3 string variables to hold the file names (i.e. File1, File2, File3)
Pass the variables from the master package to each child package.
In each child package, configure an expression in the file connection manager to use the parameter as a connection string
At the end of each package, make sure that the file is moved from the source folder or renamed in such a way that it will be ignored in subsequent loops.
Set up a For loop that will end when all the files have been processed. You can add a boolean variable to the package like "ProcessingIsAllDone" and then set this in the script task.
At the top of the For loop add a script task and connect the execute package tasks with precedent constraints.
Use the script below to set the variables
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.IO;
namespace ST_e4ccd9cfaa4847ff86ec88c215c1961c
{
[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
public void Main()
{
DirectoryInfo sourceDirectory = new DirectoryInfo(#"c:\temp");
int loops = 3;
foreach (FileInfo sourceFile in sourceDirectory.GetFiles("*.txt"))
{
if (loops == 0)
{
break;
}
string variableName = String.Format("File{0}", loops);
Dts.Variables[variableName].Value = sourceFile.FullName;
loops--;
}
if (sourceDirectory.GetFiles("*.txt").Length <= 3)
{
Dts.Variables["ProcessingIsAllDone"].Value = true;
}
Dts.TaskResult = (int)ScriptResults.Success;
}
#region ScriptResults declaration
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion
}
}

Related

Winscp NuGet Package not working in SSIS Script Task [duplicate]

I'm trying to use the WinSCP.NET NuGet to upload some files to an SFTP through a Script Task component in SSIS. While writing the code everything went fine, but if after attempting to build, the WinSCP.NET dll seems to not be picked up breaking all of the references.
I've tried adding WinSCP path to my PATH variable (user). I've tried to add the local version of the WinSCPNET.dll to the GAC. I've tried to reinstall the package through NuGet. I've even tried to change the framework versions.
This is a problem I've had before with the WinSCP.NET DLL. Last time I ended up using a workaround by interfacing with the command line through C#. But I would like to use the DLL, as it's a much simpler implementation.
The code is basically the boilerplate from WinSCP, with some minor changes:
#region Namespaces
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using WinSCP;
#endregion
namespace ST_a1d3d6e0b5d54338bce6c79882c303c6
{
/// <summary>
/// ScriptMain is the entry point class of the script. Do not change the name, attributes,
/// or parent of this class.
/// </summary>
[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
#region Help: Using Integration Services variables and parameters in a script
/* To use a variable in this script, first ensure that the variable has been added to
* either the list contained in the ReadOnlyVariables property or the list contained in
* the ReadWriteVariables property of this script task, according to whether or not your
* code needs to write to the variable. To add the variable, save this script, close this instance of
* Visual Studio, and update the ReadOnlyVariables and
* ReadWriteVariables properties in the Script Transformation Editor window.
* To use a parameter in this script, follow the same steps. Parameters are always read-only.
*
* Example of reading from a variable:
* DateTime startTime = (DateTime) Dts.Variables["System::StartTime"].Value;
*
* Example of writing to a variable:
* Dts.Variables["User::myStringVariable"].Value = "new value";
*
* Example of reading from a package parameter:
* int batchId = (int) Dts.Variables["$Package::batchId"].Value;
*
* Example of reading from a project parameter:
* int batchId = (int) Dts.Variables["$Project::batchId"].Value;
*
* Example of reading from a sensitive project parameter:
* int batchId = (int) Dts.Variables["$Project::batchId"].GetSensitiveValue();
* */
#endregion
#region Help: Firing Integration Services events from a script
/* This script task can fire events for logging purposes.
*
* Example of firing an error event:
* Dts.Events.FireError(18, "Process Values", "Bad value", "", 0);
*
* Example of firing an information event:
* Dts.Events.FireInformation(3, "Process Values", "Processing has started", "", 0, ref fireAgain)
*
* Example of firing a warning event:
* Dts.Events.FireWarning(14, "Process Values", "No values received for input", "", 0);
* */
#endregion
#region Help: Using Integration Services connection managers in a script
/* Some types of connection managers can be used in this script task. See the topic
* "Working with Connection Managers Programatically" for details.
*
* Example of using an ADO.Net connection manager:
* object rawConnection = Dts.Connections["Sales DB"].AcquireConnection(Dts.Transaction);
* SqlConnection myADONETConnection = (SqlConnection)rawConnection;
* //Use the connection in some code here, then release the connection
* Dts.Connections["Sales DB"].ReleaseConnection(rawConnection);
*
* Example of using a File connection manager
* object rawConnection = Dts.Connections["Prices.zip"].AcquireConnection(Dts.Transaction);
* string filePath = (string)rawConnection;
* //Use the connection in some code here, then release the connection
* Dts.Connections["Prices.zip"].ReleaseConnection(rawConnection);
* */
#endregion
/// <summary>
/// This method is called when this script task executes in the control flow.
/// Before returning from this method, set the value of Dts.TaskResult to indicate success or failure.
/// To open Help, press F1.
/// </summary>
public void Main()
{
// TODO: Add your code here
// User::FileName,$Package::SFTP_HostName,$Package::SFTP_Password,$Package::SFTP_PortNumber,$Package::SFTP_UserName
SessionOptions sessionOptions = new SessionOptions
{
Protocol = Protocol.Sftp,
HostName = (string)Dts.Variables["$Package::SFTP_HostName"].Value,
UserName = (string)Dts.Variables["$Package::SFTP_Password"].Value,
SshHostKeyFingerprint = (string)Dts.Variables["$Package::SFTP_Fingerprint"].Value,
Password = (string)Dts.Variables["$Package::SFTP_Password"].GetSensitiveValue(),
PortNumber = (int) Dts.Variables["$Package::SFTP_PortNumber"].Value,
};
try
{
using (Session session = new Session())
{
// As WinSCP .NET assembly has to be stored in GAC to be used with SSIS,
// you need to set path to WinSCP.exe explicitly,
// if using non-default location.
session.ExecutablePath = (string)Dts.Variables["$Package::WinSCP_Path"].Value;
// Connect
session.Open(sessionOptions);
// Upload files
TransferOptions transferOptions = new TransferOptions();
transferOptions.TransferMode = TransferMode.Binary;
TransferOperationResult transferOperationResult = session.PutFiles(
(string)Dts.Variables["User::FileName"].Value, (string) Dts.Variables["$Package::SFTP_RemoteFileName"].Value,
true, transferOptions);
// Throw on any error
transferOperationResult.Check();
// Print results
bool fireAgain = false;
foreach (TransferEventArgs transferEvent in transferOperationResult.Transfers)
{
Dts.Events.FireInformation(0, null,
string.Format("Upload of {0} succeeded", transferEvent.FileName),
null, 0, ref fireAgain);
}
}
}
catch (Exception e)
{
Dts.Events.FireError(0, null,
string.Format("Error when using WinSCP to upload files: {0}", e),
null, 0);
Dts.TaskResult = (int)DTSExecResult.Failure;
}
Dts.TaskResult = (int)ScriptResults.Success;
}
#region ScriptResults declaration
/// <summary>
/// This enum provides a convenient shorthand within the scope of this class for setting the
/// result of the script.
///
/// This code was generated automatically.
/// </summary>
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion
}
}
This should compile as is and allow me to run the SSIS, to upload the file. Instead the references break and I receive a lot of missing reference errors:
Error CS0246: The type or namespace name 'WinSCP' could not be found (are you missing a using directive or an assembly reference?)
Error: This project references NuGet package(s) that are missing on this computer. Use NuGet Package Restore to download them. For more information, see http://go.microsoft.com/fwlink/?LinkID=322105. The missing file is ..\packages\WinSCP.5.15.0\build\WinSCP.targets.
I can indeed reproduce your problem, when I use WinSCP NuGet package. It looks like a problem between the NuGet package manager and SQL Server Data Tools. The file the error refers to actually does exist (in a path relative to the script task .csproj file).
Actually, it looks like it's not even recommended to use NuGet in SSIS. You should rather register the assembly to GAC:
How can I use NuGet with SSDT?
Creating a reference to a custom assembly from an SSIS Script Task - vb
SSIS Script Task cant find reference to assembly
And indeed, if I follow the WinSCP instructions for using the assembly from SSIS (using the GAC), it works just fine.
Make sure you have uninstalled the NuGet package.
Install WinSCPnet.dll to GAC or subscribe AppDomain.AssemblyResolve event.
And add WinSCPnet.dll to your script task project.

Connecting to SFTP via SSIS

I'm trying to connect to a SFTP server via an SSIS package. The package executes WinSCP with the following connection string in a .txt file:
open sftp://username:fc$#6444#example.com:22
However the package keeps failing without being able to connect. Is it something to do with the special characters in the password?
I am able to connect to a different SFTP if I replace the string so I know it must be something to do with the syntax above. I've tried putting double quotes around the string as follows without any success:
open "sftp://username:fc$#6444#example.com:22"
I had to do this too, for one of my work projects recently. We used the WinSCP .NET assembly inside an SSIS Scripting Task, as this is what WinSCP also recommends as the way to achieve SFTP using WinSCP in SSIS.
See this guide - Using WinSCP .NET Assembly from SQL Server Integration Services (SSIS). It walks you through the install and setup and also contains working sample code (after you change the script to your needs of course!).
Sample code - after you reference the WinSCPnet.dll assembly - is below.
using System;
using Microsoft.SqlServer.Dts.Runtime;
using Microsoft.SqlServer.Dts.Tasks.ScriptTask;
using System.AddIn;
using WinSCP;
namespace ST_5a30686e70c04c5a8a93729fd90b8c79.csproj
{
[AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : VSTARTScriptObjectModelBase
{
public void Main()
{
// Setup session options
SessionOptions sessionOptions = new SessionOptions
{
Protocol = Protocol.Sftp,
// To setup these variables, go to SSIS > Variables.
// To make them accessible from the script task, in the context menu of the task,
// choose Edit. On the Script task editor on Script page, select ReadOnlyVariables,
// and tick the below properties.
HostName = (string) Dts.Variables["User::HostName"].Value,
UserName = (string) Dts.Variables["User::UserName"].Value,
Password = (string) Dts.Variables["User::Password"].Value,
SshHostKeyFingerprint = (string) Dts.Variables["User::SshHostKeyFingerprint"].Value
};
try
{
using (Session session = new Session())
{
// As WinSCP .NET assembly has to be stored in GAC to be used with SSIS,
// you need to set path to WinSCP.exe explicitly, if using non-default location.
session.ExecutablePath = #"C:\winscp\winscp.exe";
// Connect
session.Open(sessionOptions);
// Upload files
TransferOptions transferOptions = new TransferOptions();
transferOptions.TransferMode = TransferMode.Binary;
TransferOperationResult transferResult;
transferResult = session.PutFiles(#"d:\toupload\*", "/home/user/", false, transferOptions);
// Throw on any error
transferResult.Check();
// Print results
bool fireAgain = false;
foreach (TransferEventArgs transfer in transferResult.Transfers)
{
Dts.Events.FireInformation(0, null,
string.Format("Upload of {0} succeeded", transfer.FileName),
null, 0, ref fireAgain);
}
}
Dts.TaskResult = (int)DTSExecResult.Success;
}
catch (Exception e)
{
Dts.Events.FireError(0, null,
string.Format("Error when using WinSCP to upload files: {0}", e),
null, 0);
Dts.TaskResult = (int)DTSExecResult.Failure;
}
}
}
}
Install WinSCP and then create a folder where you want a file from client or put the file.Then Open a Execute Process Task and then go to Expression tab and set the Executable and Arguments with below codes(Please change accordingly).
Write this code in notepad and save as winscp.txt at the path C:\path\to\winscp.txt.
Open sftp://Host_Name:Password#apacsftp01.mftservice.com/ -hostkey="ssh-rsa 2048 xxxxxxxxxxx...="
get -delete /home/client/Share/MediaData/Media_file.xlsx
exit

How to display variable in ssis script task

I have a 'Execute SQL task' which product a row count. see screen 1.
I am trying to print 'ContactRowCount' in ssis 'Script task'. See screen 2.
But Dts.Variables["User::ContactRowCount"] is null. I can retrieve it.
Anyone knows how to retrieve variable value from 'Execute SQL task' in script task
Screen - 1
Screen - 2
Do read documentation, all of this has been covered.
Variables
I have two variables. One is for my SQL, called Quqery, which is optional. The other is an Int32 called RowCount.
Execute SQL Task
I have an Execute SQL task that uses an OLE DB Connection Manager. I have specified that it as a ResultSet of Single Row. I use my Query variable as the source.
The value is
SELECT COUNT(1) AS ContactRowCount FROM sys.columns AS SC;
In the Result Set tab, I map the 0 ResultSet to my variable User::RowCount.
If you are using a different Connection Manager provider (ADO.NET or ODBC), then these semantics all change. But it's documented.
Script Task
I ensure that I am passing in my variable as a read only object
Within my script, I need to access that variable. This is case sensitive. Furthermore, I want the .Value property. Your code, were it to work, would be casting the SSIS Variable to a string. This results in the default of the object emitting its name Microsoft.SqlServer.Dts.Runtime.Variable
Instead, we will want to access the .Value property which is returned as an object. If you were trying to do something mathematical with this value, then you'd need to convert it to an integer value but since we're going to string, that's easy.
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
namespace ST_454527867e9d448aad6a7f03563175b2.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
#region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion
public void Main()
{
string strMessage = Dts.Variables["User::RowCount"].Value.ToString();
MessageBox.Show(strMessage);
Dts.TaskResult = (int)ScriptResults.Success;
}
}
}
use variable name in the script task and not the result set name.
Just check the variable values during runtime debug.

SSIS: Set list variables of (1) inventorycountNr and (2) StoreNr, and then use them in a where clause

1.Script Task: set arrays of (A) inventory count and (B) StoreNr
2.Data flow task: Use the list variables in where clauses (to filter and thereby speed up performance)
*Script task must read from server A and Data flow task from server B.
I dont want to use linked server and dont want to filter downstream the dataflow, but instead want to filter through the where clauses in the dataflow source (OLE DB).
You may do it in two Data Flows.
In first:
Select value to be used in where from source table
Store this values in string variable ListToBeFetched as comma separated list using Srcipt Component as destination witch code similar to:
using System.Text;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
StringBuilder sb;
public override void PreExecute()
{
base.PreExecute();
sb = new StringBuilder();
}
public override void PostExecute()
{
base.PostExecute();
Variables.IdListToBeFetched = sb.ToString().TrimEnd(',');
}
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
if (!Row.Value_IsNull)
{
sb.AppendFormat("{0},", Row.Value);
}
}
}
Do the same with second list.
In second Data Flow set dynamic generated query as sql command in OLE DB Source (taken from Jamie Thomson blog):
Create a new variable called SourceSQL
Open up the properties pane for SourceSQL variable (by pressing F4)
Set EvaluateAsExpression=TRUE
Set Expression to "select * from table where columnToBeSearched in (" + #[User::ListToBeFetched] + ")"
For your OLE DB Source component, open up the editor
Set Data Access Mode="SQL Command from variable"
Set VariableName = "SourceSQL"

SSIS: Get any flat file source from folder and cache the name as a super global variable

I'm working in SSIS and Visual Studio 2008. When executed, I need to have the SSIS package perform the following tasks:
Check a folder for a file
If a file exists take the file and use it as the source for the flat file
Store the name of the file into a global variable that I can access in other parts of my package
The package will be run by some other script. Thus we need it to check for the file every time the package runs. We are trying to prevent the scenario where we have to monitor the folder and execute the package manually when the file appears.
Any suggestions?
The easiest way would be to set up a Foreach Loop container that has all the "work" of your package inside of it (optionally, you can it as a precursor step and use a conditional expression off of it). Assuming you have 2 variables called FileName (which is what you will have the value assigned to) and an InputFolder variable that contains the "where" we should be looking
ForEach Loop Editor
Collection tab:
Enumerator = Foreach File Enumerators
Expression: Directory = #[User:InputFolder]
FileSpec: "YD.*"
Retrieve file name
* Fully qualified
Variable Mappings tab:
Variable: User::FileName
Index: 0
You can also do this via a script task, if you'd like to see that, let me know.
EDIT
This script again assumes you have the variables InputFolder and FileName defined. Create a Script Task Component and check InputFolder as a read only variable, FileName as a read/write variable.
using System;
using System.Data;
using System.IO; // this needs to be added
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
// namespace will vary
namespace ST_bc177fa7cb7d4faca15531cb700b7f11.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
#region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion
public void Main()
{
string inputFolder;
string fileName;
inputFolder = Dts.Variables["InputFolder"].Value.ToString();
// File, if exists will look like YD.CCYYMMDD.hhmmss.done
string fileMask = "YD.*.done";
// this array will catch all the files matching a given pattern
string[] foundFiles = null;
foundFiles = System.IO.Directory.GetFiles(inputFolder, fileMask);
// Since there should be only one file, we will grab the zeroeth
// element, should it exist
if (foundFiles.Length > 0)
{
fileName = foundFiles[0];
// write the value to our global SSIS variable
Dts.Variables["FileName"].Value = fileName;
}
Dts.TaskResult = (int)ScriptResults.Success;
}
}
}
Here is a possible option. You can achieve this using the Foreach Loop container. Please find the example that I have provided below. Hopefully, that gives an idea.
Step-by-step process:
On the SSIS package, create 3 variables are shown in screenshot #1. Scope CheckFile represents the package name. Variable Folder will represent the folder that you would like to check for the file. Filename represents the file name to check for. Variable FilePath will be the global variable that you will need. It will be filled in with the file path value if the file exists, otherwise it will be empty.
On the package's Control Flow tab, place a Foreach Loop container and a Script Task. Script Task is to showcase that the variable retains the value after the Foreach Loop container execution is complete. Refer screenshot #2.
Configure ForEach Loop container as shown in screenshots #3 and #4.
Replace the Main() method within the Script Task with the code given under the Script task code section. This is to demonstrate the value retained by the variable FilePath.
Screenshots #5 shows no files exist in the path c:\temp\ and screenshot #6 shows the corresponding package execution.
Screenshots #7 shows the file TestFile.txt exists in the path c:\temp\ and screenshot #8 shows the corresponding package execution.
If you would like to process the file when it exists, you can place a Data Flow Task within the Foreach Loop container to do that.
Hope that helps.
Script task code:
C# code that can be used only in SSIS 2008 and above..
public void Main()
{
Variables varCollection = null;
Dts.VariableDispenser.LockForRead("User::FilePath");
Dts.VariableDispenser.GetVariables(ref varCollection);
if (String.IsNullOrEmpty(varCollection["User::FilePath"].Value.ToString()))
{
MessageBox.Show("File doesn't exist.");
}
else
{
MessageBox.Show("File " + varCollection["User::FilePath"].Value.ToString() + " exists.");
}
Dts.TaskResult = (int)ScriptResults.Success;
}
Screenshot #1:
Screenshot #2:
Screenshot #3:
Screenshot #4:
Screenshot #5:
Screenshot #6:
Screenshot #7:
Screenshot #8: