SSIS - Move List of Files from One Folder to Another - ssis

I have a folder SourceFolder with about 10,000 PDF documents.
I have a list (FileList.csv) with the names of about 1,000 of those files.
I want to move the files on that list from the folder they are in to an empty folder I have created (DestFolder). I am using SSIS 2013.
As a Proof-of-Concept, I successfully configured a ForEachLoop container using a ForEachItem Enumerator with a FileSystem Task inside, and moved 2 of the files.
However, I had to enter the filenames by hand in the ForEachLoop Editor --> Collection --> Enumerator Configuration window.
I have Variables configured for FileName, SourceFolder, and FullSourcePath, and everything works.
My question is - How can I connect to the flat file to get the filenames into the variable?
I'm not allowed to post images, so I will try to explain what I have tried.
adding a flat file source and Connection manager and using the Expressions in CM Propoerties to assign the FileName variable to Connection String (no luck)
Feeding the FlatFile Source into a RecordSet Destination and assigning the result set to an ObjectVariable, ANd changing the ForEach Loop Container configuration to ForEach ADO Enumerator, with ObjectVariable assigned to ADO Source Object variable (no luck).
This seems like such a simple task, I hope I am missing something obvious. Aopologies for not including images.

Create a variable of type Object to store your file names.
In a script task populates a String Array with the file names and assign it to the object.
Create a precedence constraint between your script task and your foreach container.
In the collection tab of your foreach loop container select Foreach From Variable Enumerator as enumerator type.
In the variable mapping tab select the variable that will store the current file name during the enumeration.

I finally got this to work and thought I would post my solution for anyone else who might face the same issue.
I used an SQL Task with an embedded query to get my list of filenames straight from the database.
I fed this into a ForEachLoop Container with a File System Task inside.
I created the following variables:
SourceFolder | String | B:\Desktop\Source\
DestFolder | String | B:\Desktop\Dest\
FileName | String |
FileList | Object |
FileFullPath |String | (Expr:) #[User::SourceFolder]+#[UserFileName]
Connection Managers
OLE DB for the database (obviously)
DestFolder | UsageType: | Existing Folder | Browse to the Folder
SourceFolder | Usage Type: | Existing File | Point to ANY file in the Folder
IMPORTANT: In the SourceFolder Connection Manager, set the following Expression:
ConnectionString | #[User::FileFullPath]
In the SQL Task Editor | General window:
Set the ResultSet to Full result set
Paste the query into SQL Statement
In the SQL Task Editor | ResultSet window:
Result Name = 0
Variable Name = User::FileList
In the ForEach Loop Editor | Collection window:
Enumerator = ForeachADO Enumerator
ADO object source variable = User::FileList
Enumeration Mode = Rows in the first table
In the ForEach Loop Editor | Variable Mappings window:
Variable = User::Filename
Index = 0
In the ForEach Loop Editor Properties:
Set the Maximum Error Count to some number greater than the number of files you are looping over. (This covers the situation where your query generates a filename that is not in your Source Folder. I'm sure there's a more elegant way to do this, and I look forward to learning it someday!)
In the File System Task Editor | General window:
IsDestinationPathVariable = False
DestinationConnection = DestFolder
Operation = MoveFile
IsSourcePathVariable = False
SourceConnection = SourceFolder
In the File System Task Properties window:
Set the Maximum Error Count > the number of files in SourceFolder
The package will now run and generate an error for every filename in the FileList that does not have a matching file in the Source Folder, but it will run and move the files.
Gotchas:
Don't forget the "\" at the end of the filepath variables
Don't forget to set the Expression in the Source Connection Manager Properties window (see above).
Thanks to everyone for your help.

Much quicker way to accomplish this!
drop the foreach loop task down.
under collection select Foreach File Enumerator.
folder is where the files are. c:\SrcFiles leave fully qualified and put in .txt or whatever file extension is
under variable mappings add variable named FileName string type and make sure it's in scope of package
drop file system task into container
Source Variable is FileName
Destination Connection is location you are moving to. d:\ReceivedFiles
Overwrite destination = True
DelayValidation = True
Voila!

Related

SSIS ForEach Loop Container - How to dynamically change OLEDB Destination connection at run time

There are 4 Connection strings with different SQL Servers (which I set up in SSIS Connection Managers section):
Database name is same in all the servers:
SERVER DATABASE
dbTestServer dbFees (Main Server and Database)
dbTestServer1 dbFees1
dbTestServer2 dbFees1
dbTestServer3 dbFees1
dbTestServer is the OLEDB Source and other Servers are OLEDB Destination that needs to be updated everytime we run package.
Now, I want to take data from dbTestServer-dbFees and copy to all the other databases. I created a Dataflow task to copy data from dbTestServer to dbTestServer1.
But I need to put this data flow task inside ForEach Loop container to change the connection/Server dynamically so that it will work like:
First run- By default OLEDB Source is set to dbTestServer and OLEDB Destination is set to dbServer1 and data is copied from dbFees to dbFees1.
Second run- OLEDB Source is set to dbTestServer and OLEDB Destination is set to dbServer2 and data is copied from dbFees to dbFees1
Third run- OLEDB Source is set to dbTestServer and OLEDB Destination is set to dbServer3 and data is copied from dbFees to dbFees1.
I need step by step solution as I am new to SSIS packages and I tried multiple solutions but NOTHING worked so far!
Appreciate your help!
Thank you
I suggest using FOR LOOP.
My logic is to increment variable on each loop and create an expression with a connection string and a number of iteration.
1st step is to create Connection Manager with server name dbTestServer1 and database name dbFees1
2nd step will be to add a connection manager to OLE DB Destination
3rd step is to create 2 variables: ConnString and Iteration.
For Iteration default value set to 1, because you need dbTestServer 1
ConnString you need to set like your initial connection string, just on place 1 in dbTestServer1 to set (DT_STR, 1, 65001)#[User::Iteration].
Like on next 2 pictures:
When you set variables, you need to set expression in OLE DB Connection Manager.
From drop-down select connection string and type #[User::ConnString].
And finally set FOR LOOP like on picture
NOTE: I can't test package because I don't have server names like you, but this is logic of how to solve your problem. And this is only solution for what you asked, you must create whole package on your own.
For main server and database, just add one OLE DB Source with static names for server name and database name.
And you don't need script task if you using my logic.
Here is the code I have used to dynamically change connection server/database inside C# Script task in SSIS:
Variables I pass to the C# Script task under ReadOnlyVariables:
(set these up in your Variables inside SSIS)
User::DatabaseListOnThisLoop_ConnectionString
User::DatabaseListOnThisLoop_DatabaseName
This is the name of the connection string I am dynamically change that is in my ConnectionMangers in SSIS:
SourceServerDBForClassification_Dynamic
FULL SCRIPT from my C# Script task inside SSIS. As long as you setup the variables and put the 2 in above in the ReadOnly section of the script task, you should be able to just copy/paste the entire code below into your C# Script task.
NOTE: The Namespace may give you an issue so may want to keep the one that is generated in your code when adding the script task.
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms; // dont think this is needed, I used this for message box for some testing, but leaving here just in case
namespace ST_f8d6dad17af541bbb0010c9fce3ccbb0
{
[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
public void Main()
{
// get connection string from variable
string ServerConnectionStringOnThisLoop = Dts.Variables["DatabaseListOnThisLoop_ConnectionString"].Value.ToString();
string DatabaseOnThisLoop = Dts.Variables["DatabaseListOnThisLoop_DatabaseName"].Value.ToString();
// this could change depend on what type of connection you are using for provider and other settings
string DynamicConnectionString = "Data Source=" + ServerConnectionStringOnThisLoop + ";Initial Catalog=" + DatabaseOnThisLoop + ";Provider=SQLNCLI11.1;Integrated Security=SSPI;";
// Add the OLE DB connection manager set to existing connection
ConnectionManager SourceServerDBForClassification_Dynamic = Dts.Connections["SourceServerDBForClassification_Dynamic"];
// now set the dynamic connection above to the connection string passed in from SSIS package
SourceServerDBForClassification_Dynamic.ConnectionString = DynamicConnectionString;
// now set the package connection to the one we just created from using the variable from the SSIS package
Dts.Connections["SourceServerDBForClassification_Dynamic"].ConnectionString = SourceServerDBForClassification_Dynamic.ConnectionString;
Dts.TaskResult = (int)ScriptResults.Success;
}
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
}
}

Unable to use expression on excel connection manager in SSIS 2017

I'm trying to loop through excel files in a directory and perform a data flow task in SSIS.
The For-Each Loop container seems pretty simple to set up:
I map to a variable called FileNameTemp.
Inside the For-Each Loop, I have a data flow task where the source object is an Excel Source with an Excel Connection Manager. I use the FileName temp to set the File Name of the ExcelFileName:
My problem is whenever I try to run the package, I get the error below:
[Connection manager "Excel Connection Manager"] Error: SSIS Error Code
DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code:
0x80004005. An OLE DB record is available. Source: "Microsoft Access
Database Engine" Hresult: 0x80004005 Description: "Failure creating
file.".
I found other similar posts. I definitely have permission to write to this folder. If I remove the expression and just open the same file over and over it works. I also set DelayValidation to true on pretty much every level.
Try removing the "C:..." from your expression definition. The For-Each file enumerator will give the full path.
In the future you can set a breakpoint on your data flow task and view the value of your variable that you set in the locals tab.
Same answer, just more verbose than #mike Baron's answer is that in the ForEach Loop Container, the radio button is checked for "Fully Qualified" with the result pushed into our variable #[User::FileNameTemp]
Each file found in the specified source folder C:\SourceCode\ExcelSourceFinancialReconcilliation is in turn going to be assigned to that variable in the form of
C:\SourceCode\ExcelSourceFinancialReconcilliation\file1.txt
C:\SourceCode\ExcelSourceFinancialReconcilliation\file2.csv
C:\SourceCode\ExcelSourceFinancialReconcilliation\file2.xls
Then, when we set the Expression on the Excel Connection Managers ExcelFilePath property, we need to just use #[User::FileNameTemp] As it stands, the expression is doubling up the path so that Excel is attempting to find
C:\SourceCode\ExcelSourceFinancialReconcilliation\file1.txt\C:\SourceCode\ExcelSourceFinancialReconcilliation\file1.txt
As a general rule, only use a direct variable in the Expressions associated to "objects" in SSIS. Property1 = #Variable The reason for this, is that you cannot put a break point to on the evaluation to determine why #Property1 = "Foo" + #Variable is invalid. If you create a custom variable #Property1Variable = "Foo" + #Variable and then assign #Property1 = #Property1Variable, you can put a breakpoint in the package and then inspect the value of the SSIS variable. It's much easier to find problems this way.
Possibly helpful other answers on the subject
https://stackoverflow.com/a/18640174/181965
https://stackoverflow.com/a/21536893/181965

Access: display .vsd from attachments

I have an Access table where each item has attached a Visio file (.vsd).
In my Access form, I would like to see the file. I don't care if it is an editable Visio file, a preview or just an image.
I have built a VBA code that let me load the Visio file from a Directory. But I need to load the file from a table.
Here my VBA code.
Private Sub Carica_Dati()
Dim path As String
path = "C:\Users\VisioFlow_001.vsd"
With Me.VisioObject ' name of the OLE Object where I want to put the Visio file
.Class = "Visio.Drawing.11"
.OLETypeAllowed = acOLELinked
.SourceDoc = path ' HERE I WANT TO LOAD THE FILE FROM A TABLE OF THE DB
.Enabled = True
.Locked = False
.Action = acOLECreateLink
.SizeMode = acOLESizeZoom
End With
End Sub
Here a preview of the form.
UPDATE
Here a picture to show how the file is attached to the table.
Since attachment fields in Access aren't very consistent, directly loading them into an OLE object is not an option, unless you're willing to do sophisticated things
Microsofts documentation on attachments can be found here
My observations on attachments: the binary data field contains one of the following:
Some characters I can't identify + the file type + the file data appended to it
Some characters I can't identify + the file type + a compressed version of the file data appended to it
Microsoft, in all it's wisdom, has supplied us with a way to save the original file to the disk, but hasn't supplied us with a way to remove those initial characters and the file type from the actual file data, or an easy way to identify if the file is compressed or not (you can check the file type with the table supplied in the link to check if it should be).
In conclusion, you're probably off best either replacing your attachment field with an OLE object in the database, or writing the attachment files to disk before displaying them.
If you use an OLE object field, and load them in as long binary data (not through the GUI), you can easily achieve the behaviour you seek without writing the file to disk, since the binary data is available without any extra characters.
To write an attachment file to disk:
Dim rsForm As DAO.Recordset2
Dim rsFiles As DAO.Recordset2
Set rsForm = Me.Recordset
Set rsFiles = rsForm.Fields("attachment_column").Value
If Not rsFiles.EOF Then
Dim fileLocation As String
fileLocation = Environ("TEMP") & rsFiles.Fields("FileName").Value
rsFiles.Fields("FileData").SaveToFile fileLocation
'Your existing code to display the OLE object here
End If
You do not want to use the Attachment feature. Its purpose is different than what you are attempting.
Put the images into their own stand alone folder outside of the database.
In the table that holds the records for your main form - you need a new field which holds the path & image file name. This is a text field. (If the path segment is uniform for all one can insert that elsewhere via code rather than store it in this field.)
Then in form design - use the image control. This control (all controls) have a source property - that will change with each record using that field that holds the path & file name.
Do a bing/google on the topic of changing an image with every record - the set up isn't intuitive necessarily. Note that older editions did things differently so be sure you get relatively recent advice.
Then when you are using the form and change records - the image will change.
Note after having typed all this.... I have no idea if the visio file type works - I know that jpg and bmp do... so first sanity check a simple fixed image with that file type to see if it works ...

SSIS Scientific Notation Desirable

I am working on a SSIS project that scans a directory and loops through each excel files that will then be loaded into MSSQL. Currently, I am having an issue with 2966171 being represented as 2.966171e+006. Here is what I have:
1) The Excel Connection String is passing IMEX=1; (Import Export Mode)
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\\UNC\PATH\TO\Excel.xls;Extended Properties="EXCEL 8.0;HDR=NO;IMEX=1";
2) The has confirmed the data type for this field is DT_WSTR of length 255.
Things I have tried:
1) Changing the datatype in Excel to Text
2) Creating a script component that explicitly converts a string to a decimal back to a string. (Terrible Approach)
3) Casting in a derived column component.
EDIT: I must keep this column a DT_WSTR type, some other rows contain alphanumeric values.
use For-each loop controls to loop over excel files
and map the filepath into a variable (ex: #[User::strExcelFile])
in the foreach container you must use 2 dataflow tasks;
the first one contains an excel source and a script component as a destination, the second DataFlowTask is your task
if the excel files have the same structure you must follow the steps below:
open an excel file and change the entirecolumn type to number
in the excel connection manager choose this file
in the second dataflow task in the Excel Source set the delay validation property to true
in the first dataflow task int the script component
properties (script tab) put the variable "strExcelFile" in the read
only variables, And in the script you must do the following steps:
First Add Microsoft.Office.Interop.Excel.dll as a reference
Second read the ExcelFile path from the variable using the following code:
Imports Microsoft.Office.Interop.Excel
Dim strExcelFiles As String = String.Empty
Public Overrides Sub PreExecute()
MyBase.PreExecute()
strExcelFiles = Variables.strExcelFile
End Sub
Third, in The Main Sub write create an Excel.application set the Visible Property to false
Open the ExcelFile and Change the EntireColumnType to Number and save The Excel File and Close the application using the following Code:
Dim ColIdx As Integer = 0 'Number Column index
Dim appExcel As New Excel.Application
appExcel.Visible = False
Dim wrkbExcel As New Excel.Workbook
wrkbExcel = appExcel.Workbooks.Open(strExcelFile)
Dim wrkshExcel As Excel.Worksheet = wrkbExcel.Worksheets(0)
wrkshExcel.Cells(1, ColIdx).EntireColumn.NumberFormat = "0"
'this will change the EntireColumn Type to Number and eliminate scientific character E+
wrkbExcel.Close(True)
appExcel.Quit()
Brief, every Excel file must be edited before importing data from it because the scientific sign appear when a number is stored in a cell having a datatype different from number

Reading excel rows from Execute sql task SSIS

Is it possible to read all rows from Excel sheet through Execute SQL Task from SSIS and reading each value in for loop container ?
You probably could and save the output to a variable, which you can use in the loop container. There may be a gotcha with permissions and/or linked server setup.
Here's another approach:
Create Data Flow
Create Data Connection to Excel file
Create Excel file source transformaion
Use Recordset Destination to populate a variable
Use the variable in your loop, setting Enumerator property to Foreach ADO Enumerator