SSIS 2008 - ForEach Loop to look at specific group/list of files - csv

Ive been searching the internet to what I thought would be a straight forward question to answer. Hope you guys can help?
I am using for each loop to look for specific files and move them with file system task to a different folder.
Say I have 10 csv files. called listed A to J
I only want to move a,e and j but cant seem to get the foreachloop to look for that group.
In the enumerator Files text box i have tried inserting the 3 file names split by various separators, but SSIS thinks its all one specific file and none of the 3 get moved.
Can someone advise how it can be done? Just to confirm, I dont to use wild card logic just group of specific file names - similur to the IN function of SQL query
Thanks in advance
now added img - please advise how to slect 3 specific files in the text box with arrow

Since OP isn't able to proceed with just my comment, I'll explain a bit more in detail -
Use an EXECUTE SQL TASK to dump the names of the files needed into an SSIS object (to do this you could use a stored procedure or an SQL query). Create an object-type variable in the variables tab prior, change the output in the EXECUTE SQL TASK to Full Result Set and map the result to the object you just created. Now this object holds the list of files you need to loop through.
Now drag-and-drop a ForEach container from the SSIS toolbox. It should be configured as a ForEach ADO Enumerator and map the object to it. Create another variable of type string that will hold the file names after each iteration of the ForEach container. Map this also in the Variables tab of the ForEach container.
Now, place the File System Task which you would use to move these files into the ForEach loop. Use the file-name-variable you created to move just the required files.
Now if you're not sure what SQL query to use for your case in step 1 to get the 3 file names -
SELECT 'A.csv'
UNION
SELECT 'E.csv'
UNION
SELECT 'J.csv'

Related

How to load files with different names to different tables in SSIS?

I have two files that are named like this:
CustomerReport(08022021-08032021)
ComparingReport(08022021-08032021)
I need to load the CustomerReport to a table and the ComparingReport to another table.
I tried for each loop container but I cant think of how the expression will be to pickup the file.
Im thinking of something like Customer*.csv where the * acts like a wild card but that didnt work. What can I do in this case?
Here key of the answer is to use Foreach Loop and Conditional Split.
Don't pay attention on errors, because I don't have your CSV files and tables in DB!
Create a new variable FileName - string data type
Add foreach loop and set like on screenshot
Collection Tab:
Variable Mappings Tab:
Add Data Flow Task into Foreach Loop container
Drag elements from toolbox like on a image
Flat File Source connect to one of your CSV file using Flat File Connection Manager and on connection manager, in properties > expressions, set for ConnectionString a variable FileName
Set Conditional Split like on a image
Expression for Customer is:
LEFT(
SUBSTRING(#[User::FileName],FINDSTRING((#[User::FileName]),"Customer",1),100),
FINDSTRING(SUBSTRING(#[User::FileName],FINDSTRING((#[User::FileName]),"Customer",1),100),"Report",1) - 1
) == "Customer"
Connect Conditional Split to OLE DB Destination's
NOTE: I can't run package as I said on top of this answer, Pay attention to Conditional Split, but this is the way how you need to find if part of a string is into whole string.

SSIS package - How to produce multiple files with a foreach loop container?

How can several files be produced in a single SSIS package? I have created one that produces a single file, but have no idea how to produce several ones.
The package I produced uses variables to know which data to retrieve, and an expression in the flat file connection manager to assign the correct name to the file (which is based on variables).
The single package I created retrieves the city for which I want the sales data (New York) and the month (September 2020) as variables/parameters, and uses them to extract the appropriate data. Example of SQL statement executed:
select * from table1 where City = ? and Period = ??
It then uses those to build the name for the file to be exported and sends it to a folder. But how do you do that to produce several files within the same package? How can I make the same SSIS package produce another file for Chicago - July 2020, another for Denver - June 2020, and another for San Diego - March 2020?
I plan to have a table that indicates what needs to be produced.
ExampleRow1: Chicago, Sep 2020, Produce=Yes.
ExampleRow2: Miami, Jan 2020, Produce=Yes.
So the SSIS package would need to use that info to produce a file, and then do it again, and again, until there is nothing more to produce. Is this even possible? I know a foreach loop container can help, but not sure if it can handle so many variables changing. This is pretty much the first package I create, that's why I am this ignorant. Thanks in advance!
Right now, you have it working correctly for the value of your two SSIS variables (City and Period) and you have it parameterized so I wouldn't discount that as your first SSIS package. People struggle with far easier tasks
What you need to do is connect the orchestrator/driver table into your package. Here's how we're going to do that.
Create an SSIS variable called rsObject of type Object. This is going to hold a recordset object aka the results of our query.
Execute SQL Task
Add an Execute SQL Task to the Control Flow. Call it "SQL Get Driver Data" You'd use a query like
SELECT T.City, T.Period FROM dbo.ExtractSetup AS T WHERE T.Produce = 1;
Change the default of No Result Set to Full Result Set. That tells SSIS to expect a table shaped return object but something needs to catch that incoming data.
In the Results tab, you now need to map the results into an SSIS variable. Assuming an OLE DB type connection manager, you'll select User::rsObject in the Variable list and then 0 as the recordset name (doing this from memory so specifics might be a slightly off)
Save and run that task. Assuming no errors, when the package runs we have a, potentially empty, set of data in our recordset object. Let's do something with that.
Shredding the data
The name I generally see applied to getting data out of enumerable objects in SSIS is called "shredding the data". The implementation of that is an Foreach Enumerator - one of the most powerful tools in your toolkit.
Drag a ForEach Loop Container onto the canvas. Drag the connector line (precedent constraint) from the "SQL Get Driver Data" to our new ForEach Loop Container. I'd name it "FELC Shred Results" to indicate my intent.
Double click the Task and change the default enumerator type from File System to "Ado.net recordset" This has no bearing on whether you used an OLE, ODBC or ADO.NET connection manager to populate the table-like object. If it's a table, use ADO.NET Recordset.
Specify our variable [User::rsObject] as the source of the Recordset object.
The last thing we need to do is configure what we should do with the current row in the enumerator. That's in the Mapping tab. Here you'll add two parameters and this will be a zero based ordinal system. Choose [User::City] (or whatever you've named your City variable) for your first entry and map that to column name 0. Add a row and use User::Period and map that to column 1
The final step is to take the existing logic (Data Flow Task and whatever else is variable dependent) and move it into the FELC. That's literally drawing a box around it with the mouse to highlight everything and hold the left mouse button and drag it into the FELC.
Hit F5 and you should have 2 files generated.

SSIS - How to loop through files in folder and get path+file names and finally execute stored Procedure with parameter as Path + Filename

Any help is much appreciated. I am trying to create an SSIS package to loop through files in the folder and get the Path + filename and finally execute the stored proc with parameter as path+filename. I am not sure how to get the path+filename and insert the into the Stored proc as parameter. I have attached the screenshot for your reference:
Looks like you have the right idea in general and the link #Speedbirt186 provided has some good details but it sounds like there are a couple of nuances that I thought I might point out in regards to flow and variables.
The foreach loop can assign the entire path or the file name or file name & extension to a variable. The latter will be the most help in your case if you don't want to add a script task to split the Filename from the path. If you start by adding 5 variables to your project it will make it a little easier. 1 will be the Source Directory Path, another the Destination (Archive) Directory Path, and then 1 to hold the File Name and Extension assigned by the for each loop. Then 2 additional dynamic variables that simply combine the source directory and file name to get the source full path and the destination with file name to get the destination full path.
Next make sure you set up your database and Excel file connections. In your Excel file connection after setting it up go to Expressions in the properties window and set the "Connection String" property to SourceFullPath. This will tell the connection to change the file path at every iteration of your loop.
Now you just need to setup your loop etc. Add the fore each loop container setting a directory, filter, and choose File Name and Extension.
Now in the expression box on the collection page set the directory property to be that of your Source Directory variable.
The last part of the Fore each loop is to set your variable mappings to store the file name in your variable. so go to that tab choose your file name variable and set index to 0.
At this point you can add your data flow and setup your import just like you would with a normal file (note your default value for your file name parameter should be to an actual file with the structure you will want to import).
After your data flow drop in your Execute SQL task and set it up how you need. here is an example of direct input and you can see an easy way to reference a parameter is simply a question mark (?).
Next in your sql task setup your parameter mapping by adding in the details you need such as:
Now you are on to your file task. Drop your file task and setup as you desire, but choose your destination and source full path variables to tell the task which file to move.
that's it your are done. there is 1 more thing to note though. The way you have your precedence set in the image you posted you show going from your data flow to your sql and to your file task simultaneously. If your stored procedure relies on your file you may want to put it after your sql task. You can always change the constraint options to "completion" if you want to move the file even if your stored proc fails.
What you want to do is to create a variable in your package, call it something like Filename. In the Edit window of the Foreach you can configure that variable to be set (on the Variable Mappings page- set index to 0).
To create a variable, you will need to have the Variables window showing. Use the View menu to show it if it's not currently open.
Then when calling your stored procedure you can pass the then current value of the variable as a parameter.
This link might help: https://www.simple-talk.com/sql/ssis/ssis-basics-introducing-the-foreach-loop-container/

Importing flat file which has changing column order using SSIS [duplicate]

Problem.
I regularly receive a feed files from different suppliers. Although the column names are consistent the problem comes when some suppliers send text files with more or less columns in there feed file.
Furthermore the arrangement of these files are inconsistent.
Other than the Dynamic data flow task provided by Cozy Roc is there another way I could import these files. I am not a C# guru but i am driven torwards using a "Script Task" control flow or "Script Component" Data flow task.
Any suggestion, samples or direction will greatly be appreciated.
http://www.cozyroc.com/ssis/data-flow-task
Some forums
http://www.sqlservercentral.com/Forums/Topic525799-148-1.aspx#bm526400
http://www.bidn.com/forums/microsoft-business-intelligence/integration-services/26/dynamic-data-flow
Off the top of my head, I have a 50% solution for you.
The problem
SSIS really cares about meta data so variations in it tend to result in exceptions. DTS was far more forgiving in this sense. That strong need for consistent meta data makes use of the Flat File Source troublesome.
Query based solution
If the problem is the component, let's not use it. What I like about this approach is that conceptually, it's the same as querying a table-the order of columns does not matter nor does the presence of extra columns matter.
Variables
I created 3 variables, all of type string: CurrentFileName, InputFolder and Query.
InputFolder is hard wired to the source folder. In my example, it's C:\ssisdata\Kipreal
CurrentFileName is the name of a file. During design time, it was input5columns.csv but that will change at run time.
Query is an expression "SELECT col1, col2, col3, col4, col5 FROM " + #[User::CurrentFilename]
Connection manager
Set up a connection to the input file using the JET OLEDB driver. After creating it as described in the linked article, I renamed it to FileOLEDB and set an expression on the ConnectionManager of "Data Source=" + #[User::InputFolder] + ";Provider=Microsoft.Jet.OLEDB.4.0;Extended Properties=\"text;HDR=Yes;FMT=CSVDelimited;\";"
Control Flow
My Control Flow looks like a Data flow task nested in a Foreach file enumerator
Foreach File Enumerator
My Foreach File enumerator is configured to operate on files. I put an expression on the Directory for #[User::InputFolder] Notice that at this point, if the value of that folder needs to change, it'll correctly be updated in both the Connection Manager and the file enumerator. In "Retrieve file name", instead of the default "Fully Qualified", choose "Name and Extension"
In the Variable Mappings tab, assign the value to our #[User::CurrentFileName] variable
At this point, each iteration of the loop will change the value of the #[User::Query to reflect the current file name.
Data Flow
This is actually the easiest piece. Use an OLE DB source and wire it as indicated.
Use the FileOLEDB connection manager and change the Data Access mode to "SQL Command from variable." Use the #[User::Query] variable in there, click OK and you're ready to work.
Sample data
I created two sample files input5columns.csv and input7columns.csv All of the columns of 5 are in 7 but 7 has them in a different order (col2 is ordinal position 2 and 6). I negated all the values in 7 to make it readily apparent which file is being operated on.
col1,col3,col2,col5,col4
1,3,2,5,4
1111,3333,2222,5555,4444
11,33,22,55,44
111,333,222,555,444
and
col1,col3,col7,col5,col4,col6,col2
-1111,-3333,-7777,-5555,-4444,-6666,-2222
-111,-333,-777,-555,-444,-666,-222
-1,-3,-7,-5,-4,-6,-2
-11,-33,-77,-55,-44,-666,-222
Running the package results in these two screen shots
What's missing
I don't know of a way to tell the query based approach that it's OK if a column doesn't exist. If there's a unique key, I suppose you could define your query to have only the columns that must be there and then perform lookups against the file to try and obtain the columns that ought to be there and not fail the lookup if the column doesn't exist. Pretty kludgey though.
Our solution. We use parent child packages. In the parent pacakge we take the individual client files and transform them to our standard format files then call the child package to process the standard import using the file we created. This only works if the client is consistent in what they send though, if they try to change their format from what they agreed to send us, we return the file.

SSIS - Load flat files, save file names to SQL Table

I have a complex task that I need to complete. It worked well before since there was only one file but this is now changing. Each file has one long row that is first bulk inserted into a staging table. From here I'm supposed to save the file name into another table and then insert the the broken up parts of the staging table data. This is not the problem. We might have just one file or even multiple files to load at once. What needs to happen is this:
The first SSIS task is a script task that does some checks. The second task prepares the file list.
The staging table is truncated.
The third task is currently a Foreach loop container task that uses the files from the file list and processes it:
File is loaded into table using Bulk Insert task.
The file name needs to be passed as a variable to the next process. This was done with a C# task before but it is now a bit more complex since there could be more than one file and each file name needs to be saved separately.
The last task is a SQL task that executes a stored procedure with the file name as input variable.
My problem is that before it was only one file. This was easy enough. What would the best way be to go about it now?
In Data Flow Task which imports your file create a derrived column. Populate it with system variable value of filename. Load filename into the same table.
Use a Execute SQL task to retrieve distinc list of filenames into a recordset (Object type variable).
Use For Each Loop container to loop through the recordset. Place your code inside the container. Code will recieve filename from the loop as a value of a variable and process the file.
Use Execute SQL task in For Each Loop container to call SP. Pass filename as a parameter like:
Exec sp_MyCode param1, param2, ?
Where ? will pass filename INPUT as a string
EDIT
To make Flat File Connection to pick up the file specified by a variable - use Connection String property of the Flat File Connection
Select FF Connection, right click and select Properties
Click on empty field for Expressions and then click ellipsis that appears. With Expressions you can define every property of the object listed there using variables. Many objects in SSIS can have Expressions specified.
Add an Expression, select Connection String Property and define an expression with absolute path to the file (just to be on a safe side, it can be a UNC path too).
All the above can be accomplished using C# code in the script task itself. You can loop through all the files one by one and for each file :
1. Bulk Copy the data to the staging
2. Insert the filename to the other table
You can modify the logic as per your requirement and desired execution flow.
Add a colunm to your staging table - FileName
Capture the filename in a SSIS Variable (using expressions) then run something like this each loop:
UPDATE StagingTable SET FileName=? WHERE FileName IS NULL
Why are you messing about with C#? From your description it's totally unnecessary.