Process Multiple Batches in Flat File SSIS - ssis

I have a flat file which contains multiple batches.
I want to read the file into a db table but maintain some reference to the batch each line belongs to.
My thought is to append to each detail row the date/timestamp in the header row from the batch to which each row belongs.
What I have done is read the file into a in-memory recordset and then use the foreach loop container to examine and process each line.
I am stuck on the follwing:
1. examine each line, determine if it is a header or not
2. append batch header information to each line.
Thanks

I found this sample script on MSDN which worked perfectly. If you are coding in C# initialize your counter variable outside of the function else it will keep reinitalizing everytime the function is called.

Related

Jitterbit: target CSV-file created with only header although "do not create emtpy files" is checked

In Jitterbit Dataloader 10.37 I want to create CSV-files from Salesforce data but only if the query returns data.
I checked "do not create empty files" on the target type local file but it is still creating a csv just with the header but with no data. I do not want files created with no data in it. It is not an option to not have the header at all in the files - I will need it when there is data from the query.
Any suggestions? What am I missing?
I've seen this happen in situations where the write operation is after a couple of other operations. In that instance a header is written in the first operation, then another header is written in a second operation. The first row is read as the header, the second row (another header) is read as data, and written out.
I always add in a condition where I check if one of the fields equals its name. Something like this, to just skip those rows.
<trans>
if(Id=="Id",
false;,
true;
);
</trans>
The best way to do this is to send your output to a variable array. Then check the variable to see if data is present. So set your target to a global variable. Then add a script after that target and do your validation. To test your script use DEBUGBREAK(); to test and look at your variable content. That way you can see what is going into it.
Then make your condition statement.
if( Length($varailbe)>1,RunOperation("operation:myexport"),"novalue"):

SSIS 2008 - ForEach Loop to look at specific group/list of files

Ive been searching the internet to what I thought would be a straight forward question to answer. Hope you guys can help?
I am using for each loop to look for specific files and move them with file system task to a different folder.
Say I have 10 csv files. called listed A to J
I only want to move a,e and j but cant seem to get the foreachloop to look for that group.
In the enumerator Files text box i have tried inserting the 3 file names split by various separators, but SSIS thinks its all one specific file and none of the 3 get moved.
Can someone advise how it can be done? Just to confirm, I dont to use wild card logic just group of specific file names - similur to the IN function of SQL query
Thanks in advance
now added img - please advise how to slect 3 specific files in the text box with arrow
Since OP isn't able to proceed with just my comment, I'll explain a bit more in detail -
Use an EXECUTE SQL TASK to dump the names of the files needed into an SSIS object (to do this you could use a stored procedure or an SQL query). Create an object-type variable in the variables tab prior, change the output in the EXECUTE SQL TASK to Full Result Set and map the result to the object you just created. Now this object holds the list of files you need to loop through.
Now drag-and-drop a ForEach container from the SSIS toolbox. It should be configured as a ForEach ADO Enumerator and map the object to it. Create another variable of type string that will hold the file names after each iteration of the ForEach container. Map this also in the Variables tab of the ForEach container.
Now, place the File System Task which you would use to move these files into the ForEach loop. Use the file-name-variable you created to move just the required files.
Now if you're not sure what SQL query to use for your case in step 1 to get the 3 file names -
SELECT 'A.csv'
UNION
SELECT 'E.csv'
UNION
SELECT 'J.csv'

SSIS - Load flat files, save file names to SQL Table

I have a complex task that I need to complete. It worked well before since there was only one file but this is now changing. Each file has one long row that is first bulk inserted into a staging table. From here I'm supposed to save the file name into another table and then insert the the broken up parts of the staging table data. This is not the problem. We might have just one file or even multiple files to load at once. What needs to happen is this:
The first SSIS task is a script task that does some checks. The second task prepares the file list.
The staging table is truncated.
The third task is currently a Foreach loop container task that uses the files from the file list and processes it:
File is loaded into table using Bulk Insert task.
The file name needs to be passed as a variable to the next process. This was done with a C# task before but it is now a bit more complex since there could be more than one file and each file name needs to be saved separately.
The last task is a SQL task that executes a stored procedure with the file name as input variable.
My problem is that before it was only one file. This was easy enough. What would the best way be to go about it now?
In Data Flow Task which imports your file create a derrived column. Populate it with system variable value of filename. Load filename into the same table.
Use a Execute SQL task to retrieve distinc list of filenames into a recordset (Object type variable).
Use For Each Loop container to loop through the recordset. Place your code inside the container. Code will recieve filename from the loop as a value of a variable and process the file.
Use Execute SQL task in For Each Loop container to call SP. Pass filename as a parameter like:
Exec sp_MyCode param1, param2, ?
Where ? will pass filename INPUT as a string
EDIT
To make Flat File Connection to pick up the file specified by a variable - use Connection String property of the Flat File Connection
Select FF Connection, right click and select Properties
Click on empty field for Expressions and then click ellipsis that appears. With Expressions you can define every property of the object listed there using variables. Many objects in SSIS can have Expressions specified.
Add an Expression, select Connection String Property and define an expression with absolute path to the file (just to be on a safe side, it can be a UNC path too).
All the above can be accomplished using C# code in the script task itself. You can loop through all the files one by one and for each file :
1. Bulk Copy the data to the staging
2. Insert the filename to the other table
You can modify the logic as per your requirement and desired execution flow.
Add a colunm to your staging table - FileName
Capture the filename in a SSIS Variable (using expressions) then run something like this each loop:
UPDATE StagingTable SET FileName=? WHERE FileName IS NULL
Why are you messing about with C#? From your description it's totally unnecessary.

how to create a SSIS package which creates three text files, using same variables but the textfile is only created when the correct data is found?

There are only 3 files that can be created : "File_1", "File_2" and "File_3". The same variable name is used in each instance (User::FileDirectory) and (User::File_name), but because the actual value of the variable changes, a new file is created.However the files are only created if there is data to go into the file. i.e. if there are no records to populate the file, it will not be created at all. When the files are created, the date the file was created should also be added to the filename. eg: File1_22102011.txt
Ok if the above was a little confusing, the following is how it works,
All the files use the same variable, but it is reset before each file is created.
• So it populates a result set in memory with the first sql selection (ID number, First_Name and Main_Name). It sets the file variable to “File_1”. If there are records in the result set, it creates and writes to this filename.
• Then it creates a new result set with the second selection(Contract No). It sets the variable to "File_2". If there are records in this new result set, a new file will be created from the variable(which now has a new value)
• Finally a third result set is created (Contract_no, ExperianNo, Entity_ID_Number, First_Name, Main_Name), and the file variable is set to "File_3". Again if there are records in the result set, then this file will be created and written to.
I have worked on a few methods to achieve this but they all have failed, So little help will be greatly appreciated.
While what you have works, I think it'd be rather painful to maintain.
I would approach it as 3 sequence containers running in parallel. Each container would have a data flow and two file tasks hanging off it based on success of the parent and the value of row count variable. If the row count variable is 0, delete the file. If it's greater than 0, rename it to File_n
As you can see, I have a container for the first file. The data flow creates an output a.txt file. Based on the value of the variable #RowCount1, it will either delete the empty file or rename it to File_1.
Each data flow would look like a source query, a row count transformation and a file destination with a temporary name (a.txt, b.txt, c.txt). As a file is always created, even if it's empty, we will need to delete or rename it afterwards which will be accomplished based on the file operation tasks.
In my opinion, this approach will be cleaner as it will allow you to test and debug each item in a cleaner manner rather than dealing with an in-memory dataset.

SSIS - read a single header record from a flat file or an excel file prior to processing

Is there a method by which one can read just the first record of a file, i.e., to read header information so that a decision can be made whther or not to process the remainder of the file?
I know that with the split transformation component one can write an expression that will ignore all of the rows besides the header based on a key word in the header. I would rather not go that route as that is inefficiently reading every record in the file.
Specifically, is there script component logic that I can implement to close the file
and end the dataflow after the first record has been read?
See this post from Todd McDermid:
Basically, you would set up a Foreach
Container to loop over the files in
your directory. Inside the Foreach,
you would determine the "file type" -
perhaps by creating a variable with a
long-winded expression on it that
pulls apart your file name and assumes
the a "file type" value - then passes
control on to one of five Data Flows
via conditional connectors.
(Double-click on the standard green
connector, change it's Evaluation
Operation to Expression and
Constraint, and set the expression to
be "file_type_variable =
".) Then each Data Flow
picks apart one "file type".