How to get row count from flat file destination? - ssis

I am redirecting error rows from ole db destinationn table to flat file destination, here i need error rows count which are getting redirect to flat file and if count(error rows)>50 then my ssis package should fail.
And the data loaded into the table should get rollback if count(error rows)>50
How can i achive this?

Use the row count task in the data flow, chain it to your error constraint from the OLEDB task. You might want to configure the row count to write to a variable since you'll be using row count in an expression (in which case you should create a variable called User::RowCount). Finally you can evaluate the condition Count(User::RowCount) > 0 by using a conditional split - all within the data flow task

Related

How do you get flat file name and perform a row count from multiple flat files with different columns in SSIS?

I'm trying to get all the file names from a folder directory along with their row counts. (Also file size in bytes if possible) I am using Microsoft Visual Studio 2010 Shell. Here's what I've done so far:
I have created a Foreach Loop Container, set the Enumerator to Foreach File Enumerator and Expressions to a variable to the folder I want to loop over. I left the Files section with *.* and asked to retrieve Name Only. I have changed the Variable Mappings to a New Variable called FullFilePath, Container is Package, Value type is String and Value: is blank.
I then added a Data Flow to the Loop. Added a flat file source, row count, and OLE DB Destination. I changed the Flat file Source properties expression to the same Folder Variable in the Foreach Loop Container Expression. I added the Variable RecordCount to the Row Count function (Int32, value 0). The OLE DB Destination creates a new table with the name OLE DB Destination.
The next step is a Execute SQL Task that does and Insert Into DBO.FileData (FileName,RowCount) Values (?,?). I set 2 parameter mappings - 1) Variable Name from the Foreach Loop Container, FullFilePath and Data Type VarChar, 2) Variable from Row Count, RecordCount and Data Type Long.
I then have another Execute SQL Task that drops the table created by the data flow task. The problem is that with all the these step the Package still does not complete. It actually gets hung up and fails on the pre-execute. It says:
Warning: Access is denied. Error: Cannot open the datafile 'FullFilePath' Error: Flat File Source failed the pre-execute phase and returned error code 0xC020200E.
Anything you see I could be doing wrong? Let me know if pictures would help.
So I figured this out finally. In order to loop over all of the files with varying headers and column counts I decided to change the option in the Flat File Source to unselect "File contains headers." Doing this allowed the all the files to have the same #1 Column, which by default is Column 0(the first column in all of my files is some sort of a numeric field or ID). I was able to map this through row count and insert into a SQL table. Then I was able to finish the Foreach Loop and scribe the file name and row count into another SQL table to record the counts. It is however taking a really really really long time, i.e. it has been running for over 14 hours and it has only counted through 13 files. Granted some files are 250K+ rows but I wouldn't think it would take this long.

Passing a variable as Parameter

I have a scenario where i need to pass an text file or excel file column as an parameter to my Sql Query in SSIS Package.
My Text or excel file has a column called Policy_no and it has more than 1000+ policy_no(EX: 12358685). i have an Sql script *select * from main_table where policy_no = ?*. And that that '?' has to be come from my package variable(txt or excel ).
Instead of manually writing script for each and every policy, how can we achieve this through SSIS.
Thanks
Assuming you want to loop through each row in your file and run the query against each individual value, you can use a Data Flow task to read your text file and load the policy numbers in an ADO Recordset (declared as a package variable). Next, you'd use a Foreach Loop Container to iterate through the recordset, loading each policy number in turn into a second variable and then executing your query and doing whatever other work is needed.
See Use a Recordset Destination in MSDN for an overview and example.
You can use EXECUTE SQL TASK (Connect Excel with OLE DB Connection) to get "Policy_no" data from Excel, then store the result into a variable, say:policyNoGroup, whose data type should be Object, then use For Each Loop to loop though variable policyNoGroup, see the example: http://www.codeproject.com/Articles/14341/Using-the-Foreach-ADO-Enumerator-in-SSIS

SSIS - Load flat files, save file names to SQL Table

I have a complex task that I need to complete. It worked well before since there was only one file but this is now changing. Each file has one long row that is first bulk inserted into a staging table. From here I'm supposed to save the file name into another table and then insert the the broken up parts of the staging table data. This is not the problem. We might have just one file or even multiple files to load at once. What needs to happen is this:
The first SSIS task is a script task that does some checks. The second task prepares the file list.
The staging table is truncated.
The third task is currently a Foreach loop container task that uses the files from the file list and processes it:
File is loaded into table using Bulk Insert task.
The file name needs to be passed as a variable to the next process. This was done with a C# task before but it is now a bit more complex since there could be more than one file and each file name needs to be saved separately.
The last task is a SQL task that executes a stored procedure with the file name as input variable.
My problem is that before it was only one file. This was easy enough. What would the best way be to go about it now?
In Data Flow Task which imports your file create a derrived column. Populate it with system variable value of filename. Load filename into the same table.
Use a Execute SQL task to retrieve distinc list of filenames into a recordset (Object type variable).
Use For Each Loop container to loop through the recordset. Place your code inside the container. Code will recieve filename from the loop as a value of a variable and process the file.
Use Execute SQL task in For Each Loop container to call SP. Pass filename as a parameter like:
Exec sp_MyCode param1, param2, ?
Where ? will pass filename INPUT as a string
EDIT
To make Flat File Connection to pick up the file specified by a variable - use Connection String property of the Flat File Connection
Select FF Connection, right click and select Properties
Click on empty field for Expressions and then click ellipsis that appears. With Expressions you can define every property of the object listed there using variables. Many objects in SSIS can have Expressions specified.
Add an Expression, select Connection String Property and define an expression with absolute path to the file (just to be on a safe side, it can be a UNC path too).
All the above can be accomplished using C# code in the script task itself. You can loop through all the files one by one and for each file :
1. Bulk Copy the data to the staging
2. Insert the filename to the other table
You can modify the logic as per your requirement and desired execution flow.
Add a colunm to your staging table - FileName
Capture the filename in a SSIS Variable (using expressions) then run something like this each loop:
UPDATE StagingTable SET FileName=? WHERE FileName IS NULL
Why are you messing about with C#? From your description it's totally unnecessary.

ssis - capturing the bad rows

HI, Can you help me to figure this out? Is there a way to get the row in which error occured in ssis? I have this flat file with some 10k + records which is being read via a 'flatfilesource'.
Right now the error output defaults to error-column, error-code, and 'flatfilesourceerroroutputcolumn' - and i use a script-component to handle it. But none of these three inputs (to script component) are user-friendly enough. So i want to get an output like the first column-value(this is a unique identifier) of the row in which error occured. How can I add that?
While debugging this in SSIS, you can add a Data Viewer on the path to where your script handles error. This path has all the columns of the original row where your error is.
If you want to handle your SSIS errors and also do something else with it, you can direct the error output from your flat file source to a Multicast and then send one stream down to a file, a table, or something else ( a Recordset destination and a subsequent foreach loop on the object used to store the Recordset will let you do stuff on a row-by-row basis on the errored row(s)).

Errorneous Row numbers in a SSIS task

I am importing a text file into SQL server table which has got number of constraints. I have created one package and associated tasks.
At the end of a SSIS package execution, I want to know the erroenous row numbers which were not succefully exported to DB. Is any direct API or variable available in dts namespace to give this information?
Kindly share with me any knowledge to get this information.
Thanks,
Rahul
The error (red line) output of your import step inside the data flow lets you redirect to an error table. This should list the information you are after.
http://msdn.microsoft.com/en-us/library/ms140083.aspx
Error Outputs ( http://msdn.microsoft.com/en-us/library/ms140080.aspx )
Sources, destinations, and transformations can include error outputs. You can specify how the data flow component responds to errors in each input or column by using the Configure Error Output dialog box. If an error or data truncation occurs at run time and the data flow component is configured to redirect rows, the data rows with the error are sent to the error output. By default, an error output contains the output columns and two error columns: ErrorCode and ErrorColumn. The output columns contain the data from the row that failed, ErrorCode provides the error code, and ErrorColumn identifies the failing column.
For more information, see Handling Errors in the Data Flow.
Redirect the error rows on the destination component, pipe them through a count operation and then log that to a log table or whatever.