How to capture SSIS source & destination error records into one file - ssis

My goal is to write error records from both source and destination into one file.
Currently, I'm encountering
Warning: The process cannot access the file because it is being used by another process.
PS: As I explore this feature, I can write these error records into two separate files.

Instead of defining two flat file destinations, define just one and union the error outputs to it.

Related

ETL Pipeline error: error: 0xc0202091 at data flow task, flat file source [2]: an error occurred while skipping data rows

So I have an error when trying to load my flat Excel file into my SQL server when using an ETL Pipeline.
For my Flat File I have 2 rows (1 and 2) with some data that I do not want to ingest into the SQL server. The data including headers start at row 3.
Somehow I can't get it working to delete row 1 and 2 and upload the file starting at row 3.
I tried uploading the data in the way I usually setup the pipeline.
Steps in the Pipeline:
Uploading Flat File Source with the Excel file.
Within this step I setup the connector to remove the first 2 rows so the upload is correctly.
OLE DB Destination step to upload the file into the SQL server.
All fields are mapped correctly as used in different pipelines and they work the correct way.
So after the above steps I received the following error:
error: 0xc0202091 at data flow task, flat file source [2]: an error occurred while skipping data rows.
So I need to have the first 2 rows deleted so the rows start at number 3 as indicated in the attached table:
Sample data
How is it possible to get this pipeline working?

How to load multiple files into multiple tables using SSIS for each loop

Am beginner to SSIS. I have to load multiple files into multiple destinations using For Each loop in SSIS. The problem is for each loop is picking the files names dynamically but while loading into destination table its not changing and pointing to very first table. Which means job is loading the first file correctly but second file its still pointing to first table columns to load second file which causing the error i guess.
I have used below are the variables.
FileName
FolderPath
TargetTable
Please see first table source data in flat file
Second table source data in flat file
Control Flow task
Data Flow Task
Below is the error message :

SSIS data validation

I have a json file that comes with around 125 columns and I need to load it to a DB Table.I'm using SSIS package and after dumping all the JSON file contents to a DB DUMP Table,I need to validate the data and load only the data that is valid to the MASTER Table and Send the rest to a failure table.The failure Table has 250 columns with ERROR for each column.If the first column fails validation,I need to write the error message to the corresponding error column and continue with the validation of second column...Is there some utility IN SSIS that helps in achieving the requirement.
I've tried using Conditional Split but appears like it doesn't fit the bill..
Thanks,
Vijay
I agree with Alleman's suggestion of getting this done via stored procedure. In terms of implementation there are various ways with which you can go about. I am listing one way here
In the database you can create some 10 stored procedures as follows
dbo.usp_ValidateData_Columns1_To_Columns25
dbo.usp_ValidateData_Columns26_To_Columns50
....
....
dbo.usp_ValidateData_Columns226_To_Columns250
In each of this procedures you can have the validate your data in bulk across columns. If validation fails you can insert into the respective error columns.
Once you have this in place you can then call all the above procedures in parallel as part of your SSIS Package.
Post that you would need one more DFT, to pick all those records which are good to be transferred to MASTER.
Basically you are modularizing the whole setup.

Logging errors in SSIS

I have a ssis project with 3 ssis packages, one is a parent package which calls the other 2 packages based on some condition. In the parent package I have a foreach loop container which will read multiple .csv files from a location and based on the file name one of the two child packages will be executed and the data is uploaded into the tables present in MS SQL Server 2008. Since multiple files are read, if any of the file generates an error in the the child packages, I have to log the details of error (like the filename, error message, row number etc) in a custom database table, delete all the records that got uploaded in the table and read the next file and the package should not stop for the files which are valid and doesn't generate any error when they are read.
Say if a file has 100 rows and there is a problem at row number 50, then we need to log the error details in a table, delete rows 1 to 49 which got uploaded in the database table and the package to start executing the next file.
How can I achieve this in SSIS?
You will have to set TransactionOption=*Required* on your foreach loop container and TransactionOption=*Supported* on the control flow items within it. This will allow for your transactions to be rolled back if any complications happen in your child packages. More information on 'TransactionOption' property can be found # http://msdn.microsoft.com/en-us/library/ms137690.aspx
Custom logging can be performed within the child packages by redirecting the error output of your destination to your preferred error destination. However, this redirection logging only occurs on insertion errors. So if you wish to catch errors that occur anywhere in your child package, you will have to set up an 'OnError' event handler or utilize the built-in error logging for SSIS (SSIS -> Logging..)
I suggest you try the creation of two dataflows in your loop container. The main idea here is to have a set of three tables to better and more easily handle the error situations. In the same flow you do the following:
1st dataflow:
Should read .csv file and load data to a temp table. If the file is processed with errors you simply truncate the temp table. In addition, you should also configure the flat file source output to redirect the errors to an error log table.
2nd dataflow:
On the other hand, in case of processing error-free, you need to transfer the rows from temp into the destination table. So, here, the OLEDB datasource is "temp table" and the OLEDB destination is "final table".
DonĀ“t forget to truncate the temp table in both cases, as the next file will need an empty table.
Let's break this down a bit.
I assume that you have a data flow that processes an individual file at a time. The data flow would read the input file via a source connection, transform it and then load the data into the destination. You would basically need to implement the Error Handler flow in your transformations by choosing "Redirect Row". Details on the Error Flow are available here: https://learn.microsoft.com/en-us/sql/integration-services/data-flow/error-handling-in-data.
If you need to skip an entire file due to a bad format, you will need to implement a Precedence Constraint for failure on the file system task.
My suggestion would be to get a copy of the exam preparation book for exam 70-463 - it has great practice examples on exactly the kind of scenarios that you have run into.
We do something similar with Excel files
We have an ErrorsFound variable which is reset each time a new file is read within the for each loop.
A script component validates each row of the data and sets the ErrorsFound variable to true if an error is found, and builds up a string containing any error details.
Then - based on the ErrorsFound variable - either the data is imported or the error is recorded in a log table.
It gets a bit more tricky when the Excel files are filled in badly enough for the process not to be able to read them at all - for example when text is entered in a date, number or currency field. In this case we use the OnError Event Handler of the Data flow task to record an error in the log but won't know which row(s) caused the problem

Why all the records are not being copied from CSV to SQL table in a SSIS package

I am trying to copy data from a flat file to a SQL table using SSIS.
I have a Data Flow Task where I have created a Flat File Source pointing to the csv file and an OLE DB Destination pointing to the table I want the data in.
The problem I am facing is when I run the package, I get only 2621 rows copied to the SQL destination table, where I have about 1,70,000 records in the csv. Not sure why this is happening.
Thanks in advance.
This could be a number of things. This is what comes to mind:
The connection string to your flat file is overwritten by a variable expression or a package configuration. Check SSIS -> Package configurations or the Expressions properties on your connection manager.
The DataRowsToSkip property on your flat file connection manager is set to a value.
The meta data definition of your flat file is incorrectly configured in your connection manager. See properties such as Format, Row delimiter, Column delimiter, etc. Use the preview function to see the output.
The error output on your flat file source is set to Ignore failure, meaning that lines which SSIS cannot process (due to, e.g., incompatible data types) are ignored without warning.