I'm quite new to SSIS - using 2008 version.
I have a job that uses a few data flow tasks. On the third one I'm getting a primary key violation on the last row that it needs to insert, but only sometimes!
I'd like to ignore this problem for now and let the job continue. I have set the MaximumErrorCount property to 10 for the DataFlowTaks, the SequenceContainer and for the Package but still taks fails and this causes the package to stop.
Could anyone please advise how I can get the package to ignore the error?
Thanks
Rob.
That error count refers to the number of Tasks that SSIS will allow to error before it stops the package. You're wanting to allow a set number of rows to error - and that's not what it's counting.
Instead, you should go into your Destination and configure the Error Output on that destination to either ignore errors, or redirect errors (better). You can then pull a red arrow off the bottom of the destination component to a Derived Column (or any other type of component that doesn't need to attach its output to anything), and put a Data Viewer on that red link. Now all the rows that fail will go to the Derived Column, and show up in a Data Viewer for you to see (while in BIDS).
The other thing you'll have to do is change the Batch Size on the OLE DB Destination (if that's what you're using) to 1 so that it only inserts one row at a time. Otherwise, it will fail the whole batch that contains the error...
Related
I have a ssis project with 3 ssis packages, one is a parent package which calls the other 2 packages based on some condition. In the parent package I have a foreach loop container which will read multiple .csv files from a location and based on the file name one of the two child packages will be executed and the data is uploaded into the tables present in MS SQL Server 2008. Since multiple files are read, if any of the file generates an error in the the child packages, I have to log the details of error (like the filename, error message, row number etc) in a custom database table, delete all the records that got uploaded in the table and read the next file and the package should not stop for the files which are valid and doesn't generate any error when they are read.
Say if a file has 100 rows and there is a problem at row number 50, then we need to log the error details in a table, delete rows 1 to 49 which got uploaded in the database table and the package to start executing the next file.
How can I achieve this in SSIS?
You will have to set TransactionOption=*Required* on your foreach loop container and TransactionOption=*Supported* on the control flow items within it. This will allow for your transactions to be rolled back if any complications happen in your child packages. More information on 'TransactionOption' property can be found # http://msdn.microsoft.com/en-us/library/ms137690.aspx
Custom logging can be performed within the child packages by redirecting the error output of your destination to your preferred error destination. However, this redirection logging only occurs on insertion errors. So if you wish to catch errors that occur anywhere in your child package, you will have to set up an 'OnError' event handler or utilize the built-in error logging for SSIS (SSIS -> Logging..)
I suggest you try the creation of two dataflows in your loop container. The main idea here is to have a set of three tables to better and more easily handle the error situations. In the same flow you do the following:
1st dataflow:
Should read .csv file and load data to a temp table. If the file is processed with errors you simply truncate the temp table. In addition, you should also configure the flat file source output to redirect the errors to an error log table.
2nd dataflow:
On the other hand, in case of processing error-free, you need to transfer the rows from temp into the destination table. So, here, the OLEDB datasource is "temp table" and the OLEDB destination is "final table".
DonĀ“t forget to truncate the temp table in both cases, as the next file will need an empty table.
Let's break this down a bit.
I assume that you have a data flow that processes an individual file at a time. The data flow would read the input file via a source connection, transform it and then load the data into the destination. You would basically need to implement the Error Handler flow in your transformations by choosing "Redirect Row". Details on the Error Flow are available here: https://learn.microsoft.com/en-us/sql/integration-services/data-flow/error-handling-in-data.
If you need to skip an entire file due to a bad format, you will need to implement a Precedence Constraint for failure on the file system task.
My suggestion would be to get a copy of the exam preparation book for exam 70-463 - it has great practice examples on exactly the kind of scenarios that you have run into.
We do something similar with Excel files
We have an ErrorsFound variable which is reset each time a new file is read within the for each loop.
A script component validates each row of the data and sets the ErrorsFound variable to true if an error is found, and builds up a string containing any error details.
Then - based on the ErrorsFound variable - either the data is imported or the error is recorded in a log table.
It gets a bit more tricky when the Excel files are filled in badly enough for the process not to be able to read them at all - for example when text is entered in a date, number or currency field. In this case we use the OnError Event Handler of the Data flow task to record an error in the log but won't know which row(s) caused the problem
I have a bit of a problem. When I set up a SSIS package and i fire it off it shows me the amount of rows that is going into the SQL table, but when I query the table there is almost 40000 rows missing from what the last count was after the conditional split that I have in the package.
What causes this problem? Even if I have it on normal table or view it still does the same thing. But here I have to use the fastload option as it is a lot of source files being loaded. This is only testing before sending it to production and I am stuck at the moment. Is there a way I can work around this problem and get all the data that is supposed to be pumped into the table. please also take note that in the conditional split it removes any NULL values as seen in first picture.
Check the Error Output (under Connection Manager and Mappings) within Destination Component. If the Error setting is set to Ignore Failure or Redirect Row, the component will succeed, but only the successful rows will be inserted.
What is the data source? Try checking your data and make sure you don't have any terminators stored in one of the rows.
I have SSIS packages to extract fact tables into the staging tables. I have a control table which contains the last extract date for each table. So, the package extract rows where > control table date. The problem I have is I want to redirect rows with error to an error file in the data flow task of the package. If I do that the package will not fail (so I can't rollback) and some rows might actually go through which if I coninue with the process will ultimately get to my fact table. Now, next time when I run the package if I had updated the control table, I will miss the rows which had erros. If I had not updated the control table with the date, I will re-extract the rows which went through. What is the best practice for this?
How about adding a Row Count Transformation onto the error branch? It sounds like you are using the transaction option in SSIS so put the Data Flow in a sequence container and post Data Flow, evaluate the value of your row count variable. If it's greater than zero, rollback/abort processing.
I am running a SSIS package using SQL Server 2008 Job. The package crash at some point while running. I have created my own mechanism to grab the error and record it in a table. So I can see that there is an error with an specific task, but could not find what the error is.
When I run the same package from BIDS, it works perfect. no error.
What I want to do is, I need to write that error string to my own table which shown in the "Execution Result" tab.
So the question is which system variable holds the error string in SSIS.
The error is stored in the ErrorDescription system variable. See Handling Errors in the Data Flow for an example of how to get the error description.
Also, if you want to capture error information into a table, SSIS supports logging to a table using the SQL Server Log Provider. You can also customize the logging.
Too easy.
Left-Click (highlight) on the object you want to capture the error event (Script, or Data Flow, etc.)
Click on 'Event Handlers' - screen should open with Executable = object you clicked and Event Handler = OnError
Click URL (click here to create....)
Drag Execute SQL object from SSIS Toolbox
Configure to the database/table you want to house the error message
Write INSERT INTO DB.Schema.Table(DBName, SchemaName, TableName,ErrorMessage,DateAdded)
Write VALUES (?,?,?,'I am smart',getdate())
Click Parameters and select the USER::Variables for the ?'s + my comment.
Since this is ran at the database server it will pass in the ?'s. My SAC is already at the database as a value but you will have selected System::ErrorDescription as parameter 3. Remember, this array is 0 based. DO NOT TRY TO NAME THE PARAMETERS. Instead, number them 0 to ~? The datatypes are based on what you have going in; mine are all VARCHAR so... :)
This is a much better solution than just logging whatever the server allows you to.
I can also add a counter variable and adjust it wherever I like; then pass it to the event OnError. This will allow me to pinpoint exactly where the last successful object completed; works best in scripting objects but also available in other areas.
I'm using this so I can process thousands of cycles without actually failing the package. If a table doesn't exist or a column doesn't exist I simply log it for further review later. Oh yeah, I'm cycling through hundreds of databases capturing their architecture and maximum column size used; not to be confused with maximum column size.
Example: TelephoneNumber comes from a source column of char(500) (definitely bad programming but...you can't change everything so..). I capture the max len of that column and adjust the destination column to accommodate that size +/- a certain percentage.
If a table doesn't exist or a column doesn't exist anymore I log the error and keep churning. At the end, I can evaluate those entries and see if I can actually remove them from my warehouse. This happens more in the TEST and STAGE environments than in PROD. However, when a change goes through to PROD I most definitely will identify it as it's coming in to the warehouse.
Everything is configured, this includes dynamic MERGE/JOINs, INSERT, SELECT, ELEMENTS, SIZES, USAGESIZE, IDENTITY, SOURCEORDER, etc. with conversions of data to destination datatypes.
ALL that because the systemic version of logging will not provide you with the granularity you might need for this type of operation. This OnError Event Handler can if setup properly.
Check this out! He has explained with a Step by step process on how to configure SSIS logging which has the error message parameter.
I'm using SSIS on SQL Server 2008. I have a data flow with a lookup component with the no matching entries option specified to "Fail component". I'm looking at the log of a previous execution of the package and I can see the following error message from the data flow:
Row yielded no match during lookup.
Later error messages indicate this is from my lookup component. However after that I can see an information message (from the same data flow and the same execution) saying that the destination component wrote several thousand rows:
"component "OLE_DST ..." (578)" wrote 9924 rows.
An execution on another environment resulted in the same "Row yielded no match during lookup" error but then wrote zero rows to the destination.
The SSIS package is exactly the same in both environments. The data was slightly different but had the same characteristics - source rows, a small number with no matching lookup entry.
Is this behaviour allowed? Can the data flow begin writing an arbitrary number of rows before a lookup fails and then stop writing rows?
Tom,
Yeah, this behaviour is plausible. However I think (best to check this) it can be affected by FastLoadMaxInsertCommitSize because that property determines how many rows are inserted before being committed.
Read more: Default value for OLE DB Destination FastLoadMaxInsertCommitSize in SQL Server 2008
cheers
JT