Ssis empty excel columns causing error - ssis

Using Microsoft Visual Studio Community 2015.
Goal of project
-create "*\temp\email" directory
-start program to extract all emails that include xls attachments to the previously created folder
-use for each loop to cycle through each file in the folder, process, and shift to sql table.
The problem I am running into is caused by either a blank excel document (which is occasionally sent from a remote location) or some of the original xls reports only contain 5 columns instead of 6 that I have mapped now. Is there any way to separate files that include the correct columns from those that do not match?
** as Long as these two problems do not exist I can run the ssis package and everything runs without issue.
Control flow;
File System Task (creates directory --->Execute Process Task (xls extraction)-->ForEach Loop(Data flow Task "email2Sql")
Data Flow;
Excel Source (uses expression ExcelFilePath,#user:filepath) delay validation ==true
(columns are initially set to f1-f6 and are mapped to for ex. a,b,c,d,e,f. The Older files that get mixed in only include a,b,c,d,e.) This is where I want to be able to separate the xls files
Conditional Transformation split (column names are not in row 1, this helps remove "null" values)
Ole Db destination (sql table)
Sorry for the amount of reading, but for the first post I tried to include anything that I thought may be relevant.

There are some tools out there which would allow you to open the excel doc and read it. However, I think the simplest thing to do would be to use SSIS out of the box:
1 - add a file system task after the data flow which reads the file.
2 - Make the precedence constraint from the data flow to the file system task "failure." This will cause that to only fire when the data flow task fails.
3 - set the file task to move the "bad" files to another folder
This will allow you to loop through all the files and move the failed ones. Ultimately, the package will end in failure. If you don't want that behavior you can change the ForceExecutionResult property to be success. However, it might be good to know that there were problems with some files so that they can be addressed.
m

Related

SSIS Execute File Task In Parent Container If Child Container Process fails

I'm trying to create a package that imports excel files into a database using SSIS.
As the operation has to perform this regularly and the file names follow a convention but are not the same, and equally the sheet/tab names are not always the same, the SSIS package is set up as follows:
Main Container
->
First For Each container (call it FE1)
Obtains filenames (assigns to a variable)
->
Second For Each Container (call it FE2)
Obtains worksheet name and starts the process to import.
What I have done is create a "failure" precedence constraint from FE2 to a file system task process in FE1.
The idea is that the file move is done if the import is unsuccessful for whatever reason.
(once it works I'd like to create a "success" process that moves the file to the archive folder)
The file task process works when there is only one "for each container" (i.e. not nested the way it is now) but it fails when all the processes are in the nested container citing "file in use". I'm assuming this is because the first for each container is locking the file, hence why I moved the file task process to the first for each container and used a precedent control.
Any help and advice much appreciated.
For the benefit of anyone else who may have the same problem:
for love nor money could I get the excel connector to release the files, even when the move file task was outside of the loop.
In the end I recorded the files that were moved into a table in the DB and then executed a second package that contained the move file task and would iterate through the table rows with the list of successfully imported (and failed import) files and moved them to their destination based on fail/success flag.
It was the only way I got to successfully make this happen.
When having to iterate through the worksheet of each excel file (i.e. two excel connections effectively) SSIS would not release the files so I forever received the file in use error and failure.

How to do File System Task in SSIS depending on Result of Data Flow

I'm writing a (what I thought to be a) simple SSIS package to import data from a CSV file into a SQL table.
On the Control Flow task I have a Data Flow Task. In that Data Flow Task I have
a Flat File Source "step",
followed by a Data Conversion "step",
followed by a OLE DB destination "step".
What I want to do is to move the source CSV file to a "Completed" folder or to a "Failed" folder based on the results of the Data Flow Task.
I see that I can't add a File System step inside the Data Flow Task, but I have to do it in the Control Flow tab.
My question is how do I do a simple thing like assign a value to a variable (I saw how to create variable and assign them a value at the bottom pane of Data Tools (2012)) depending of if the "step" succeeds or fails?
Thanks!
(You can tell by my question that I'm an SSIS rookie - and don't assume I can write a C# script, please)
I have used VB or C# scripts to accomplish this myself. Since you do not want to use scripts I would recommend using a different path for the project to flow. Have your success path lead to moving the file to completed and failure path lead to moving the file to failed. This keeps it simple and accomplishes what you are looking for.

The destination component does not have any available inputs for use in creating a path

I'm working with legacy tsql code that outputs to txt files.
For security reasons, I'm replacing these outputs with SSIS packages.
I've gotten most of them to work, but one particular one gives me the following error:
TITLE: Microsoft Visual Studio
Cannot create connector.
The destination component does not have any available inputs for use in creating a path.
The data flow itself is very simple. OLE DB Source runs an SQL command, then outputs to a flatfile source that points to an existing txtfile that was created by the TSQL.
Anyone know what the error means in regards to the available inputs?
Your SSIS toolbox is divided into 3 general groupings (pre 2012/2014)
Sources
Transformations
Destinations
A source has 1 to N output paths. Nothing can feed into a source. Things can only consume what a Source emits.
A Transformation does not generate* rows, it accepts rows from an upstream provider (either a Source or another Transformation). A Transformation has 1 to N output paths.
A Destination is the terminus for data. I'm not aware of any destinations that accept more than one input. It has one optional output path, Error.
Your problem, therefore, is that you are trying to route data into a Source. Change that to a Flat File Destination.
Replace the Flat File Source with a Flat File Destination task. Click on the RowCount task and drag the green arrow to the new destination.

how to prevent SSIS package failure when no file(s) exist to import

I have an SSIS package with several data flow tasks. Each one imports a flat file into a table in my DB. I have created a connection manager for each underlying flat file. The package works just fine if all of the files exist. However, even if one of the files is missing, the entire package fails. I don't want this behavior. For whatever files that exist, I want my package to import them. For those that don't exist, I want SSIS to simply ignore them. At least one of the files will always exist. How do I achieve this behavior? I have seen some solutions that involve either scripts or file control tasks, but I'm not sure which is appropriate for my situation.
my solution is
1. make a Script Task for checking the path file:
SSIS Script task to check if file exists in folder or not
2. ValidateExternalMetadata set to False in the source properties
3. link the Script Task with next step if skip and create a Constrain and Variables connection with if the file exist

How do I load Excel files selectively?

I have an SSIS package that needs to lookup two different types of excel files, type A and type B and load the data within to two different staging tables, tableA and tableB. The formats of these excel sheets are different and they match their respective tables.
I have thought of putting typeA.xls and typeB.xls in two different folders for simplicity(folder paths to be configureable). The required excel files will then be put here through some other application or manually.
What I want is to be able to have my dtsx package to scan the folder and pick the latest unprocessed file and load it ignoring others and then postfix the file name with '-loaded' (typeAxxxxxx-loaded.xls). The "-loaded" in the filename is how I plan to differentiate between the already loaded files and the ones yet to be loaded.
I need advice on:
a) How to check that configured folder for the latest file ie. without the '-loaded' in the filename and load it? ..and then after loading it, rename the same file in that configured folder with the '-loaded' postfixed.
b) Is this the best approach to doing this or is there a better way?
Thanks.
You can do it this way, but it might require several complex string expressions.
E.g. create a ForEach loop over .xls files, inside the loop add an empty script task, then a data flow to load this file. Connect them with a precedence constraint and make it conditional: precedence constraint expression will the check if file name does not end with -loaded.xls. You may either do it in script task or purely using SSIS expression on precedence constraint. Finally, add File System Task to rename the file. You may need to build new file name with another expression.
It might be easier to create two folders: Incoming for new unprocessed files, and Loaded for the files you've processed, and just move the .xls to this folder after processing without renaming. This will avoid the first conditional expression (and dummy script task), and simplify the configuration of File System task.
You can get the SQL File watcher Task and add it to your SSIS. I think this is a cleaner way to do what you want.
SQL File Watcher