I'm trying to create a package that imports excel files into a database using SSIS.
As the operation has to perform this regularly and the file names follow a convention but are not the same, and equally the sheet/tab names are not always the same, the SSIS package is set up as follows:
Main Container
->
First For Each container (call it FE1)
Obtains filenames (assigns to a variable)
->
Second For Each Container (call it FE2)
Obtains worksheet name and starts the process to import.
What I have done is create a "failure" precedence constraint from FE2 to a file system task process in FE1.
The idea is that the file move is done if the import is unsuccessful for whatever reason.
(once it works I'd like to create a "success" process that moves the file to the archive folder)
The file task process works when there is only one "for each container" (i.e. not nested the way it is now) but it fails when all the processes are in the nested container citing "file in use". I'm assuming this is because the first for each container is locking the file, hence why I moved the file task process to the first for each container and used a precedent control.
Any help and advice much appreciated.
For the benefit of anyone else who may have the same problem:
for love nor money could I get the excel connector to release the files, even when the move file task was outside of the loop.
In the end I recorded the files that were moved into a table in the DB and then executed a second package that contained the move file task and would iterate through the table rows with the list of successfully imported (and failed import) files and moved them to their destination based on fail/success flag.
It was the only way I got to successfully make this happen.
When having to iterate through the worksheet of each excel file (i.e. two excel connections effectively) SSIS would not release the files so I forever received the file in use error and failure.
Related
I have an SSIS package that looks in a folder, loops through each file inside, imports file to SQL Server, executes a SQL task, then needs to delete the file, then loop through next one. The import and looping works without the delete file step (File System Task), but the delete file errors with the message "Failed to lock variable. The variable cannot be found." The variable in question is the variable I created for "Current File". It's being used by the first part of the For Each Loop container to look up the current file and import it successfully. What I think is happening, importing the file locks the variable. Then when it's going to delete the file based on the variable, it can't access the variable because it is locked, so it fails. Any idea how to allow it to import the "Current File" based on the "Current File variable", then delete it based on that same variable, then loop through the rest of files? Taking the delete file control out of the For Each Loop is not an option, because I need to delete each file after it imports - if I do a delete after all import, I might delete files in the directory which got there after I ran the import, so I'd be deleting non-imported files, I think. Thanks for any help!
A workaround to this issue would be to create a new object variable. In your current loop you can populate it with the full path and file name of each file that is imported. Then create a second "For Each Loop Container" control task that is enumerated on that object variable. Then within this new For Each Loop Container you can delete only those files (which were imported earlier) defined in the object variable. This way you do not delete any of the other files in the folder that you do not wish to touch.
The following link may help provide some idea as to how to set up the object variable in a For Each Loop Container and then parse out one file name at a time for deletion. Get List of Files
Let me know if this helps or if you need additional details.
Using Microsoft Visual Studio Community 2015.
Goal of project
-create "*\temp\email" directory
-start program to extract all emails that include xls attachments to the previously created folder
-use for each loop to cycle through each file in the folder, process, and shift to sql table.
The problem I am running into is caused by either a blank excel document (which is occasionally sent from a remote location) or some of the original xls reports only contain 5 columns instead of 6 that I have mapped now. Is there any way to separate files that include the correct columns from those that do not match?
** as Long as these two problems do not exist I can run the ssis package and everything runs without issue.
Control flow;
File System Task (creates directory --->Execute Process Task (xls extraction)-->ForEach Loop(Data flow Task "email2Sql")
Data Flow;
Excel Source (uses expression ExcelFilePath,#user:filepath) delay validation ==true
(columns are initially set to f1-f6 and are mapped to for ex. a,b,c,d,e,f. The Older files that get mixed in only include a,b,c,d,e.) This is where I want to be able to separate the xls files
Conditional Transformation split (column names are not in row 1, this helps remove "null" values)
Ole Db destination (sql table)
Sorry for the amount of reading, but for the first post I tried to include anything that I thought may be relevant.
There are some tools out there which would allow you to open the excel doc and read it. However, I think the simplest thing to do would be to use SSIS out of the box:
1 - add a file system task after the data flow which reads the file.
2 - Make the precedence constraint from the data flow to the file system task "failure." This will cause that to only fire when the data flow task fails.
3 - set the file task to move the "bad" files to another folder
This will allow you to loop through all the files and move the failed ones. Ultimately, the package will end in failure. If you don't want that behavior you can change the ForceExecutionResult property to be success. However, it might be good to know that there were problems with some files so that they can be addressed.
m
I have the following use case:
I am trying to move the files in one folder to another folder. If any file is corrupt then whole process should be rolled back and no files should be moved.
For achieving this I am making use of one data flow task and one file system task. Data flow task would check for the integrity of the file and file system task would then move the file. These two tasks are in foreach container.
The transaction property of foreach is set to required and for the two tasks inside it, i am keeping it as supported.
Issue: There are 6 files in one folder which are to be moved. The file # 4 is corrupt. I want the whole task to rollback when system detects the corrupt file. However, this is not happening and files uptil file no 3 get moved.
Screen shot attached.
I have an SSIS package with several data flow tasks. Each one imports a flat file into a table in my DB. I have created a connection manager for each underlying flat file. The package works just fine if all of the files exist. However, even if one of the files is missing, the entire package fails. I don't want this behavior. For whatever files that exist, I want my package to import them. For those that don't exist, I want SSIS to simply ignore them. At least one of the files will always exist. How do I achieve this behavior? I have seen some solutions that involve either scripts or file control tasks, but I'm not sure which is appropriate for my situation.
my solution is
1. make a Script Task for checking the path file:
SSIS Script task to check if file exists in folder or not
2. ValidateExternalMetadata set to False in the source properties
3. link the Script Task with next step if skip and create a Constrain and Variables connection with if the file exist
I have a package that needs to check if a file exists in a folder and if the file does exist then take a branch that will import the file to SQL Server and execute some stored procedures to process it. If the file does not exist then just end the current run of the package without error. I have all parts working just fine except for the file detection and branching depending on the results. (In other words currently it just runs as if the file is there and does the rest). I know how to use a script task to detect for the file and return an error if not found - I need to know how to make the main package just end without error in that case or go on and do the import and the rest of the processing if the file was found.
You could use a Foreach Loop container in the Control flow tab. Loop through a folder for a given pattern (say *.csv). Set the flat file connection manager to use the filepath obtained from the For each loop container as the connection string.
In this setup, the data flow task within the For each loop container will execute only if a file is found. Otherwise, it will end the process silently without any errors.
Here are few other SO questions where I have provided some examples about looping files using Foreach Loop container.
Creating an Expression for an Object Variable?
How can I load a large flat file into a database table using SSIS?
Hope that gives you an idea.