I am working in BIDS. I have an SSIS package that loops over all files in a given directory and imports all of them to a table. So, my control flow has a Foreach Loop Container with a Data Flow Task inside of it. The Data Flow Task goes straight from a Flat File Source to an OLE DB Destination. Because the directory can change, I have made the Foreach Directory dynamic (i.e., I gave it a package variable, which I can change when the directory changes). The Files field has something like stuff*.pip, such that all files in the directory that match this pattern get imported. The Variable Mappings tab has a file name variable, which is the same variable used in the Expressions of the flat file connection manager.
When I originally set up the flat file connection manager, I had to point it to an already existing file to pick up the file's metadata. I then changed the file connection manager to dynamic (i.e., I added a file name variable in the Expressions property) such that the Foreach Loop Container will pick up all files (that match the pattern).
Since then, the metadata on the file has changed. So, when I open the solution file, it throws a warning on the Flat File Source in the Data Flow Task. To resolve this, I have to temporarily change the flat file connection back to a hard-coded path such that I can change the metadata on the new file. I then make it dynamic again (i.e., use the file name variable in the Expressions, which overrides the hard-coded file path). (Finally, I double-click on the Flat File Source in the Data Flow Task to automatically update the metadata in the Flat File Source.)
Is this the correct way to update the metadata on a flat file? It seems a bit clunky to me.
You can prevent this error from occurring by editing the properties on the Flat File Manager and setting ValidateExternalMetaData=False.
You may also need to set DelayValidation=True on the Data Flow Task that contains the Flat File Manager.
This will prevent the warning - but if the file's metadata changes for some reason (columns change, data types change) and you do not remember to update the metadata then you will get a runtime error.
Related
how to load multiple Excel files using SSIS. I have had Packages in the past where I was looping through multiple Text files in a folder and loading them into SQL server tables.
Create a variable named FileName, set the scope to ImportMultipleExcelFiles, Data type is String.
Add a Foreach Loop Container in the Control Flow Task.
Edit the Foreach Loop Container, in the Collection section change the Enumerator value to Foreach File Enumerator
You need to change the Enumerator configuration as shown below :
Folder: Provide a complete folder path location where all our Excel
source files are stored.
Files: you need to read the Excel files from our source folder, so
enter *.xls in the Files section, this will make sure our SSIS
package will read all available .xls files from the source folder.
Here * indicates that the Excel file name can be anything, but file
extension will be .xls. If we need to read data from a specific Excel
file name then we have to configure it accordingly.
Retrieve File Name: select the Fully Qualified radio button.
Then, create variable mappings for the Foreach Loop container, select the "User::FileName" variable and set the Index value to 0 in the Variable Mappings section.
Add a Data Flow Task inside the Foreach Loop Container.
Right click on the recently added Data Flow task and click on Properties and mark the DelayValidation property to True.
Add an Excel Source in the Data Flow Task and create a new connection to any of the Excel source files.
You have to make the Excel connection dynamic so that it can connect to each Excel file in the source folder. To make the Excel source connection dynamic, right click on Excel Source Connection and then click on Properties.
Expand the Expression Properties, then select the Connection String property and then click on the expression icon, in the expression window :
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
+#[User::FileName]+";Extended Properties=\"Excel 8.0;HDR=YES\";"
Finally, execute the SSIS package and see the result.
I have set up part of my ssis package to check a folder and move the csv file to another folder using a file system task.
Is it possible to rename a file without knowing what the source name will be? as I have been told it will be a new name on a daily bases. can I use a wildcard in a file system task? would I be best renaming before moving what is best practise?
For an unknown file name you can wrap your file system task in a foreach loop
wrap your file system task in a foreach loop and set to file enumerate
set the folder to your source location
set search string to *.csv
set to full file path
map to a variable called fname
use fname as variable in file system task for source
Make sure you delay validation on connection manager
I have a list of file paths in a table. I need to read these file paths copy these files to a drive which is on present on another server without changing the directory structure. Is this possible ?
You could implement this in SSIS in the following way:
Create a Data Flow Task. In the Data Flow Task, add a Data source component to fetch the filenames from the table. Add a Recordset Destination component and connect the two
Create a For Each container. Connect to the Data Flow Task created in step 1. Change the enumerator type to ADO enumerator
In the For Each task, add a Script Task that takes the file name and uses System.IO.File::Copy method to copy the file to its destination.
I have a Flat File Source that is reading data from a flat file. We have recently added a new column to this flat file.
The flat file data is inserted into a database table. To accommodate the new field in the destination component, I used the ALTER TABLE statement to add the new column to the table. That is the only change I have done.
Should the mapping between flat file and destination component automatically change? I do not see the additional column present in the flat file anywhere within the SSIS package.
How do I configure the additional column in the flat file within SSIS package so that flat file source can pass the data to the destination component?
If you added a new column to the flat file, you need to update the Flat File Connection Manager to reflect the new changes. Flat File Connection Manager will be present under the Connection Manager tab at the bottom of the package.
Sample scenario illustrated using SSIS 2012:
Let's assume that you have a flat file with columns StateCode and StateName.
When you configure the Flat File Connection Manager, you will see these columns configured under Advanced tab page as shown below.
If you modify the flat file to add an additional column, say by adding the new column named CountryCode.
The flat file connection manager will not contain the new column definition. You need to open the Flat File Connection Manager to add the new column or you could delete the Flat File Connection Manager and create a new one with the new flat file column definition.
You need to click New and select appropriate option to insert the column. You cannot move the column positions. So, make sure you select the right option to add the columns. Set the appropriate properties to define the column.
When you modify source or destination schema, it will affect the source and destination components within data flow task. You might see a warning icon on the component as shown below because the component is out of sync with the metadata information of the connection manager that it is associated with.
Double-click the component showing the warning and click OK on the editor to resolve the mapping issue.
Hope that helps.
When you alter the metadata of an underlying component such as a flat file or a database, SSIS doesn't automatically refresh all the available columns. You have to do this manually.
Open the source component's editor and navigate to the "Columns" properties (on the left) and verify all of your external columns from the flat file are there and are selected as output columns.
Repeat this process for the Mappings property window of your destination component. Verify all flat file columns are mapped to the proper destination column.
The simplest way of updating your columns in your flat file source is to reset the columns on your flat file connection.
Open Your flat file connection from connection managers
Select Columns (below General)
Click Reset Columns - this then includes any new columns.
Of course you need to be careful if you have made custom changes to the data types, et cetera.
I am trying to import multiple tab delimited files into a sql server table using a SSIS package. I set the flat file source and created a flat file connection manager but I was told I will need to create multiple flat file sources for this. This cannot be true right?
Is there not someway I can use a loop and the source folder directory location?
So long as the files are all the same structure, you'd use a for-each loop, of type file. Point it at the folder with the files in, and assign a variable to the file+path. Then use that variable as an expression on the flat file connection manager.
Here is a link that shows how to do this: http://www.sqlis.com/sqlis/post/Looping-over-files-with-the-Foreach-Loop.aspx
I like the layout and the graphical indicators used on that page.