I receive a pipe-delimited flat file each week that has 50 columns. I am trying to use SSIS to take that file, delete the last 3 columns, then insert the remaining data into a new pipe-delimited flat file. At first I thought this would be very simple, but I've got a stubborn flat file connection manager. It keeps reverting back to the inbound file layout with the extra columns, and also keeps going back to a comma delimited file when the outbound file needs to be pipe delimited.
The way I'm "deleting" the unneeded columns is by just removing them from the inbound flat file connection manager, so they aren't listed in the output columns from the flat file source, and they don't show to be on the input columns of the flat file destination.
The file name of both files is dynamic...not sure if that's having something to do with it.
I have delay validation set to true for both, but I'm not sure what else to try. I've also tried deleting all of it and adding back in the connection managers and the files.
Is there some issue with having 2 flat file connection managers, one for the source and one for the destination? Is there a setting I'm missing?
Delete your connection managers and Source/Destination (effectively start over)
Add Flat File Source (FFSRC) with new FF Connection
Set up FFSRC as needed (pipe delimited, headers, etc) - don't delete any rows
After clicking "OK", you're back in the Data Flow. Right click on your Flat File Source, and click "Show Advanced Editor"
Go to "Input and Output Properties" tab, then expand the FFSRC Output/Output Columns. Click a column, then click "Remove Column".
Add your Flat File Destination (FFDST) with a new connection manager, and map the inputs.
Your destination shouldn't have those columns now.
If the Flat File connection seems to be resetting because of dynamic names, look into supplying them as expressions/variables.
To do that, click on the Connection Manager node for your source/destination (not the data flow node), then in Properties, expand Expressions. You will want to make the ConnectionString dynamic via a variable.
What do you mean by
It keeps reverting back to the inbound file layout with the extra columns, and also keeps going back to a comma delimited file when the outbound file needs to be pipe delimited.
Have you tried directly referencing the file, then setup the type of file (ragged right, delimited or fixed width), then applying the expression to the connectionstring property of the connection manager?
I suggest ragged right, I can specify whatever 'column' and width I want to.
Related
When I create the SSIS package it requires a file to be referenced to pick up the files metadata. For example the column headers will be ColumnA, ColumnB.
I have always assumed that these column names need to be present in the file for it to be loaded. Recently business, for whatever reason, changed one of the column names in the file to something else so the file contains ColumnA, NotColumnB. When the SSIS package runs it ignores this and loads the file. I assumed that it would fail. Is my assumption correct and there is something weird going on or is my assumption incorrect, if so please let me know why.
I have changed the column names in a few other packages that load data from a file and they also dont care what the column names are
Click on the flat file source, and press F4 to show the properties tab. There are a property called ValidateExternalMetadata change it to True.
For more information check the following answer:
Detect new column in source not mapped to destination and fail in SSIS
Update 1
It looks like that flat file connection manager has no validation engine and the metadata defined is used at configuration time to configure the mappings between the data file and the database.
Why Does't SSIS Flat File Data Check If Columns Names or Order Have Changed? What is best way to check?
Flat file destination columns data types validation
I have a data flow task that gets the data from a proc, counts the rows and adds the rows to a flat file with some 20 columns each with different output column width,specified in the advanced tab of the flat file connection manager. The flat file destination object overwrites the file every time and the name of the file is created dynamically.
Now what I need to do is add a header and footer row to that existing flat file with only 5 columns, each with their own width. Values for the header and footer are not coming from the data set used in the above data flow task.
I think it will be its on flat file connection object with 5 columns in it. Some column values can be gotten from the variables.
How can I append a row of header and footer to the existing file coming from the data flow task. I am not sure how to go about doing that..
Since the header and footer rows use different data, you can use three Data Flow Tasks (one for the header, current output, and footer). On the Control Flow tab, use Precedence Constraints to link each DFT in the correct order. For the header and footer tasks, add the necessary source components and set the output to a Flat File Destination. Create a string variable that will contain the name of the output file that all the results will be written to and set this for the ConnectionString expression on all the Flat File Connection Managers that are used. This will ensure they all write to the same file. On the first (header) DFT, select the "Overwrite data in the file" option on the Flat File Destination to ensure that a new file will be created. This can also be done by using the Advanced Editor, going to the Component Properties pane, and setting Overwrite to true. On the second (current) and footer DFTs, set the overwrite option to false so that the data will only be appended to the file for these tasks.
I just added a new field to my OLE DB Source in an SSIS package. It then goes to a Flat file Destination. In the Flat file destination object, the field shows up under available inpt columns but doesnt show up anywhere else (Available destination Columns, or the bottom box that lists fields)
How do i get this as an available destination field?
Go into the Flat File Connection manager used by your destination object, go to Advanced, and click "New", then add the column properties.
Delete the destination and recreate it, it usually works.
Does any one have a tutorial on how to import a fixed width flat file into a database using an SSIS package?
I have a flat file containing columns with varying lengths.
Column name Width
----------- -----
First name 25
Last name 25
Id 9
Date 8
How do I convert a flat file into columns?
Here is a sample package created using SSIS 2008 R2 that explains how to import a flat file into a database table.
Create a fixed-width flat file named Fixed_Width_File.txt with data as shown in the screenshot. The screenshot uses Notepad++ to display the file contents. It has the capability to show the special characters like carriage return and line feed. CR LF denotes the row delimiters Carriage return and Line feed.
In the SQL server database, create a table named dbo.FlatFile using the create script provided under SQL Scripts section.
Create a new SSIS package and add a new OLE DB Connection manager that would connect to the SQL Server database. Let's assume that the OLE DB Connection manager is named as SQLServer.
On the package's control flow tab, place a Data Flow Task.
Double-click on the data flow task and you will be taken to the data flow tab. On the data flow tab, place a Flat File Source. Double-click on the flat file source and the Flat File Source Editor will appear. Click the New button to open the Flat File Connection Manager Editor.
On the General section of the Flat File Source Editor, enter a value in Connection manager name (say Source) and browse to the flat file location and select the file. This example uses the sample file in the path C:\temp\Fixed_Width_File.txt If you have header rows in your file, you can enter a value 1 in the Header rows to skip textbox to skip the header row.
Click on the Columns section. Change the font according to your choice I chose Courier New so I could see more data with less scrolling. Enter the value 69 in the Row width text box. This value is the sum of width of all your columns + 2 for the row delimiter. Once you have set the correct row width, you should see the fixed width file data correctly on the Source data columns section. Now, you have to click at the appropriate locations to determine the column limits. Note the sections 4, 5, 6 and in the below screenshot.
Click on the Advanced section. You will notice 5 columns created for you automatically based on the column limits that we set on the Columns section in the previous step. The fifth column is for row delimiter.
Rename the column names as FirstName, LastName, Id, Date and RowDelimiter
By default, the columns will be set with DataType string [DT_STR]. If we are fairly certain, that a certain column will be of different data type, we can configure it in the Advanced section. We will change Id column to be of data type four-byte signed integer [DT_I4] and Date column to be of data type date [DT_DATE]
Click on the Preview section. The data will be shown as per the column configuration.
Click OK on the Flat file connection manager editor and the flat file connection will be assigned to the Flat File Source in the data flow task.
On the Flat File Source Editor, click on the Columns section. You will notice the columns that were configured in the flat file connection manager. Uncheck the RowDelimiter because we won't need that.
On the data flow task, place an OLE DB Destination. Connect the output from the Flat file source to the OLE DB Destination.
On the OLE DB Destination Editor, select the OLE DB Connection manager named SQLServer and set the Name of the table or the view drop down to [dbo].[FlatFile]
On the OLE DB Destination Editor, click on the Mappings section. Since the column names in the flat file connection manager are same as the columns in the database, the mapping will take place automatically. If the names are different, you have to manually map the columns. Click OK.
Now the package is ready. Execute the package to load the fixed-width flat file data into the database.
If you query the table dbo.FlatFile in the database, you will notice the flat file data imported into the database.
This sample should give you an idea about how to import fixed-width flat file into database. It doesn't explain how to handle error logging but this should get you started and help you discover other SSIS related features when you play with packages.
Hope that helps.
SQL Scripts:
CREATE TABLE [dbo].[FlatFile](
[Id] [int] NOT NULL,
[FirstName] [varchar](25) NOT NULL,
[LastName] [varchar](25) NOT NULL,
[Date] [datetime] NOT NULL
)
In the derived column transformation you can use SUBSTRING() function for each of the column.
Example:
Columns DerivedColumn
FirstName SUBSTRING(Data, startFrom, Length);
Here the FirstName has width 25 so if we consider that from the 0th position then in the derived column you should specify it by giving SUBSTRING(Data, 0, 25);
Similarly for other columns.
Very well explained, Siva! Your tutorial and excellent illustrations point out what Microsoft should have made clear
that the width for a fixed length row has to include the Carriage Return and Line Feed (CR & LF) characters (which I figured out because the preview showed the rows were not lining up correctly)
the all important step of defining an extra column to contain those CR & LF characters, even though they won't be imported. I figured this out, too. I would have benefited by finding your answer before I began.
Without those two things, an attempt to run the import will give this error message:
The data conversion for column "Column x" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.".
I have added in this error text in hopes someone will find this page while searching for the cause of their error. Your turorial is worth finding, even if after the fact!
I have an SSIS package using a tab delimited flat file source with a TON of fields. Recently the provider of the tab delimited flat file has decided to change the format of the flat file by sprinkling a couple dozen new fields at random into the file. Needless to say, this hosed the package.
Rather than rebuild another flat file source and redefine all the fields, types, and lengths all over again, is there a way to reorder the fields in the flat file source? Sure would have been nice if Microsoft allowed you to move the fields around in the Advanced Columns pane, but noooooo.
Any help is appreciated.
If you only need to add columns to your file, you can do that in the Flat File connection editor. In the advanced window, you can select the field next to the new one and click the chevron next to the New button. It will give you the choice insert before or insert after.
If you truly have to move things around, you'll need to edit the XML source. If you use the existing file definition as a guide, you can build the new one in Excel or T-SQL relatively easily. Easier than typing everything in all over again at least.
I had a similar issue: I needed to change the order of columns in my flat file destination. The time-saving approach I settled on:
Delete the FF destination and FF connection manager (note down file name/location!),
Clear the check boxes that enable output columns in the source component
Re-enable the columns in the order you want
Add a new FF destination and FF connection right from the FF destination's connection manager drop-down.
Review/sanity check column sizes in FF connection, as usual
Not a direct answer to the question, but I came here looking for advice on "how to rearrange flat file destination columns", perhaps this will help someone.
I haven't seen an solution for that problem. SSIS isn't very strong in changing metadata. You could try to do it in notepad, but that is very tricky and very buggy. I would not recommand that to you.
In the connection managers below of your IDE you can double click your file name and edit everything you want.
This is still a "feature" of SSIS. To work around this I create a flat file connection called "NULL" with a single column named "NULL". Use the "New" button to add the column. I change the default column name from "Column 0" to "NULL". This column name must not match any column name in the list to be re-populated. If you have a real column named "NULL", pick something else for the column name that's not in use. You can keep the "NULL" flat file connection in the project for later use. (I expect to need it a few more times in this project.)
For this example, I use a flat file destination. Change the Flat File Destination to use the NULL connection.
Check the mapping to see there are no columns mapped. Saving this resets the metadata stored for the mapping.
Finally, change the Flat File Destination back to the correct connection to get a new mapping without metadata interference.
My example is a flat file destination. It should work for a flat file source for resetting the metadata. It is similar to the trick of changing a query to "select 1 as [NULL]" and back to purge metadata when using a ODBC source or such.
you could probably try something, but i havent tested.. use expressions to set everything for your flat file source? turn design time validation off