SSIS reading flatfile header column - csv

Can you guys help me (point me in the right direction) on how I can achieve the following in SSIS.
So, I have a flatfile that looks like this
ColumnA ColumnB ColumnC ColumnD ColumnN
1 x APPLE Random1 MoreRandomData1
2 y ORANGE Random2 MoreRandomData2
3 z OTHER Random3 MoreRandomData3
... and I need to store these data into a table in the following format
ColumnA, ColumnB, BigBlurColumn
1 x ColumnC:APPLE, ColumnD:Random1, ColumnN:MoreRandomData1
2 y ColumnC:ORANGE, ColumnD:Random2, ColumnN:MoreRandomData2
3 z ColumnC:OTHER, ColumnD:Random3, ColumnN:MoreRandomData3
Here's my question:
1. How can i read the header/column of a flatfile?
2. Is it possible to pivot the result of #1
If I can managed to manipulate both #1 and #2 the reset will be fairly easy for me to do in SSIS, obviously I can script these however my client insist on using SSIS as this is there standard ETL tool.
Any ideas on how I can achieve above scenario?
Thanks

In the flat file connection manager, uncheck First row contains header option. Then go to Advanced Tab, delete all column and leave one and change its length to 4000.
In the data flow task, add a script component that split each row and:
Read the columns headers from the first row
Generate the desired output columns in all remaining rows
The following answers (different situations but they are helpful) will give you some insights:
SSIS ragged file not recognized CRLF
how to check column structure in ssis?
SSIS reading LF as terminator when its set as CRLF

Try dumping the data into a staging table and then use STRINGAGG() function to concatenate the data into the format you want and move it to the destination table.

Related

Remove null values from SSIS package export from DB source to csv

I setup a SSIS package to export a view from multiple tables that gives me 3 columns, columnA ColumnB ColumnC to a single CSV file that contains only ColumnB and ColumnC. So it is a OLEDB source to flat file destination. I am now trying to remove rows that have a blank entry in ColumnB. I cant seem to find anything that clearly shows how to remove those rows that have a null value in columnB; coming from a DB source to flatfile destination. Anyone have any links or thoughts on the matter would be greatly appreciated.
In your dataflow, you can include a Conditional Split component to move the rows with an empty entry in that column to its own output, and simply ignore that output in the rest of your flow.

SSIS 2012 column header is too long for column width Extracting fixed width flat file

I am attempting to extract a table from sql database into a fixed width flat file.
The file should have a column header
I am attempting to recreate a file that already existed where the header for certain columns(for example Gender with a width of 1) has a column name that is too long for it's column format.
The existing file just cuts off these column headers, so Gender(the db column name and input column to the destination becomes 'G' - just what will fit.. but when I attempt to reproduce the extract in SSIS 2012 by pointing at the existing file while creating the flatFile connectionManager It works without a header, but not when I check "column header in first data row"
Is there a way to change/shorten the column names to just what will fit in the format? I am using "ragged right" file format and the data looks perfect without column headers.
Any help is appreciated.
Steve
SSIS really likes consistent metadata. The flat file definition specifies that gender is a length of one and it's going to hold the column header to the same standard that it holds the data. My experience with fixed width files is that they've never had headers, which is painful when they're a few thousand bytes wide, which is likely due to the this problem.
What you can do is to manually specify a header row in the Flat File Destination.
Within my Connection Manager, I uncheck the Column Names in First Row and increment the Header Rows to Skip value to 1.
In my example, I used the following query
SELECT
*
FROM
(
VALUES
('AAAAAAAAAAAAAAAAAA','BBBBBBBBBBBBBBBBBBBBBBBB','M','CCCCCCC')
)D(c1, c2, Gender, c4);
This results in an output file that looks like
Col1Is18BytesWide NextColumnAlignsWithNextGenderSeeWhatIDidThere
AAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBMCCCCCCC
That may or may not be the solution you're looking for. I think it'd drive me mad seeing column headers not aligning with the data values but you never know how other systems expect their data.

SSIS: Importing from flatfile, how do I custom update a column that's not in flatfile?

Flat file has:
A,B
1,2
5,7
DB has:
A,B,C
1,2,null
5,7,null
I'd like to during import update column C with static data "3"
So it looks like:
A,B,C
1,2,3
5,7,3
Thanks in advance!
Hmm looks like this is the best solution:
Executing the same SSIS Package with different parameters at different time
In your Data Flow, you will need to add a Derived Column Component between your Flat File Source and the OLE DB Destination.
In your Derived Column, you will name it as C and possibly use an Expression. Since you say it's static, then it might be as simple as putting the literal value of 3 into the Expression column.

SSIS package - extract data from first n rows and import data from n+1th row from a flat file

I have a flat file with the following structure (first 3 lines are information about the file content and data starts at 4th row):
ImportSourceId,ReadTime,Location
ColumnHeader1,ColumnHeader2,ColumnHeader3,ColumnHeader4,ColumnHeader5,ColumnHeader6
Unit1,Unit2,Unit3,Unit4,Unit5,Unit6
DataForColumn1,DataForColumn2,DataForColumn3,DataForColumn4,DataForColumn5,DataForColumn6
I would appreciate suggestions to import this data to a target SQL Server table using SSIS. I am thinking on these lines:
Add a connection manager. 3 columns will be created based on the
number of values in first row (ColumnHeader3 thro ColumnHeader6 are all
being treated as one column by the connection manager at this point). As I want to extract information from the first row, I can't set 'Header Rows To skip' (?).
Add a script component to read first 3 rows to a string variable and extract the data as required.
(not sure how to split the 3rd column to 3 columns at this point)
Regards,
Mohan.
Assuming the column names are always static:
When importing the file, use a flat file connection.
Skip the first 3 rows with "Header Rows to skip"
Uncheck "column names in first row"
Click "Advanced" and manually set your column names.

SSIS 2008 Script Transformation Inputs and Outputs

I have a flat file that I need to parse in SSIS, part of this parsing is to chop off a load of extra text at the bottom of the file. To help do this I added a row number to each row using a Script Transformation.
In the Script Transformation (ST) under Inputs and Outputs I have an Input Column defined called Column256_in (it has a length of 256) and its ID is 59.
For Output columns I have defined Column256_out, it has an ID of 68 and a MappedColumnID of 59, there is another Output Col called rowCount.
There is script code contained in the ST the calculates the row number for each row.
When I run the SSIS package I have a Data Grid after the Script Transformation I get the following:
Column256_in contains the data from the orginal text file.
rowCount is populated correctly. ( I did something right today!)
Column256_out is empty --> I thought that the MappedColumnId of 59 would populate this col with the data from Column256_in.
What does the MappedColumnID attribute do on the Out put col?
Thanks for your assistance.
KD
MappedColumnID is just an alternative way of identifying the columns instead of using their names.
From MSDN
The use of these properties is not required. These properties provide an easier way for developers to associate related columns, such as input and output columns, in custom data flow components.