SSIS excel destination , data loads into header row - ssis

I have SSIS package that is supposed to load data into excel destination (template file).
The destination has first row is Title , 2nd row has headers so I do as follows,
Select * from [TemplateName$A2:$AD10000]
But what happens is it inserts first set of data (SQL source) into second row of template which contains header names and overwrites but if I select A3 istead, it gives error since mapping needs column names.
Please suggest, thanks.

I just did some testing with this and I get the same exact behavior:
MyCol
<----- Data should start here
1 <----- Data actually starts here
2
3
My suggestion would be to remove the column headings in the sheet, uncheck the "First row has column names" in your excel connection and then add the actually headings into your SQL data (Converting everything to text). Probably would be good to put in a Connect for this as well.

Related

Extra text added by external table by impala,hive in first row from csv

I have many csv's with some rows and multiple columns. First cell is of id Ex: c63-c5cf-44d7 and so on in S3 bucket.
So I create external table on that location without skipping header as there is no header and first row also have actual values.
If i make select * from then first cell value get appended automatically with some text like
./Track.2017-02-06-12-11_ae55b12f.csv00006440000000000000000031413046064003015703 0ustar rootrootc63-c5cf-44d7
Above string at last cell value after rootroot..
I tried to replace it regex_replace also tried regex_extract but it fails when apply with select query while fetching values.
When I import select id list in to csv it show many ? marks.
Is it CSV header issue? OR its recommended to use Header while creating csv and avoid while creating external table.

How to Get Data from CSV File and Send them to Excel Using Pentaho?

I have a tabular csv file that has seven columns and containing the following data:
ID,Gender,PatientPrefix,PatientFirstName,PatientLastName,PatientSuffix,PatientPrefName
2 ,M ,Mr ,Lawrence ,Harry , ,Larry
I am new to pentaho and I want to design a transformation that moves the data (values of the 7 columns) to an empty excel sheet. The excel sheet has different column names, but should carry the same data, as shown:
prefix_name,first_name,middle_name,last_name,maiden_name,suffix_name,Gender,ID
I tried to design a transformation using the following series of steps, but it gives me errors at the end that I could not interpret them.
What is the proper design to move the data from the csv file to the excel sheet in this case? Any ideas to solve this problem?
As #Brian.D.Myers mentioned in the comment you can use select values step. But here is how you do it step by step explanation.
Select all the fields from CSV file input step.
Configure the select values step as follows.
In the Content tab of Excel writer step click on Get fields button and fill the fields. Alternatively you can use Excel output step as well.

How to import a fixed width flat file into database using SSIS?

Does any one have a tutorial on how to import a fixed width flat file into a database using an SSIS package?
I have a flat file containing columns with varying lengths.
Column name Width
----------- -----
First name 25
Last name 25
Id 9
Date 8
How do I convert a flat file into columns?
Here is a sample package created using SSIS 2008 R2 that explains how to import a flat file into a database table.
Create a fixed-width flat file named Fixed_Width_File.txt with data as shown in the screenshot. The screenshot uses Notepad++ to display the file contents. It has the capability to show the special characters like carriage return and line feed. CR LF denotes the row delimiters Carriage return and Line feed.
In the SQL server database, create a table named dbo.FlatFile using the create script provided under SQL Scripts section.
Create a new SSIS package and add a new OLE DB Connection manager that would connect to the SQL Server database. Let's assume that the OLE DB Connection manager is named as SQLServer.
On the package's control flow tab, place a Data Flow Task.
Double-click on the data flow task and you will be taken to the data flow tab. On the data flow tab, place a Flat File Source. Double-click on the flat file source and the Flat File Source Editor will appear. Click the New button to open the Flat File Connection Manager Editor.
On the General section of the Flat File Source Editor, enter a value in Connection manager name (say Source) and browse to the flat file location and select the file. This example uses the sample file in the path C:\temp\Fixed_Width_File.txt If you have header rows in your file, you can enter a value 1 in the Header rows to skip textbox to skip the header row.
Click on the Columns section. Change the font according to your choice I chose Courier New so I could see more data with less scrolling. Enter the value 69 in the Row width text box. This value is the sum of width of all your columns + 2 for the row delimiter. Once you have set the correct row width, you should see the fixed width file data correctly on the Source data columns section. Now, you have to click at the appropriate locations to determine the column limits. Note the sections 4, 5, 6 and in the below screenshot.
Click on the Advanced section. You will notice 5 columns created for you automatically based on the column limits that we set on the Columns section in the previous step. The fifth column is for row delimiter.
Rename the column names as FirstName, LastName, Id, Date and RowDelimiter
By default, the columns will be set with DataType string [DT_STR]. If we are fairly certain, that a certain column will be of different data type, we can configure it in the Advanced section. We will change Id column to be of data type four-byte signed integer [DT_I4] and Date column to be of data type date [DT_DATE]
Click on the Preview section. The data will be shown as per the column configuration.
Click OK on the Flat file connection manager editor and the flat file connection will be assigned to the Flat File Source in the data flow task.
On the Flat File Source Editor, click on the Columns section. You will notice the columns that were configured in the flat file connection manager. Uncheck the RowDelimiter because we won't need that.
On the data flow task, place an OLE DB Destination. Connect the output from the Flat file source to the OLE DB Destination.
On the OLE DB Destination Editor, select the OLE DB Connection manager named SQLServer and set the Name of the table or the view drop down to [dbo].[FlatFile]
On the OLE DB Destination Editor, click on the Mappings section. Since the column names in the flat file connection manager are same as the columns in the database, the mapping will take place automatically. If the names are different, you have to manually map the columns. Click OK.
Now the package is ready. Execute the package to load the fixed-width flat file data into the database.
If you query the table dbo.FlatFile in the database, you will notice the flat file data imported into the database.
This sample should give you an idea about how to import fixed-width flat file into database. It doesn't explain how to handle error logging but this should get you started and help you discover other SSIS related features when you play with packages.
Hope that helps.
SQL Scripts:
CREATE TABLE [dbo].[FlatFile](
[Id] [int] NOT NULL,
[FirstName] [varchar](25) NOT NULL,
[LastName] [varchar](25) NOT NULL,
[Date] [datetime] NOT NULL
)
In the derived column transformation you can use SUBSTRING() function for each of the column.
Example:
Columns DerivedColumn
FirstName SUBSTRING(Data, startFrom, Length);
Here the FirstName has width 25 so if we consider that from the 0th position then in the derived column you should specify it by giving SUBSTRING(Data, 0, 25);
Similarly for other columns.
Very well explained, Siva! Your tutorial and excellent illustrations point out what Microsoft should have made clear
that the width for a fixed length row has to include the Carriage Return and Line Feed (CR & LF) characters (which I figured out because the preview showed the rows were not lining up correctly)
the all important step of defining an extra column to contain those CR & LF characters, even though they won't be imported. I figured this out, too. I would have benefited by finding your answer before I began.
Without those two things, an attempt to run the import will give this error message:
The data conversion for column "Column x" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.".
I have added in this error text in hopes someone will find this page while searching for the cause of their error. Your turorial is worth finding, even if after the fact!

how to load extracted file name into sql server table in SSIS

i have 3 csv files in a folder which contains eid, ename, country fields, and my 5 csv files names are test1_20120116_034512, test1_20120116_035512,test1_20120116_035812 etc.. my requirement is I want to take lastest file based on timne stamp and modified date, which i have done. Now i want to import the extracted file name into destination table..
my destination tables contains fields like,
filepath, filename, eid, ename, country
I have posted regarding this before in the same site i got an answer for extracting filename, now i want to load the extracted FileName into destination table
Import most recent csv file to sql server in ssis
my destination tables should have output as
C:/source test1_20120116_035812 1234 tester USA
In your DataFlow task, add a Derived Column Transformation. The value of CurrentFile will be the fully qualified path to the file. As you only want the file name, I would look to use a replace function on that with the base folder and then strip the remaining slash. This does not strip the file extension but you can add yet another call to REPLACE and substitute an empty string
Derived Column Name: filename
Derived Column:
Expression: REPLACE(REPLACE(#[User::CurrentFile], #[User::RootFolder], ""), "\\", "")
The above expects it to look like
CurrentFile = "C:\source\test1_20120116_035812.csv"
RootFolder = "C:\source"
Edit
I believe you've done something in your approach that I did not do. You should see a warning about possible truncation but given the values discussed in this and the preceding question, I don't believe the 4k limit on expressions will be of concern.
Displaying the derived column
Demonstrating the derived column does work
I will give you a +1 for providing an approach I wasn't aware of, but you'll still need to add a derived column to match your provided format (base path name)
Full path is provided from the custom properties. Use the above REPLACE section to remove the path info except use the column [FileName] instead of #[User::CurrentFile]
I tried to get the filename through the procedure which Billinkc has given, but its throwing me error stating that filename column failed becaue of truncation error..
Any how i tried different approach to load file name into table.
steps i have used
1. right click on flat file Source and click on show advanced edito for Flat file
2. select component Properties tab
3. Inside that Custom Properties section ---> it has a property FileNameColumnName
I have assigned Filename to that column property like
FileNameColumnName----> FileName thats it, am able to get the filename into my destination table..

SSIS Need Flat File output with 2 column headers the same

I am trying to use SSIS Flat File destination, but cannot come up with a work around for getting the output file to have two columns named to same thing.
I have a requirement for the output file to have the column headers:
first1, last1, email, shortname, email
Whenever I try to map the source data, I get error messages saying things like "This column name already exists" and "There is more than one data source column with the name "email"".
What's the best work around?
Thanks
Assuming I understand the problem correctly, you need to have the same column name in the output file twice. Doesn't matter whether it's same data or not, just the header needs to be repeated.
It's a little hokey, but in your connection manager, uncheck "Column Names in the first data row" and redefine the columns as email and email1. This will allow you to connect the columns to the right places in the file.
In your flat file destination, you have the ability to define Header row(s). It's very limited, you can't put useful things in there like dynamic checksums and such but in your case, paste in first1, last1, email, shortname, email and run the package. Data will be extracted to the correct columns and a header row will be prepended to the file with all the "right" field names.
Two downsides to this approach. First is the connection manager becomes output only as it would attempt to read in the header row from the file. Second is that any changes to the layout will not be kept in sync with the manual header row.