I have source excel file and I need to add the file name to every row, When I select the file name from system variables, the system set 0 as the length. How could I change this value?
This is typically done in the data flow with a derived column.
Simply add a column of type string and set the expression to the filename of your excel file (typically through a variable).
Related
I have a problem in a small SSIS package that I'm trying to do for storing a query result into an excel file
I want the file to have a dynamic name of Missing_Timecards_#DATETIME#.xlsx
for example: "Missing_Timecards_20220808_131321.xlsx"
for this I have created a template file that has the columns and sheet name I want.
and I have set a system task to copy this template file into a new one with the dynamic name I want to have:
for the variables I have set a combination of a few fields to get my dynamic filename with the date:
the expression for getting the date is
REPLACE(REPLACE( REPLACE(SUBSTRING((DT_WSTR,50)GETDATE(),1,19),"-",""),":","")," ","_")
so far so good, no errors here, when the process starts the variable gets calculated, the filetask creates a copy with the freshly calculated field and goes to the dataflow that retrieves the data and saves it into the excel file path set with the variable that was calculated originally for the filename+datetime
However here is where the issue appears, it seems that the variable is calculated again, so a new file gets created with a "fresh" datetime part of the name, and as the sheet name doesn't match it gives an error.
I think the issue is that is calculating the variable again, how do I stop this from happening? (I have set delay validation = true in for the excel connection and the dataFlow)
As you've identified, GETDATE() is calculated each time it is evaluated. Instead, I favor using a System scoped variable like #[System::StartTime] as it is the time the package started execution but remains constant for the duration of the package.
Literally, swap reference to getdate() with #[System::StartTime] and you're set.
The other option is to
Copy the existing expression to your clipboard
Clear the expression from the Variable
Add an Expression Task to the Control Flow and re-use the expression in the clipboard to assign the value to your #[User::DateTime] variable
Personally, I favor the former approach as a consultant because I still run into SQL Server 2008/R2 packages and the Expression Task was not available for the product.
I have been working on a requirement which serves the following:
Fetching record set from an OLEDB source through an execute SQL task.
These record set is then formatted into fixed width and merged into a single column with the help of another Execute SQL task.
The formatted data is then exported to a flat file..
Now, the requirement has been changed to have the record set (Originally coming from OLEDB source) exported to three separate flat files (Each with different set of data) depending upon value of a package variable.
e.g If (USER::Instructor = 'DEV') then 5 fields will be extracted to one flat file.
e.g If (USER::Instructor = 'Jerry') then 7 fields will be extracted to another flat file. And so on..
My current challenge is I have to extract different set of data without using expressions in the precedence constraint.
You will need a different data flow task for file formats that you want to be able to export. So a different task for the 5 field export to the 7 field export.
In the Control Flow, you can choose which of these data flow tasks gets executed based on the value of your package variable.
For example, if you set the Disabled property of the 5 field data flow task to the expression #[USER::Instructor] != 'DEV' , then it would be disabled whenever the instructor was not Dev, and enabled whenever it was dev.
I have a pretty simple package. It reads a flat file, extracts date from a header record and subsequently uses derived column component to reformat data to the desired output format. One of the columns (FileRunDate, string, length 8) in the derived component is defined as a string and in the expression I'm assigning it to a variable I set earlier in the script component - #[User::vRunTimeDate]. When the process runs, the output file gets generated, however FileRunDate is blank. The default value of the variable is blank, however if I were to set it to some date, then the output file does reflect this value. It seems that the variable assignment in the script task does not work, but if I were to debug it, then I see how the value is being set. The variable has an attribute of ReadWrite.
Any feedback is greatly appriciated.
I have an Excel Source which has got 1000 rows with some 10 columns and one of the column is a Date Field ,We have to retrieve the minimum date value and assign it to a variable in ssis .Could you guys provide me a script or steps to map that value to the variable...So that i can use it in control flow task to perform truncate operation using the variable value.
please adviiiise
your help in this regard is appreciated.
Rosh..
It's fairly simple: you use an Execute SQL Task to retrieve the value and store it in a variable.
Basic steps:
A. Create an Excel Connection Manager, point it at your file
B. Create a variable to store the value
C. Add an Execute SQL Task
Connection type: EXCEL
Specify connection manager
ResultSet: single row
SQLSourceType: Direct input
SQLStatement: select max(fieldname) as fieldname from [sheetname$]
In the result set tab, add a row with the the ResultName set to fieldname, and the earlier created variable in the Variable Name column.
Note that the sheetname qualification (square brackets) is necessary because of the required $. If your field (column) contains a space in the name, you have to also qualify it: [field name]
I have a couple of questions about the task on which I am stuck and any answer would be greatly appreciated.
I have to extract data from a flat file (CSV) as an input and load the data into the destination table with a specific format based on position.
For example, if I have order_id,Total_sales,Date_Ordered with some data in it, I have to extract the data and load it in a table like so:
The first field has a fixed length of 2 with numeric as a datatype.
total_sales is inserted into the column of total_sales in the table with a numeric datatype and length 10.
date as datetime in a format which would be different than that of the flat file, like ccyy-mm-dd.hh.mm.ss.xxxxxxxx (here x has to be filled up with zeros).
Maybe I don't have the right idea to solve this - any solution would be appreciated.
I have tried using the following ways:
Used a flat file source to get the CSV file and then gave it as an input to OLE DB destination with a table of fixed data types created. The problem here is that the columns are loaded, but I have to fill them up with zeros in case the date when it is been loaded or in most of the columns if I am not utilizing the total length then it has to preceded with zeros in it.
For example, if I have an Orderid of length 4 and in the flat file I have an order id like 201 then it has to be changed to 0201 when it is loaded in the table.
I also tried another way of using a flat file source and created a variable which takes the entire row as an input and tried to separate it with derived columns. I was to an extent successful in getting it, but at last the data type in the derived column got fixed to Boolean type explicitly, which I am not able to change to the data type I want.
Please give me some suggestions on how to handle this issue...
Assuming you have a csv file in the following format
order_id,Total_sales,Date_Ordered
1,123.23,01/01/2010
2,242.20,02/01/2010
3,34.23,3/01/2010
4,9032.23,19/01/2010
I would start by creating a Flat File Source (inside a Data Flow Task), but rather than having it fixed width, set the format to Delimited. Tick the Column names in the first data row. On the column tab, make sure row delimiter is set to "{CR}{LF}" and column delimiter is set to "Comma(,)". Finally, on the Advanced tab, set the data types of each column to integer, decimal and date.
You mention that you want to pad the numeric data types with leading zero's when storing them in the database. Numeric data types in databases tend not to hold leading zero's. So you have two options; either hold the data as the type they are in the target system (int, decimal and dateTime) or use the Derived Column control to convert them to strings. If you decide to store them as strings, adding an expression like
"00000" + (DT_WSTR, 5) [order_id]
to the Derived Column control will add up to 5 leading zeros to order id (don't forget to set the data type length to 5) and would result in an order id of "00001"
Create your target within a Data Flow Destination and make the table/field mappings accordingly (or let SSIS create a new table / mappings for you).