SSIS 2008 sequence number - ssis

I have a requirement where the output file needs to be saved(dynamically) with the naming convention as FileName_YYY-MM-DD_FileNumber where file number is the sequence number. For example:-
ABC_2009-01-01_001
ABC_2009-01-01_002
ABC_2009-01-01_003 and so on
I am able to get the name part and date part using expression in .TXT connection but unable to get the sequence number part. I would appreciate if anyone could help me out with the solution.
Thanks in advance!

Use a package variable that starts with "1" and add one to it for each new file.
EDIT: To populate the variable, one way is to use a script task that opens a filesystemobject and gets all the file names in the folder, parses them and figures out what is the highest sequence number. Then just add one to that number and set the value of the variable to that.
And no, I don't have any code handy that does that. You'll need to write it yourself.

Related

How to decouple variable names in external files and the code?

Imagine I have an external file dates.csv in the following format:
Name
Date
start_of_fin_year
01.03.2022
end_of_fin_year
28.02.2023
Obviously, the file may get updated in the future, and the date may change. I create a piece of code that checks the file periodically to extract needed dates and put them into the DB/variables. Roughly speaking, I have this pseudocode:
start_of_fin_year = SELECT Date FROM table WHERE Name = 'start_of_fin_year'
The problem I face: my code will break if I or someone else changes the name in the table. How do I prevent this?
FYI this is a personal project that I developed on my own, but I will have to give access to .csv files to others so they can update info. I'm afraid they may accidentally change the names, so that's why I'm worried.

SSIS how handle var length in Derived column to avoid truncation

I have this setup like in illustration attached, hope it clear enough. I load all xls files in loop from given folder.
While implementing new Derived Column box to record FileName from my loop I got truncation error. My variable initially was set to CCS.xls (Len=7, shortest name ).
I tried to increase Length in Derived Column Editor but failed to do this, as it's not active, I can't not type anything there, then I track that that original Length came from Variables value. In Variable windows I have DataType = String and no any option to set length.
So for now I made dummy empty file with looong CCS____1.xls name to avoid this problem and it works OK. But want to learn other good way to avoid this problem, looks like in this setup for data connection I need to use file with longest name (?)
You can change the Length property to 50 or larger manually in Advanced Editor.
Right-Click on the Derived column->Show Advanced Editor->Input and Output Properties->Derived Column Output->Output Columns->the new Column->Data Type Properties->Length

How to change Column Delimiter in MS VSTS for web performance test

I am using Microsoft VSTS for Performance test a web application
I am adding a Data Pool (.csv file) for parameterize multiple values, But the problem is .csv file is showing it in column delimited type like:
VariableA,VariableB,Variable3
Test1,Test2,Test3
Test4,Test5,Test6
But i want these multiple values in single column, Because whenever we will select the column delimited type, .csv file automatically converts all values in different columns.
Like in HP-LoadRunner we have 3 options [Column, Tab, Space]. I tried to find out in VSTS data-pool settings but not able to find any option.
I am trying to do this:
VariableA
Test1,Test2,Test3
Test5,Test6,Test7
Kindly help me out.
If you want to use Test1,Test2,Test3 in first iteration, Test5,Test6,Test7 in second iteration then try below in your csv file.
VariableA
"Test1,Test2,Test3"
"Test5,Test6,Test7"
This should consider Test1,Test2,Test3 as a single variable.

how to create a SSIS package which creates three text files, using same variables but the textfile is only created when the correct data is found?

There are only 3 files that can be created : "File_1", "File_2" and "File_3". The same variable name is used in each instance (User::FileDirectory) and (User::File_name), but because the actual value of the variable changes, a new file is created.However the files are only created if there is data to go into the file. i.e. if there are no records to populate the file, it will not be created at all. When the files are created, the date the file was created should also be added to the filename. eg: File1_22102011.txt
Ok if the above was a little confusing, the following is how it works,
All the files use the same variable, but it is reset before each file is created.
• So it populates a result set in memory with the first sql selection (ID number, First_Name and Main_Name). It sets the file variable to “File_1”. If there are records in the result set, it creates and writes to this filename.
• Then it creates a new result set with the second selection(Contract No). It sets the variable to "File_2". If there are records in this new result set, a new file will be created from the variable(which now has a new value)
• Finally a third result set is created (Contract_no, ExperianNo, Entity_ID_Number, First_Name, Main_Name), and the file variable is set to "File_3". Again if there are records in the result set, then this file will be created and written to.
I have worked on a few methods to achieve this but they all have failed, So little help will be greatly appreciated.
While what you have works, I think it'd be rather painful to maintain.
I would approach it as 3 sequence containers running in parallel. Each container would have a data flow and two file tasks hanging off it based on success of the parent and the value of row count variable. If the row count variable is 0, delete the file. If it's greater than 0, rename it to File_n
As you can see, I have a container for the first file. The data flow creates an output a.txt file. Based on the value of the variable #RowCount1, it will either delete the empty file or rename it to File_1.
Each data flow would look like a source query, a row count transformation and a file destination with a temporary name (a.txt, b.txt, c.txt). As a file is always created, even if it's empty, we will need to delete or rename it afterwards which will be accomplished based on the file operation tasks.
In my opinion, this approach will be cleaner as it will allow you to test and debug each item in a cleaner manner rather than dealing with an in-memory dataset.

SSIS - Is there a Data Flow Source component that will handle CSV files where the column order may change?

We have written a number of SSIS packages that import data from CSV files using the Flat File Source.
It now seems that after these packages are deployed into production, the providers of these files may deliver files where the column order of the files changes (Don't ask!). Currently if this happens, our packages will fail.
For example, an additional column is inserted at the beginning of each row. In this case, the flat file source continues to use the existing column order, which obviously has a detrimental effect on the transformation!
Eg. Using a trivial example, the original file has the following content :
OurReference,Client,Amount
235,MFI,20000.00
236,MS,30000.00
The output from the flat file source is :
OurReference Client Amount
235 ClientA 20000.00
236 ClientB 30000.00
Subsequently, the file delivered changes to :
OurReference,ClientReference,Client,Amount
235,A244,ClientA,20000.00
236,B222,ClientB,30000.00
When the existing unchanged package is run against this file, the output from the flat file source is :
OurReference Client Amount
235 A244 ClientA,20000.00
236 B222 ClientB,30000.00
Ideally, we would like to use a data source that will cope with this problem - ie which produces output based on the column names, instead of the column order.
Any suggestions would be welcomed!
Not that I know of.
A possibility to check for the problem in advance is to set up two different connection managers, one with a single flat row. This one can read the first row and tell if it's OK or not and abort.
If you want to do the work, you can take it a step further and make that flat one-field row the only connection manager, and use a script component in your flow to parse the row and assign to the columns you need later in the flow.
As far as I know, there is no way to dynamically add columns to the flow at runtime - so all the columns you need will need to be added to the script task output. Whether they can be found and get parsed from the each line is up to you. Any "new" (i.e. unanticipated) columns cannot be used. Columns which are missing you could default or throw an exception.
A final possibility is to use the SSIS object model to modify the package before running to alter the connection manager - or even to write the entire package dynamically using the object model based on an inspection of the input file. I have done quite a bit of package generation in C# using templates and then adding information based on metadata I obtained from master files describing the mainframe files.
Best approach would be to run a check before the SSIS package imports the CSV data. This may have to be an external script/application, because I don't think you can manipulate data in the MS Business Intelligence Studio.
Here is a rough approach. I will write down the limitations at the end.
Create a flat file source. Put the entire row in one column.
Do not check Column names in first data row.
Create a Script Component
Code:
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
string sRow = Row.Column0;
string sManipulated = string.Empty;
string temp = string.Empty;
string[] columns = sRow.Split(',');
foreach (string column in columns)
{
sManipulated = string.Format("{0}{1}", sManipulated, column.PadRight(15, ' '));
}
/* Note: For sake of demonstration I am padding to 15 chars.*/
Row.Column0 = sManipulated;
}
Create a flat file destination
Map Column0 to Column0
Limitation: I have arbitrarily padded each field to 15 characters. Points to consider:
1. Do we need to have each field of same size?
2. If yes, what is that size?
A generic way to handle that would be to create a table to store the file name, fields, and field sizes.
Use the file name to dynamically create the source and destination connection manager.
Use the field name and corresponding field size to decide the padding. Not sure, if you need this much flexibility. If you have any question, please respond.