How to decouple variable names in external files and the code?

How to decouple variable names in external files and the code? - csv

Imagine I have an external file dates.csv in the following format:
Name
Date
start_of_fin_year
01.03.2022
end_of_fin_year
28.02.2023
Obviously, the file may get updated in the future, and the date may change. I create a piece of code that checks the file periodically to extract needed dates and put them into the DB/variables. Roughly speaking, I have this pseudocode:
start_of_fin_year = SELECT Date FROM table WHERE Name = 'start_of_fin_year'
The problem I face: my code will break if I or someone else changes the name in the table. How do I prevent this?
FYI this is a personal project that I developed on my own, but I will have to give access to .csv files to others so they can update info. I'm afraid they may accidentally change the names, so that's why I'm worried.

Related

Using the file name to determine if a SSIS package should be executed

I've got 4 different companies whose spreadsheets I have to process. Currently, if I receive a set of files, the SSIS solution is set to execute all of the packages in the solution whether I received the files or not. I want to change this behavior by looking the the Source directory that stores the .xlsx files.
For instance, I have the files from two companies: A & B. The file names will always be in this format with the companies' name at the beginning of the file name.
From there, I would like to use the file names to possibly set variables (? - not sure if this would be the best way ) to set precedence constraints to only execute the packages that relate to the companies' files we received.
Based on the screen shots, I want the packages of Company A & B to execute while C & D are skipped and then to move on to the next package.
Suggestions on how to achieve this goal?

One way would be to create a boolean variable for each company.
#CompanyA_Recd, #CompanyB_Recd, #CompanyC_Recd, #CompanyD_Recd
Your foreach loop would loop through the files, analyze each file name, and mark the appropriate variable as true if you received that company's files.
Then your precedence constraints could just check the variable for that company.
Note that, depending on the logic you want for executing the Final Report Package, this means you might need two precedence constraints for each company. One that goes to the package if the variable is true, and one that goes straight to the final package if the variable is false.

Folder Structure In SSIS for Output file and naming convention of Output file

SSIS
This is general question which i am going to ask and might possible to have multiple answer.
Techie, please share with me the best solution if familiar with scenario.
i have 100 customer and they share file for us to load into our database.
and after compution and the output file need to be saved in Output folder.
the name of the output file should be customer_name.txt where customer_name
is coming from a column available in input file.
Can anyone please help me how should i design this in SSIS to achieve my Goal

At first, use a Foreach Loop Container for every file.
For each one, store data into your importTable in database with the fileName or customerName.
Then, when you need to get output file:
Get the nameFile or userName with a query and save it into a ssis variable.
Create your file with the 'DestinationVariable' as 'User::yourVariableName'
Add the data from your database and then use a simple script to move it into output folder
Example

What should be the appropriate name of a log file

I want to log my exceptions in a file. What will be the name of the file?
Error-ddmmyyyy.log
Log-ddmmyyyy.err
Log-ddmmyyyy.txt
or anything else?

If date is important, I would use yyyymmdd format: this is easier to get a sorted list. I would add hhmmss if relevant.
.log suffix is nice for me.
I would add the name of the command issuing exceptions as a prefix of the logfile.
Something like: myCommand-20100315-114235.log

There are many ways you can name your log files, you have to consider several factors:
Is your file generated by a special server or application in a group? Then you should add the name of the server to know where it does come from.
Example:
Server1.log
In the log file there could be many levels of logging, if you want you can configure different files for different levels.
Example
Server1.info.log
Server1.err.log
You should add a date if your application runs for many days or if you want to keep track of errors for future reference, add it at the start of the name in yyyyMMdd format on windows or linux so they will be sorted by date automatically in command line commands or add it at the end if you want them more organized:
Server1.info.log.20100315 (Linux)
Server1.info.20100315.log (Win)
You can try with different combinations, it all depends on what sorting style and archiving style you want to achieve.

Is there any way to reorder fields in an SSIS flat file source?

I have an SSIS package using a tab delimited flat file source with a TON of fields. Recently the provider of the tab delimited flat file has decided to change the format of the flat file by sprinkling a couple dozen new fields at random into the file. Needless to say, this hosed the package.
Rather than rebuild another flat file source and redefine all the fields, types, and lengths all over again, is there a way to reorder the fields in the flat file source? Sure would have been nice if Microsoft allowed you to move the fields around in the Advanced Columns pane, but noooooo.
Any help is appreciated.

If you only need to add columns to your file, you can do that in the Flat File connection editor. In the advanced window, you can select the field next to the new one and click the chevron next to the New button. It will give you the choice insert before or insert after.
If you truly have to move things around, you'll need to edit the XML source. If you use the existing file definition as a guide, you can build the new one in Excel or T-SQL relatively easily. Easier than typing everything in all over again at least.

I had a similar issue: I needed to change the order of columns in my flat file destination. The time-saving approach I settled on:
Delete the FF destination and FF connection manager (note down file name/location!),
Clear the check boxes that enable output columns in the source component
Re-enable the columns in the order you want
Add a new FF destination and FF connection right from the FF destination's connection manager drop-down.
Review/sanity check column sizes in FF connection, as usual
Not a direct answer to the question, but I came here looking for advice on "how to rearrange flat file destination columns", perhaps this will help someone.

I haven't seen an solution for that problem. SSIS isn't very strong in changing metadata. You could try to do it in notepad, but that is very tricky and very buggy. I would not recommand that to you.

In the connection managers below of your IDE you can double click your file name and edit everything you want.

This is still a "feature" of SSIS. To work around this I create a flat file connection called "NULL" with a single column named "NULL". Use the "New" button to add the column. I change the default column name from "Column 0" to "NULL". This column name must not match any column name in the list to be re-populated. If you have a real column named "NULL", pick something else for the column name that's not in use. You can keep the "NULL" flat file connection in the project for later use. (I expect to need it a few more times in this project.)
For this example, I use a flat file destination. Change the Flat File Destination to use the NULL connection.
Check the mapping to see there are no columns mapped. Saving this resets the metadata stored for the mapping.
Finally, change the Flat File Destination back to the correct connection to get a new mapping without metadata interference.
My example is a flat file destination. It should work for a flat file source for resetting the metadata. It is similar to the trick of changing a query to "select 1 as [NULL]" and back to purge metadata when using a ODBC source or such.

you could probably try something, but i havent tested.. use expressions to set everything for your flat file source? turn design time validation off

SSIS - Is there a Data Flow Source component that will handle CSV files where the column order may change?

We have written a number of SSIS packages that import data from CSV files using the Flat File Source.
It now seems that after these packages are deployed into production, the providers of these files may deliver files where the column order of the files changes (Don't ask!). Currently if this happens, our packages will fail.
For example, an additional column is inserted at the beginning of each row. In this case, the flat file source continues to use the existing column order, which obviously has a detrimental effect on the transformation!
Eg. Using a trivial example, the original file has the following content :
OurReference,Client,Amount
235,MFI,20000.00
236,MS,30000.00
The output from the flat file source is :
OurReference Client Amount
235 ClientA 20000.00
236 ClientB 30000.00
Subsequently, the file delivered changes to :
OurReference,ClientReference,Client,Amount
235,A244,ClientA,20000.00
236,B222,ClientB,30000.00
When the existing unchanged package is run against this file, the output from the flat file source is :
OurReference Client Amount
235 A244 ClientA,20000.00
236 B222 ClientB,30000.00
Ideally, we would like to use a data source that will cope with this problem - ie which produces output based on the column names, instead of the column order.
Any suggestions would be welcomed!

Not that I know of.
A possibility to check for the problem in advance is to set up two different connection managers, one with a single flat row. This one can read the first row and tell if it's OK or not and abort.
If you want to do the work, you can take it a step further and make that flat one-field row the only connection manager, and use a script component in your flow to parse the row and assign to the columns you need later in the flow.
As far as I know, there is no way to dynamically add columns to the flow at runtime - so all the columns you need will need to be added to the script task output. Whether they can be found and get parsed from the each line is up to you. Any "new" (i.e. unanticipated) columns cannot be used. Columns which are missing you could default or throw an exception.
A final possibility is to use the SSIS object model to modify the package before running to alter the connection manager - or even to write the entire package dynamically using the object model based on an inspection of the input file. I have done quite a bit of package generation in C# using templates and then adding information based on metadata I obtained from master files describing the mainframe files.

Best approach would be to run a check before the SSIS package imports the CSV data. This may have to be an external script/application, because I don't think you can manipulate data in the MS Business Intelligence Studio.

Here is a rough approach. I will write down the limitations at the end.
Create a flat file source. Put the entire row in one column.
Do not check Column names in first data row.
Create a Script Component
Code:
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
string sRow = Row.Column0;
string sManipulated = string.Empty;
string temp = string.Empty;
string[] columns = sRow.Split(',');
foreach (string column in columns)
{
sManipulated = string.Format("{0}{1}", sManipulated, column.PadRight(15, ' '));
}
/* Note: For sake of demonstration I am padding to 15 chars.*/
Row.Column0 = sManipulated;
}
Create a flat file destination
Map Column0 to Column0
Limitation: I have arbitrarily padded each field to 15 characters. Points to consider:
1. Do we need to have each field of same size?
2. If yes, what is that size?
A generic way to handle that would be to create a table to store the file name, fields, and field sizes.
Use the file name to dynamically create the source and destination connection manager.
Use the field name and corresponding field size to decide the padding. Not sure, if you need this much flexibility. If you have any question, please respond.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to decouple variable names in external files and the code? - csv

Related

Using the file name to determine if a SSIS package should be executed

Folder Structure In SSIS for Output file and naming convention of Output file

What should be the appropriate name of a log file

Is there any way to reorder fields in an SSIS flat file source?

SSIS - Is there a Data Flow Source component that will handle CSV files where the column order may change?

Categories

Resources