How to dynamic column mapping in ssis - ssis

I have large amount of flat files to migrate to sql server.Each file has different column length based on the metadata defined.Please tell me how i can create a single package to migrate all the files?

If this is a one-time import, you can let the import/export wizard create the package for you.
If you need a more lasting solution, you can use BiML which dynamically creates packages at execution time based on available metadata.

Related

Adding many output columns to script component

I have a Data Flow with OLE DB Source, Script Component (Transformation), and Flat File Destination:
The OLE DB Source task has 100+ columns. The script component is going to cleanup data in each column and then output it to the Flat File Destination.
Adding output columns by hand in Script Component is unthinkable to me.
What options do I have to mirror the output columns with the input columns in the Script Component? While the output column name will be the same, I plan to change the datatype from DT_STR to DT_WSTR.
Thank you.
You are short of luck here. Possible scenarios:
Either you use Script Component and have to key in all columns and its properties manually. In your case, you have to set proper datatype.
Or you can create your own Custom Component which can be programmed to create output columns based on input columns. It is not easy and I cannot recommend a simple guideline, but it could be done.
This might have sense if you have to repeat similar operations in many places so it is not a one-time task.
You can create a BIML script that creates a package based on metadata. However, the metadata (list of columns and its datatypes) has to be prepared before running BIML script or do some tricks to get it during script execution. Again, some proficiency with BIML is essential.
So, for one-time job and little experience with BIML I would go for a pure manual approach.

SSIS staging truncate warehouse

Daily we get the data in excel formats we load the data into staging and then go to SSIS package
and take excel as connection manager and perform transformations and move the data to warehouse.
since we are taking data from excel only then why to create a stage and truncate it,
since we taking excel as source and every manipulation is done with in it? Can someone please
explain Real time scenario? I have seen many websites and couldn't understand what the concept is all about like
staging, source(excel),lookup target(warehouse)
Why to create to stage since everything is being done SSIS package only ?
The staging area is mainly used to quickly extract data from its data sources, minimizing the impact of the sources. After data has been loaded into the staging area, the staging area is used to combine data from multiple data sources, transformations, validations, data cleansing.
You can use a staging design pattern :
Incremental load
Truncate Insert
Using Delimiters with HashBytes for Change Detection
You can find out about the Package design pattern for loading a data warehouse

How to use Format files in SSIS Data Flow task?

I am able to migrate data between two SQL Server tables easily using a SSIS data flow task. Can I use format files to specify the columns to choose from the source and destination? If so, can you give me an example?
In our current system, our Source and Destination tables are always not the same. We were using SQL-DMO with format files so far and are now upgrading to SSIS.
Thanks in advance for your suggestions.
So I think that you can look up info on how to create a format file here: http://msdn.microsoft.com/en-us/library/ms191516.aspx
Google SSIS Bulk Insert Task to find more on that.
I would recommend using a data flow if you can because this can eliminate columns from the source that do not exist in the destination and it can out perform bulk inserts. It's worth consideration.
Mark
Here is a post to which I just finished answering my own question and thought will link the two posts together.
SSIS - Export multiple SQL Server tables to multiple text files

Best practice to organize a 200+ tables import project

This question is going to be a purely organizational question about SSIS project best practice for medium sized imports.
So I have source database which is continuously being enriched with new data. Then I have a staging database in which I sometimes load the data from the source database so I can work on a copy of the source database and migrate the current system. I am actually using a SSIS Visual Studio project to import this data.
My issue is that I realised the actual design of my project is not really optimal and now I would like to move this project to SQL Server so I can schedule the import instead of running manually the Visual Studio project. That means the actual project needs to be cleaned and optimized.
So basically, for each table, the process is simple: truncate table, extract from source and load into destination. And I have about 200 tables. Extractions cannot be parallelized as the source database only accepts one connection at a time. So how would you design such a project?
I read from Microsoft documentation that they recommend to use one Data Flow per package, but managing 200 different package seems quite impossible, especially that I will have to chain for scheduling import. On the other hand a single package with 200 Data Flows seems unamangeable too...
Edit 21/11:
The first apporach I wanted to use when starting this project was to extract my table automatically by iterating on a list of table names. This could have worked out well if my source and destination tables had all the same schema object names, but the source and destination database being from different vendor (BTrieve and Oracle) they also have different naming restrictions. For example BTrieve does not reserve names and allow more than 30 characters names, which Oracle does not. So that is how I ended up manually creating 200 data flows with a semi-automatic column mapping (most were automatic).
When generating the CREATE TABLE query for the destination database, I created a reusable C# library containing the methods to generate the new schema object names, just in case the methodology could automated. If there was any custom tool to generate the package that could use an external .NET library, then this might do the trick.
Have you looked into BIDS Helper's BIML (Business Intelligence Markup Language) as a package generation tool? I've used it to create multiple packages that all follow the same basic truncate-extract-load pattern. If you need slightly more cleverness than what's built into BIML, there's BimlScript, which adds the ability to embed C# code into the processing.
From your problem description, I believe you'd be able to write one BIML file and have that generate two hundred individual packages. You could probably use it to generate one package with two hundred data flow tasks, but I've never tried pushing SSIS that hard.
You can basically create 10 child packages each having 20 data flow tasks and create a master package which triggers these child pkgs.Using parent to child configuration create a single XML file configuration file .Define the precedence constraint for executing the package in serial fashion in master pkg. In this way maintainability will be better compared to having 200 packages or single package with 200 data flow tasks.
Following link may be useful to you.
Single SSIS Package for Staging Process
Hope this helps!

SSIS - 2008 - Use a single config table for multiple copies of the same package

I am somewhat new to SSIS.
I have to deliver a 'generic' SSIS package, that the client will make multiple copies of, deploy and schedule each copy for different source databases. I have a single SSIS Configuration table in a separate common database. I would like to use this single configuration table for all connections. However the challenge is with the configuration filter. When client makes a copy of my package, it will have the same configuration filter just like others. I would like to give an option to the client to change the configuration filter before deploying, because for this new copy, the source database can be different. I do not find an option to control this.
Is there a way to change the configuration filter from outside the package (without editing the executable .dtsx file)? Or is there a better approach that I can follow? I do not prefer XML configuration files, the primary reason being my packages are deployed onto SQL server.
Any help would be greatly appreciated.
-Shahul
Your preferred solution does not align well with the way that SSIS package configurations are typically used. See Jamie Thomson's answer to a similar question on the MSDN forums.
I have created a package with the same requirements for my company. It loads data from different sources and loads them into different destinations based on individual configurations for the instances. It is used as an internal ETL.
We have adapters that connect to different sources and pass data to a common staging table in XML format and the IETL Package loads this data into different tables depending on a number of different settings etc.
i.e. Multiple SSIS package instances can be executed with different configurations. You are on the right track. It can be achieved using SQL Server to hold configurations and XML Config file to hold the database info that has this configurations. When an instance of the package executes it will load the default values configured with the package, but needs to update all variables to reflect the purpose of the new instance.
I have created a Windows app to configure these instances and they settings in the database to make it really easy for the client or consultant to configure them without actually opening the package.