What are good resources to learn advanced SSIS? - ssis

Please recommend books/blogs to learn adavnced SSIS (2008)

Good stuff over at SSIS Advanced Techniques and for ETL best practices in general, check out the Kimball Group (and related publications).

I would recommend this blog post which describes a SSIS package design pattern for loading a data warehouse.
It describes in detail a solution for dividing your SSIS packages into the 3 phases (Extract, Transform, Load) so that you can control and run each of them individually if necessary.
There are also many other great blogs about SSIS at http://sqlblog.com, so go there and search for SSIS.
Another advanced SSIS ressource could be the enhancement to the existing Slowly Changing Dimension Wizard in SSIS, called SSIS Dimension Merge SCD Component.

Related

Is there an automation tool to convert BIML to SSIS Package?

I have hundreds of BIML scripts and I have to convert each into SSIS package. The only process I figured out is to manually right click the biml file and convert it click the Generate SSIS Package. (Please follow the link to visualize it). How do I automate this process. In other words, how can I programmatically convert all the biml scripts into their corresponding SSIS packages..
https://www.google.com/url?sa=i&url=http%3A%2F%2Fwww.erikhudzik.com%2Ftag%2Fssis%2F&psig=AOvVaw3vHH8scEdHu5w-JUDrHyLi&ust=1657797254349000&source=images&cd=vfe&ved=0CAkQjRxqFwoTCLDFmZfe9fgCFQAAAAAdAAAAABAR
You should be able to select multiple files, right click and generate them all at the same time.
You can also reference one biml script from another. So you can have your main entry point which contains a <packages> element and then reference other scripts within that which define each package.
Finally, if you have biml studio, this comes with a command line utility which would allow you to do it programmatically.
There are 3 tools for transforming Biml into DTSX packages: BimlExpress, BimlStudio (formerly known as Mist) and the precursor to BimlExpress - BidsHelper. That product no longer exists and the Biml bits in it and has been rebranded.
Under the covers, BimlStudio is going to invoke Bimlc.exe which is the Biml Compiler and that is how scripts become packages. Buy it outright or rent it monthly, depending on your needs. This is your only choice for unattended/automatic/automated builds.
BimlExpress is the free tool that can also transform scripts into packages. It requires mouse clicks to build packages.
The big difference between the two, for the beginner at least, is convenience. I have ScriptedPackageA and ScriptedTablesB which makes Package1. In BimlStudio, I can set the properties so that one is live (tables) and just evaluate/expand the Package script. In BimlExpress, I need to shift/control click the scripts I want to be compiled/referenced.
Also if you have hundreds of Biml scripts... you might not have understood the idea behind Biml. For reference, I have about about 7000 .biml files on my machine but I bet I have less than 30 describing package patterns. The only way I'm at my large number of Biml files is that I have scripted a number of databases with an file per table.
Generally speaking, you want to distill your approach down to distinct patterns and then throw your metadata against it. How many ways can you have a package that loads from file to database?

MS BI Warehouse project

Is there any link or zip file where I could get whole MS BI warehouse project (sample) from starting to end? (2008)
Incremental load and even possible creating cubes too. What kind of problems one faced in real time projects, such things.
I could find things on you tube in parts but couldn't link it. Please help.
Rohan
I think the best reference implementation for the MS BI stack is Project REAL. According to Microsoft:
In Project REAL we are creating a reference implementation of a
business intelligence (BI) system using real large-scale data from a
real customer. The goal is to discover the best practices for creating
BI systems with SQL Server 2005 and to build a system that exhibits as
many of those best practices as we can. This project is not just a
demo —we are creating this system for ongoing operation. It is a
complete system, including daily incremental updates of the data,
large multiuser workloads, and system monitoring.
It contains:
A set of instructions for setting up the environment
Guidance on how to explore the implementation
A sample relational data warehouse database (a subset of the Project REAL data warehouse)
A sample source database (from which we pull incremental updates)
SSIS packages that implement the ETL operations
An SSAS cube definition and scripts for processing the cube from the sample warehouse
Sample SSRS reports
Sample data mining models for predicting out-of-stock
conditions in stores
Sample client views in briefing books for the Proclarity and Panorama BI front-end tools
You can download it here - http://www.microsoft.com/download/en/details.aspx?id=12134
you can get the AdventureWorks database and datawarehouse (with the cube) here: http://msftdbprodsamples.codeplex.com/
not sure about the SSIS packages

Branching and merging SSIS/SSAS Projects

I have a data warehousing solution formed of a series of databases, SSIS packages and an SSAS database. The SSIS packages and SSAS database all sit within source control using Team Foundation Server.
What I'd like to be able to do is branch the SSAS and SSIS projects to enable us to work on multiple streams of work and then be able to merge the projects back in prior to release to a production environment.
TFS allows me to branch my projects with little effort, however merging them back together afterwards results in trawling through pages and pages are difficult to consume XML.
How are other people dealing with this situation? Are there any tools available on the market to deal with exactly these situations?
As documented in this blog post by Jamie Thomson, SSIS files are effectively binary files so should be treated as non-mergable.
http://consultingblogs.emc.com/jamiethomson/archive/2007/08/06/SSIS_3A00_-Team-Development-Experiences.aspx
He also recommends making packages as modular as possible if you want to have multiple team members working on the same project - this is something we've adopted.
There is a tool called BIDS Helper which provides a 'smart diff' for SSIS files which can be useful for determining changes between versions.
http://bidshelper.codeplex.com/wikipage?title=Smart%20Diff&referringTitle=Documentation
But, generally, SSIS files should be treated as non-mergable if you want to avoid hours of pain - we've switched on exclusive check out on all .dtsx files in TFS so that people don't tread on each other's toes.

Stored procedures in SSIS Packages

We are doing a huge Data Migration Project using SSIS packages. We were insisted on not using stored procedures in SSIS packages. Can you please suggest whether we should be using stored procedures in SSIS packages or not? What are the advantages of using stored procedures?
It is correct that merge statements can easily be used in SSIS and your directive to encapsulate everything in SSIS is not necessary, as SQL processing aggregations faster than SSIS, for example. Further, if you are not deploying to SSISDB or have proper logging wrappers or email alerts, then troubleshooting your ETL is going to be more difficult via the SQL agent than otherwise as the errors are frequently more cryptic - thus the SSISDB and its reports in 2012. SSIS can be extremely powerful, however.
Here is a fairly blatant benchmark that will tell you never to use the out of the box SCD ever in SSIS. Taskfactory however does have a nice deployable which does basically merges behind the scene.
SSIS has more powerful functions than Stored Procedures.
However you can easily use Execute T-SQL Statement tasks in SSIS for existing tasks, and then build out from there.
SSIS is superior at the vast majority of ETL
Below Via Microsoft
Microsoft Integration Services is a platform for building enterprise-level data integration and data transformations solutions. You use Integration Services to solve complex business problems by copying or downloading files, defining business logic, sending e-mail messages in response to events, updating data warehouses, cleaning and mining data, and managing SQL Server objects and data. The packages can work alone or in concert with other packages to address complex business needs. Integration Services can extract and transform data from a wide variety of sources such as XML data files, flat files, and relational data sources, and then load the data into one or more destinations.
Integration Services includes a rich set of built-in tasks and transformations; tools for constructing packages; and the Integration Services service for running and managing packages. You can use the graphical Integration Services tools to create solutions without writing a single line of code; or you can program the extensive Integration Services object model to create packages programmatically and code custom tasks and other package objects.
A stored procedure in SQL Server is a group of one or more Transact-SQL statements or a reference to a Microsoft .NET Framework common runtime language (CLR) method. They can be called from within SSIS just the same as the unencapsulated SQL statement. For more information about it, please see: http://msdn.microsoft.com/en-us/library/ms190782(v=sql.110).aspx

DTS exchange tool

We have a large collection of DTS packages that need to be converted to SSIS as part of SQL upgrade?. How effective is this tool compared to the wizard?. Some of the functionality that is available in DTS
Import/Export
SQL operations
Copying/Renaming/moving files
Activex scripts (Not complex, most of the business functionality is in Stored procedures). Any help in sharing documentation or web links or any insight is well appreciated.
You can find a full comparison here: http://www.pragmaticworks.com/products/business-intelligence/dtsxchange/DTSxChange-vs-MSWizard.htm
At a high-level, the existing Microsoft wizard does not handle some common tasks like Dynamic Properties Task. It also doesn't handle things like ODBC or all the flat file conditions. With the DTS xChange tool, it will migrate pretty much all conditions, re-engineer the logging and auditing framework of the package and turn on some of the new features in SSIS. It also includes BI xPress, which will help migrate ActiveX Scripts post-migration with code snippets.