I need to make a ETL sync to add new data from relational database to the database DW in SSIS. It should take place in every sixth hour.
How do I do it?
What component do I need?
Where can I find more material about it in the Internet?
You'll need to build a SSIS Package using Business Intelligence Development Studio (which is installed as part of the SQL Server Development tools). You'll then need to execute that package on a server, using a scheduling mechanism of some kind (either an enterprise scheduler, cron, or a windows scheduled job).
The Microsoft website has good information about SSIS. I would also suggest reading about ETL & Data Warehousing in books by Ralph Kimball or Bill Inmon.
Related
I know it's not a new question but maybe somebody find some documentation and/or the sql scripts and SSIS ETL used to create Adventure works DW (2014 at least).
Not sure why Microsoft released a lot about AdventureWorks for Analysis Services ( https://github.com/Microsoft/sql-server-samples/releases/tag/adventureworks-analysis-services ) but nothing for SSIS.
Any help will be much appreciated.
Adventure works DW (tables and procedures) are part of an executable which installs the SQL objects (facts and dimension) tables in the target edition of SQL Server instance.
Never has been the case to create such dimensional model using SSIS since this data is static and one-off. You may want to create SSIS artifacts on top of this DW and there are a lot of samples and workflows available online for reference.
One such repository on Git is - Repo link
We are planning to migrate an Oracle 10g database to SQL Server 2008 R2. At the moment nothing is implemented in the target database and this will give us the opportunity to change and improve the existing schema during the migration.
Not only the data, but also stored procedures and views have to be imported.
I already worked with SSIS and I found an excellent product for data manipulation. A colleague mentioned SSMA for the migration. However after some research on the net it seems that it would be suitable mainly for data migration and conversion, while SSIS seems to provide a wider set of functinalities (Tasks, custom scripts, etc).
Which are the pro/contra of the two products and which one would best fit for the task?
I would recommend a hybrid approach. Use SSMA to convert the schema and objects from Oracle to SQL Server. Then improve and or change the objects as you see fit on the SQL end. Once your satisfied with your new schema. Use SSIS to move the data still waiting on the Oracle side into the new schema waiting for it on SQL.
As for a quick comparison of SSMA and SSIS... SSIS is by far superior for the ETL aspects of moving data from one place to another; but I wouldn't necessarily recommend it for the creation/modification phase of what you describe above. I think you'll find that process much easier with SSMA. On the flip side SSMA doesn't offer much in the way of transformation during the copy process.
I would go for an hybrid of the two.
Do you know you can trig SSMA from command line? This way you can execute the SSMA migration as a part of the SSIS solution.
You can also save your SSMA project as an SSIS package:
Once migrated keep doing the extra work with SSIS.
Is there any link or zip file where I could get whole MS BI warehouse project (sample) from starting to end? (2008)
Incremental load and even possible creating cubes too. What kind of problems one faced in real time projects, such things.
I could find things on you tube in parts but couldn't link it. Please help.
Rohan
I think the best reference implementation for the MS BI stack is Project REAL. According to Microsoft:
In Project REAL we are creating a reference implementation of a
business intelligence (BI) system using real large-scale data from a
real customer. The goal is to discover the best practices for creating
BI systems with SQL Server 2005 and to build a system that exhibits as
many of those best practices as we can. This project is not just a
demo —we are creating this system for ongoing operation. It is a
complete system, including daily incremental updates of the data,
large multiuser workloads, and system monitoring.
It contains:
A set of instructions for setting up the environment
Guidance on how to explore the implementation
A sample relational data warehouse database (a subset of the Project REAL data warehouse)
A sample source database (from which we pull incremental updates)
SSIS packages that implement the ETL operations
An SSAS cube definition and scripts for processing the cube from the sample warehouse
Sample SSRS reports
Sample data mining models for predicting out-of-stock
conditions in stores
Sample client views in briefing books for the Proclarity and Panorama BI front-end tools
You can download it here - http://www.microsoft.com/download/en/details.aspx?id=12134
you can get the AdventureWorks database and datawarehouse (with the cube) here: http://msftdbprodsamples.codeplex.com/
not sure about the SSIS packages
We are doing a huge Data Migration Project using SSIS packages. We were insisted on not using stored procedures in SSIS packages. Can you please suggest whether we should be using stored procedures in SSIS packages or not? What are the advantages of using stored procedures?
It is correct that merge statements can easily be used in SSIS and your directive to encapsulate everything in SSIS is not necessary, as SQL processing aggregations faster than SSIS, for example. Further, if you are not deploying to SSISDB or have proper logging wrappers or email alerts, then troubleshooting your ETL is going to be more difficult via the SQL agent than otherwise as the errors are frequently more cryptic - thus the SSISDB and its reports in 2012. SSIS can be extremely powerful, however.
Here is a fairly blatant benchmark that will tell you never to use the out of the box SCD ever in SSIS. Taskfactory however does have a nice deployable which does basically merges behind the scene.
SSIS has more powerful functions than Stored Procedures.
However you can easily use Execute T-SQL Statement tasks in SSIS for existing tasks, and then build out from there.
SSIS is superior at the vast majority of ETL
Below Via Microsoft
Microsoft Integration Services is a platform for building enterprise-level data integration and data transformations solutions. You use Integration Services to solve complex business problems by copying or downloading files, defining business logic, sending e-mail messages in response to events, updating data warehouses, cleaning and mining data, and managing SQL Server objects and data. The packages can work alone or in concert with other packages to address complex business needs. Integration Services can extract and transform data from a wide variety of sources such as XML data files, flat files, and relational data sources, and then load the data into one or more destinations.
Integration Services includes a rich set of built-in tasks and transformations; tools for constructing packages; and the Integration Services service for running and managing packages. You can use the graphical Integration Services tools to create solutions without writing a single line of code; or you can program the extensive Integration Services object model to create packages programmatically and code custom tasks and other package objects.
A stored procedure in SQL Server is a group of one or more Transact-SQL statements or a reference to a Microsoft .NET Framework common runtime language (CLR) method. They can be called from within SSIS just the same as the unencapsulated SQL statement. For more information about it, please see: http://msdn.microsoft.com/en-us/library/ms190782(v=sql.110).aspx
I have lot of data in MS Access, and for analysis I need tools. Might you suggest any tools for data mining and analysis (OLAP)?
Support for Access (and other various non-SQL Server data sources) will be included in the upcoming SQL Server 2008 R2 release (this release is focusing on self-service BI). You can follow how the project is progressing at http://blogs.msdn.com/gemini.
It depends on your data volumes the the complexity of the relationships that you want to investigate:
(1) Moderate volumes with low complexity relationships - use queries, pivot's graphs and reports in ms-access.
(2) High volume and or high complexity relationships - consider up sizing to SQL server and using the more grown-up data cubes (OLAP), stored procedures etc.
A possible solution can be Excel 2010 using the new Power Pivot Add-on.
It really depend on the type of analisys needed.
Federico
I guess your best bet would be to import your data into SQL Server using SQL Server Integration Services - should be pretty straightforward and painless.
Once in SQL Server, you have the Analysis Services at your disposal which give you all these capabilities for OLAP analysis.
I don't think there's much for MS Access directly.
Marc
If it is not too much data, import it into Excel and use the privot table functionality.
If it is too much for that then SQL Server is the way to go.
An alternative OLAP solution is to use icCube to connect directly to your MS Access file.