Does it make sense to use Apache Airflow to orchestrate/automate ETLs, and subsequently dispatch reports via SSRS or SSIS? Put another way, I'm wondering if I can wire up an Airflow DAG to SSRS or SSIS. The MsSqlOperator Airflow operator looks like it interfaces with SQL Server, but I can't find any reference to a provider for SSRS or SSIS.
This is meant a very general question, I'm looking for directional guidance, as opposed to code examples (although I'll happily take those). Just wanting to know if I am going to pursue something that isn't going to work or is otherwise a bad idea.
Assuming you're looking to orchestrate the running of a report or running of an SSIS package, yeah sure, that's doable.
SSIS
Assuming you're working with the project deployment model, a package run is a few stored procedure calls strung together
create_execution creates an instance of an execution
set_execution_parameter_value allows you to configure values for an instance of that execution
start_execution begins the actual running of the package
Depending on whether you want the package running in synchronous or asynchronous (default) mode, you might want to set the SYNCHRONIZED bit.
SSRS
I'm not sure what you're looking for here but we trigger the proc add_event to kick off a subscription which then gets emailed out but I have seen plenty of questions from people that want to run an ssis package that pulls a report and exports to csv/pdf/etc
https://businesswintelligence.com/content/26/manually-trigger-ssrs-subscription
Docs/Learn
https://learn.microsoft.com/en-us/sql/integration-services/system-stored-procedures/catalog-create-execution-ssisdb-database?view=sql-server-ver16
https://learn.microsoft.com/en-us/sql/integration-services/system-stored-procedures/catalog-set-execution-parameter-value-ssisdb-database?view=sql-server-ver16
https://learn.microsoft.com/en-us/sql/integration-services/system-stored-procedures/catalog-start-execution-ssisdb-database?view=sql-server-ver16
https://andyleonard.blog/2015/11/the-synchronized-ssis-execution-parameter/
I'm struggling to find an easy and simple way to validate the parameters of my SSIS package.
I feel this is the basic feature any tool allowing parameters should provide : checking those values are in a specific "allowed" range and not allowing a user to pass any value (or forget to provide a mandatory value).
Especially in SSIS where the variables values can be replaced easily (throuh parameters when calling or XML configuration files).
I found some "solutions" involving VB6 scripting or constraints on tasks. But all of this feels like workarounds.
What is the best practice for checking the variables/parameters values before executing the main SSIS package tasks ?
Example: my package can process data for 3 entities of my company : 'NHY', 'JIO', 'NTL' and 2 modes 'Q' and 'M'. The "caller" specifies which data he wants to process by passing the value when calling the SSIS package. But if the user specifies an entity that doesn't exist or a mode not supported, I want the package to fail immediately.
Example of command line that should fail BEFORE doing anything:
/FILE "Path\To\File\MyPackage.dtsx" /CHECKPOINTING OFF /REPORTING EWCDI /SET EntityCode;ZZZZZ /SET ProcessMode;Z
I need to read a column from a database table depending upon some parameter. If the database table has two columns, status and ID, then I have to read the ID if the status is true. Then I have to pass this ID to a C# method.
How can I achieve this in SSIS? So basically my database package will read the data from SQL Server and pass it to a C# method.
SSIS is an ETL Tool for moving and transforming large quantities of data. If you need to do a lot of C# work and you only have one record, or a few records, SSIS may not be the right tool for this purpose. You might do better writing an ASP.NET web application or a Windows application. These applications can also use SQL to get data for processing in C#.
If you are determined to do this in SSIS and C#, here are two possible approaches:
You could use an Execute SQL Task to perform your query and save the rowset into a variable. Then you would use a C# Script Task to do something with the contents of the variable.
You could create a Data Flow Task. The dataflow should have the structure Source -> Transformation -> Destination, and can include several transformation components.
You would use, for example, an OLE DB Source Component to perform your query. Then you would use a C# Script Component to transform each record that is returned by the query. Finally, you would use a OLE DB Destination Component to do something with the output for each record.
What is the difference between Variables and Parameters in SSIS Denali?
If there is any difference then What is that which Variables cannot do that Parameters can do ? or vice versa.
When should one go with SSIS Parameters and Variables?
I tried searching on Google, but I failed to get some information.
Thanks In Anticipation!
I think a little bit background will be beneficial to understand the Parameter concept. Here I will explain it in the context of comparing with Variables. To fully grasp the Parameter concept, you might need to look up for the new Project Deployment Model, Environment, Build Configuration as well..
Usage Of Variable
With SSIS prior 2012, if we need to pass any external values to the package before the execution (as we all do all the time), I normally use configuration file (or a couple of other ways). Say we have a file server, which will be used to access a shared file, I will use variable to store the server name, and expose this variable to the configuration file. If the actual file server is changed (dev env to test env etc.), we just need the change the value of that variable in the configuration file and SSIS package remains intact.
Everything looked good, but there are a couple of things that I always ask myself why and could not figure out why:
100% of the time when I am exposing variables to configuration file, I just expose the "Value" properties. Why does SSIS allow to expose all the other variable properties?
Why does SSIS not have "private" variable? By "private" I mean when I chose the variables to configure, the "private" ones just did not get shown on the pick list. The SSIS package could have dozens of variables, for the internal value-holders, what's the point to expose them? Why I have to scroll all the way to find the only one I need to expose?
New Project Deployment Model
SSIS 2012 introduces a new deployment model, Project Deployment Model. For short, this model deploys SSIS project as a single unit to SQL Server SSIS catalog, and package configuration is NOT available in this model (it is available in the old model referenced as Package Deployment Model, with SSIS 2012 you can choose which one to use, 2012 default to the new model).
If we want the pass some values into the SSIS packages, we have to pass them in via Parameters, and use SSIS catalog in SSMS to configure the value for the parameters(only the value, nothing else we can configure). Parameters and connection managers are exposed automatically in SSIS catalog which can be configured, nothing else previously available via configuration files can be configured in Project Deployment Model (The world is much cleaner). Inside SSIS package, parameters can be used in the same way as variables in terms of building up expressions. However, parameters can NOT be modified within the SSIS package, which makes perfect sense. (Why do we need to change a value which is passed in from external? If we have to, pass the value to an variable, and do the changes there..)
Sum Up
Parameter is only available in the Project Deployment Model, and it provides the only mechanism for passing values from external to SSIS packages in this model. If we think SSIS pacakge as an OO class, Parameters could be thought as public properties, which externals can access and assign value to it (the class itself can/will use it, but cannot modify it). Where Variables could be thought as private variables, which are used internally, external world does not need to know anything about it.
For the old Package Deployment model, there is no Parameter, and the world remains the same.
FYI, in short, variable's value can be changed during the runtime, but parameter cannot. Parameter can help you do the project deployment and you can set it up in SSISDB catalog, while variable cannot.
There are many differences between Variables and parameters, few of them mentioned below:
Variables values can change at run time but parameters value can't change .
Variables can be used only with in the package we can't use it for other package with in the solution but we can use Parameters for multiple package (package exist with in the Solution Explorer).
The variables & parameters are similar to that are in java,
we pass/through some values to certain method/task in the form of parameters and we use them in that particular task we cant change those values since they are external things for that method similarly in SSIS the Project Parameters are used to set certain variables or connections dynamically in the package. where as variables are limited internal to the package level.
It works like this:
say you have a project parameter called ServerName:
Lets say you deploy an SSIS package into two integration catalog environments, one which is configured for prod server and another which is configured for test server:
Then your ServerName 'parameter' will be set in prod with prod server address and in test environment to contain test server address.
If any variable in your ssis package needs a runtime value(say the variable is used to set a connection at run time for prod or test servers respectively) then the variable will use the parameter from above to find the right server to connect to.
So parameters are usually needed in environment specific scenarios.
There are two types of parameters based on how you've configured your solution in Visual Studio: Project parameters or Package parameters. Project parameters are accessible to all packages in the project.
Parameters are using send data from outside of the package like usernames, passwords or connectionstrings etc. Variables are using inside of the package. It means you can define a variable in one of your SSIS package and use it in package level.
Parameters in SSIS are like Global Constants in Programming, so if they be applied to the Project can be used anywhere. Their
highest access would be the whole project.
Variables as they are named, can be assigned, from a Query or even a Parameter, etc. Also their highest level access would the
package.
Rather than generate my RS reports by directly accessing a SQL database, I'd like to take advantage of Domain Objects I've already written in another application, where complex business rules and calculations already exist so that I don't have to duplicate that logic in stored procedures and other code. I want to keep it DRY.
It would be nice to treat the reporting concern as just another type of view
Is that possible with Reporting Services? It seems logical that it should be, but I'm not finding much information out there.
Yes. You can use the ReportViewer control in Local Processing mode. In this mode, you can just pass a DataSource instead of directly accessing the Database.
Keep in mind that there are certain things that you cannot do in LocalMode that you can in ServerMode. One that I recall, is exporting to anything other than PDF or Excel.