Dynamically assign value to variable in SSIS - ssis

I have an SSIS package where I need to get the date the package last ran from an ADO NET Source then assign it to a variable so what I can use it in a query for another ADO NET Source. I can't find an example on the Googles that actually works. I'm running VS 2012 and connecting to a SQL Server 2012 instance. If there is more information needed let me know.

Create a variable #User::LastRanDate.
Create an Execute SQL task.
Set the ConnectionType property to ADO.NET.
Set the Connection property to your ADO.NET connection.
Set the SQLStatement property to the statement which will return the date you want. Make sure the first column returned is the date.
Set the ResultSet property to Single row.
On the Result Set tab of the Task editor, hit Add and set the Result Name value to 0 and the Variable Name value to #User::LastRanDate. (ADO.NET result sets are returned as indexed arrays.)
Upon completion of the Task, #User::LastRanDate will now be set to whatever the query returned and you can use it to build up your query for your other ADO.NET source.

Working with parameterized queries in an ADO.NET Data Source in SSIS is not as easy as an OLE DB one. Basically, you're going to have to write the query with the expression language and pray your source doesn't lend itself to sql injection.
How to Pass parameter in ADO.NET Source SSIS
how to pass parameters to an ado.net source in ssis?
I created a package with 3 variables as shown below
Package
Variables
I have LastRunDate as a DateTime and a QueryAdo as a string. This evaluated as an Expression with the expression being "SELECT RD.* FROM dbo.RunData AS RD WHERE RD.InsertDate > '" + (DT_WSTR, 25) #[User::LastRunDate] + "';"
Execute SQL Task
I create an Execute sql task that uses a query and is set to return a single row. I assign this value into my SSIS Variable.
In my results tab, I assign the zeroeth column to my variable LastRunDate
Data flow
Note there is an expression here. On the ADO.NET source, I originally used SELECT RD.* FROM dbo.RunData AS RD to get my meta data set.
After I was happy with my data flow, I then went to the control flow and substituted my Query variable in as the expression on the ADO.NET Source component (see the referenced questions).
Try it, try it, you will see
I used the following script to build out my demo environment
create table dbo.RussJohnson
(
LastRunDate datetime NOT NULL
);
create table dbo.RunData
(
SomeValue int NOT NULL
, InsertDate datetime NOT NULL
);
insert into dbo.RussJohnson
SELECT '2014-08-01' AS LastRunDate
INSERT INTO
dbo.RunData
(
SomeValue
, InsertDate
)
SELECT
D.rc AS Somevalue
, dateadd(d, D.rc, '2014-07-30') AS InsertDate
FROM
(
SELECT TOP 15 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rc
FROM sys.all_columns AS SC
) D;
Since I have BIDS Helper installed, I used the following Biml to generate this package as described. For those playing along at home, you will need to edit the third line so that the ADO.NET connection manager is pointing to a valid server and database.
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<AdoNetConnection Name="CM_ADO_DB" ConnectionString="Data Source=localhost\dev2014;Integrated Security=SSPI;Connect Timeout=30;Database=tempdb;" Provider="SQL" />
</Connections>
<Packages>
<Package Name="so_25125838" ConstraintMode="Linear">
<Variables>
<Variable DataType="DateTime" Name="LastRunDate" >2014-01-01</Variable>
<Variable DataType="Int32" Name="RowCountOriginal" >0</Variable>
<Variable DataType="String" Name="QueryAdo" EvaluateAsExpression="true">"SELECT RD.* FROM dbo.RunData AS RD WHERE RD.InsertDate > '" + (DT_WSTR, 25) #[User::LastRunDate] + "';"</Variable>
</Variables>
<Tasks>
<ExecuteSQL
Name="SQL GetLastRunDate"
ConnectionName="CM_ADO_DB"
ResultSet="SingleRow"
>
<DirectInput>SELECT MAX(RJ.LastRunDate) AS LastRunDate FROM dbo.RussJohnson AS RJ;</DirectInput>
<Results>
<Result Name="0" VariableName="User.LastRunDate" />
</Results>
</ExecuteSQL>
<Dataflow Name="DFT POC">
<Transformations>
<AdoNetSource Name="ADO_SRC Get New Data" ConnectionName="CM_ADO_DB">
<DirectInput>SELECT RD.* FROM dbo.RunData AS RD</DirectInput>
</AdoNetSource>
<RowCount Name="CNT Original rows" VariableName="User.RowCountOriginal" />
</Transformations>
<Expressions>
<Expression ExternalProperty="[ADO_SRC Get New Data].[SqlCommand]">#[User::QueryAdo]</Expression>
</Expressions>
</Dataflow>
</Tasks>
</Package>
</Packages>
</Biml>

Related

How to use the same SSIS Data Flow with different Date Values?

I have a very straightforward SSIS package containing one data flow which is comprised of an OLEDB source and a flat file destination. The OLEDB source calls a query that takes 2 sets of parameters. I've mapped the parameters to Date/Time variables.
I would like to know how best to pass 4 different sets of dates to the variables and use those values in my query?
I've experimented with the For Each Loop Container using an item enumerator. However, that does not seem to work and the package throws a System.IO.IOException error.
My container is configured as follows:
Note that both variables are of the Date/Time data type.
How can I pass 4 separate value sets to the same variables and use each variable pair to run my data flow?
Setup
I created a table and populated it with contiguous data for your sample set
DROP TABLE IF EXISTS dbo.SO_67439692;
CREATE TABLE dbo.SO_67439692
(
SurrogateKey int IDENTITY(1,1) NOT NULL
, ActionDate date
);
INSERT INTO
dbo.SO_67439692
(
ActionDate
)
SELECT
TOP (DATEDIFF(DAY, '2017-12-31', '2021-04-30'))
DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), '2017-12-31') AS ActionDate
FROM
sys.all_columns AS AC;
In my SSIS Package, I added two Variables, startDate and endDAte2018 both of type Date Time. I added an OLE DB Connection manager pointed to the database where I made the above tables.
I added a Foreach Item Enumerator, configured it for Item Enumerator and defined the columns there as datetime as well
I populated it (what a clunky editor) with the year ranges from 2018 to 2020 as shown and 2021-01-01 to 2021-04-30.
I wired the variables up as shown in the problem definition and ran it as is. No IO error reported.
Once I knew my foreach container was working, the data flow was trivial.
I added a data flow inside the foreach loop with an OLE DB Source using a parameterized query like so
DECLARE #StartDate date, #EndDate date;
SELECT #StartDate = ?, #EndDate = ?;
SELECT *
FROM
dbo.SO_67439692 AS S
WHERE
S.ActionDate >= #StartDate AND S.ActionDate <= #EndDate;
I mapped my two variables in as parameter names of 0 and 1 and ran it.
The setup you described works great. Either there is more to your problem than stated or there's something else misaligned. Follow along with my repro and compare it to what you've built and you should see where things are "off"

SSRS reports using Report Parameter in two datasets

I'm fixing some really old SSRS reports. I'm doing it by editing the rdl files directly in notepad++ which so far has been working well since the format is easily readable XML.
However, I have run into a problem trying to use the same ReportParameter in two datasets.
I have this in my rdl file:
<ReportParameters>
<ReportParameter Name="Fromdate">
<DataType>DateTime</DataType>
<Prompt>From date:</Prompt>
</ReportParameter>
<ReportParameter Name="Todate">
<DataType>DateTime</DataType>
<Prompt>To date:</Prompt>
</ReportParameter>
</ReportParameters>
This makes it so that before the report is opened, the user enters two dates that are then used in the queries that show data in the report.
The first query is working well with this. It's an Oracle query and looks like this:
<DataSet Name="OracleDS">
<Query>
<DataSourceName>Oracle</DataSourceName>
<QueryParameters>
<QueryParameter Name="Fromdate">
<Value>=Parameters!Fromdate.Value</Value>
<rd:UserDefined>true</rd:UserDefined>
</QueryParameter>
<QueryParameter Name="Todate">
<Value>=Parameters!Todate.Value</Value>
<rd:UserDefined>true</rd:UserDefined>
</QueryParameter>
</QueryParameters>
<CommandText>SELECT myColumn FROM myTable WHERE myDate BETWEEN :Fromdate AND :Todate</CommandText>
</Query>
<Fields>
<Field Name="myColumn">
<DataField>MyColumn</DataField>
<rd:TypeName>System.String</rd:TypeName>
</Field>
</Fields>
</DataSet>
Getting the :Fromdate and :Todate works in this query. However, when I try to do the same in my other query, which is an MSSQL query, I can't open the report because SSRS tells me there's a problem with the query. Here's what the second dataset looks like:
<DataSet Name="MSSQLDS">
<Query>
<DataSourceName>MSSQL</DataSourceName>
<CommandText>SELECT myColumn FROM myOtherTableOnAnotherDatabase WHERE myDate BETWEEN :Fromdate AND :Todate</CommandText>
</Query>
<Fields>
<Field Name="myColumn">
<DataField>myColumn</DataField>
<rd:TypeName>System.String</rd:TypeName>
</Field>
</Fields>
</DataSet>
Seems pretty straight forward. I have even tried switching places between the two datasets in the file, but the Oracle dataset works while the MSSQL dataset doesn't.
I thought maybe I need to add ' around the dates, so I tried some stuff like:
[...]
WHERE myDate BETWEEN ':Fromdate' AND ':Todate'
[...]
WHERE myDate BETWEEN '& :Fromdate &' AND '& :Todate &'
DECLARE #startdate datetime DECLARE #enddate datetime
SET #startdate = CAST(:Fromdate as datetime)
SET #enddate = CAST(:Todate as datetime)
SELECT myColumn FROM myTable
WHERE myDate BETWEEN #startdate AND #enddate
But SSRS keeps telling me that there's an error getting the data for this dataset. It works if I hard code the dates in, like so:
<CommandText>SELECT myColumn FROM myOtherTableOnAnotherDatabase WHERE myDate BETWEEN '2020-01-01' AND '2020-01-31'</CommandText>
How can I use the same report parameters in the command texts of two datasets?
Just as I was about to post my question I realized that I was assuming that the report parameter should be passed with :ParameterName since that's how it was done for the Oracle part (and I'm not super familiar with Oracle syntax since I've been working mostly with MSSQL in the past). So I changed : to # for the MSSQL query and it worked:
WHERE myDate BETWEEN #Fromdate AND #Todate
D'oh!

BIML PropertyName for "Direct Input"/SqlCommand of Lookup Task

<Dataflow Name="Load Tables">
<Expressions>
<Expression PropertyName="[Lookup].[SqlCommand]">"SELECT * FROM " + #[$User::DBSchema] + ".Table1" </Expression>
</Expressions>
[...]
In reference to:
<Lookup Name="Lookup1" CacheMode="Partial" NoMatchBehavior="RedirectRowsToNoMatchOutput" OleDbConnectionName="abc123">
This is part of a much larger package, but when I try to generate it, it gives me an error:
"Could not resolve reference to '[Lookup1].[SqlCommand]' in property 'Property'. '' is invalid. Provide valid scoped name."
In the .dtsx, the needed Property is called "[Lookup1].[SqlCommand]" (and given the same Expression); when changed in the .dtsx file manually the works as expected but I am at a loss of how to translate this into the biml, specifically what reference name to use so it knows where to put the expression.
My question is what is the property name to reference the DirectInput/SqlCommand of the Lookup task? I cannot seem to figure it out.
Here is a picture of how it looks in the dtsx when I change it manually:
Note:
I can't put the expression into the Lookup Task directly because the parameters are dynamically passed into the expression e.g.
<Lookup Name="Lookup1" CacheMode="Partial" NoMatchBehavior="RedirectRowsToNoMatchOutput">
<DirectInput>
SELECT * [etc.]
</DirectInput>
I think what you're missing is when you define your Lookup in the data flow, provide a valid query against a default schema. That will allow the engine to derive the properties of the reference table/set and then once the package is emitted the data flow overrides should take over.
<Dataflow Name="Data Flow Task">
<Expressions>
<Expression ExternalProperty="[Lookup].[SqlCommand]">"SELECT *
FROM
(
VALUES (1, 'b')
,(100, 'a')
,(11, 'c')
) D(colRef, colVal)"</Expression>
</Expressions>
<Transformations>
<OleDbSource Name="OLE DB Source" ConnectionName="SourceConnectionOLEDB">
<DirectInput>SELECT 100 aS col
union all select 11</DirectInput>
</OleDbSource>
<Lookup Name="Lookup" OleDbConnectionName="SourceConnectionOLEDB">
<Outputs>
<Column SourceColumn="colVal" TargetColumn="colVal" />
</Outputs>
<Parameters>
<Parameter SourceColumn="col" />
</Parameters>
<Inputs>
<Column SourceColumn="col" TargetColumn="colRef" />
</Inputs>
<DirectInput>SELECT *
FROM
(
VALUES (1, 'b')
,(100, 'a')
) D(colRef, colVal)</DirectInput>
<ParameterizedQuery>select * from (SELECT *
FROM
(
VALUES (1, 'b')
,(100, 'a')
) D(colRef, colVal)) [refTable]
where [refTable].[colRef] = ?</ParameterizedQuery>
</Lookup>
<RowCount Name="Row Count" VariableName="User.Variable" />
</Transformations>
</Dataflow>
In the above snippet, I generate a pair of number, 100 and 11 as col and then route to a Lookup component that has hardcoded values of 1 and 100. Since I expect to match, if I ran it as-is, it'll blow up on the unmatched 11 value.
My Dataflow's ExternalProperty override then injects the "missing" at run time to save my lookup from blowing up.
In your case,
I didn't attempt to do this for a partial cache but I can't imagine the syntax will be much different, but I'd fix the code by hand one time and then reverse engineer the package. BimlExpress now provides that functionality for free and it's awesome for answering the "how do I express Y in Biml?" Right click on the package and there's a Convert to Biml option (name approximate)
Billinkc's answer as correct. It was, in essence:
ExternalProperty=[etc], not "PropertyName", and in order for it to work it requires a valid query to generate.

How to use 'Return Value' parameter in Execute SQL task

I know there are 3 types of parameter in 'Parameter mapping' - Input Parameter, Output Parameter and Return Parameter. I understand how to use Input and Output parameter. But when I try to set the parameter type as 'Return Parameter', it doesn't work. Below is my SQL Server stored procedure.
ALTER Procedure [dbo].[spRandomReturn]
As
Begin
Return Convert(int, rand() * 10)
End
In SSIS Execute SQL task, I have set
connection type: OLE DB
parameter mapping: variable name: User::#random (I set SSIS a User parameter in SSIS: random INT32), Direction: ReturnValue, Type: Numeric, Parameter Name: #random
SQL statement:
Declare #r int = #random EXEC #r = spRandomReturn
I created a return parameter in SSIS, but it doesn't work and throws error.
Since you're using OLE DB Connection Manager, you need to use the ? to indicate where parameters are.
Thus, your query becomes
EXECUTE ? = [dbo].[spRandomReturn]
And within your parameter mapping, you'd have
Reproduction
Biml, the Business Intelligence Markup Language, describes the platform for business intelligence. Here, we're going to use it to describe the ETL. BIDS Helper, is a free add on for Visual Studio/BIDS/SSDT that addresses a host of shortcomings with it. Specifically, we're going to use the ability to transform a Biml file describing ETL into an SSIS package. This has the added benefit of providing you a mechanism for being able to generate exactly the solution I'm describing versus clicking through many tedious dialogue boxes.
You can see in the following bit of XML, I create a connection called CM_OLE and this points to localhost\dev2014 at tempdb. You would need to modify this to reflect your environment.
I create a package named so_28419264. This package contains 2 variables. One is Query which contains the first bit of code. The second is ReturnValue which we will use to capture the return value on the Mapping tab. I initialize this one to -1 as the provided stored procedure would never generate a negative value.
I add two Tasks, both Execute SQL Tasks. The second one does nothing, it simply serves as a point for me to put a breakpoint on. The first Execute SQL Task is where we invoke our Stored Procedure and assign the results into our variable
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="CM_OLE" ConnectionString="Data Source=localhost\dev2014;Initial Catalog=tempdb;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;" />
</Connections>
<Packages>
<Package ConstraintMode="Linear" Name="so_28419264">
<Variables>
<Variable DataType="String" Name="Query">EXECUTE ? = [dbo].[spRandomReturn];</Variable>
<Variable DataType="Int32" Name="ReturnValue">-1</Variable>
</Variables>
<Tasks>
<ExecuteSQL ConnectionName="CM_OLE" Name="SQL Demonstrate Return Value">
<VariableInput VariableName="User.Query" />
<Parameters>
<Parameter DataType="Int32" VariableName="User.ReturnValue" Name="0" Direction="ReturnValue" />
</Parameters>
</ExecuteSQL>
<ExecuteSQL ConnectionName="CM_OLE" Name="Put Breakpoint on me">
<DirectInput>SELECT 1;</DirectInput>
</ExecuteSQL>
</Tasks>
</Package>
</Packages>
</Biml>
Results
It works

BIML assigns wrong metadata to NUMBER columns from Oracle

I've successfully created a BIML script on BIDS 2008 with BIDS Helper 1.6.6.0 which automates the creation of SSIS packages to import data from an Oracle database (11g Enterprise Edition Release 11.2.0.3.0 - 64bit) into SQL Server 2008 R2. I am having an issue at package run-time which causes the package to fail at Data Flow valdation with:
Warning: The external columns for component "Source" (1) are out of synchronization with the data source columns. The external column "LIMIT_AMOUNT" needs to be updated.
The external column "LIMIT_BASE_AMOUNT" needs to be updated.
The external column "GROSS_BASE_AMOUNT" needs to be updated.
Error: The OLE DB provider used by the OLE DB adapter cannot convert between types "DT_BYTES" and "DT_NUMERIC" for "LIMIT_AMOUNT".
Error: The OLE DB provider used by the OLE DB adapter cannot convert between types "DT_BYTES" and "DT_NUMERIC" for "LIMIT_BASE_AMOUNT".
Error: The OLE DB provider used by the OLE DB adapter cannot convert between types "DT_BYTES" and "DT_NUMERIC" for "GROSS_BASE_AMOUNT".
Error: There were errors during task validation.
Upon inspection, it appears that the metadata for NUMBER columns without scale and precision in Oracle are mapped to DT_BYTES in the generated SSIS. The description of the above object (a view) in Oracle is as follows:
Name Null Type
--------------------- ---- ------------
ID NUMBER(12)
CURRENCY VARCHAR2(3)
LIMIT_AMOUNT NUMBER
LIMIT_BASE_AMOUNT NUMBER
GROSS_BASE_AMOUNT NUMBER
STATUS VARCHAR2(15)
Checking in all_tab_columns shows the three NUMBER columns as having a DATA_LENGTH of 22 and NULL DATA_PRECISION and DATA_SCALE.
COLUMN_ID COLUMN_NAME DATA_TYPE DATA_LENGTH DATA_PRECISION DATA_SCALE
---------- ---------------------- ------------- ----------- -------------- ----------
1 ID NUMBER 22 12 0
2 CURRENCY VARCHAR2 3
3 LIMIT_AMOUNT NUMBER 22
4 LIMIT_BASE_AMOUNT NUMBER 22
5 GROSS_BASE_AMOUNT NUMBER 22
6 STATUS VARCHAR2 15
The Oracle documentation states that this is the equivalent of a float
Specify a floating-point number using the following form:
NUMBER
The absence of precision and scale designators specifies the maximum range and precision for an Oracle number.
The workaround so far was to implement a custom SELECT which casts these fields to the desired type, but that's not very elegant or maintainable. I would like to understand why BIML seems to get the data type mapping wrong, whereas SSIS is able to determine that the metadata is wrong when the package is first opened after it has been created –I get a pop-up in BIDS stating that
The metadata of the following output columns does not match the metadata of the external columns with which the output columns are associated:
Output "Output": "LIMIT_AMOUNT", "LIMIT_BASE_AMOUNT", "GROSS_EXP_BASE_AMOUNT"
Do you want to replace the metadata of the output columns with the metadata of the external columns?
EDIT: Adding relevant Biml details regarding connections & dataflow
<#
string OraConnectionStr = #"Provider=OraOLEDB.Oracle;Data Source=(In-line TNS);User Id=redacted;Password=redacted;Persist Security Info=True;";
string StagingConnectionStr = "Data Source=SVR;Initial Catalog=DB;Integrated Security=SSPI;Provider=SQLNCLI10;";
#>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<Connection Name="<#=StagingConnectionName#>"
ConnectionString="<#=StagingConnectionStr#>" />
<Connection Name="<#=OraConnectionName#>"
ConnectionString="<#=OraConnectionStr#>" />
</Connections>
<Packages>
<!-- Assume object stagingTables is populated and methods have been defined -->
<# foreach (DataRow row in stagingTables.Rows) { #>
<Package Name="<#= GetChildPackageName(row) #>"
ConstraintMode="Linear" AutoCreateConfigurationsType="None">
<Dataflow Name="<#=GetStagingTableDescriptiveName(row)#>" >
<Tasks>
<Transformations>
<OleDbSource Name="Source - <#=GetStagingTableDescriptiveName(row)#>"
ConnectionName="<#=OraConnectionName#>"
AlwaysUseDefaultCodePage="true"
DefaultCodePage="1252">
<DirectInput>SELECT * FROM <#GetOracleObjectName(row)#></DirectInput>
</OleDbSource>
<OleDbDestination Name="Destination - <#=GetStagingTableDescriptiveName(row)#>"
ConnectionName="<#=DataLoadConnectionName#>">
<ExternalTableOutput Table="<#= GetStagingTableObjectName(row) #>" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
<# } #>
</Packages
</Biml>
Thanks in advance.