<Dataflow Name="Load Tables">
<Expressions>
<Expression PropertyName="[Lookup].[SqlCommand]">"SELECT * FROM " + #[$User::DBSchema] + ".Table1" </Expression>
</Expressions>
[...]
In reference to:
<Lookup Name="Lookup1" CacheMode="Partial" NoMatchBehavior="RedirectRowsToNoMatchOutput" OleDbConnectionName="abc123">
This is part of a much larger package, but when I try to generate it, it gives me an error:
"Could not resolve reference to '[Lookup1].[SqlCommand]' in property 'Property'. '' is invalid. Provide valid scoped name."
In the .dtsx, the needed Property is called "[Lookup1].[SqlCommand]" (and given the same Expression); when changed in the .dtsx file manually the works as expected but I am at a loss of how to translate this into the biml, specifically what reference name to use so it knows where to put the expression.
My question is what is the property name to reference the DirectInput/SqlCommand of the Lookup task? I cannot seem to figure it out.
Here is a picture of how it looks in the dtsx when I change it manually:
Note:
I can't put the expression into the Lookup Task directly because the parameters are dynamically passed into the expression e.g.
<Lookup Name="Lookup1" CacheMode="Partial" NoMatchBehavior="RedirectRowsToNoMatchOutput">
<DirectInput>
SELECT * [etc.]
</DirectInput>
I think what you're missing is when you define your Lookup in the data flow, provide a valid query against a default schema. That will allow the engine to derive the properties of the reference table/set and then once the package is emitted the data flow overrides should take over.
<Dataflow Name="Data Flow Task">
<Expressions>
<Expression ExternalProperty="[Lookup].[SqlCommand]">"SELECT *
FROM
(
VALUES (1, 'b')
,(100, 'a')
,(11, 'c')
) D(colRef, colVal)"</Expression>
</Expressions>
<Transformations>
<OleDbSource Name="OLE DB Source" ConnectionName="SourceConnectionOLEDB">
<DirectInput>SELECT 100 aS col
union all select 11</DirectInput>
</OleDbSource>
<Lookup Name="Lookup" OleDbConnectionName="SourceConnectionOLEDB">
<Outputs>
<Column SourceColumn="colVal" TargetColumn="colVal" />
</Outputs>
<Parameters>
<Parameter SourceColumn="col" />
</Parameters>
<Inputs>
<Column SourceColumn="col" TargetColumn="colRef" />
</Inputs>
<DirectInput>SELECT *
FROM
(
VALUES (1, 'b')
,(100, 'a')
) D(colRef, colVal)</DirectInput>
<ParameterizedQuery>select * from (SELECT *
FROM
(
VALUES (1, 'b')
,(100, 'a')
) D(colRef, colVal)) [refTable]
where [refTable].[colRef] = ?</ParameterizedQuery>
</Lookup>
<RowCount Name="Row Count" VariableName="User.Variable" />
</Transformations>
</Dataflow>
In the above snippet, I generate a pair of number, 100 and 11 as col and then route to a Lookup component that has hardcoded values of 1 and 100. Since I expect to match, if I ran it as-is, it'll blow up on the unmatched 11 value.
My Dataflow's ExternalProperty override then injects the "missing" at run time to save my lookup from blowing up.
In your case,
I didn't attempt to do this for a partial cache but I can't imagine the syntax will be much different, but I'd fix the code by hand one time and then reverse engineer the package. BimlExpress now provides that functionality for free and it's awesome for answering the "how do I express Y in Biml?" Right click on the package and there's a Convert to Biml option (name approximate)
Billinkc's answer as correct. It was, in essence:
ExternalProperty=[etc], not "PropertyName", and in order for it to work it requires a valid query to generate.
Related
I'm just getting into BIML and have written some Scripts to creat a few DTSX-Packages. In general the most things are working. But one thing makes me crazy.
I have an ODBC-Source (PostgreSQL). From there I'm getting data out of a table using an ODBC-Source. The table has a text-Column (Name of the column is "description"). I cast this column to varchar(4000) in the query in the ODBC-Source (I know that there will be truncation, but it's ok). If I do this manually in Visual Studio the Advanced Editor of the ODBC-Source is showing "Unicode string [DT_WSTR]" with a Length of 4000 both for the External and the Output-Column. So there everything is fine. But if I do the same things with BIML and generate the SSIS-Package the External-Column will still say "Unicode string [DT_WSTR]" with a Length of 4000, but the Output-Column is telling "Unicode text stream [DT_NTEXT]". So the mapping done by BIML differs from the Mapping done by SSIS (manually). This is causing two things (warnings):
A Warning that metadata has changed and should be synced
And a Warning that the Source uses LOB-Columns and is set to Row by Row-Fetch..
Both warnings are not cool. But the second one also causes a drasticaly degredation in Performance! If I set the cast to varchar(255) the Mapping is fine (External- and Output-Column is then "Unicode string [DT_WSTR]" with a Length of 255). But as soon as I go higher, like varchar(256) it's again treated as [DT_NTEXT] in the Output.
Is there anything I can do about this? I invested days in the Evaluation of BIML and find many things an increase in Quality of Life, but this issue is killing it. It defeats the purpose of BIML if I have to correct the Errors of BIML manually after every Build.
Does anyone know how I can solve this Issue? A correct automatic Mapping between External- and Output-Columns would be great, but at least the option to define the Mapping myself would be ok.
Any Help is appreciated!
Greetings
Marco
Edit As requested a Minimal Example for better understanding:
The column in the ODBC Source (Postegres) has the type "text" (Columnname: description)
I select it in a ODBC-Source with this Query (DirectInput):
SELECT description::varchar(4000) from mySourceTable
The ODBC-Source in Biml looks like this:
<OdbcSource Name="mySource" Connection="mySourceConnection"> <DirectInput>SELECT description::varchar(4000) from mySourceTable</DirectInput></OdbcSource>
If I now generate the dtsx-Package the ODBC-Source throws the above mentioned warnings with the above mentioned Datatypes for External and Output-Column
As mentioned in the comment before I got an answer from another direction:
You have to use DataflowOverrides in the ODBC-Source in BIML. For my example you have to do something like this:
`<OdbcSource Name="mySource" Connection="mySourceConnection">
<DirectInput>SELECT description::varchar(4000) from mySourceTable</DirectInput>
<DataflowOverrides>
<OutputPath OutputPathName="Output">
<Columns>
<Column ColumnName="description" SsisDataTypeOverride="DT_WSTR" DataType="String" Length="4000" />
</Columns>
</OutputPath>
<OutputPath OutputPathName="Error">
<Columns>
<Column ColumnName="description" SsisDataTypeOverride="DT_WSTR" DataType="String" Length="4000" />
</Columns>
</OutputPath>
</DataflowOverrides>
</OdbcSource>`
You won't have to do the Overrides for all columns, only for the ones you have mapping-Issues with.
Hope this solution can help anyone who passes by.
Cheers
I know there are 3 types of parameter in 'Parameter mapping' - Input Parameter, Output Parameter and Return Parameter. I understand how to use Input and Output parameter. But when I try to set the parameter type as 'Return Parameter', it doesn't work. Below is my SQL Server stored procedure.
ALTER Procedure [dbo].[spRandomReturn]
As
Begin
Return Convert(int, rand() * 10)
End
In SSIS Execute SQL task, I have set
connection type: OLE DB
parameter mapping: variable name: User::#random (I set SSIS a User parameter in SSIS: random INT32), Direction: ReturnValue, Type: Numeric, Parameter Name: #random
SQL statement:
Declare #r int = #random EXEC #r = spRandomReturn
I created a return parameter in SSIS, but it doesn't work and throws error.
Since you're using OLE DB Connection Manager, you need to use the ? to indicate where parameters are.
Thus, your query becomes
EXECUTE ? = [dbo].[spRandomReturn]
And within your parameter mapping, you'd have
Reproduction
Biml, the Business Intelligence Markup Language, describes the platform for business intelligence. Here, we're going to use it to describe the ETL. BIDS Helper, is a free add on for Visual Studio/BIDS/SSDT that addresses a host of shortcomings with it. Specifically, we're going to use the ability to transform a Biml file describing ETL into an SSIS package. This has the added benefit of providing you a mechanism for being able to generate exactly the solution I'm describing versus clicking through many tedious dialogue boxes.
You can see in the following bit of XML, I create a connection called CM_OLE and this points to localhost\dev2014 at tempdb. You would need to modify this to reflect your environment.
I create a package named so_28419264. This package contains 2 variables. One is Query which contains the first bit of code. The second is ReturnValue which we will use to capture the return value on the Mapping tab. I initialize this one to -1 as the provided stored procedure would never generate a negative value.
I add two Tasks, both Execute SQL Tasks. The second one does nothing, it simply serves as a point for me to put a breakpoint on. The first Execute SQL Task is where we invoke our Stored Procedure and assign the results into our variable
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="CM_OLE" ConnectionString="Data Source=localhost\dev2014;Initial Catalog=tempdb;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;" />
</Connections>
<Packages>
<Package ConstraintMode="Linear" Name="so_28419264">
<Variables>
<Variable DataType="String" Name="Query">EXECUTE ? = [dbo].[spRandomReturn];</Variable>
<Variable DataType="Int32" Name="ReturnValue">-1</Variable>
</Variables>
<Tasks>
<ExecuteSQL ConnectionName="CM_OLE" Name="SQL Demonstrate Return Value">
<VariableInput VariableName="User.Query" />
<Parameters>
<Parameter DataType="Int32" VariableName="User.ReturnValue" Name="0" Direction="ReturnValue" />
</Parameters>
</ExecuteSQL>
<ExecuteSQL ConnectionName="CM_OLE" Name="Put Breakpoint on me">
<DirectInput>SELECT 1;</DirectInput>
</ExecuteSQL>
</Tasks>
</Package>
</Packages>
</Biml>
Results
It works
I've successfully created a BIML script on BIDS 2008 with BIDS Helper 1.6.6.0 which automates the creation of SSIS packages to import data from an Oracle database (11g Enterprise Edition Release 11.2.0.3.0 - 64bit) into SQL Server 2008 R2. I am having an issue at package run-time which causes the package to fail at Data Flow valdation with:
Warning: The external columns for component "Source" (1) are out of synchronization with the data source columns. The external column "LIMIT_AMOUNT" needs to be updated.
The external column "LIMIT_BASE_AMOUNT" needs to be updated.
The external column "GROSS_BASE_AMOUNT" needs to be updated.
Error: The OLE DB provider used by the OLE DB adapter cannot convert between types "DT_BYTES" and "DT_NUMERIC" for "LIMIT_AMOUNT".
Error: The OLE DB provider used by the OLE DB adapter cannot convert between types "DT_BYTES" and "DT_NUMERIC" for "LIMIT_BASE_AMOUNT".
Error: The OLE DB provider used by the OLE DB adapter cannot convert between types "DT_BYTES" and "DT_NUMERIC" for "GROSS_BASE_AMOUNT".
Error: There were errors during task validation.
Upon inspection, it appears that the metadata for NUMBER columns without scale and precision in Oracle are mapped to DT_BYTES in the generated SSIS. The description of the above object (a view) in Oracle is as follows:
Name Null Type
--------------------- ---- ------------
ID NUMBER(12)
CURRENCY VARCHAR2(3)
LIMIT_AMOUNT NUMBER
LIMIT_BASE_AMOUNT NUMBER
GROSS_BASE_AMOUNT NUMBER
STATUS VARCHAR2(15)
Checking in all_tab_columns shows the three NUMBER columns as having a DATA_LENGTH of 22 and NULL DATA_PRECISION and DATA_SCALE.
COLUMN_ID COLUMN_NAME DATA_TYPE DATA_LENGTH DATA_PRECISION DATA_SCALE
---------- ---------------------- ------------- ----------- -------------- ----------
1 ID NUMBER 22 12 0
2 CURRENCY VARCHAR2 3
3 LIMIT_AMOUNT NUMBER 22
4 LIMIT_BASE_AMOUNT NUMBER 22
5 GROSS_BASE_AMOUNT NUMBER 22
6 STATUS VARCHAR2 15
The Oracle documentation states that this is the equivalent of a float
Specify a floating-point number using the following form:
NUMBER
The absence of precision and scale designators specifies the maximum range and precision for an Oracle number.
The workaround so far was to implement a custom SELECT which casts these fields to the desired type, but that's not very elegant or maintainable. I would like to understand why BIML seems to get the data type mapping wrong, whereas SSIS is able to determine that the metadata is wrong when the package is first opened after it has been created –I get a pop-up in BIDS stating that
The metadata of the following output columns does not match the metadata of the external columns with which the output columns are associated:
Output "Output": "LIMIT_AMOUNT", "LIMIT_BASE_AMOUNT", "GROSS_EXP_BASE_AMOUNT"
Do you want to replace the metadata of the output columns with the metadata of the external columns?
EDIT: Adding relevant Biml details regarding connections & dataflow
<#
string OraConnectionStr = #"Provider=OraOLEDB.Oracle;Data Source=(In-line TNS);User Id=redacted;Password=redacted;Persist Security Info=True;";
string StagingConnectionStr = "Data Source=SVR;Initial Catalog=DB;Integrated Security=SSPI;Provider=SQLNCLI10;";
#>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<Connection Name="<#=StagingConnectionName#>"
ConnectionString="<#=StagingConnectionStr#>" />
<Connection Name="<#=OraConnectionName#>"
ConnectionString="<#=OraConnectionStr#>" />
</Connections>
<Packages>
<!-- Assume object stagingTables is populated and methods have been defined -->
<# foreach (DataRow row in stagingTables.Rows) { #>
<Package Name="<#= GetChildPackageName(row) #>"
ConstraintMode="Linear" AutoCreateConfigurationsType="None">
<Dataflow Name="<#=GetStagingTableDescriptiveName(row)#>" >
<Tasks>
<Transformations>
<OleDbSource Name="Source - <#=GetStagingTableDescriptiveName(row)#>"
ConnectionName="<#=OraConnectionName#>"
AlwaysUseDefaultCodePage="true"
DefaultCodePage="1252">
<DirectInput>SELECT * FROM <#GetOracleObjectName(row)#></DirectInput>
</OleDbSource>
<OleDbDestination Name="Destination - <#=GetStagingTableDescriptiveName(row)#>"
ConnectionName="<#=DataLoadConnectionName#>">
<ExternalTableOutput Table="<#= GetStagingTableObjectName(row) #>" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
<# } #>
</Packages
</Biml>
Thanks in advance.
I have an SSIS package where I need to get the date the package last ran from an ADO NET Source then assign it to a variable so what I can use it in a query for another ADO NET Source. I can't find an example on the Googles that actually works. I'm running VS 2012 and connecting to a SQL Server 2012 instance. If there is more information needed let me know.
Create a variable #User::LastRanDate.
Create an Execute SQL task.
Set the ConnectionType property to ADO.NET.
Set the Connection property to your ADO.NET connection.
Set the SQLStatement property to the statement which will return the date you want. Make sure the first column returned is the date.
Set the ResultSet property to Single row.
On the Result Set tab of the Task editor, hit Add and set the Result Name value to 0 and the Variable Name value to #User::LastRanDate. (ADO.NET result sets are returned as indexed arrays.)
Upon completion of the Task, #User::LastRanDate will now be set to whatever the query returned and you can use it to build up your query for your other ADO.NET source.
Working with parameterized queries in an ADO.NET Data Source in SSIS is not as easy as an OLE DB one. Basically, you're going to have to write the query with the expression language and pray your source doesn't lend itself to sql injection.
How to Pass parameter in ADO.NET Source SSIS
how to pass parameters to an ado.net source in ssis?
I created a package with 3 variables as shown below
Package
Variables
I have LastRunDate as a DateTime and a QueryAdo as a string. This evaluated as an Expression with the expression being "SELECT RD.* FROM dbo.RunData AS RD WHERE RD.InsertDate > '" + (DT_WSTR, 25) #[User::LastRunDate] + "';"
Execute SQL Task
I create an Execute sql task that uses a query and is set to return a single row. I assign this value into my SSIS Variable.
In my results tab, I assign the zeroeth column to my variable LastRunDate
Data flow
Note there is an expression here. On the ADO.NET source, I originally used SELECT RD.* FROM dbo.RunData AS RD to get my meta data set.
After I was happy with my data flow, I then went to the control flow and substituted my Query variable in as the expression on the ADO.NET Source component (see the referenced questions).
Try it, try it, you will see
I used the following script to build out my demo environment
create table dbo.RussJohnson
(
LastRunDate datetime NOT NULL
);
create table dbo.RunData
(
SomeValue int NOT NULL
, InsertDate datetime NOT NULL
);
insert into dbo.RussJohnson
SELECT '2014-08-01' AS LastRunDate
INSERT INTO
dbo.RunData
(
SomeValue
, InsertDate
)
SELECT
D.rc AS Somevalue
, dateadd(d, D.rc, '2014-07-30') AS InsertDate
FROM
(
SELECT TOP 15 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rc
FROM sys.all_columns AS SC
) D;
Since I have BIDS Helper installed, I used the following Biml to generate this package as described. For those playing along at home, you will need to edit the third line so that the ADO.NET connection manager is pointing to a valid server and database.
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<AdoNetConnection Name="CM_ADO_DB" ConnectionString="Data Source=localhost\dev2014;Integrated Security=SSPI;Connect Timeout=30;Database=tempdb;" Provider="SQL" />
</Connections>
<Packages>
<Package Name="so_25125838" ConstraintMode="Linear">
<Variables>
<Variable DataType="DateTime" Name="LastRunDate" >2014-01-01</Variable>
<Variable DataType="Int32" Name="RowCountOriginal" >0</Variable>
<Variable DataType="String" Name="QueryAdo" EvaluateAsExpression="true">"SELECT RD.* FROM dbo.RunData AS RD WHERE RD.InsertDate > '" + (DT_WSTR, 25) #[User::LastRunDate] + "';"</Variable>
</Variables>
<Tasks>
<ExecuteSQL
Name="SQL GetLastRunDate"
ConnectionName="CM_ADO_DB"
ResultSet="SingleRow"
>
<DirectInput>SELECT MAX(RJ.LastRunDate) AS LastRunDate FROM dbo.RussJohnson AS RJ;</DirectInput>
<Results>
<Result Name="0" VariableName="User.LastRunDate" />
</Results>
</ExecuteSQL>
<Dataflow Name="DFT POC">
<Transformations>
<AdoNetSource Name="ADO_SRC Get New Data" ConnectionName="CM_ADO_DB">
<DirectInput>SELECT RD.* FROM dbo.RunData AS RD</DirectInput>
</AdoNetSource>
<RowCount Name="CNT Original rows" VariableName="User.RowCountOriginal" />
</Transformations>
<Expressions>
<Expression ExternalProperty="[ADO_SRC Get New Data].[SqlCommand]">#[User::QueryAdo]</Expression>
</Expressions>
</Dataflow>
</Tasks>
</Package>
</Packages>
</Biml>
I have a liquibase xml script. When I run it on Postgres I don't face any problem but when I run it for MYSQL it gives error when the structure is of the following type:-
<insert tableName="user_table">
<column name="id" valueComputed="(select max(id)+1 from user_table)"/>
<column name="name" value="someName"/>
</insert>
When the above script is executed for MYSQL it gives error:-
You can't specify target table 'user_table' for update in FROM clause.
I found a solution to this by using alias like this :-
<insert tableName="user_table">
<column name="id" valueComputed="(select max(id)+1 from (Select * from user_table) t)" />
<column name="name" value="someName"/>
</insert>
But there are thousands of entries like this. Is there any generic way of doing it so that I don't have to change the script at so many places. Thanks.
The easiest approach would be to just update the XML, either with an simple XML parser program or even a regexp search and replace in your text editor.
Alternately, you can override the standard liquibase logic to look for that particular valueComputed pattern and replace it. There are a couple points you could make the change at:
Override the liquibase.parser.core.xml.XMLCHangeLogSAXParser class, probably the parseToNode() method to search through the generated ParsedNode for valueComputed nodes
Override the liquibase.change.core.InsertDataChange class generateStatements() method or addColumn() method to replace valueComputed fields.
See http://liquibase.org/extensions for more info on writing extensions.