FuzzyLookup in BIML - ssis

I'm trying to do the following in BIML:
I'm at a bit of a loss on how to do this in BIML. Here is what I've tried:
<FuzzyLookup
Name="Fuzzy Lookup"
ConnectionName="WO7"
Exhaustive="true"
AutoPassThroughInputColumns="true"
>
<ExternalReferenceTableInput Table="map.AgencyWO7" />
<Inputs>
<Column SourceColumn="AgencyName" TargetColumn="AgencyName" />
</Inputs>
<Outputs>
<Column SourceColumn="AgencyId" TargetColumn="AgencyIdWO7" />
<Column SourceColumn="AgencyName" TargetColumn="AgencyNameWO7" />
</Outputs>
The result is the following error:
(-1,-1) : Error 5 : The input column for the
Fuzzy Lookup Fuzzy Lookup references external column that cannot be found in the reference table. Verify that the
input mapping references a valid column in the reference table.
Property TargetColumn. EmitSsis. There were errors during compilation.
See compiler output for more information.

I think you are maybe missing a reference to the previous transform which is effectively the joining arrow, had you been using SSDT.
Also the format I use to set passthrough = true is on a per column basis.
<FuzzyLookup Name="Fuzzy Lookup" MatchIndexName="" ConnectionName="WO7">
<InputPath OutputPathName="[Previous Transform Name].Output" />
<ExternalReferenceTableInput Table="map.AgencyWO7" />
<Inputs>
<Column MinSimilarity="85" MatchTypeExact="true" PassThrough="true" SourceColumn="AgencyName" TargetColumn="AgencyName" />
</Inputs>
<Outputs>
<Column SourceColumn="AgencyId" TargetColumn="AgencyIdWO7" />
<Column SourceColumn="AgencyName" TargetColumn="AgencyNameWO7" />
</Outputs>
</FuzzyLookup>
Try the above code, and if all else fails you can design the fuzzy look up in SSDT and then import it into biml using Mist/BimlStudio which is pretty reliable.
https://varigence.com/Mist
Cheers

Related

SSIS BIML Derived Column syntax for expressions

I am defining a Derived Column transformation in BIML but I am having trouble referencing the output from the previous Excel Source in my Derived Column transformation.
I receive the error upon opening the package after successfully generating the SSIS package and it suggests that it the Derived Transformation cannot find the output from the Excel Source.
Error 2 Error loading AFR_ShareTableBIML.dtsx: The object
"/DTS:Executable/DTS:Executables/DTS:Executable/DTS:ObjectData/pipeline/components/component/inputs/input/inputColumns/inputColumn/properties/property"
references ID "#{Package\Data Flow {Import Share Table CSV}\Source
{Flat File Share Table}.Outputs[Output].Columns[Div c per share]}",
but no object in the package has this ID.
Here is a code snippet:
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<FileFormats>
<FlatFileFormat Name="FFF_AFRShareTable" ColumnNamesInFirstDataRow="true"
FlatFileType="Delimited" IsUnicode="false" TextQualifer="None" HeaderRowsToSkip="6">
<Columns>
<Column Name="Quote Buy" ColumnType="Delimited" DataType="AnsiString" Length ="50" Delimiter=","></Column>
<Column Name="Quote Sell" ColumnType="Delimited" DataType="AnsiString" Length ="50" Delimiter=","></Column>
<Column Name="Div c per share" ColumnType="Delimited" DataType="AnsiString" Length ="50" Delimiter=","></Column>
</Columns>
</FlatFileFormat>
</FileFormats>
<Connections>
<FlatFileConnection Name="FF_AFRShareTable" FileFormat="FFF_AFRShareTable"
FilePath="C:\Temp\Stocks.csv"></FlatFileConnection>
<OleDbConnection Name="CMD DB"
ConnectionString="Data Source=Localhost;Initial Catalog=DB;Provider=SQLNCLI11.1;Integrated Security=SSPI;" CreateInProject="true">
</OleDbConnection>
</Connections>
<Packages>
<Package Name="AFR_ShareTableBIML" ConstraintMode="Linear" ProtectionLevel="DontSaveSensitive">
<Tasks>
<ExecuteSQL Name="SQLTask {OLE_DB} Truncate Security Share Table" ConnectionName="CMD DB">
<DirectInput>truncate table Staging.SecurityShareTable</DirectInput>
</ExecuteSQL>
<Dataflow Name="Data Flow {Import Share Table CSV}">
<Transformations>
<FlatFileSource Name="Source {Flat File Share Table}" ConnectionName="FF_AFRShareTable"></FlatFileSource>
<DerivedColumns Name="DER_NullifyColumns">
<Columns>
<Column Name ="DER_DPS" DataType = "Decimal" Precision="4">
[Div c per share] == "-" ? NULL(DT_DECIMAL, 4) : (DT_DECIMAL, 4)[Div c per share]
</Column>
</Columns>
</DerivedColumns>
</Transformations>
</Dataflow>
</Tasks>
</Package>
</Packages>
I have already defined the column name via the FlatFileFormat and I have confirmed that the expression in the DER_DPS column is is syntactically correct. I found that through replacing the square brackets "[" and "]" with double apostrophes, the SSIS package can be opened. For example:
"Div c per share" == "-" ? NULL(DT_DECIMAL, 4) : (DT_DECIMAL, 4) "Div c per share"
However there are derived column transformation errors on incorrect syntax. Are square brackets special characters in BIML that I need to escape?
That was ... interesting.
It appears that your use of curly braces in your component names causes the Biml expansion to go haywire.
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<FileFormats>
<FlatFileFormat Name="FFF_AFRShareTable" ColumnNamesInFirstDataRow="true"
FlatFileType="Delimited" IsUnicode="false" TextQualifer="None" HeaderRowsToSkip="6">
<Columns>
<Column Name="Quote Buy" ColumnType="Delimited" DataType="AnsiString" Length ="50" Delimiter=","></Column>
<Column Name="Quote Sell" ColumnType="Delimited" DataType="AnsiString" Length ="50" Delimiter=","></Column>
<!-- Change -->
<Column Name="Div c per share" ColumnType="Delimited" DataType="AnsiString" Length ="50" Delimiter="CRLF"></Column>
</Columns>
</FlatFileFormat>
</FileFormats>
<Connections>
<FlatFileConnection Name="FF_AFRShareTable" FileFormat="FFF_AFRShareTable"
FilePath="C:\ssisdata\so\input\Stocks.csv"></FlatFileConnection>
<OleDbConnection Name="CMD DB"
ConnectionString="Data Source=Localhost\dev2014;Initial Catalog=tempdb;Provider=SQLNCLI11.1;Integrated Security=SSPI;"
CreateInProject="false">
</OleDbConnection>
</Connections>
<Packages>
<Package Name="so_37641290_AFR_ShareTableBIML" ConstraintMode="Linear" ProtectionLevel="DontSaveSensitive">
<Tasks>
<ExecuteSQL Name="SQLTask OLE_DB Truncate Security Share Table" ConnectionName="CMD DB">
<DirectInput>truncate table Staging.SecurityShareTable</DirectInput>
</ExecuteSQL>
<Dataflow Name="Data Flow Import Share Table CSV">
<Transformations>
<FlatFileSource Name="Source Flat File Share Table" ConnectionName="FF_AFRShareTable"></FlatFileSource>
<DerivedColumns Name="DER_NullifyColumns">
<Columns>
<Column Name="DER_DPS" DataType="Decimal" Precision="4"><![CDATA[[Div c per share] == "-" ? NULL(DT_DECIMAL, 4) : (DT_DECIMAL, 4)[Div c per share]]]></Column>
</Columns>
</DerivedColumns>
</Transformations>
</Dataflow>
</Tasks>
</Package>
</Packages>
</Biml>
The above biml works for me. Changes I made:
removed { and } from the tasks and component names
updated the last Column definition within your FlatFileFormat Columns collection to have a delimiter of CRLF instead of ,
I used the CDATA tag for the expression. Not needed here but if you had a > or < in there, then you'd need to escape them as either < or the CDATA approach as I used.
I also cleaned up the Derived Column's entity assignments. There were spaces around the equals and I don't believe those are supposed to be there.
Path updates for flat file + OLE DB to work with my setup.
Source data
0
1
2
3
4
5
Quote Buy,Quote Sell,Div c per share
1,1,1
2,2,2
3,3,-
Results

Unable to read CSV file with double quotes in cell value

I'm trying to read a CSV file with this kind of lines :
"A text";"Another text";"A text with ""quotes"""
In my Flat File connection, I filled the Text qualifier as ".
When I click on the Preview button, the lines are shown properly : A text with ""quotes"" (Shouldn't it show only one double quote btw ?)
But as soon as I try to execute the package, an error occurs saying that the column delimiter cannot be found:
[Source du fichier plat [1313]] Erreur*: «*Le séparateur de colonne pour la colonne «COL3» est introuvable.
If I remove those double double-quotes within the cell value it works fine.
Is there any way to make SSIS read those cells with double quotes in it ?
For the same data, you can see how 2008 versus 2012 will preview the data. Observe that Col2 either does, or does not escape the double quote (A text with "quotes" vs A text with ""quotes"")
The result of using the 2008 version is that it will fail with the following error messages
The column delimiter for column "Col2" was not found.
An error occurred while processing file "c:\ssisdata\so\input\so_36033443.txt" on data row 1.
A reproduction of the problem using Biml follows
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<FlatFileConnection
FilePath="c:\ssisdata\so\input\so_36033443.txt"
FileFormat="FFF_36033443"
Name="FFSRC" />
</Connections>
<FileFormats>
<FlatFileFormat
Name="FFF_36033443"
IsUnicode="false"
HeaderRowDelimiter=";"
CodePage="1252"
TextQualifer="""
>
<Columns>
<Column Name="Col0" DataType="AnsiString" Length="10" Delimiter=";" CodePage="1252"/>
<Column Name="Col1" DataType="AnsiString" Length="20" Delimiter=";" CodePage="1252"/>
<Column Name="Col2" DataType="AnsiString" Length="20" Delimiter="CRLF" CodePage="1252"/>
</Columns>
</FlatFileFormat>
</FileFormats>
<Packages>
<Package Name="so_36033443">
<Tasks>
<Dataflow Name="DFT Demo Delimiter">
<Transformations>
<FlatFileSource
ConnectionName="FFSRC"
Name="FFSRC so_36033443" />
<DerivedColumns Name="DER Placeholder" />
</Transformations>
</Dataflow>
</Tasks>
</Package>
</Packages>
</Biml>

EF4: ObjectContext inconsistent when inserting into a view with triggers

I get an Invalid Operation Exception when inserting records in a View that uses “Instead of” triggers in SQL Server with ADO.NET Entity Framework 4.
The error message says:
{"The changes to the database were committed successfully, but an error occurred while updating the object context. The ObjectContext might be in an inconsistent state. Inner exception message: The key-value pairs that define an EntityKey cannot be null or empty. Parameter name: record"}
# at System.Data.Objects.ObjectContext.SaveChanges(SaveOptions options)
at System.Data.Objects.ObjectContext.SaveChanges()
In this simplified example I created two tables, Contacts and Employers, and one view Contacts_x_Employers which allows me to insert or retrieve rows into/from these two tables at once. The Tables only have a Name and an ID attributes and the view is based on a join of both:
CREATE VIEW [dbo].[Contacts_x_Employers]
AS
SELECT dbo.Contacts.ContactName, dbo.Employers.EmployerName
FROM dbo.Contacts INNER JOIN dbo.Employers
ON dbo.Contacts.EmployerID = dbo.Employers.EmployerID
And has this trigger:
Create TRIGGER C_x_E_Inserts
ON Contacts_x_Employers
INSTEAD of INSERT
AS
BEGIN
SET NOCOUNT ON;
insert into Employers (EmployerName)
select i.EmployerName
from inserted i
where not i.EmployerName in
(select EmployerName from Employers)
insert into Contacts (ContactName, EmployerID)
select i.ContactName, e.EmployerID
from inserted i inner join employers e
on i.EmployerName = e.EmployerName;
END
GO
The .NET Code follows:
using (var Context = new TriggersTestEntities())
{
Contacts_x_Employers CE1 = new Contacts_x_Employers();
CE1.ContactName = "J";
CE1.EmployerName = "T";
Contacts_x_Employers CE2 = new Contacts_x_Employers();
CE1.ContactName = "W";
CE1.EmployerName = "C";
Context.Contacts_x_Employers.AddObject(CE1);
Context.Contacts_x_Employers.AddObject(CE2);
Context.SaveChanges(); // line with error
}
 
SSDL and CSDL (the view nodes):
<EntityType Name="Contacts_x_Employers">
<Key>
<PropertyRef Name="ContactName" />
<PropertyRef Name="EmployerName" />
</Key>
<Property Name="ContactName" Type="varchar" Nullable="false" MaxLength="50" />
<Property Name="EmployerName" Type="varchar" Nullable="false" MaxLength="50" />
</EntityType>
<EntityType Name="Contacts_x_Employers">
<Key>
<PropertyRef Name="ContactName" />
<PropertyRef Name="EmployerName" />
</Key>
<Property Name="ContactName" Type="String" Nullable="false" MaxLength="50" Unicode="false" FixedLength="false" />
<Property Name="EmployerName" Type="String" Nullable="false" MaxLength="50" Unicode="false" FixedLength="false" />
</EntityType>
The Visual Studio solution and the SQL Scripts to re-create the whole application can be found in the TestViewTrggers.zip at ftp://JulioSantos.com/files/TriggerBug/.
I appreciate any assistance that can be provided. I already spent days working on this problem.
I stumbled on the same problem when I tried to insert a row in a view with "instead of insert" and "instead of update" triggers.
I think I found a solution: when visual studio's wizard drop your view in your model, it add a StoreGeneratedPattern="Identity" on some properties (probably the keys of your entity).
When generating requests on a regular table, this property tells entity framework to expect an ID in return, so it append a select scope_identity() at the end of the insert.
Now with updatable views the scope_identity is screwed because the insert happen in another scope and it returns null, so the insert fail.
If you remove this StoreGeneratedPattern="Identity" from the model, entity framework doesn't append select scope_identity() and the insert is working fine.
I hope this solve your problem and that it doesn't come too late.
Cheers
More details here : http://social.msdn.microsoft.com/Forums/en-US/adodotnetentityframework/thread/9fe80b08-0b67-4163-9cb0-41dee5115148/

Linq to SQL Foreign Keys & Collections

I'm a still trying to wrap myself around LINQ to SQL and foreign key relationships.
I have a table called "Parent" and another called "Children". I have a OneToMany relationship, where a Parent can have multiple Children. My DBML looks something like:
<Table Name="" Member="Parents">
<Type Name="Parent">
<Column Member="ParentID" Type="System.String" IsPrimaryKey="true" CanBeNull="false" />
<Column Member="ChildID" Type="System.String" CanBeNull="false" />
<Association Name="Parent_Child" Member="Childs" ThisKey="ParentID" OtherKey="ParentID" Type="Child" />
</Type>
</Table>
<Table Name="" Member="Childs">
<Type Name="Child">
<Column Member="ChildID" Type="System.String" IsPrimaryKey="true" CanBeNull="false" />
<Column Member="ParentID" Type="System.String" CanBeNull="false" />
<Association Name="Parent_Child" Member="Parent" ThisKey="ParentID" OtherKey="ParentID" Type="Parent" IsForeignKey="true" />
</Type>
</Table>
In my code, I would like do to something like:
// parent has already been loaded from the datacontext
parent.Childs = <some collection of children>
db.SubmitChanges();
But when I do that I get the following error:
A member defining the identity of the object cannot be changed.
Consider adding a new object with new identity and deleting the existing one instead.
Can anyone tell me how to properly accomplish this?
this is actually the error of datacontext,i think u have multiple instance of datacontext..
well u can not add any entity in datacontext witch is already fetched through another instance of datacontext...
even u can so it by setting datacontext.objecttrackingenabled to false and then u can add it to another data context then it will work for sure....

Linq To SQL : Error "Database node not found"

I am attempting to experiment with linq to sql using this site as a guide.
When running a test I keep getting an error parsing the mapping file I created. The error:
System.Xml.Schema.XmlSchemaException : Database node not found. Is the mapping namespace (http://schemas.microsoft.com/linqtosql/mapping/2007) correctly specified?
Here is the mapping file:
<?xml version="1.0" encoding="utf-8"?>
<Database Name="Test" xmlns="http://schemas.microsoft.com/linqtosql/mapping/2007">
<Table Name="dbo.Categories" Member="Category">
<Type Name="Category">
<Column Name="ID" Member="ID" Storage="id" DbType="Char(32) NOT NULL" CanBeNull="false" IsPrimaryKey="true" />
<Column Name="ParentID" Member="ParentID" Storage="parentID" DbType="Char(32)" />
<Column Name="Name" Member="Name" Storage="name" DbType="VarChar(50) NOT NULL" CanBeNull="false" />
</Type>
</Table>
</Database>
Can anyone point me in the right direction?
Figured it out!
the line:
<Table Name="dbo.Categories" Member="Category">
Needed changed to:
<Table Name="dbo.Categories" Member="Categories">
and now its working.