How to add multiple DataFlow tasks to a Foreach container - ssis

How to add multiple data flow tasks to a single foreach container using EzAPI. Basicly I need to do as following.
I am new to EzAPI. Can anyone give me a code sample for this kind of scenario. Thanks in advance.

Your question can really be distilled down to two questions: How do I create the various Containers and Tasks? How do I define precedence constraints between them?
As you can see in the code below, I create instances of the EzPackage, EzForEachLoop, EzExecSqlTask and EzDataFlowTask. EzAPI tasks and containers all accept a parent object in their constructor. This is how you specify what scope an object should exist at. Thus, the For Each Loop takes the base package as its argument but the Data Flow and Execute SQL Task use the For Each Loop so that they are created inside that container.
There are different mechanisms for defining the Precedence Constraint between objects and it's up to you which version you use: object.AttachTo vs package.PrecedenceConstraints.Add
public static void GimmieDaCodez()
{
EzPackage ezPackage = null;
EzForEachLoop ezLoop = null;
string packageName = #"so_22533130";
string outputFile = string.Format("{0}.dtsx",System.IO.Path.Combine(#"C:\Dropbox\Sandbox\UtumnoSandbox\EzAPIDemo\EzAPIDemo", packageName));
EzDataFlow df1 = null;
EzDataFlow df2 = null;
EzDataFlow df3 = null;
EzExecSqlTask t4 = null;
// Instantiate and configure our package
ezPackage = new EzPackage();
ezPackage.Name = packageName;
ezPackage.Description = "A package with a foreach enumerator and muliple data flows";
// Lazy initialization of FELC
ezLoop = new EzForEachLoop(ezPackage);
ezLoop.Name = "FELC Enumerate stuff";
ezLoop.Description = "EzAPI still does not allow configuration of FELC beyond file enumerator";
// Instantiate our tasks. Details left to the implementer
df1 = new EzDataFlow(ezLoop);
df1.Name = "DFT 1";
df2 = new EzDataFlow(ezLoop);
df2.Name = "DFT 2";
df3 = new EzDataFlow(ezLoop);
df3.Name = "DFT 3";
t4 = new EzExecSqlTask(ezLoop);
t4.Name = "SQL Do all the things";
df2.AttachTo(df1);
df3.AttachTo(df1);
t4.AttachTo(df2);
t4.AttachTo(df3);
ezPackage.SaveToFile(outputFile);
}
Using that code, I generate a package that looks like
References
My EzAPI posts
SSIS Team
EzAPI project subsite

Related

Unmatched Files Processing through ForEach Loop Container

I have some processed and unprocessed files in my Source Folder and the file names
of all the processed files are stored in a table. How can I match the files names of source folder and table prior to ForEach Loop Container and process only unmatched files.
The solution below is a bit elaborate but it's the best I could think of.
STEP 1: Create 2 Variables, both strings.
a)CurrentFile: This will be used for your Foreach Loop Container collection value
b)ToProcess: This will be used to map the result set an Execute SQL Task explained
below
STEP 2: Add an Execute SQL Task into your Foreach Loop Container.
Configure Parameter Mapping as shown below:
Use the script below as your SQL Statement:
DECLARE #ToProcess VARCHAR(1)
IF NOT EXISTS(SELECT [FileNames] FROM [YourFilesTable] WHERE FileNames = ?)
SET #ToProcess = 'Y'
SELECT #ToProcess AS ToProcess
Set ResultSet to Single Row as shown below:
Configure Result Set as shown below:
On the Execute SQL Task, configure the Precedence Constraint as shown below:
Your Foreach Loop Container should look like below:
Before the Foreach Loop, use a Script Task to store the names of unprocessed files in an SSIS object variable, then iterate through this variable to load the new files as you already are. Create an object variable and add this in the ReadWriteVariables field of the Script Task. If you're using an SSIS variable to hold the folder path of the source files as done below, add this in the ReadOnlyVariables field. The Foreach Loop will need to use the Foreach From Variable Enumerator enumerator type. In the Variable field on the Collection page, add the object variable that is populated in the Script Task. As you're probably already doing, add a string variable at Index 0 of the Variable Mapping pane and set this variable as the expression of the ConnectionString property on the connection manager, assuming this is a flat file connection. If this is excel, change the ExcelFilePath property to use this variable as the expression. The example code and referenced namespaces for the Script Task is below and uses C#.
using System.Linq;
using System.Data.SqlClient;
using System.IO;
using System.Collections.Generic;
using System.Data;
string connString = #"Data Source=YourSQLServer;Initial Catalog=YourDatabase;Integrated Security=SSPI;";
string cmdText = #"SELECT DISTINCT ColumnWithFileNames FROM YourDatabase.YourSchema.YourTable";
string sourceFolder = Dts.Variables["User::SourceFilePath"].Value.ToString();
//create DirectoryInfo object from source folder
DirectoryInfo di = new DirectoryInfo(sourceFolder);
List<string> processedFiles = new List<string>();
List<string> newFiles = new List<string>();
//get names of already processed files stored in tavle
using (SqlConnection conn = new SqlConnection(connString))
{
conn.Open();
//data set name does not need to relate to name of table storing processed files
DataSet ds = new DataSet("ProcessedFiles");
SqlDataAdapter da = new SqlDataAdapter(cmdText, conn);
da.Fill(ds, "ProcessedFiles");
foreach (DataRow dr in ds.Tables["ProcessedFiles"].Rows)
{
processedFiles.Add(dr[0].ToString());
}
}
foreach (FileInfo fi in di.EnumerateFiles())
{
//only add files not already processed
if (!processedFiles.Contains(fi.FullName))
{
newFiles.Add(fi.FullName);
}
}
//populate SSIS object variable with unprocessed files
Dts.Variables["User::ObjVar"].Value = newFiles.ToList();

Receiving warning messages with EZAPI EzDerivedColumn and input columns

I am working with EZApi to assist in creating a package to stage data for transformation. It is working in terms of data movement. When opening the package in the designer however there are warning messages surrounding the Derived Column and the InputColumns being set to read only.
Warning 148 Validation warning. Staging TableName:
{AA700319-FC05-4F06-A877-599E826EA833}: The "Additional
Columns.Inputs[Derived Column Input].Columns[DataSourceID]" on
"Additional Columns" has usage type READONLY, but is not referenced by
an expression. Remove the column from the list of available input
columns, or reference it in an expression. StageFull.dtsx 0 0
I can manually change them in the designer to be Read/Write or unselect them and the warning goes away. I am unable to get this to work programmatically however.
I have tried removing the columns from the metadata which works but doesn't remove them from the component so the columns are still created in the xml.
XML section
<externalMetadataColumn refId="Package\Full\Staging TableName\DestinationStaging TableName.Inputs[OLE DB Destination Input].ExternalColumns[DataSourceID]" dataType="i4" name="DataSourceID" />
When I try to go to the underlying object and delete the column using component.DeleteInput(id) I get an error message stating that the input column cannot be removed.
0xC0208010
-1071611888
DTS_E_CANTDELETEINPUT
An input cannot be deleted from the inputs collection.
Here is the code I am using to create a data flow task with an OLEDB Source, Derived Column, and OLE DB Destination.
Note that the input columns are not present until after the derived column is attached to the Source: dc.AttachTo(source);
public class EzMyDataFlow : EzDataFlow
{
public EzMyDataFlow(EzContainer parent, EzSqlOleDbCM sourceconnection,
EzSqlOleDbCM destinationconnection, string destinationtable, string sourcecomannd, string dataflowname)
: base(parent)
{
Name = dataflowname;
EzOleDbSource source = new EzOleDbSource(this);
source.Connection = sourceconnection;
source.SqlCommand = sourcecomannd;
source.AccessMode = AccessMode.AM_SQLCOMMAND;
source.Name = string.Format("Source_{0}", dataflowname);
EzDerivedColumn dc = new EzDerivedColumn(this);
dc.Name = "Additional Columns";
// Setup DataSourceID
string columnName = DBSchema.ReportFoundationalColumns.DataSourceID;
dc.InsertOutputColumn(columnName);
dc.SetOutputColumnDataTypeProperties(columnName, DataType.DT_I4, 0, 0, 0, 0);
var c = dc.OutputCol(columnName);
var property = c.CustomPropertyCollection["Expression"];
property.Name = "Expression";
property.Value = "#[TM::SourceDatabaseID]";
property = c.CustomPropertyCollection["FriendlyExpression"];
property.Name = "FriendlyExpression";
property.Value = "#[TM::SourceDatabaseID]";
dc.AttachTo(source);
EzOleDbDestination destination = new EzOleDbDestination(this);
destination.Table = destinationtable;
destination.Connection = destinationconnection;
destination.Name = string.Format("Destination{0}", dataflowname);
destination.AttachTo(dc);
}
}

Retrieving column mapping info in T4

I'm working on a T4 file that generates .cs classes based on an entity model, and one of the things I'm trying to get to is the mapping info in the model. Specifically, for each field in the model I'm trying retrieve the database field name it is mapped to.
I've found that the mapping info is apparently stored in StorageMappingItemCollection, but am having an impossible time figuring out how to query it and retrieve the data I need. Has anyone worked with this class and can maybe provide guidance?
The code I have so far goes something like this (I've pasted everything up to the problematic line):
<#
System.Diagnostics.Debugger.Launch();
System.Diagnostics.Debugger.Break();
#>
<## template language="C#" debug="true" hostspecific="true"#>
<## include file="EF.Utility.CS.ttinclude"#>
<## output extension=".cs"#><#
CodeGenerationTools code = new CodeGenerationTools(this);
MetadataLoader loader = new MetadataLoader(this);
CodeRegion region = new CodeRegion(this, 1);
MetadataTools ef = new MetadataTools(this);
string inputFile = #"MyModel.edmx";
EdmItemCollection ItemCollection = loader.CreateEdmItemCollection(inputFile);
StoreItemCollection storeItemCollection = null;
loader.TryCreateStoreItemCollection(inputFile, out storeItemCollection);
StorageMappingItemCollection storageMappingItemCollection = null;
loader.TryCreateStorageMappingItemCollection(
inputFile, ItemCollection, storeItemCollection, out storageMappingItemCollection);
var item = storageMappingItemCollection.First();
storageMappingItemCollection has methods like GetItem() and such, but I can't for the life of me get it to return data on fields that I know exist in the model.
Thx in advance!
Parsing the MSL isn't really that hard with Linq to XML
string mslManifestResourceName = GetMslName(ConfigurationManager.ConnectionStrings["Your Connection String"].ConnectionString);
var stream = Assembly.GetExecutingAssembly().GetManifestResourceStream(mslManifestResourceName);
XmlReader xreader = new XmlTextReader(stream);
XDocument doc = XDocument.Load(xreader);
XNamespace xmlns = "http://schemas.microsoft.com/ado/2009/11/mapping/cs";
var items = from entitySetMap in doc.Descendants(xmlns + "EntitySetMapping")
let entityTypeMap = entitySetMap.Element(xmlns + "EntityTypeMapping")
let mappingFragment = entityTypeMap.Element(xmlns + "MappingFragment")
select new
{
EntitySet = entitySetMap.Attribute("Name").Value,
TypeName = entityTypeMap.Attribute("TypeName").Value,
TableName = mappingFragment.Attribute("StoreEntitySet").Value
};
It may be easier to parse the EDMX file as XML rather than using the StorageMappingItemCollection.

linq2sql best way to update POCO having many columns

I am trying to update a POCO using lin2sql. I can also use entity framework. For updating objects I follow the next routine.
//GridView Control gives me some updated POCOS As an Example: Person updated;
function UpdatePerson(Person myUpdatedPersonfromUI)
{
using (Entity con = new Entity() ) {
var recordFromdB = from obj in con.Person where obj.PK = myUpdatedPersonfromUI.PK
select obj;
Person personOnDB = recordFromdB.Single();
// now for each column I update personOnDB
personOnDB.Property1 = myUpdatedPersonfromUI.Property1 ;
personOnDB.Property2 = myUpdatedPersonfromUI.Property2 ;
personOnDB.Property3 = myUpdatedPersonfromUI.Property3 ;
personOnDB.Property4 = myUpdatedPersonfromUI.Property4 ;
// continue updating fields ...
..
.
personOnDB.Property124 = myUpdatedPersonfromUI.Property124 ;
con.SaveChanges();
}
}
Do I have to update each property manually . Please help .
You can use an object mapping tool like AutoMapper which pretty much will do the work for you - in this simple case (property names match between source and target) it would be a one-liner to map these.
Given that you are using you same object type in the UI, why not just plug it into the DAL and call update directly?
This link gives details on how to do it.

Modifiy column attribute using ADOX [ vc++ and MS Access]

I have to add new columns in existing table. I can able to successfully add new column, but following exception occur while tying to modify the column attribute to nullable.
Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done
Here my code,
HRESULT hr = S_OK;
ADOX::_CatalogPtr pCatalog = NULL;
ADOX::_TablePtr pTable = NULL;
ADOX::TablesPtr pTables = NULL;
hr = pCatalog.CreateInstance(__uuidof(Catalog));
pCatalog->PutActiveConnection("Provider='Microsoft.JET.OLEDB.4.0';data source='C:\\sample.mdb';");
pTables = pCatalog->GetTables();
pTable = pTables->Item["sampletable"];
hr = pTable->Columns->Append( "age", ADOX::adInteger, 0);
ASSERT(hr == S_OK);
pTable->Columns->Item["age"]->Attributes = ADOX::adColNullable;
The equivalent code in VBA works for me without error (assuming I have translated it faithfully).
Something perhaps to try is to create a Column object, set its properties including NULLable then append it to the Table object's Columns collection e.g. this in VBA:
Set oColumn = New ADOX.Column
oColumn.Name = "age"
oColumn.Type = ADOX.adInteger
oColumn.Attributes = ADOX.adColNullable
oTable.Columns.Append oColumn