SSIS package merge conflict resolved but causing etl Package load failure - ssis

I was trying to merge two git branches and encountered Merge conflict error.
I tried to resolve them and saved it. But now the whole package is unable to load.
The error shown is " An item with the same key is already added"
The error msg is as below. I am unable to find out where exactly I should make the change.
Could anyone help me to resolve it?
Please let me know if I need to add more info for the troubleshooting to be easier.
at Microsoft.SqlServer.Dts.Runtime.Project.OpenProject(IProjectStorage storage, String projectPassword, IDTSEvents events)
at Microsoft.DataTransformationServices.Project.DataTransformationsProjectLoader.<>c__DisplayClass21_0.<LoadProject>b__0(String password, IDTSEvents events)
at Microsoft.DataTransformationServices.Controls.ProjectProtectionUtils.LoadProjectWithPassword(Boolean askedPasswordOnce, ProjectLoader loader, IWin32Window dialogParent, String& password, ProjectProtectionEvents errorListener)
at Microsoft.DataTransformationServices.Project.DataTransformationsProjectLoader.LoadProject(XmlNode manifestNode, String& projectPassword, ProjectProtectionEvents errorListener)
at Microsoft.DataTransformationServices.Project.DataTransformationsProjectLoader.DeserializeManifestInProjectMode(XmlNode manifestNode)
at Microsoft.DataTransformationServices.Project.DataTransformationsProjectLoader.ConstructProjectHierarchyFrom(ProjectSerialization projectSerialization)
at Microsoft.DataTransformationServices.Project.DataTransformationsProjectLoader.Deserialize(TextReader reader)
at Microsoft.DataWarehouse.VsIntegration.Shell.Project.Serialization.BaseProjectLoader.Load(IFileProjectHierarchy projectHierarchy)
at Microsoft.DataWarehouse.VsIntegration.Shell.Project.FileProjectHierarchy.Load(String pszFilename, UInt32 grfMode, Int32 iReadOnly)

An SSIS package is an XML based file format.
Yes, you should absolutely use source control to version your packages. But you would be best off treating them as binaries because no source tool I am aware of knows how to merge XML documents.
The error you're experiencing is that you have an invalid package declaration. Without seeing the two files and the merge record, it's super hard to guess what's been done, much less rectify it.
SSIS Source Control guidance
After doing SSIS for nearly 20 years, I have a few thoughts on the matter.
Design your packages to be as small and tightly focused on solving a single business problem (Populate Sales table from Excel)
Use package orchestration to solve the dependent package problem (Run the Employee Package, then Customer, then Sales)
Only one developer works on a package at a time. Decompose the package into smaller packages if the business problem supports it to get more developers working on a task
If adding new packages to a project/solution, have a captain/leader create empty/shell packages and commit the project to source control - because the SSDT project artifacts are also XML and subject to the same botched merge logic.

Related

Data Factory SSIS IR Locale settings

Trying to move a legacy dtsx package to data factory, and we have an SSIS IR running, but on running the package, the date format is MM/dd/yyyy instead of the expected MM/dd/YYYY (Australian). The package could have most definitely been written a little better, but we're just wanting to just lift-and-shift with as minimal change as possible. I've tried to look for locale settings in DataFactory/SSIS IR, but unable to find anything. The developer has shown me that the locale setting on the dtsx file is set to Australian. I'm hoping not having to go through each package to update, but rather, if there's a "global" setting somewhere that I can apply that will apply the change across the board, that would be ideal - is this possible, and if so, where would I configure that?
Thanking you in advance.

Is there a function in SSIS to extract functionality from a package to be added to other packages?

I just added error handling functionality to an SSIS package that I am upgrading, and I need to add this same error handling to about 30 more packages. Is there a way to extract the error handling control flow, parameters, variables, etc. so that I can easily add them to the rest of the packages?
I am using Visual Studio Enterprise 2019 and SSIS 15.0.
I found a bunch of articles on BIML, but it looks like that is only for creating new packages. I am aware that copy and paste exists, but I would like to try to find a solution that is easy to apply across future packages as well as the current packages being updated. Apologies if this question has already been asked, I searched, but I'm not sure that I even really know what search terms would be applicable.
Yes, Biml is an excellent choice for creating consistent packages going forward. Even if you're only generating empty packages with error handling logic, that's a pattern and that's the power of Biml.
With the change to BimlExpress and the now free ability to reverse engineer packages, an approach could be to reverse engineer the packages to Biml. That would all be static tier but you'll need to select all and then in a new BimlScript file, add the error handling like so
<#
foreach(AstPackageNode apn in this.RootNode.Packages)
{
if (!apn.Events.Where(x => x.EventType==EventType.OnError).Any())
{
AstTaskEventHandlerNode onError = new AstTaskEventHandlerNode(null);
onError.EventType = EventType.OnError;
onError.Name = "OnError";
// TODO: add tasks and such
apn.Events.Add(onError);
}
//WriteLine(apn.GetBiml());
}
#>
Once that's looking good, you right click on everything at once and generate packages.
A non-Biml approach is going to test your C# (or VB.NET) skills. I've not touched this type of SSIS dev in more than a decade but the concept will remain the same. https://billfellows.blogspot.com/2016/10/what-packages-still-use-configuration.html
You'll need to find all the SSIS packages. For each one of those, use a reference to the DTS Runtime application to load it. Then look at the package's Events collection and if there isn't an OnError, you're going to have to add one to the collection and then add all the associated tasks, configure them and then save.

Razor exceptions

I have undoubtedly set something up wrong but frequently I get exceptions thrown by my Razor templates even though there is no problem with the templates. These are usually fixed by my doing a build.
If I do actually have an error in the template I get a popup asking me to debug in VS, but of course this does not actually allow me to debug the template.
Errors in my log are not all that helpful (see below).
Is it possible to both avoid spurious errors and get better information when there is actually a problem?
ServiceStack.Razor.Templating.TemplateCompilationException: Unable to compile template. Check the Errors list for details.
at ServiceStack.Razor.Templating.TemplateService.CreateTemplate(String template, Type modelType)
at ServiceStack.Razor.Templating.TemplateService.Compile(ViewPageRef viewPageRef, String template, Type modelType, String name)
at ServiceStack.Razor.Templating.TemplateService.Compile(ViewPageRef viewPageRef, String template, String name)
at ServiceStack.Razor.ViewPageRef.Compile(Boolean force)
I was having similar problems. I found the "easiest" way to find out what the error was, was to download all of service stack, build a debug version of the razor libary and link it into my project. I then set a break point in the ServiceStack.Razor.Templating.TemplateService.CreateTemplate method and was able to see the full exception details. From there I learnt that I had included an import in my razor page that was not referenced in my project.
Since I solved this it's been very reliable.
I had trouble with this myself, because ServiceStack swallowed the exceptions, and the logs, as you said, don't show the Errors collection. There are two ways to get that information:
Uncheck Enable Just My Code in the debugging options in Visual Studio (Debug -> Options and Settings). If you have checked Thrown for Common Language Runtime Exceptions in Debug -> Exceptions, you will get the exceptions, and be able to view the Errors collection.
A merge was committed some days ago to the ServiceStack repository, which makes it log the Errors collection. Demis Bellot apparently pushes new versions to NuGet fairly often, so it'll probably be there in a week or two.
I had the same problem. And my case, I have removed some libraries referenced in the project but the reference to them remained (eventhought I think removed it, but anyway) and this has been the problem.
After I deleted the references to libraries which don't exsits anymore in the project, it worked immediatelly.

Redeploying SSIS packages - Cache?

We have noticed an issue recently that redeployed SSIS packages sometime don't seem to include the latest changes... When I search the dtsx using notepad I see the amended script in the code so the changes are definitely there.
My assumption was that script components of SSIS packages are eventually compiled into an assembly somewhere in the process - this is quite likely since I would imagine C# code cannot run without something compiling it first. So in theory if these assemblies would then end up being cached and not immediately overwritten (for some reason) that would explain this issue.
The only "evidence" that makes me think that my theory is correct is if I keep running the package at some point it suddenly shifts to the new code.
However, so far I haven't found why and how this is happening, if is... Can anybody help?
UPDATE:
MSDN says: "Unlike earlier versions where you could indicate whether the scripts were precompiled, all scripts are precompiled in SQL Server 2008 Integration Services (SSIS) and later versions." - If by pre-compiled they mean that instead of the actual package a pre-compiled version runs (I think this because the package itself does not seem to be compiled since the code is visible in Notepad) there must be a way to force the engine to overwrite the pre-compiled assembly... but how?
UPDATE:
One of the four core components of SSIS is the SQL ServerIntegration Services service, which is a windows service. Apparently this service will cache component/task metadata so that the SSIS runtime engine can poll the cache to see what is installed, which may help speed up package load times. However, if the packages are stored in the file system (not in SQL Integration Services) and executed by Agent Jobs, the agent job will use the 64 bit version of DTEXEC to execute the packages. I haven't yet found evidence that any caching would be involved there, but there are certainly options to check a number of parameters in the validation phase of the execution, such as version numbers - may be for a reason.
Have you looked at sysssispackages to compare the version build number of the package in msdb to your build number in Visual Studio / SSIS?
SELECT name, verbuild
FROM msdb.dbo.sysssispackages
WHERE name LIKE '%bla%'
(Adjust WHERE-clause as necessary to find your package. Do NOT ever "SELECT * FROM msdb.dbo.sysssispackages" as it contains the package XML in one of the columns.)
And in Visual Studio, open the package, then right-click at the background of the package and select "Properties" from the context menu. Look at the field VersionBuild. It should match the number from the SELECT above!
I know this is not an actual solution to your problem but it may help locate where the cause of the problem is. If the number is older, it means that your package deployment did not work.
This sounds somewhat familiar to something I ran into a while back. Unfortunately, I don't remember exactly when I ran into this (so I can't check for sure), but I believe the fix I found was to make sure that I explicitly invoked the Build | Build st_5bd541c294054c25b9e7eb55b92bd0e2 command from the script editor (VSTA) menu before closing the window. (The specific project name will be different for each script, obviously, since it's based on a GUID; however, there will only be one possible submenu under Build.)
Explicitly invoking the Build command ensures that the binary code for the script gets ASCII-encoded and saved in the XML of the resulting .dtsx file. I'd gotten used to SSIS 2005 always building for me whenever I closed the script editor. Apparently, there are bizarre edge cases where SSIS 2008 doesn't always build the script project when the editor closes.
BTW, the precompiled binaries appear to be stored in a tag of the source XML called BinaryItem:
<DTS:Executable DTS:ExecutableType="Microsoft.SqlServer.Dts.Tasks.ScriptTask.ScriptTask, Microsoft.SqlServer.ScriptTask, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" DTS:ThreadHint="0">
<DTS:Property DTS:Name="ObjectName">SCR_StepOne</DTS:Property>
<DTS:ObjectData>
<ScriptProject Name="ST_5bd541c294054c25b9e7eb55b92bd0e2" VSTAMajorVersion="2" VSTAMinorVersion="1" Language="CSharp" EntryPoint="Main" ReadOnlyVariables="User::FileOneName,User::OutputFolder" ReadWriteVariables="">
<BinaryItem Name="\bin\release\st_5bd541c294054c25b9e7eb55b92bd0e2.csproj.dll">
TVqQAAMAAAAEAAAA//8AALgAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAgAAAAA4fug4AtAnNIbgBTM0hVGhpcyBwcm9ncmFtIGNhbm5vdCBiZSBydW4gaW4gRE9TIG1v
ZGUuDQ0KJAAAAAAAAABQRQAATAEDADuOb04AAAAAAAAAAOAAAiELAQgAABAAAAAIAAAAAAAAPi8A
AAAgAAAAQAAAAABAAAAgAAAAAgAABAAAAAAAAAAEAAAAAAAAAACAAAAAAgAAAAAAAAMAQIUAABAA
It might be worth checking your source code control system history to see if that was getting updated for some of those screwy errors.
Caveat: I haven't found official Microsoft documentation on this.
This doesn't specifically solve the mystery you have, but if you are running file system-based packages and want to verify that the package that is running is the package you deployed, there is a way to do that.
Build your package.
Open the properties on your package and note down the "Version Build" property (alternatively, open the .dtsx in notepad and find the DTS:VersionBuild attribute.)
Deploy your package.
In your SQL Agent job step, go to the Verification tab.
Enter the Version Build in the "Verify package build" input box.
Execute the job step.
I don't know if this will force SSIS to throw out its cache and get the newly deployed package, but I do know if you modify the .dtsx package's build number by hand and then try to re-run the job step it fails because the package build doesn't match what it's looking for so it is definitely doing a run-time check of that value.

SSIS Common rownumber for both outputs on a flatfile source

I have a small problem (I assume...)
I'm loading a flatfile (csv) and I want to add a rownumber to the dataflow. Using the RowNumber transforation works good for both output paths (source and error) individually. But what if you want to use the same rownumber in both paths to be able to track where (in the file) an error occured. I have scratch my head long enough now and I'm just throwing it out here since I'm pretty sure other people has tumbled across this one...
I have tried the script transformation which seems to work for a while but then it hangs the load.
Any suggestion on how to solve this issue is greatly appreciated.
If I understand you correctly, dynamically generating the number with a script component for the dataflow is not a problem for you.
What I would recommend you is to adopt the following philosophy for stable etl processes coming from files:
Never cast anything in the connector, just import the fields as nvarchars of the maximum lenght they will achieve.
Cast and control each column to your specification.
If a row cannot be read, you will not know the index, but you will know that the file is malformed (extremely rare in my experience, for half transferred files), and it should be rejected anyway.
A quick screenshot of a part of a file loading process shows how the rejection (after assigning row_id) can work (link to dataflow image). To this you can add further countless checks (duplicates...) and even have a repository for the loaded files to check upon the rejects and whatever else you might want to control (Link to control flow image).
In some of my processes, I even use a flat file connector and just import each row as a bulk text and then split it in columns with an intermediate script component, allowing for different versions of the columns in the files.
Anyway, sorry not to be more detailed (due to my status I can't add more links or any images), but I hope that you understand the concept.
Regards,
Francisco.