Redeploying SSIS packages - Cache? - ssis

We have noticed an issue recently that redeployed SSIS packages sometime don't seem to include the latest changes... When I search the dtsx using notepad I see the amended script in the code so the changes are definitely there.
My assumption was that script components of SSIS packages are eventually compiled into an assembly somewhere in the process - this is quite likely since I would imagine C# code cannot run without something compiling it first. So in theory if these assemblies would then end up being cached and not immediately overwritten (for some reason) that would explain this issue.
The only "evidence" that makes me think that my theory is correct is if I keep running the package at some point it suddenly shifts to the new code.
However, so far I haven't found why and how this is happening, if is... Can anybody help?
UPDATE:
MSDN says: "Unlike earlier versions where you could indicate whether the scripts were precompiled, all scripts are precompiled in SQL Server 2008 Integration Services (SSIS) and later versions." - If by pre-compiled they mean that instead of the actual package a pre-compiled version runs (I think this because the package itself does not seem to be compiled since the code is visible in Notepad) there must be a way to force the engine to overwrite the pre-compiled assembly... but how?
UPDATE:
One of the four core components of SSIS is the SQL ServerIntegration Services service, which is a windows service. Apparently this service will cache component/task metadata so that the SSIS runtime engine can poll the cache to see what is installed, which may help speed up package load times. However, if the packages are stored in the file system (not in SQL Integration Services) and executed by Agent Jobs, the agent job will use the 64 bit version of DTEXEC to execute the packages. I haven't yet found evidence that any caching would be involved there, but there are certainly options to check a number of parameters in the validation phase of the execution, such as version numbers - may be for a reason.

Have you looked at sysssispackages to compare the version build number of the package in msdb to your build number in Visual Studio / SSIS?
SELECT name, verbuild
FROM msdb.dbo.sysssispackages
WHERE name LIKE '%bla%'
(Adjust WHERE-clause as necessary to find your package. Do NOT ever "SELECT * FROM msdb.dbo.sysssispackages" as it contains the package XML in one of the columns.)
And in Visual Studio, open the package, then right-click at the background of the package and select "Properties" from the context menu. Look at the field VersionBuild. It should match the number from the SELECT above!
I know this is not an actual solution to your problem but it may help locate where the cause of the problem is. If the number is older, it means that your package deployment did not work.

This sounds somewhat familiar to something I ran into a while back. Unfortunately, I don't remember exactly when I ran into this (so I can't check for sure), but I believe the fix I found was to make sure that I explicitly invoked the Build | Build st_5bd541c294054c25b9e7eb55b92bd0e2 command from the script editor (VSTA) menu before closing the window. (The specific project name will be different for each script, obviously, since it's based on a GUID; however, there will only be one possible submenu under Build.)
Explicitly invoking the Build command ensures that the binary code for the script gets ASCII-encoded and saved in the XML of the resulting .dtsx file. I'd gotten used to SSIS 2005 always building for me whenever I closed the script editor. Apparently, there are bizarre edge cases where SSIS 2008 doesn't always build the script project when the editor closes.
BTW, the precompiled binaries appear to be stored in a tag of the source XML called BinaryItem:
<DTS:Executable DTS:ExecutableType="Microsoft.SqlServer.Dts.Tasks.ScriptTask.ScriptTask, Microsoft.SqlServer.ScriptTask, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" DTS:ThreadHint="0">
<DTS:Property DTS:Name="ObjectName">SCR_StepOne</DTS:Property>
<DTS:ObjectData>
<ScriptProject Name="ST_5bd541c294054c25b9e7eb55b92bd0e2" VSTAMajorVersion="2" VSTAMinorVersion="1" Language="CSharp" EntryPoint="Main" ReadOnlyVariables="User::FileOneName,User::OutputFolder" ReadWriteVariables="">
<BinaryItem Name="\bin\release\st_5bd541c294054c25b9e7eb55b92bd0e2.csproj.dll">
TVqQAAMAAAAEAAAA//8AALgAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAgAAAAA4fug4AtAnNIbgBTM0hVGhpcyBwcm9ncmFtIGNhbm5vdCBiZSBydW4gaW4gRE9TIG1v
ZGUuDQ0KJAAAAAAAAABQRQAATAEDADuOb04AAAAAAAAAAOAAAiELAQgAABAAAAAIAAAAAAAAPi8A
AAAgAAAAQAAAAABAAAAgAAAAAgAABAAAAAAAAAAEAAAAAAAAAACAAAAAAgAAAAAAAAMAQIUAABAA
It might be worth checking your source code control system history to see if that was getting updated for some of those screwy errors.
Caveat: I haven't found official Microsoft documentation on this.

This doesn't specifically solve the mystery you have, but if you are running file system-based packages and want to verify that the package that is running is the package you deployed, there is a way to do that.
Build your package.
Open the properties on your package and note down the "Version Build" property (alternatively, open the .dtsx in notepad and find the DTS:VersionBuild attribute.)
Deploy your package.
In your SQL Agent job step, go to the Verification tab.
Enter the Version Build in the "Verify package build" input box.
Execute the job step.
I don't know if this will force SSIS to throw out its cache and get the newly deployed package, but I do know if you modify the .dtsx package's build number by hand and then try to re-run the job step it fails because the package build doesn't match what it's looking for so it is definitely doing a run-time check of that value.

Related

Is there a function in SSIS to extract functionality from a package to be added to other packages?

I just added error handling functionality to an SSIS package that I am upgrading, and I need to add this same error handling to about 30 more packages. Is there a way to extract the error handling control flow, parameters, variables, etc. so that I can easily add them to the rest of the packages?
I am using Visual Studio Enterprise 2019 and SSIS 15.0.
I found a bunch of articles on BIML, but it looks like that is only for creating new packages. I am aware that copy and paste exists, but I would like to try to find a solution that is easy to apply across future packages as well as the current packages being updated. Apologies if this question has already been asked, I searched, but I'm not sure that I even really know what search terms would be applicable.
Yes, Biml is an excellent choice for creating consistent packages going forward. Even if you're only generating empty packages with error handling logic, that's a pattern and that's the power of Biml.
With the change to BimlExpress and the now free ability to reverse engineer packages, an approach could be to reverse engineer the packages to Biml. That would all be static tier but you'll need to select all and then in a new BimlScript file, add the error handling like so
<#
foreach(AstPackageNode apn in this.RootNode.Packages)
{
if (!apn.Events.Where(x => x.EventType==EventType.OnError).Any())
{
AstTaskEventHandlerNode onError = new AstTaskEventHandlerNode(null);
onError.EventType = EventType.OnError;
onError.Name = "OnError";
// TODO: add tasks and such
apn.Events.Add(onError);
}
//WriteLine(apn.GetBiml());
}
#>
Once that's looking good, you right click on everything at once and generate packages.
A non-Biml approach is going to test your C# (or VB.NET) skills. I've not touched this type of SSIS dev in more than a decade but the concept will remain the same. https://billfellows.blogspot.com/2016/10/what-packages-still-use-configuration.html
You'll need to find all the SSIS packages. For each one of those, use a reference to the DTS Runtime application to load it. Then look at the package's Events collection and if there isn't an OnError, you're going to have to add one to the collection and then add all the associated tasks, configure them and then save.

SSIS package merge conflict resolved but causing etl Package load failure

I was trying to merge two git branches and encountered Merge conflict error.
I tried to resolve them and saved it. But now the whole package is unable to load.
The error shown is " An item with the same key is already added"
The error msg is as below. I am unable to find out where exactly I should make the change.
Could anyone help me to resolve it?
Please let me know if I need to add more info for the troubleshooting to be easier.
at Microsoft.SqlServer.Dts.Runtime.Project.OpenProject(IProjectStorage storage, String projectPassword, IDTSEvents events)
at Microsoft.DataTransformationServices.Project.DataTransformationsProjectLoader.<>c__DisplayClass21_0.<LoadProject>b__0(String password, IDTSEvents events)
at Microsoft.DataTransformationServices.Controls.ProjectProtectionUtils.LoadProjectWithPassword(Boolean askedPasswordOnce, ProjectLoader loader, IWin32Window dialogParent, String& password, ProjectProtectionEvents errorListener)
at Microsoft.DataTransformationServices.Project.DataTransformationsProjectLoader.LoadProject(XmlNode manifestNode, String& projectPassword, ProjectProtectionEvents errorListener)
at Microsoft.DataTransformationServices.Project.DataTransformationsProjectLoader.DeserializeManifestInProjectMode(XmlNode manifestNode)
at Microsoft.DataTransformationServices.Project.DataTransformationsProjectLoader.ConstructProjectHierarchyFrom(ProjectSerialization projectSerialization)
at Microsoft.DataTransformationServices.Project.DataTransformationsProjectLoader.Deserialize(TextReader reader)
at Microsoft.DataWarehouse.VsIntegration.Shell.Project.Serialization.BaseProjectLoader.Load(IFileProjectHierarchy projectHierarchy)
at Microsoft.DataWarehouse.VsIntegration.Shell.Project.FileProjectHierarchy.Load(String pszFilename, UInt32 grfMode, Int32 iReadOnly)
An SSIS package is an XML based file format.
Yes, you should absolutely use source control to version your packages. But you would be best off treating them as binaries because no source tool I am aware of knows how to merge XML documents.
The error you're experiencing is that you have an invalid package declaration. Without seeing the two files and the merge record, it's super hard to guess what's been done, much less rectify it.
SSIS Source Control guidance
After doing SSIS for nearly 20 years, I have a few thoughts on the matter.
Design your packages to be as small and tightly focused on solving a single business problem (Populate Sales table from Excel)
Use package orchestration to solve the dependent package problem (Run the Employee Package, then Customer, then Sales)
Only one developer works on a package at a time. Decompose the package into smaller packages if the business problem supports it to get more developers working on a task
If adding new packages to a project/solution, have a captain/leader create empty/shell packages and commit the project to source control - because the SSDT project artifacts are also XML and subject to the same botched merge logic.

ivy publish multi modules - how to continue on publishing others if one fails

I have an ant project with over 100 modules. I cycle through all modules compile, package, and publish in one build run. However, when one ivy:publish fails (due to random connection issue), the entire build exits.
I would like the build process to continue compile/publish the remaining modules even if one module fails to publish for whatever reason.
Is there some settings in ivy:publish to prevent exiting upon error or some other way to achieve this?
thanks
Since you appear to be using ANT to call multiple sub-builds, then I would submit this is a control loop problem rather that something specific to ivy. In other words you are best advised to ensure each module's build is as stand-alone as you can make them and then in your loop each module's build should succeed or fail.
You have not indicated what your main build file looks like? I would high recommend using the subant task, as this has a "failonerror" flag that will give you your desired behaviour (build will continue on if a module fails).
<subant failonerror="true">
<fileset dir="." includes="**/build.xml" excludes="build.xml"/>
<target name="clean"/>
<target name="build"/>
</subant>
This should be enough to solve your problem. Any build that fails can be manually re-run. In practice this might be difficult since one module failing might cause a subsequent build to fail due to missing dependencies..... You need to judge the risks of this for yourself.
You can even further complicate your solution later, by using an embedded script to run module builds. If you have lots and lots of errors you might want to add some bespoke error handling logic.
Move a ant dir project after the ant or subant task completes

Do client applications still need to reference Log4Net if using Castle LoggingFacility?

The reason I ask this is if the log4net.dll is not in the GAC and not in the current Executing Assembly directory, castle Logging facility will not find it.
I have built a class library that is going to be used by multiple client applications to do logging using Castle and Windsor, I have noticed that I don't need a reference to the log4net.dll in my class library at all, it just needs to be able to see the dll at runtime.
So I am just wondering where the reference should truly be because if I put in my class library it is not copied over to clients even though copy local is true, I think because it actually is never used directly.
You definitely need to provide log4net.dll to get Castle logging facility to work (assuming you have configured the logging facility to use log4net). You are correct that you no longer need to reference log4net directly in your projects, because you will now be using Castle.Core's ILogger interface to actually write log messages. Your application still depends on log4net, albeit indirectly.
Visual studio normally handles these kinds of "indirect" references (A depends on B depends on C) properly (it copies C over to A's output directory). However, it does not copy the indirect reference (it does not copy C to A's output directory) when C is in the GAC on the machine performing the build.
My guess is that you have log4net.dll GAC'd on your development machine.
To resolve this, you either need to remove log4net.dll from your GAC (and from the GAC on any machine where you'll be building the app), or you must explicitly reference log4net.dll in your top-level executable (project A in the example above) and set "copy local" to true. This forces the compiler to copy the dll to the output directory.
This issue was also discussed in this SO question

What should NOT be under source control?

It would be nice to have a more or less complete list over what files and/or directories that shouldn't (in most cases) be under source control. What do you think should be excluded?
Suggestion so far:
In general
Config files with sensitive information (passwords, private keys etc.)
Thumbs.db, .DS_Store and desktop.ini
Editor backups: *~ (emacs)
Generated files (for instance DoxyGen output)
C#
bin\*
obj\*
*.exe
Visual Studio
*.suo
*.ncb
*.user
*.aps
*.cachefile
*.backup
_UpgradeReport_Files
Java
*.class
Eclipse
I don't know, and this is what I'm looking for right now :-)
Python
*.pyc
Temporary files
- .*.sw?
- *~
Anything that is generated. Binary, bytecode, code/documents generated from XML.
From my commenters, exclude:
Anything generated by the build, including code documentations (doxygen, javadoc, pydoc, etc.)
But include:
3rd party libraries that you don't have the source for OR don't build.
FWIW, at my work for a very large project, we have the following under ClearCase:
All original code
Qt source AND built debug/release
(Terribly outdated) specs
We do not have built modules for our software. A complete binary is distributed every couple weeks with the latest updates.
OS specific files, generated by their file browsers such as
Thumbs.db and .DS_Store
Some other Visual Studio typical files/folders are
*.cachefile
*.backup
_UpgradeReport_Files
My tortoise global ignore pattern for example looks like this
bin obj *.suo *.user *.cachefile *.backup _UpgradeReport_Files
files that get built should not be checked in
I would approach the problem a different way; what things should be included in source control? You should only source control those files that:
( need revision history OR are created outside of your build but are part of the build, install, or media ) AND
can't be generated by the build process you control AND
are common to all users that build the product (no user config)
The list includes things like:
source files
make, project, and solution files
other build tool configuration files (not user related)
3rd party libraries
pre-built files that go on the media like PDFs & documents
documentation
images, videos, sounds
description files like WSDL, XSL
Sometimes a build output can be a build input. For example, an obfuscation rename file may be an output and an input to keep the same renaming scheme. In this case, use the checked-in file as the build input and put the output in a different file. After the build, check out the input file and copy the output file into it and check it in.
The problem with using an exclusion list is that you will never know all the right exclusions and might end up source controlling something that shouldn't be source controlled.
Like Corey D has said anything that is generated, specifically anything that is generated by the build process and development environment are good candidates. For instance:
Binaries and installers
Bytecode and archives
Documents generated from XML and code
Code generated by templates and code generators
IDE settings files
Backup files generated by your IDE or editor
Some exceptions to the above could be:
Images and video
Third party libraries
Team specific IDE settings files
Take third party libraries, if you need to ship or your build depends on a third party library it wouldn't be unreasonable to put it under source control, especially if you don't have the source. Also consider some source control systems aren't very efficient at storing binary blobs and you probably will not be able to take advantage of the systems diff tools for those files.
Paul also makes a great comment about generated files and you should check out his answer:
Basically, if you can't reasonably
expect a developer to have the exact
version of the exact tool they need,
there is a case for putting the
generated files in version control.
With all that being said ultimately you'll need to consider what you put under source control on a case by case basis. Defining a hard list of what and what not to put under it will only work for some and only probably for so long. And of course the more files you add to source control the longer it will take to update your working copy.
Anything that can be generated by the IDE, build process or binary executable process.
An exception:
4 or 5 different answers have said that generated files should not go under source control. Thats not quite true.
Files generated by specialist tools may belong in source control, especially if particular versions of those tools are necessary.
Examples:
parsers generated by bison/yacc/antlr,
autotools files such as configure or Makefile.in, created by autoconf, automake, libtool etc,
translation or localization files,
files may be generated by expensive tools, and it might be cheaper to only install them on a few machines.
Basically, if you can't reasonably expect a developer to have the exact version of the exact tool they need, there is a case for putting the generated files in version control.
This exception is discussed by the svn guys in their best practices talk.
Temp files from editors.
.*.sw?
*~
etc.
desktop.ini is another windows file I've seen sneak in.
Config files that contain passwords or any other sensitive information.
Actual config files such a web.config in asp.net because people can have different settings. Usually the way I handle this is by having a web.config.template that is on SVN. People get it, make the changes they want and rename it as web.config.
Aside from this and what you said, be careful of sensitive files containing passwords (for instance).
Avoid all the annoying files generated by Windows (thumb) or Mac OS (.ds_store)
*.bak produced by WinMerge.
additionally:
Visual Studio
*.ncb
The best way I've found to think about it is as follows:
Pretend you've got a brand-new, store-bought computer. You install the OS and updates; you install all your development tools including the source control client; you create an empty directory to be the root of your local sources; you do a "get latest" or whatever your source control system calls it to fetch out clean copies of the release you want to build; you then run the build (fetched from source control), and everything builds.
This thought process tells you why certain files have to be in source control: all of those necessary for the build to work on a clean system. This includes .designer.cs files, the outputs of T4 templates, and any other artifact that the build will not create.
Temp files, config for anything other than global development and sensitive information
Things that don't go into source control come in 3 classes
Things totally unrelated to the project (obviously)
Things that can be found on installation media, and are never changed (eg: 3rd-party APIs).
Things that can be mechanically generated, via your build process, from things that are in source control (or from things in class 2).
Whatever the language :
cache files
generally, imported files should not either (like images uploaded by users, on a web application)
temporary files ; even the ones generated by your OS (like thumbs.db under windows) or IDE
config files with passwords ? Depends on who has access to the repository
And for those who don't know about it : svn:ignore is great!
If you have a runtime environment for your code (e.g. dependency libraries, specific compiler versions etc.) do not put the packages into the source control. My approach is brutal, but effective. I commit a makefile, whose role is to downloads (via wget) the stuff, unpack it, and build my runtime environment.
I have a particular .c file that does not go in source control.
The rule is nothing in source control that is generated during the build process.
The only known exception is if a tool requires an older version of itself to build (bootstrap problem). In that case you will need a known good bootstrap copy in source control so you can build from blank.
Going out on a limb here, but I believe that if you use task lists in Visual Studio, they are kept in the .suo file. This may not be a reason to keep them in source control, but it is a reason to keep a backup somewhere, just in case...
A lot of time has passed since this question was asked, and I think a lot of the answers, while relevant, don't have hard details on .gitignore on a per language or IDE level.
Github came out with a very useful, community collaborated list of .gitignore files for all sorts of projects and IDEs that is worth taking a look.
Here's a link to that git repo: https://github.com/github/gitignore
To answer the question, here are the related examples for:
C# -> see Visual Studio
Visual Studio
Java
Eclipse
Python
There are also OS-specific .gitignore files. Following:
Windows
OS X
Linux
So, assuming you're running Windows and using Eclipse, you can just concatenate Eclipse.gitignore and Windows.gitignore to a .gitignore file in the top level directory of your project. Very nifty stuff.
Don't forget to add the .gitignore to your repo and commit it!
Chances are, your IDE already handles this for you. Visual Studio does anyway.
And for the .gitignore files, If you see any files or patterns missing in a particular .gitignore, you can open a PR on that file with the proposed change. Take a look at the commit and pull request trackers for ideas.
I am always using www.gitignore.io to generate a proper one .ignore file.
Opinion: everything can be in source control, if you need to, unless it brings significant repository overhead such as frequently changing or large blobs.
3rd party binaries, hard-to-generate (in terms of time) generated files to speed up your deployment process, all are ok.
The main purpose of source control is to match one coherent system state to a revision number. If it would be possible, I'd freeze the entire universe with the code - build tools and the target operating system.