SGE error can't open output file - sungridengine

I'm trying to run PsN with run_on_sge but keep getting this error when I submit multiple runs simultaneously. I'm also getting "error: can't chdir to directory".
I suspect the PsN scripts may not be creating the run directories before submitting the job but am not certain. I'm also a little unsure of how to test this.
The nodes have access to the NFS mounted directories in question and the permissions look fine. Any pointers appreciated!!

Note to self: don't mess up the UID's when creating user accounts on the compute nodes.
UID's did not match between (3) compute nodes and the head node... which resulted in 'intermitted' failures with the error mentioned. Whenever a job was submitted to a UID mismatched node the job failed with what now seems to be a pretty obvious error message.

Related

Error: The process '/usr/bin/git' failed with exit code 1 (terratest CI fetching module from terraform cloud)

Getting this error
I have all the necessary tokens included in the yaml.
The code ran with no issues before.
Check if the issue persists considering there were an outage on GitHub side (regarding Codespace), while there was no recent incident on Terraform Cloud.
The first one might have an incidence on accessing the second one.

Ssis logging doesnt work in absense of folder

I have setup ssis logging to a text file. In the connection manager I have selected create file and given path as c:\logs\log.txt
Notice that log file is not generated if the log folder is absent. How to ensure that folder is created if not exists? I tried choosing create folder on connection manager but that is also not creating the log file in absence of the c:\log folder.
How to ensure folder is auto created and log is always generated?
You have a chicken and egg scenario here. Consider the following replication of your problem
I have the connection manager driven by a variable LogFileName which generates the date and time. That file lives in whatever folder is specified by LogPath and the first thing my package does is create the folder if it does not already exist. "This thing can run anywhere and all is good." I've said that plenty and have the scars to show for it.
The following shows the events you can choose to log (based on what is in my package).
I am only logging OnPostExecute events. So I'm good, right? Because the post execute event won't fire until after that File System Task has completed.
If that were the case, you wouldn't have posted a question.
The first event that a package generates is a PackageStart event. Look at that list of events - no ability to filter that out. It doesn't matter whether you want that event logged or not, the logging handlers hear the PackageStart event and record it. Always.
The specified Text file logger should be used to record the data and it's ready to record PackageStart to file... "oh that path doesn't exist."
It will exist once the very first task (File System Task, Create Folder) has completed but alas, it it too late. You either get the complete sequence of events or none.
In your Output window, you would see something like the following
SSIS package "C:\Users\bfellows\source\repos\PackageDeploymentModel\PackageDeploymentModel\ChickenAndEgg_Logging.dtsx" starting.
Error: 0xC001404B at ChickenAndEgg_Logging, Log provider "SSIS log provider for Text files": The SSIS logging provider has failed to open the log. Error code: 0x80070003.
The system cannot find the path specified.
Error: 0xC001404B at ChickenAndEgg_Logging, Log provider "SSIS log provider for Text
files": The SSIS logging provider has failed to open the log. Error code: 0x80070003.
The system cannot find the path specified.
SSIS package "C:\Users\bfellows\source\repos\PackageDeploymentModel\PackageDeploymentModel\ChickenAndEgg_Logging.dtsx" finished: Success.
The package will show your Control Flow objects as all having gone green/OK and the status message will say it "Package execution completed with success" but on the results tab, you'll have a red X showing the log provider couldn't open the log.
What do I do
Preconfigure your environments as part of the package deployment process. When we used the native logger as you're inquiring about, we had a document that laid out all that new developers/new servers needed to have done to ensure all of this stuff was laid out and configured as it needed to be.
Unless a client has a strong business case for using the classic logging methodology, I would encourage them to not use it and instead rely on the SSISDB's native logging. It's cleaner, easier to manage, no special setup required. To quote the fine folks in Cupertino - it just works

Error while executing Work Item "Cannot find the addin file"

I am new to the Design Automation API, so please excuse and correct me if I am using the wrong terms. I am setting up the wiring for my very first Design Automation AppBundle, and I have almost all of it working. I followed the patterns in the "Delete Walls" tutorial.
I have a working add-in DLL that I can test locally and it runs under the "design.automation-csharp-revit.local.debug.tool".
I also have all of the Rest API connections setup, and I can successfully submit a WorkItem that will download a Revit file from a BIM 360, and start processing it in the sandbox of Design Automation. But I am getting an error during the execution on the sandbox where it seems it can't find my add-in file. Here is an excerpt from the WorkItem log:
[07/21/2020 18:02:26] Resolving location of Revit/RevitCoreEngine installation...
[07/21/2020 18:02:26] Running user application....
[07/21/2020 18:02:31] Cannot find the addin file:
[07/21/2020 18:02:31] Fail to deploy Addon DLL(s) in AppPackages.
[07/21/2020 18:02:31] RESULT: Failure
I have looked through "bundle" ZIP file many times looking for typos that could cause this, but I can't find anything, it looks identical to the "delete walls" example. So I'm wondering if there is somewhere else that I need to look. Or any other way I could debug this to find out were the connection is missing. I can only assume that the AppBundle and Activity items are setup correctly since I am getting this far and the error is not mentioning either of those items.
Any tips on where to look?
This turned out to be a misspelling of the [dot]bundle folder extension that triggered the issue.

Ejabberd not sending presence stanza to other roster members

I have an internal web chat application implemented in python/django and using ejabberd 2.1.3 and Strophe.js.
When you open the website with a special access link, a cmd line call is executed to add you to a shared roster group:
ejabberdctl srg_user_add 00024-540-1mCYpYTTCRcjJK5OWI7cWs xmpp.mydomain.com
1mCYpYTTCRcjJK5OWI7cWs xmpp.mydomain.com
The executes successfully and if I execute this cmd manually the members are shown fine:
ejabberdctl srg_get_members 1mCYpYTTCRcjJK5OWI7cWs xmpp.mydomain.com
00024-540-1mcypyttcrcjjk5owi7cws#xmpp.mydomain.com
01114-540-1mcypyttcrcjjk5owi7cws#xmpp.mydomain.com
However, in the Strophe.js presence handler I only receive a presence stanza for myself, meaning if I open two different links in two different browsers for two different members, I do not receive a presence stanza for the other member.
If set the ejabberd log level to debug, this is reflected here.
<presence xmlns='jabber:client' from='0031-666-vjuuogji8mxvo5edtsvre#xmpp.mydomain.com/3189433061352390311794558' to='0031-666-vjuuogji8mxvo5edtsvre#xmpp.mydomain.com/3189433061352390311794558'/>
<presence xmlns='jabber:client' from='0030-666-vjuuogji8mxvo5edtsvre#xmpp.mydomain.com/3319710041135238652858307' to='0030-666-vjuuogji8mxvo5edtsvre#xmpp.mydomain.com/3319710041135238652858307'/>
It's missing the stanza to the other member:
<presence xmlns='jabber:client' from='0031-666-vjuuogji8mxvo5edtsvre#xmpp.mydomain.com/3189433061352390311794558' to='0030-666-vjuuogji8mxvo5edtsvre#xmpp.mydomain.com/3319710041135238652858307'/>
I have the same setup on a second server where it works fine, and I see the stanzas exactly as expected.
Before, it also worked on the problem server, but last week we had a total power failure in the data center and the server went down. Since then I cannot seem to get it to work again.
Could it be that some files might have been corrupted due to the power loss? Is there something I should clean up?
Unfortunately the previously responsible person is no longer in our company...
I found the problem - due to some stupid internal reasons, the
ejabberdctl srg_create ...
was missing, so the members were added to a non-existent group.
I just do not understand why the srg_user_add and srg_get_members commands worked without error. If they had given some hint about the group not existing I would have found the problem sooner

Error handling. How should a program do it?

How should a program handle errors? Example:
A program needs the file text.txt. It must exist and be writable. What should it do if it's not writable or doesn't exist? Should it try to chown/chmod the file? Should it try to create it or just display an error message?
Or: Should it try to find a solution or just display an error message?
It's up to you how to handle it. You have to define your scenarios, user interactions, and other parts of the program. Once you define those it is time to implement and test those scenarios.
Some questions to ask:
What data is being written to the file?
How critical is it that the data get saved?
If an error is reported, who will see the error?
If an error is reported, how do you expect a user to react? And what are their options?
I would go with Eilon's answer for the most part but would add the following caveat - I would not try to chown/chmod a file, unless you really need to i.e. if the purpose of your program is managing file permissions or acting as an installer of some sort. This is because a) your attempted chmod/chown may not work and b) your application should respect the user privileges with which it is run - if these are not sufficient you should inform the user via whatever mechanism you do this.
Your program should output an error on STDERR and return with an exit code different than zero.
For more informations :
http://en.wikipedia.org/wiki/Exit_status