Receive email only when all the tasks are completed - sungridengine

I am launching a lot of jobs on a cluster as an array (similarly to what explained in http://www3.imperial.ac.uk/bioinfsupport/help/cluster_usage/submitting_array_jobs)
If I use $ -m ea I receive hundreds of emails, one for job.
How can I receive an email only when all the tasks are completed? Is it possible to receive when all the tasks are completed but also an email when any of the task is aborted?

According to my knowledge, this does not seem possible. Others may have more experience, so I defer final solution to those with more experience.
However, what you can do is:
Submit your job array without the -m option (or with -m a to track aborted tasks)
submit a second single dummy job using -hold_jid_ad <job_id_of_job_array> and -m e option.
This will send email when hold on on single job (step 2) is satisfied i.e. when all tasks in your job array complete (step 1).

Related

Execution ID on Google Cloud Run

I am wondering if there exists an execution id into Cloud Run as the one into Google Cloud Functions?
An ID that identifies each invocation separately, it's very useful to use the "Show matching entries" in Cloud Logging to get all logs related to an execution.
I understand the execution process is different, Cloud Run allows concurrency, but is there a workaround to assign each log to a certain execution?
My final need is to group at the same line the request and the response. Because, as for now, I am printing them separately and if a few requests arrive at the same time, I can't see what response corresponds to what request...
Thank you for your attention!
Open Telemetry looks like a great solution, but the learning and manipulation time isn't negligible,
I'm going with a custom id created in before_request, stored in Flask g and called at every print().
#app.before_request
def before_request_func():
execution_id = uuid.uuid4()
g.execution_id = execution_id

How to complete an converging parallal gateway in test

I wrote some JUnit-tests on my process. In some cases I used
RuntimeService
.createProcessInstanceByKey("ID") //
.startBeforeActivity("taskID") //
.setVariables(map) //
.execute()
to start a process from a given task (not from the beginning).
This works well so far. In one case, the starting task is in one of two flows after a parallel gateway. The process now just executes until it reaches the 'end' gateway of this parallel flow.
Is there a way to 'mock' that missing token on the second incoming sequence flow?
I hope, you understood me ;-)
You can execute
runtimeService
.createProcessInstanceModification(processInstanceId)
.startBeforeActivity(idOfGateway)
.execute();
If there are n missing tokens make sure to call #startBeforeActivity n times.

SGE hold_jid and catching failed jobs

I have a script that submits a number of jobs to run in parallel on an SGE queue, and another gathering script that is executed when this list of jobs are finished. I am using -hold_jid wc_job_list to hold the execution of the gathering script while the parallel jobs are running.
I just noticed that sometimes some of the parallel jobs fail and the gathering script still runs. The documentation states that:
If any of the referenced jobs exits with exit code 100, the submitted
job will remain ineligible for execution.
How can I catch the parallel failed jobs exit status so that if any of them fail for any reason, the gathering script is not executed or gives an error message?
In case of BASH, you could parse the exit status of your program (can be referenced as $?) and in the case of not being 0 (which is the exit status for normal termination), call exit 100 at the end of your jobscript.
The problem with this is, that your job will remain in the queue in state Eqw and has to be deleted manually.
UPDATE: For every job you set to Eqw your administrators get an email...

How to send email notification on the basis of different job

I have job A which trigger build on the basis of commit, after completion of job A, job B will get triggered as downstream job.
IN my case both job A and B send email notification.
How can i configure like job A trigger job B so that after successful completion of job B , i wanted to send email notification to the committer of job A aswellas global recipient.
Can any one help me regarding this?
Thanks in Advance
1. Try the Blame Upstream Committers plugin.
2. If you trigger Job-B via "trigger parameterized build" from Job-A, then you can pass any information from Job-A to Job-B.
(for that one, I am not sure how the committer is saved -
please advise the plugin you use and if the committer is listed as an environment variable of some sort).

Avoid printing job exit codes in SGE with option -sync yes

I have a Perl script which submits a bunch of array jobs to SGE. I want all the jobs to be run in parallel to save me time, and the script to wait for them all to finish, then go on to the next processing step, which integrates information from all SGE output files and produces the final output.
In order to send all the jobs into the background and then wait, I use Parallel::ForkManager and a loop:
$fork_manager = new Parallel::ForkManager(#as);
# #as: Max nb of processes to run simultaneously
for $a (#as) {
$fork_manager->start and next; # Starts the child process
system "qsub <qsub_options> ./script.plx";
$fork_manager->finish; # Terminates the child process
}
$fork_manager->wait_all_children;
<next processing step, local>
In order for the "waiting" part to work, however, I have had to add "-sync yes" to the qsub options. But as a "side effect" of this, SGE prints the exit code for each task in each array job, and since there are many jobs and the single tasks are light, it basically renders my shell unusable due to all those interupting messages while the qsub jobs are running.
How can I get rid of those messages? If anything, I would be interested in checking qsub's exit code for the jobs (so I can check everything went ok before the next step), but not in one exit code for each task (I log the tasks' error via option -e anyway in case I need it).
The simplest solution would be to redirect the output from qsub somewhere, i.e.
system("qsub <qsub options> ./script.plx >/dev/null 2>&1");
but this masks errors that you might want to see. Alternatively, you can use open() to start the subprocess and read it's output, only printing something if the subprocess generates an error.
I do have an alternate solution for you, though. You could submit the jobs to SGE without -sync y, and capture the job id when qsub prints it. Then, turn your summarization and results collection code into a follow on job and submit it with a dependency on the completion of the first jobs. You can submit this final job with -sync y so your calling script waits for it to end. See the documentation for -hold_jid in the qsub man page.
Also, rather than making your calling script decide when to submit the next job (up to your maximum), use SGE's -tc option to specify the maximum number of simultaneous jobs (note that -tc isn't in the man page, but it is in qsub's -help output). This depends on you using a new enough version of SGE to have -tc, of course.