How to copy the environment variables in cluster system using qsub? - sungridengine

I use the SUN's SGE to submit my jobs into a cluster system. The problem is how to let the
computing machine find the environment variables in the host machine, or how to config the qsub script to make the computing machine load the environment variables in host machine?
The following is an script example, but it will say some errors, such as libraries not found:
#!/bin/bash
#
#$ -V
#$ -cwd
#$ -j y
#$ -o /home/user/jobs_log/$JOB_ID.out
#$ -e /home/user/jobs_log/$JOB_ID.err
#$ -S /bin/bash
#
echo "Starting job: $SGE_TASK_ID"
# Modify this to use the path to matlab for your system
/home/user/Matlab/bin/matlab -nojvm -nodisplay -r matlab_job
echo "Done with job: $SGE_TASK_ID"

The technique you are using (adding a -V) should work. One possibility since you are specifying the shell with -S is that grid engine is configured to launch /bin/bash as a login shell and your profile scripts are stomping all over the environment you are trying to pass to the job.
Try using qstat -xml -j on the job while it is queued/running to see what environment variables grid engine is trying to pass to the job.
Try adding an env command to the script to see what variables are set.
Try adding shopt -q login_shell;echo $? in the script to tell you if it is being run as a login shell.
To list out shells that are configured as login shells in grid engine try:
SGE_SINGLE_LINE=true qconf -sconf|grep ^login_shells

I think this issue is due to you didn't config BASH in the login_shells of SGE
check your login_shells by qconf -sconf and see if bash in there.
login_shells
UNIX command interpreters like the Bourne-Shell (see sh(1)) or the C-
Shell (see csh(1)) can be used by Grid Engine to start job scripts. The
command interpreters can either be started as login-shells (i.e. all
system and user default resource files like .login or .profile will be
executed when the command interpreter is started and the environment
for the job will be set up as if the user has just logged in) or just
for command execution (i.e. only shell specific resource files like
.cshrc will be executed and a minimal default environment is set up by
Grid Engine - see qsub(1)). The parameter login_shells contains a
comma separated list of the executable names of the command inter-
preters to be started as login-shells. Shells in this list are only
started as login shells if the parameter shell_start_mode (see above)
is set to posix_compliant.
Changes to login_shells will take immediate effect. The default for
login_shells is sh,csh,tcsh,ksh.
This value is a global configuration parameter only. It cannot be over-
written by the execution host local configuration.

Related

Does a GitHub action step use `set -e` semantics by default?

A common pattern in GitHub action workflows is to run something like this:
- name: Install and Build 🔧
run: |
npm ci
npm run build
Clearly the intention is to run the second command only if the first command succeeds.
When running on Linux, the question becomes if the shell runs with set -e semantics. This answer suggests that set -e semantics are the default.
I'm trying to find that information in the documentation, but I'm a bit confused how it is specified. The section on exit codes contains the following for shell/sh shells:
Fail-fast behavior using set -eo pipefail: This option is set when shell: bash is explicitly specified. It is not applied by default.
This seems to contradict the other answer (and question!), and would mean that the above pattern actually is invalid, because the second line would be executed even if the first line fails.
Am I just misreading the documentation, or is it really necessary to either always specify set -e manually or add the shell: bash explicitly to get the desired behavior?
Does a GitHub action step use set -e semantics by default?
Yes, it does.
According to jobs.<job_id>.steps[*].shell, the sh and bash invocations do include -e whether specified or unspecified.
unspecified: bash -e {0}
with shell: bash: bash --noprofile --norc -eo pipefail {0}
with shell: sh: sh -e {0}
However, this section specified under Exit codes and error action preference:
bash/sh: Fail-fast behavior using set -eo pipefail: This option is set when shell: bash is explicitly specified. It is not applied by default.
applies to the -o pipefail part for Bash only. It could have been more explicit though.
An issue has been created on the GitHub docs repo to revise this:
https://github.com/github/docs/issues/23853

What is the right way to increase the hard and soft ulimits for a singularity-container image?

The task I want to complete: I need to run a python package inside of a singularity-container that is asking to open at least some 9704 files. This is the first I have heard of it and searching around this has something to do with a system’s ulimit.
What I currently have is the following def file.
I am setting the * hard nofile flag and the * soft nofile flag to 15 thousand. The sed line does edit the conf file but within the singularity shell my ulimit is still the default 1024.
Bootstrap: docker
From: fedora
%post
dnf -y update
dnf -y install nano pip wget libXcomposite libXcursor libXi libXtst libXrandr alsa-lib mesa-libEGL libXdamage mesa-libGL libXScrnSaver
wget -c https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh
/bin/bash Anaconda3-2020.02-Linux-x86_64.sh -bfp /usr/local
conda config --file /.condarc --add channels defaults
conda config --file /.condarc --add channels conda-forge
conda update conda
sed -i '2s/#/\n* hard nofile 15000\n* soft nofile 15000\n\n#/g' /etc/security/limits.conf
bash
%runscript
python /Users/lamsal/count_of_monte_cristo/orthofinder_run/OrthoFinder_source/orthofinder.py -f /Users/lamsal/count_of_monte_cristo/orthofinder_run/concatanated_FAs/
I am following the “official” instuctions to change the ulimits for a RHEL based system from IBM’s webpage here: https://www.ibm.com/docs/en/rational-clearcase/9.0.2?topic=servers-increasing-number-file-handles-linux-workstations
Is the sed line not the right way to change ulimits for a singularity image?
Short answer:
Change the value on the host OS.
Long answer:
In this instance, running a singularity container is best thought of as any other binary you're executing in your host OS. It creates its own separate environment, but otherwise it follows the rules and restrictions of the user running it. Here, the ulimit is taken from the host kernel and completely ignores any configs that may exist in the container itself.
Compare the output from the following:
# check the ulimit on the host
ulimit -n
# check the ulimit in the singularity container
singularity exec -e image.sif ulimit -n
# docker only cares about container config settings
docker run --rm fedora:latest ulimit -n
# change your local ulimit
ulimit -n 4096
# verify it has changed
ulimit -n
# singularity has changed
singularity exec -e image.sif ulimit -n
# ... but docker hasn't
docker run --rm fedora:latest ulimit -n
To have a persistent fix, you'll need to modify the setting on your host OS. Assuming you're on MacOS this answer should take care of that.
If you don't have root privs or you're only using this intermittently you can run ulimit by before running singularity. Alternatively, you could use a wrapper script to run the image and set it in there.

How to execute a command inside a singularity container that do not interact/sources options from the host OS?

I have a binary installed on a Docker container that I have been trying to run via Singularity:
singularity run docker://repo/container_image ./repository/bin --flag
The problem is that with such command it sources my .bashrc, which is causing some problems with the binary.
So I tried running it with --no-home and flaged repositories to be mounted with -B:
singularity run --no-home -B /hostrepo01:/data,/hostrepo02:/results docker://repo/container_image ./repository/bin --flag
This still imports some paths form my host os, for instance if I open a Singularity shell with the options bellow and do a cd, the shell tries to access the adress I have for my home on the host OS.
singularity run --no-home -B /hostrepo01:/data,/hostrepo02:/results docker://repo/container_image
How can I execute a command inside a singularity container that do not interact/sources options from the host OS, other than what I specify with the -B flag?
You can use the --contain flag
-c, --contain use minimal /dev and empty other
directories (e.g. /tmp and $HOME) instead
of sharing filesystems from your host
singularity run --contain -B /hostrepo01:/data,/hostrepo02:/results docker://repo/container_image

Accessing environment variables in Docker containers linked with --link

I'm setting up the development environment for my application inside Docker containers, at the moment I have these containers:
myapp-data - Holds application source code and log files
myapp-phpfpm - Runs the php5-fpm process for Nginx
myapp-nginx - Runs the Nginx web server that serves the application
This setup works beautifully, I'm really happy with it. But my application needs a MySQL database to connect to, so I'm using the official MySQL image, and running it like so:
sudo docker run --name myapp-mysql -e "MYSQL_ROOT_PASSWORD=iamroot" -e "MYSQL_USER=redacted" -e "MYSQL_PASSWORD=redacted" -e "MYSQL_DATABASE=redacted" -d mysql
This also works great. But my myapp-phpfpm container needs to be linked to the myapp-mysql container in order to expose MySQL's connection details to my application. So I restart my myapp-phpfpm container:
sudo docker run --privileged=true --name myapp-phpfpm --volumes-from myapp-data --link myapp-mysql:mysql -d readr/phpfpm
So now my myapp-phpfpm container is linked to my myapp-mysql container so I should be able to access the database within my PHP application.
The problem is I can't. The environment variables don't exist inside the PHP application. If I do:
die(var_dump(`printenv`));
I don't get the MySQL environment variables. To try to debug I did a whoami to find out what user PHP is running as, which is www-data. I then created a bash process inside the container, used su www-data to become the www-data user and did printenv there. Sure enough, the MySQL environment variables do exist there:
MYSQL_PORT_3306_TCP_PORT=3306
MYSQL_PORT_3306_TCP=tcp://172.17.1.118:3306
MYSQL_ENV_MYSQL_ROOT_PASSWORD=iamroot
... etc ...
So, how can I access the environment variables that Docker exposes about my myapp-mysql container within PHP?
I solved this by creating a custom start.sh script that then gets called from my Dockerfile:
#!/bin/sh
# Function to update the fpm configuration to make the service environment variables available
function setEnvironmentVariable() {
if [ -z "$2" ]; then
echo "Environment variable '$1' not set."
return
fi
# Check whether variable already exists
if grep -q $1 /etc/php5/fpm/pool.d/www.conf; then
# Reset variable
sed -i "s/^env\[$1.*/env[$1] = $2/g" /etc/php5/fpm/pool.d/www.conf
else
# Add variable
echo "env[$1] = $2" >> /etc/php5/fpm/pool.d/www.conf
fi
}
# Grep for variables that look like MySQL (MYSQL)
for _curVar in `env | grep MYSQL | awk -F = '{print $1}'`;do
# awk has split them by the equals sign
# Pass the name and value to our function
setEnvironmentVariable ${_curVar} ${!_curVar}
done
# start php-fpm
exec /usr/sbin/php5-fpm
This then adds the environment variables to the PHP5-FPM config so they can be accessed from within PHP scripts.
php-fpm by default clears all environment variables, /etc/php5/fpm/pool.d/www.conf:
; Setting to "no" will make all environment variables available to PHP code
; via getenv(), $_ENV and $_SERVER.
; Default Value: yes
;clear_env = no
you can fix this by uncommenting in your Dockerfile:
RUN sed -i -e "s/;clear_env\s*=\s*no/clear_env = no/g" /etc/php5/fpm/pool.d/www.conf
I'd recommend using something like fig and just passing the env vars to both containers at startup. If you really want to you could docker inspect any container from any other container if you bind-mount the docker socket, then do something like this:
docker inspect -f {{.Config.Env}} myapp-mysql
The problem may not be the environment variables - it may be your PHP installation.
TL;DR environment variables that are accessible when you're running your application under Apache & PHP may not be available if you're using nginx or lighttpd and fastcgi.
The longer version
Here's the way I understand it (and it's probably wrong or incomplete because my experience with this is quite limited). Because PHP is not running as part of the browser under nginx with fastCGI, it does not have access to the shell in which the browser was started and therefore does not have access to the environment variables in that shell.
The solution is to declare the variables you're interested in as part of the configuration. This answer is kind of terse, but it contains the basic answer to this problem.

Google Compute Engine: how to set hostname permanently?

How do I set the hostname of an instance in GCE permanently? I can set it via hostname,but after reboot it is gone again.
I tried to feed in metadata (hostname:f.q.d.n), but that did not do the job. But it should work via metadata (https://github.com/GoogleCloudPlatform/compute-image-packages/tree/master/google-startup-scripts).
Anybody an idea?
The most simple way to achieve it is creating a simple script and that's what I have done.
I have stored the hostname in the instance metadata and then I retrieve it every time the system restarts in order to set the hostname using a cron job.
$ gcloud compute instances add-metadata <instance> --metadata hostname=<new_hostname>
$ sudo crontab -e
And this is the line that must be appended in crontab
#reboot hostname $(curl --silent "http://metadata.google.internal/computeMetadata/v1/instance/attributes/hostname" -H "Metadata-Flavor: Google")
After these steps, every time you restart your instance it will have the hostname <new_hostname>.
You can check it in the prompt or with the command: hostname
You need to remove the file /etc/dhcp/dhclient.d/google_hostname.sh
rm -rf /etc/dhcp/dhclient.d/google_hostname.sh
rm -rf /etc/dhcp/dhclient-exit-hooks.d/google_set_hostname
It's worth noting that this script is needed in order to run gcloud beta compute instances create with the --hostname flag. If this script is absent on a base image, new VM instances will preserve the source hostname/FQDN!
Edit rc.local
sudo nano /etc/rc.local
Add your line under the rest:
hostname *your.hostname.com*
Make sure to run the following after for the script to be executed
chmod +x /etc/rc.d/rc.local
Reboot, and profit.
That isn't possible. Please take a look at this answer. The following article explains that the "hostname" is part of the default metadata entries and it is not possible to manually edit any of the default metadata pairs. As such, you would need to use a script or something else to change the hostname every time the system restarts, otherwise it will automatically get re-synced with the metadata server on every reboot.
You can find information on startup scripts for GCE in this article. You can visit this one for info on how to apply the script to an instance.
You can also create a simple startup-script to do the jobs:
$ gcloud compute instances add-metadata <instance-name> --zone <instance-zone> --metadata startup-script='#! /bin/bash
hostname <hostname>'
Notice that if you already have a startup-script you need to add to the existing startup-script below command or you will replace all the startup-script:
$ hostname instance-name
I was lucky to set hostname at GCE running CentOS.
Source: desantolo.com
Click EDIT on your instance
Go to "Custom metadata" section
Add hostname + your.hostname.tld (change "your.hostname.tld" to your actual hostname
run curl --silent "http://metadata.google.internal/computeMetadata/v1/instance/attributes/hostname" -H "Metadata-Flavor: Google"
run sudo env EDITOR=nano crontab -e to edit crontab
add line #reboot hostname $(curl --silent "http://metadata.google.internal/computeMetadata/v1/instance/attributes/hostname" -H "Metadata-Flavor: Google")
On your keyboard Ctrl + X
On your keyboard hit Y
On your keyboard hit Enter
run reboot
after system rebooted, run hostname and see if your changes applied
Good luck!
If anyone finds this solution does not work for them on GCS instance. Then I suggest you try using exit hooks as described by Google Support.
In fact, some distributions of Linux like CentOS and Debian use
dhclient-script script to configure the network parameters of the
machine. This script is invoked from time to time by dhclient which is
dynamic host configuration protocol client and provides a means for
configuring one or more network interfaces using the DHCP protocol,
BOOTP protocol, or if these protocols fail, by statically assigning an
address.
The following text is a quote from the man (manual) page of
dhclient-script:
After all processing has completed, /usr/sbin/dhclient-script
checks for the presence of an executable
/etc/dhcp/dhclient-exit-hooks script, which if present is invoked using the ´.´ command. The exit status of
dhclient-script will be passed to dhclient-exit-hooks in the exit_status shell variable, and will always be zero
if the script succeeded at the task for which it was invoked. The rest of the environment as described previ‐
ously for dhclient-enter-hooks is also present. The /etc/dhcp/dhclient-exit-hooks script can modify the valid of
exit_status to change the exit status of dhclient-script.
That being said, by taking a look into the code snippet of
dhclient-script, we can see the script checks for the existence of an
executable /etc/dhcp/dhclient-up-hooks script and all scripts in
/etc/dhcp/dhclient-exit-hooks.d/ directory.
ETCDIR="/etc/dhcp"
193 exit_with_hooks() {
194 exit_status="${1}"
195
196 if [ -x ${ETCDIR}/dhclient-exit-hooks ]; then
197 . ${ETCDIR}/dhclient-exit-hooks
198 fi
199
200 if [ -d ${ETCDIR}/dhclient-exit-hooks.d ]; then
201 for f in ${ETCDIR}/dhclient-exit-hooks.d/*.sh ; do
202 if [ -x ${f} ]; then
203 . ${f}204 fi
205 done
206 fi
207
208 exit ${exit_status}209 }
Therefore, in order to modify the hostname of your Linux VM you can
create a custom script with .sh extension and place it in
/etc/dhcp/dhclient-exit-hooks.d/ directory. If this directory does not
exist, you can create it. The content of the custom script will be:
hostname YourFQDN.sh
>
be sure to make this new .sh file executable:
chmod +x YourFQDN.sh
Source: (https://groups.google.com/d/msg/gce-discussion/olG_nXZ-Jaw/Y9HMl4mlBwAJ)
Im not sure I understand Adrián's answer. It seems overly complex since you have to run a script each boot why not just use hostname?
vi /etc/rc.local
add:
hostname your_hostname
thats it. tested and working. no need to fiddle with metadata and such.
Non-cron/metadata/script solution.
Edit /etc/dhclient-(network-interface).conf or create one if it doesn't exist.
Example:
sudo nano /etc/dhclient-eth0.conf
Then add the following line, replacing the desired FQDN between the double quotes:
supersede host-name "hostname.domain-name";
Persists between reboots and hostname and hostname -f works as intended.
Tested on Debian.
The dhclient sets the hostname using DHCP
You can override this by creating a custom hook script in /etc/dhcp/dhclient-exit-hooks.d/custom_set_hostname that would read the hostname from /etc/hostname:
if [ -f "/etc/hostname" ]; then
new_host_name=$(cat /etc/hostname)
fi
The script must have the execute permission.
It's important to set the new_host_name variable and not calling the hostname command directly as any call to the hostname command will be overriden by another hook or the dhclient-script which uses this variable
When creating a VM, you can specify a custom FQDN hostname as an optional parameter. This feature is currently in Beta.
$ gcloud beta compute instances create INSTANCE_NAME --hostname example.hostname
This should work across OSes, and prevent the need for workaround scripts.
More info in the docs.
-- Sirui (Product Manager, Google Compute Engine)
In my CentOS VMs I found that the script /etc/dhcp/dhclient.d/google_hostname.sh, installed by the google-compute-engine RPM, actually changed the hostname. This happens when the instance gets its IP address during boot.
While it's not the long-term solution I really want, for now I simply deleted this script. The hostname I set with hostnamectl now persists after a reboot.
The script is likely to be in exactly the same place in Debian/Ubuntu VMs, but of course I don't run any of those.
There is some hack you can do to achieve this as i did. Just do:
sudo chattr +i /etc/hosts
This command actually makes the file "(i)mmutable", which means even root can't change it (unless root does chattr -i /etc/hosts first, of course).
As above, you can undo this with sudo chattr -i /etc/hosts
Cheer!
An easy way to fix this is to set up a startup script with custom metadata.
Key :startup-script
Value:
#! /bin/bash
hostname <desired hostname>