How do I set the hostname of an instance in GCE permanently? I can set it via hostname,but after reboot it is gone again.
I tried to feed in metadata (hostname:f.q.d.n), but that did not do the job. But it should work via metadata (https://github.com/GoogleCloudPlatform/compute-image-packages/tree/master/google-startup-scripts).
Anybody an idea?
The most simple way to achieve it is creating a simple script and that's what I have done.
I have stored the hostname in the instance metadata and then I retrieve it every time the system restarts in order to set the hostname using a cron job.
$ gcloud compute instances add-metadata <instance> --metadata hostname=<new_hostname>
$ sudo crontab -e
And this is the line that must be appended in crontab
#reboot hostname $(curl --silent "http://metadata.google.internal/computeMetadata/v1/instance/attributes/hostname" -H "Metadata-Flavor: Google")
After these steps, every time you restart your instance it will have the hostname <new_hostname>.
You can check it in the prompt or with the command: hostname
You need to remove the file /etc/dhcp/dhclient.d/google_hostname.sh
rm -rf /etc/dhcp/dhclient.d/google_hostname.sh
rm -rf /etc/dhcp/dhclient-exit-hooks.d/google_set_hostname
It's worth noting that this script is needed in order to run gcloud beta compute instances create with the --hostname flag. If this script is absent on a base image, new VM instances will preserve the source hostname/FQDN!
Edit rc.local
sudo nano /etc/rc.local
Add your line under the rest:
hostname *your.hostname.com*
Make sure to run the following after for the script to be executed
chmod +x /etc/rc.d/rc.local
Reboot, and profit.
That isn't possible. Please take a look at this answer. The following article explains that the "hostname" is part of the default metadata entries and it is not possible to manually edit any of the default metadata pairs. As such, you would need to use a script or something else to change the hostname every time the system restarts, otherwise it will automatically get re-synced with the metadata server on every reboot.
You can find information on startup scripts for GCE in this article. You can visit this one for info on how to apply the script to an instance.
You can also create a simple startup-script to do the jobs:
$ gcloud compute instances add-metadata <instance-name> --zone <instance-zone> --metadata startup-script='#! /bin/bash
hostname <hostname>'
Notice that if you already have a startup-script you need to add to the existing startup-script below command or you will replace all the startup-script:
$ hostname instance-name
I was lucky to set hostname at GCE running CentOS.
Source: desantolo.com
Click EDIT on your instance
Go to "Custom metadata" section
Add hostname + your.hostname.tld (change "your.hostname.tld" to your actual hostname
run curl --silent "http://metadata.google.internal/computeMetadata/v1/instance/attributes/hostname" -H "Metadata-Flavor: Google"
run sudo env EDITOR=nano crontab -e to edit crontab
add line #reboot hostname $(curl --silent "http://metadata.google.internal/computeMetadata/v1/instance/attributes/hostname" -H "Metadata-Flavor: Google")
On your keyboard Ctrl + X
On your keyboard hit Y
On your keyboard hit Enter
run reboot
after system rebooted, run hostname and see if your changes applied
Good luck!
If anyone finds this solution does not work for them on GCS instance. Then I suggest you try using exit hooks as described by Google Support.
In fact, some distributions of Linux like CentOS and Debian use
dhclient-script script to configure the network parameters of the
machine. This script is invoked from time to time by dhclient which is
dynamic host configuration protocol client and provides a means for
configuring one or more network interfaces using the DHCP protocol,
BOOTP protocol, or if these protocols fail, by statically assigning an
address.
The following text is a quote from the man (manual) page of
dhclient-script:
After all processing has completed, /usr/sbin/dhclient-script
checks for the presence of an executable
/etc/dhcp/dhclient-exit-hooks script, which if present is invoked using the ´.´ command. The exit status of
dhclient-script will be passed to dhclient-exit-hooks in the exit_status shell variable, and will always be zero
if the script succeeded at the task for which it was invoked. The rest of the environment as described previ‐
ously for dhclient-enter-hooks is also present. The /etc/dhcp/dhclient-exit-hooks script can modify the valid of
exit_status to change the exit status of dhclient-script.
That being said, by taking a look into the code snippet of
dhclient-script, we can see the script checks for the existence of an
executable /etc/dhcp/dhclient-up-hooks script and all scripts in
/etc/dhcp/dhclient-exit-hooks.d/ directory.
ETCDIR="/etc/dhcp"
193 exit_with_hooks() {
194 exit_status="${1}"
195
196 if [ -x ${ETCDIR}/dhclient-exit-hooks ]; then
197 . ${ETCDIR}/dhclient-exit-hooks
198 fi
199
200 if [ -d ${ETCDIR}/dhclient-exit-hooks.d ]; then
201 for f in ${ETCDIR}/dhclient-exit-hooks.d/*.sh ; do
202 if [ -x ${f} ]; then
203 . ${f}204 fi
205 done
206 fi
207
208 exit ${exit_status}209 }
Therefore, in order to modify the hostname of your Linux VM you can
create a custom script with .sh extension and place it in
/etc/dhcp/dhclient-exit-hooks.d/ directory. If this directory does not
exist, you can create it. The content of the custom script will be:
hostname YourFQDN.sh
>
be sure to make this new .sh file executable:
chmod +x YourFQDN.sh
Source: (https://groups.google.com/d/msg/gce-discussion/olG_nXZ-Jaw/Y9HMl4mlBwAJ)
Im not sure I understand Adrián's answer. It seems overly complex since you have to run a script each boot why not just use hostname?
vi /etc/rc.local
add:
hostname your_hostname
thats it. tested and working. no need to fiddle with metadata and such.
Non-cron/metadata/script solution.
Edit /etc/dhclient-(network-interface).conf or create one if it doesn't exist.
Example:
sudo nano /etc/dhclient-eth0.conf
Then add the following line, replacing the desired FQDN between the double quotes:
supersede host-name "hostname.domain-name";
Persists between reboots and hostname and hostname -f works as intended.
Tested on Debian.
The dhclient sets the hostname using DHCP
You can override this by creating a custom hook script in /etc/dhcp/dhclient-exit-hooks.d/custom_set_hostname that would read the hostname from /etc/hostname:
if [ -f "/etc/hostname" ]; then
new_host_name=$(cat /etc/hostname)
fi
The script must have the execute permission.
It's important to set the new_host_name variable and not calling the hostname command directly as any call to the hostname command will be overriden by another hook or the dhclient-script which uses this variable
When creating a VM, you can specify a custom FQDN hostname as an optional parameter. This feature is currently in Beta.
$ gcloud beta compute instances create INSTANCE_NAME --hostname example.hostname
This should work across OSes, and prevent the need for workaround scripts.
More info in the docs.
-- Sirui (Product Manager, Google Compute Engine)
In my CentOS VMs I found that the script /etc/dhcp/dhclient.d/google_hostname.sh, installed by the google-compute-engine RPM, actually changed the hostname. This happens when the instance gets its IP address during boot.
While it's not the long-term solution I really want, for now I simply deleted this script. The hostname I set with hostnamectl now persists after a reboot.
The script is likely to be in exactly the same place in Debian/Ubuntu VMs, but of course I don't run any of those.
There is some hack you can do to achieve this as i did. Just do:
sudo chattr +i /etc/hosts
This command actually makes the file "(i)mmutable", which means even root can't change it (unless root does chattr -i /etc/hosts first, of course).
As above, you can undo this with sudo chattr -i /etc/hosts
Cheer!
An easy way to fix this is to set up a startup script with custom metadata.
Key :startup-script
Value:
#! /bin/bash
hostname <desired hostname>
Related
Disclaimer:
On a old machine with Ubuntu 14.04 with Upstart as init system I have enabled the HTTP API by defining DOCKER_OPTS on /etc/default/docker. It works.
$ docker version
Client:
Version: 1.11.2
(...)
Server:
Version: 1.11.2
(...)
Problem:
This does solution does not work on a recent machine with Ubuntu 16.04 with SystemD.
As stated on the top of the recent file installed /etc/default/docker:
# Docker Upstart and SysVinit configuration file
#
# THIS FILE DOES NOT APPLY TO SYSTEMD
#
# Please see the documentation for "systemd drop-ins":
# https://docs.docker.com/engine/articles/systemd/
#
(...)
As I checked this information on the Docker documentation page for SystemD I need to fill a daemon.json file but as stated on the reference there are some properties self-explanatory but others could be under-explained.
That being said, I'm looking for help to convert this:
DOCKER_OPTS="-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock -G myuser --debug"
to the daemon.jsonobject?
Notes
PS1: I'm aware that the daemon.json have a debug: true as default.
PS2: Probably the group: "myuser" it will work like this or with an array of strings.
PS3: My main concern is to use SOCK and HTTP simultaneous.
EDIT (8/08/2017)
After reading the accepted answer, check the #white_gecko answer for more input on the matter.
With a lot of fragmented documentation it was difficult to solve this.
My first solution was to create the daemon.json with
{
"hosts": [
"unix:///var/run/docker.sock",
"tcp://127.0.0.1:2376"
]
}
This does not worked this error docker[5586]: unable to configure the Docker daemon with file /etc/docker/daemon.json after tried to restart the daemon with service docker restart.
Note: There was more on the error that I failed to copy.
But what this error meant it at the start the daemon it a conflict with a flag and configurations on daemon.json.
When I looked into it with service docker status this it was the parent process: ExecStart=/usr/bin/docker daemon -H fd://.
What it was strange because is different with configurations on /etc/init.d/docker which I thought that were the service configurations.
The strange part it was that the file on init.d does contain any reference to daemon argument neither -H fd://.
After some research and a lot of searches of the system directories, I find out these directory (with help on the discussion on this issue docker github issue #22339).
Solution
Edited the ExecStart from /lib/systemd/system/docker.service with this new value:
/usr/bin/docker daemon
And created the /etc/docker/daemon.json with
{
"hosts": [
"fd://",
"tcp://127.0.0.1:2376"
]
}
Finally restarted the service with service docker start and now I get the "green light" on service docker status.
Tested the new configurations with:
$ docker run hello-world
Hello from Docker!
(...)
And,
$ curl http://127.0.0.1:2376/v1.23/info
[JSON]
I hope that this will help someone with a similar problem as mine! :)
I had the same problem and actually in my eyes the easiest solution which should doesn't touch any existing files, which are managed by the system update process is, to use a systemd drop-in:
Just create a file /etc/systemd/system/docker.service which overwrites the specific part of the service in /lib/systemd/system/docker.service.
In this case the content of /etc/systemd/system/docker.service would be:
[Service]
ExecStart=/usr/bin/dockerd --tlsverify --tlscacert=/etc/docker/ca.pem --tlscert=/etc/docker/server-cert.pem --tlskey=/etc/docker/server-key.pem -H=tcp://127.0.0.1:2375 -H=fd://
(You could even create a directory docker.service.d which contains multiple files to overwrite different parameters.)
After adding the file you just run:
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
The solution described at https://docs.docker.com/engine/admin/#troubleshoot-conflicts-between-the-daemonjson-and-startup-scripts works for me:
One notable example of a configuration conflict that is difficult to
troubleshoot is when you want to specify a different daemon address
from the default. Docker listens on a socket by default. On Debian and
Ubuntu systems using systemd), this means that a -H flag is always
used when starting dockerd. If you specify a hosts entry in the
daemon.json, this causes a configuration conflict (as in the above
message) and Docker fails to start.
To work around this problem, create a new file
/etc/systemd/system/docker.service.d/docker.conf with the following
contents, to remove the -H argument that is used when starting the
daemon by default.
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd
Note that the line with ExecStart= is actually required, otherwise it'll fail with the error:
docker.service: Service has more than one ExecStart= setting, which is only allowed for Type=oneshot services. Refusing.
After creating the file you must run:
sudo systemctl daemon-reload
sudo systemctl restart docker
For me worked on Ubuntu 18.04.1 LTS and Docker 18.06.0-ce create
/etc/systemd/system/docker.service.d/remote-api.conf
with following content:
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock
then run sudo systemctl daemon-reload and sudo systemctl restart docker
See result calling:
curl http://localhost:2376/info
You might need to configure proxy, if your docker is behind a proxy.
To achiev this paste in /etc/default/docker file following:
http_proxy="http://85.22.53.71:8080/"
https_proxy="http://85.22.53.71:8080/"
HTTP_PROXY="http://85.22.53.71:8080/"
HTTPS_PROXY="http://85.22.53.71:8080/"
# below you can list some *.enterprise_domain.com as well
NO_PROXY="localhost,127.0.0.1,::1"
Or Create
/etc/systemd/system/docker.service.d/remote-api.conf with following content:
[Service]
Environment="HTTP_PROXY=http://<you_proxy_ip>:<port>"
Environment="HTTPS_PROXY=https://<you_proxy_ip>:<port>/"
Environment="NO_PROXY=localhost,127.0.0.1,::1"
I hope it helps someone...
I am trying to mount a remote filesystem on Google Container Engine. I am following this tutorial: https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh
Using following sshfs command:
sudo sshfs -o sshfs_debug,allow_other <instance-name>.<region>.<project_id>:/home/<user_name> /mnt/gce-container
I am getting error:
SSHFS version 2.5
read: Connection reset by peer
I referred this link https://cloud.google.com/sdk/gcloud/reference/compute/config-ssh
and could login using ssh via following command:
$gcloud compute config-ssh
$ssh <instance-name>.<region>.<project_id>
Any ideas what might be going wrong here? I can't understand what keys and username should I use for sshfs login.
Update(11/5):
I am using following command:
sshfs -o IdentityFile=~/.ssh/google_compute_engine <user>#<ip>:~/ /mnt/gce`
I have chowned /mnt/gce folder for my user. I checked the IP matches the entry in ~/.ssh/config file. However I still get the error read: Connection reset by peer
The problem with command below is that
1) unless you have a static IP, it keeps changing on machine reboot
2) you need to use .pub file
sshfs -o IdentityFile=~/.ssh/google_compute_engine <user>#<ip>:~/ /mnt/gce
I finally got it working by following command:
sudo mkdir /mnt/gce
sudo chown <user> /mnt/gce
sshfs -o IdentityFile=~/.ssh/google_compute_engine.pub <user_name>#<instance-name>.<region>.<project_id>:/home/<user_name> /mnt/gce
A few things that might be the cause of the problem:
Don't use sshfs as root. It's a FUSE filesystem and meant to be user mounted.
Don't specify a full path as the remote FS. It's SSH, so by default, the $PWD on the remote side is the login user's $HOME.
if ssh works, sshfs will work. The easiest way is to make sure that ~/.ssh/config has an entry for the remote host with the user, port, etc provided.
If you get this from sshfs
read: Connection reset by peer
maybe help to set file to read only
chmod 400 /{{path_to_your_key}}/keypair.pem
and connect again.
I want to ping an external ip from all of my servers that run zabbix agent.
I searched and find some articles about zabbix user parameters.
In /etc/zabbix/zabbix_agentd.conf.d/ I created a file named userparameter_ping.conf with following content:
UserParameter=checkip[*],ping -c4 8.8.8.8 && echo 0 || echo 1
I created an item named checkip in zabbix server with a graph but got no data. After some another digging I found zabbix_get and tested my userparameter but I got the error : ZBX_NOTSUPPORTED
# zabbix_get -s 172.20.4.43 -p 10050 -k checkip
my zabbix version :
Zabbix Agent (daemon) v2.4.5 (revision 53282) (21 April 2015)
Does anybody know what I can do to address this?
After some change and talks with folks in mailing list finally it worked but how :
first i created a file in :
/etc/zabbix/zabbix_agentd.conf.d/
and add this line :
UserParameter=checkip[*],ping -W1 -c2 $1 >/dev/null 2>&1 && echo 0 || echo 1
and run this command :
./sbin/zabbix_agentd -t checkip["8.8.8.8"]
checkip[8.8.8.8] [t|0]
so everything done but Timeout option is very important for us :
add time out in /etc/zabbix/zabbix_agentd.conf
Timeout=30
Timeout default is 3s so if we run
time ping -W1 -c2 8.8.8.8
see maybe it takes more than 3s so you got error :
ZBX_NOTSUPPORTED
It can be anything. For example timeout - default timeout is 3 sec and ping -c4 requires at least 3 seconds, permission/path to ping, not restarted agent, ...
Increase debug level, restart agent and check zabbix logs. Also you can test zabbix_agentd directly:
zabbix_agentd -t checkip[]
[m|ZBX_NOTSUPPORTED] [Timeout while executing a shell script.] => Timeout problem. Edit zabbix_agentd.conf and increase Timeout settings. Default 3 seconds are not the best for your ping, which needs 3+ seconds.
If you need more than 30s for the execution, you can use the nohup (command..) & combo to curb the timeout restriction.
That way, if you generate some file with the results, in the next pass, you can read the file and get back the results without any need to wait at all.
For those who may be experiencing other issues with the same error message.
It is important to run zabbix_agentd with the -c parameter:
./sbin/zabbix_agentd -c zabbix_agentd.conf --test checkip["8.8.8.8"]
Otherwise zabbix might not pick up on the command and will thus yield ZBX_NOTSUPPORTED.
It also helps to isolate the command into a script file, as Zabbix will butcher in-line commands in UserParameter= much more than you'd expect.
I defined two user parameters like this for sync checking between to samba DCs.
/etc/zabbix/zabbix_agentd.d/userparameter_samba.conf:
UserParameter=syncma, sudo samba-tool drs replicate smb1 smb2 cn=schema,cn=configuration,dc=domain,dc=com
UserParameter=syncam, sudo samba-tool drs replicate smb2 smb1 cn=schema,cn=configuration,dc=domain,dc=com
and also provided sudoer access for Zabbix user to execute the command. /etc/sudoers.d/zabbix:
Defaults:zabbix !syslog
Defaults:zabbix !requiretty
zabbix ALL=(ALL) NOPASSWD: /usr/bin/samba-tool
zabbix ALL=(ALL) NOPASSWD: /usr/bin/systemctl
And "EnableRemoteCommands" is enabled on my zabbix_aganetd.conf, sometimes when I run
zabbix_get -s CLIENT_IP -p10050 -k syncma or
zabbix_get -s CLIENT_IP -p10050 -k syncam
I get the error ZBX_NOTSUPPORTED: Timeout while executing a shell script.
but after executing /sbin/zabbix_agentd -t syncam on the client, Zabbix server just responses normally.
Replicate from smb2 to smb1 was successful.
and when it has a problem I get below error on my zabbix.log
failed to kill [ sudo samba-tool drs replicate smb1 smb2 cn=schema,cn=configuration,dc=domain,dc=com]: [1] Operation not permitted
It seems like it is a permission error! but It just resolved after executing /sbin/zabbix_agentd -t syncam but I am not sure the error is gone permanently or will happen at the next Zabbix item check interval.
I use the SUN's SGE to submit my jobs into a cluster system. The problem is how to let the
computing machine find the environment variables in the host machine, or how to config the qsub script to make the computing machine load the environment variables in host machine?
The following is an script example, but it will say some errors, such as libraries not found:
#!/bin/bash
#
#$ -V
#$ -cwd
#$ -j y
#$ -o /home/user/jobs_log/$JOB_ID.out
#$ -e /home/user/jobs_log/$JOB_ID.err
#$ -S /bin/bash
#
echo "Starting job: $SGE_TASK_ID"
# Modify this to use the path to matlab for your system
/home/user/Matlab/bin/matlab -nojvm -nodisplay -r matlab_job
echo "Done with job: $SGE_TASK_ID"
The technique you are using (adding a -V) should work. One possibility since you are specifying the shell with -S is that grid engine is configured to launch /bin/bash as a login shell and your profile scripts are stomping all over the environment you are trying to pass to the job.
Try using qstat -xml -j on the job while it is queued/running to see what environment variables grid engine is trying to pass to the job.
Try adding an env command to the script to see what variables are set.
Try adding shopt -q login_shell;echo $? in the script to tell you if it is being run as a login shell.
To list out shells that are configured as login shells in grid engine try:
SGE_SINGLE_LINE=true qconf -sconf|grep ^login_shells
I think this issue is due to you didn't config BASH in the login_shells of SGE
check your login_shells by qconf -sconf and see if bash in there.
login_shells
UNIX command interpreters like the Bourne-Shell (see sh(1)) or the C-
Shell (see csh(1)) can be used by Grid Engine to start job scripts. The
command interpreters can either be started as login-shells (i.e. all
system and user default resource files like .login or .profile will be
executed when the command interpreter is started and the environment
for the job will be set up as if the user has just logged in) or just
for command execution (i.e. only shell specific resource files like
.cshrc will be executed and a minimal default environment is set up by
Grid Engine - see qsub(1)). The parameter login_shells contains a
comma separated list of the executable names of the command inter-
preters to be started as login-shells. Shells in this list are only
started as login shells if the parameter shell_start_mode (see above)
is set to posix_compliant.
Changes to login_shells will take immediate effect. The default for
login_shells is sh,csh,tcsh,ksh.
This value is a global configuration parameter only. It cannot be over-
written by the execution host local configuration.
I'm a new fish for hadoop.I installed Ubuntu 12.10 on my computer and I wanna install Hadoop in pseudo-distributed mode on one single node.I searched and get lots of tutorials but I have a problem with the SSH.I did what the tutorial said.
I am sure the problem is about the SSH.I get the openssh-server,and had done this:
hadoop00#WebsoftStation:~$ssh-keygen -t dsa -P "" -f ~/.ssh/id_dsa
hadoop00#WebsoftStation:~/.ssh$cat ~/.ssh/id_dsa.pub >> authorized_keys
Then I can successfully ssh my localhost like this:
hadoop00#WebsoftStation:~$ssh localhost
It worked.
So I changed the path to hadoop and then:
hadoop00#WebsoftStation:/usr/local/hadoop$ sudo bin/start-all.sh
[sudo] password for hadoop00:
starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-root-namenode-WebsoftStation.out
root#localhost's password:
root#localhost's password: localhost: Permission denied, please try again.
So,what's the problem?
You have setup password-less ssh for only your current account. Since, when you can use ssh localhost without any problem, the thing you need to do next is giving execution permission to your scripts.
Execute the following commands:
chmod +x bin/*.sh ---> assigns execution permission to all the scripts
./start.all ----> executes the script
Note: Hadoop can also be run without having password-less ssh setup using hadoop-daemon.sh script. The only advantage with password-less ssh is that, the ./start.all, script will take the trouble of doing that on behalf of you in each of the nodes.
You need to change permissions for your Hadoop folder to be owned by the hadoop00 user:
cd /usr/local/
sudo chown -R hadoop00:hadoop00 /usr/local/hadoop
Then you can cd into the sbin folder and run things without sudo. If you use sudo you're running the scripts as root which has different environment variables etc which is why you have a different behavior.
Why are you using sudo this is clearly a permission problem.
Try running this without sudo
bin/start-all.sh