pods can't resolve DNS after 'oc cluster up' - openshift

On a fresh install of RHEL7.4:
# install the oc client and docker
[root#openshift1 ~]# yum install atomic-openshift-clients.x86_64 docker
# configure and start docker
[root#openshift1 ~]# sed -i '/^\[registries.insecure\]/!b;n;cregistries = ['172.30.0.0\/16']' /etc/containers/registries.conf
[root#openshift1 ~]# systemctl start docker; systemctl enable docker
# these links recommend running 'iptables -F' as a workaround for pod DNS issues
# https://github.com/openshift/origin/issues/12110
# https://github.com/openshift/origin/issues/10139
[root#openshift1 ~]# iptables -F; iptables -F -t nat
[root#openshift1 ~]# oc cluster up --public-hostname 192.168.146.200
Attempting a test apache build gives me this error:
Cloning "https://github.com/openshift/httpd-ex.git " ...
WARNING: timed out waiting for git server, will wait 1m4s
error: fatal: unable to access 'https://github.com/openshift/httpd-ex.git/': Could not resolve host: github.com; Unknown error
DNS server is present
[root#openshift1 ~]# cat /etc/resolv.conf
# Generated by NetworkManager
nameserver 192.168.146.2
I can confirm that the host machine can resolve names:
[root#openshift1 ~]# host github.com
github.com has address 192.30.255.113
github.com has address 192.30.255.112
However this DNS server didn't make it's way down to the pods
[root#openshift1 ~]# oc get pods
NAME READY STATUS RESTARTS AGE
docker-registry-1-rqm9h 1/1 Running 0 38s
persistent-volume-setup-fdbv5 1/1 Running 0 50s
router-1-m6z8w 1/1 Running 0 31s
[root#openshift1 ~]# oc rsh docker-registry-1-rqm9h
sh-4.2$ cat /etc/resolv.conf
nameserver 172.30.0.1
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
Is there anything I am missing?

You should not flush the rules, rather you should create a new zone and open additional ports, e.g.:
firewall-cmd --permanent --new-zone dockerc
firewall-cmd --permanent --zone dockerc --add-source $(docker network inspect -f "{{range .IPAM.Config }}{{ .Subnet }}{{end}}" bridge)
firewall-cmd --permanent --zone dockerc --add-port 8443/tcp --add-port 53/udp --add-port 8053/udp
firewall-cmd --reload
Source:
https://github.com/openshift/origin/blob/release-3.7/docs/cluster_up_down.md#linux
EDIT:
Also the DNS server in your /etc/resolv.conf should be routable from your OCP instance.
Source: kubernetes skydns failure to forward request

Related

Port being used by container docker, but the container is not running

I have created a mysql container running on port 3306, it worked few days until i shut down my computer. But now the port is being used by it, but the container is not running.
$ sudo netstat -ano -p tcp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name Timer
tcp 0 0 127.0.0.1:33060 0.0.0.0:* LISTEN 6112/mysqld off (0.00/0/0)
tcp 0 0 127.0.0.1:3306 0.0.0.0:* LISTEN 6112/mysqld off (0.00/0/0)
But when i check if the container is running I get this result:
$ sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0c3005d205fe mysql:5.7 "docker-entrypoint.s…" 11 days ago Exited (128) 2 days ago mysql
And if I try to start it:
$ sudo docker start mysql
Error response from daemon: driver failed programming external connectivity on endpoint mysql (c127ccaac80d4f80ffaf9a825112e07d718ee022fbcac3bc1cfc01faf6f9ebf1): Error starting userland proxy: listen tcp4 0.0.0.0:3306: bind: address already in use
Error: failed to start containers: mysql
I have tried to:
restart docker:
$ sudo /etc/init.d/docker restart
Restarting docker (via systemctl): docker.service.
kill container
$ sudo docker kill mysql
Error response from daemon: Cannot kill container: mysql: Container 0c3005d205fe3d833dc34a5b57e2dca4d72de82855c57cbdf2062b56aab3e0e1 is not running
stop container
$ sudo docker stop mysql
mysql
kill the PID
$ sudo kill -9 6112
And after all of these, the same...
I have created the container WITHOUT detached option.
Thank you!

Can't start docker container 3306 is busy [duplicate]

When I run docker-compose up in my Docker project it fails with the following message:
Error starting userland proxy: listen tcp 0.0.0.0:3000: bind: address already in use
netstat -pna | grep 3000
shows this:
tcp 0 0 0.0.0.0:3000 0.0.0.0:* LISTEN -
I've already tried docker-compose down, but it doesn't help.
In your case it was some other process that was using the port and as indicated in the comments, sudo netstat -pna | grep 3000 helped you in solving the problem.
While in other cases (I myself encountered it many times) it mostly is the same container running at some other instance. In that case docker ps was very helpful as often I left the same containers running in other directories and then tried running again at other places, where same container names were used.
How docker ps helped me:
docker rm -f $(docker ps -aq) is a short command which I use to remove all containers.
Edit: Added how docker ps helped me.
This helped me:
docker-compose down # Stop container on current dir if there is a docker-compose.yml
docker rm -fv $(docker ps -aq) # Remove all containers
sudo lsof -i -P -n | grep <port number> # List who's using the port
and then:
kill -9 <process id> (macOS) or sudo kill <process id> (Linux).
Source: comment by user Rub21.
I had the same problem. I fixed this by stopping the Apache2 service on my host.
You can kill the process listening on that port easily with one command below :
kill -9 $(lsof -t -i tcp:<port#>)
ex :
kill -9 $(lsof -t -i tcp:<port#>)
or for ubuntu:
sudo kill -9 `sudo lsof -t -i:8000`
Man page for lsof : https://man7.org/linux/man-pages/man8/lsof.8.html
-9 is for hard kill without checking any deps.
(Not related, but might be useful if its PORT 5000 mystery) - the culprit process is due to Mac OS monterery.
The port 5000 is commonly used to serve local development servers. When updating to the latest macOS operating system, I was unable the docker to bind to port 5000, because it was already in use. (You may find a message along the lines of Port 5000 already in use.)
By running lsof -i :5000, I found out the process using the port was named ControlCenter, which is a native macOS application. If this is happening to you, even if you use brute force (and kill) the application, it will restart itself. In my laptop, lsof -i :5000 returns that Control Center is being used by process id 433. I could do killall -p 433, but macOS keeps restarting the process.
The process running on this port turns out to be an AirPlay server. You can deactivate it in
System Preferences › Sharing, and unchecking AirPlay Receiver to release port 5000.
I had same problem,
docker-compose down --rmi all (in the same directory where you run docker-compose up)
helps
UPD: CAUTION - this will also delete the local docker images you've pulled (from comment)
For Linux/Unix:
Simple search for linux utility using following command
netstat -nlp | grep 8888
It'll show processing running at this port, then kill that process using PID (look for a PID in row) of that process.
kill PID
In some cases it is critical to perform a more in-depth debugging to the problem before stopping a container or killing a process.
Consider following the checklist below:
1) Check you current docker compose environment
Run docker-compose ps. If port is in use by another container, stop it with docker-compose stop <service-name-in-compose-file> or remove it by replacing stop with rm.
2) Check the containers running outside your current workspace
Run docker ps to see list of all containers running under your host.
If you find the port is in use by another container, you can stop it with docker stop <container-id>.
(*) Because you're not under the scope of the origin compose environment - it is a good practice first to use docker inspect to gather more information about the container that you're about to stop.
3) Check if port is used by other processes running on the host
For example if the port is 6379 run:
$ sudo netstat -ltnp | grep ':6379'
tcp 0 0 127.0.0.1:6379 0.0.0.0:* LISTEN 915/redis-server 12
tcp6 0 0 ::1:6379 :::* LISTEN 915/redis-server 12
(*) You can also use the lsof command which is mainly used to retrieve information about files that are opened by various processes (I suggest running netstat before that).
So, In case of the output above the PID is 915. Now you can run:
$ ps j 915
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
1 915 915 915 ? -1 Ssl 123 0:11 /usr/bin/redis-server 127.0.0.1:6379
And see the ID of the parent process (PPID) and the execution command.
You can also run: $ pstree -s <PID> to a visual display of the process and its related processes.
In our case we can see that the process probably is a daemon (PPID is 1) - In that case consider running: A) $ cat /proc/<PID>/status in order to get a more in-depth information about the process like the number of threads spawned by the process, its capabilities, etc'.
B) $ systemctl status <PID> in order to see the systemd unit that caused the creation of a specific process. If the service is not critical - you can stop and disable the service.
4) Restart Docker service
Run: sudo service docker restart.
5) You reached this point and..
Only if its not placing your system at risk - consider restarting the server.
In my case it was
Error starting userland proxy: listen tcp 0.0.0.0:9000: bind: address already in use
And all that I need is turn off debug listening in php storm
Most probably this is because you are already running a web server on your host OS, so it conflicts with the web server that Docker is attempting to start.
So try this one-liner before trying anything else:
sudo service apache2 stop; sudo service nginx stop; sudo nginx -s stop;
I had apache running on my ubuntu machine. I used this command to kill it!
sudo /etc/init.d/apache2 stop
I was getting the below error when i was trying to launch a new container -
listen tcp 0.0.0.0:8080: bind: address already in use.
To check which process is running on port 8080, run below command:
netstat -tulnp | grep 8080
i got the output below
[root#ip-112-x6x-2x-xxx.xxxxx.compute.internal (aws_main) ~]# netstat -tulnp | grep 8080 tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN **12749**/java [root#ip-112-x6x-2x-xxx.xxxxx.compute.internal (aws_main) ~]#
run
kill -9 12749
Then try to relaunch the container it should work
If redis server is started as a service, it will restart itself when you using kill -9 <process_id> or sudo kill -9 `sudo lsof -t -i:<port_number>` . In that case you will need to stop the redis service using following command.
sudo service redis-server stop
I upgraded my docker this afternoon and ran into the same problem. I tried restarting docker but no luck.
Finally, I had to restart my computer and it worked. Definitely a bug.
Check docker-compose.yml, it might be the case that the port is specified twice.
version: '3'
services:
registry:
image: mysql:5.7
ports:
- "3306:3306" <--- remove either this line or next
- "127.0.0.1:3306:3306"
Changing network_mode: "bridge" to "host" did it for me.
This with
version: '2.2'
services:
bind:
image: sameersbn/bind:latest
dns: 127.0.0.1
ports:
- 172.17.42.1:53:53/udp
- 172.17.42.1:10000:10000
volumes:
- "/srv/docker/bind:/data"
environment:
- 'ROOT_PASSWORD=secret'
network_mode: "host"
I ran into the same issue several times. Restarting docker seems to do the trick
A variation of #DmitrySandalov's answer: I had tomcat/java running on 8080, which needed to keep going. Looked at the docker-compose.yml file and altered the entry for 8080 to another of my choosing.
nginx:
build: nginx
ports:
#- '8080:80' <-- original entry
- '8880:80'
- '8443:443'
Worked perfectly. (The only wrinkle is the change will be wiped if I ever update the project, since it's coming from an external repo.)
At first, make sure which service you are running in your specific port. In your case, you are already using port number 3000.
netstat -aof | findstr :3000
now stop that process which is running on specific port
lsof -i tcp:3000
I resolve the issue by restarting Docker.
It makes more sense to change the port of the docker update instead of shutting down other services that use port 80.
Just a side note if you have the same issue and is with Windows:
In my case the process in my way is just grafana-server.exe. Because I first downloaded the binary version and double click the executable, and it now starts as a service by user SYSTEM which I cannot taskkill (no permission)
I have to go to "Service manager" of Windows and search for service "Grafana", and stop it. After that port 3000 is no longer occupied.
Hope that helps.
The one that was using the port 8888 was Jupiter and I had to change the configuration file of Jupiter notebook to run on another port.
to list who is using that specific port.
sudo lsof -i -P -n | grep 9
You can specify the port you want Jupyter to run uncommenting/editing the following line in ~/.jupyter/jupyter_notebook_config.py:
c.NotebookApp.port = 9999
In case you don't have a jupyter_notebook_config.py try running jupyter notebook --generate-config. See this for further details on Jupyter configuration.
Before it was running on :docker run -d --name oracle -p 1521:1521 -p 5500:5500 qa/oracle
I just changed the port to docker run -d --name oracle -p 1522:1522 -p 5500:5500 qa/oracle
it worked fine for me !
On my machine a PID was not being shown from this command netstat -tulpn for the in-use port (8080), so i could not kill it, killing the containers and restarting the computer did not work. So service docker restart command restarted docker for me (ubuntu) and the port was no longer in use and i am a happy chap and off to lunch.
maybe it is too rude, but works for me. restart docker service itself
sudo service docker restart
hope it works for you also!
I have run the container with another port, like... 8082 :-)
I came across this problem. My simple solution is to remove the mongodb from the system
Commands to remove mongodb in Ubuntu:
sudo apt-get purge mongodb mongodb-clients mongodb-server mongodb-dev
sudo apt-get purge mongodb-10gen
sudo apt-get autoremove
Let me add one more case, because I had the same error and none of the solutions listed so far works:
serv1:
...
networks:
privnet:
ipv4_address: 10.10.100.2
...
serv2:
...
# no IP assignment, no dependencies
networks:
privnet:
ipam:
driver: default
config:
- subnet: 10.10.100.0/24
depending on the init order, serv2 may get assigned the IP 10.10.100.2 before serv1 is started, so I just assign IPs manually for all containers to avoid the error. Maybe there are other more elegant ways.
I have the same problem and by stopping docker container it was resolved.
sudo docker container stop <container-name>
i solved with this sudo service redis-server stop

GKE: how to close the connection to a pod after port-forwarding via bastion-host

I am working from my local machine with a database deployed in a pod in kubernetes. To connect to it, first up it is necessary to connect to a bastion host VM.
Basically, it is a double ssh tunnel: port 3306 is mapped to port 3306 of the bastion host VM and then to localhost 3306 port via
gcloud beta compute ssh my-bastion-host --project my-gcp-project --zone us-west1-b --command "kubectl -n mynamespace port-forward app-mysqldb-12345-abcde 3306" -- -L3306:127.0.0.1:3306
However, when I terminate the command the connection between the VM and the mysql pod it's not terminated and I need to do it automatically: first up, in the bastion host,
ps -ef|grep port-forward
to find the the PROCESS_NUMBER
then
echo "kill -9 <PROCESS_NUMBER>
to terminate the connection.
Is there a way to close automatically the connection between the bastion host and the mysql pod when the gcloud beta compute ssh is terminated?
Try this:
gcloud beta compute ssh my-bastion-host --project my-gcp-project --zone us-west1-b --command "bash -c 'kubectl -n mynamespace port-forward app-mysqldb-12345-abcde 3306'; kill -9 $(pgrep -f port-forward)" -- -L3306:127.0.0.1:3306
As #Saxon suggested, it is to run kill after the kubectl.. It should be killed first before running another port-forwarding operation.
So,You should perform another kill prior to calling the kubectl so it kills any lingering connection before creating a new port-forward, this will solve the error you are getting:
gcloud beta compute ssh my-bastion-host --project my-gcp-project --zone us-west1-b --command "bash -c 'kill -9 $(pgrep -f port-forward); kubectl -n mynamespace port-forward app-mysqldb-12345-abcde 3306'; kill -9 $(pgrep -f port-forward)" -- -L3306:127.0.0.1:3306

Docker: Error response from daemon: Bind for 0.0.0.0:3306 failed: port is already allocated

I'm new to Docker and I can't seem to get my mariadb container running. I have just freshly installed Docker on Macbook Pro running High Sierra.
I've simple used this command:
docker run --name db -e MYSQL_ROOT_PASSWORD=test -d -p 3306:3306
mariadb
Which is supposed to create an image and run the container from it. But I get the following error:
docker: Error response from daemon: driver failed programming
external connectivity on endpoint db
(d4d6631ae53d644b5c28a803d5814a792c7af6925ebcf84b61b49b4a0fe30f4b):
Error starting userland proxy: Bind for 0.0.0.0:3306 failed: port is
already allocated.
So far I may have used MySQL in the far past, but I'm pretty sure I don't have anything running on port 3306.
I have also tried not adding the -p tag, it will run when I use this but when i execute docker ps it will show 3306/tcp and NOT 0.0.0.0:3306->3306/tcp as the PORT.
I have also tried just having to port tag as -p 3306 but this will show 0.0.0.0:32769->3306/tcp as the PORT in docker ps.
I would love some help. Thanks in advance.
Use lsof command to check if a service / process is using the port 3306.
$ lsof -i tcp:3306
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
mysqld 721 krisnik 34u IPv4 0x348c24a60c9d72a9 0t0 TCP localhost:mysql (LISTEN)
Now kill / stop the service.
kill -9 <PID>
Re-run your Docker container. It should work fine as the required port is released.
Edit - 1
If lsof doesn't catch the process, netstat can also be used.
sudo netstat -lpn |grep :3306
kill -9 PID //PID processID used by 3306 Port
Ref - Port 3306 busy but no process using it

Can't save iptables rule on Google Cloud VM instance (CentOS 7)

I'm running Tomcat8 on CentOS7 in Google VM instance on port 8080.
I setup the iptables rule to map all external connections to port 80 to 8080
iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 8080
After that I save the rule with
service iptables save
Tomcat works fine and accessible from outside via port 80.
The rule is saved in /etc/sysconfig/iptables.
...
-A PREROUTING -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 8080
...
but after server reboot the rule is not applied.
It's still in the file /etc/sysconfig/iptables but not in action when I run
iptables-save
It seems that iptables rules are restored from somewhere else.
How can I persist the rule properly to preserve it after reboot?
In order to resolve the issue with IPtables you can do the following:
yum install iptables-services
systemctl mask firewalld
systemctl enable iptables
systemctl enable ip6tables
systemctl stop firewalld
systemctl start iptables
systemctl start ip6tables
However, Centos7 is using FirewallD now instead. In order to apply the firewall, you need to check first what are the available zones and which zones are active on FirewallD by running these commands:
firewall-cmd --list-all-zones
firewall-cmd --get-active-zones
If public zone is active for example, you can run these commands to enable port forwarding (port 80 to 8080 in your case):
firewall-cmd --zone=public --add-masquerade --permanent
firewall-cmd --zone=public --add-forward-port=port=80:proto=tcp:toport=8080 --permanent
Once done, you can reload the rules to make sure everything is OK by running this command:
firewall-cmd --reload
You can check man firewall-cmd for more information.