k3s not able to pull from a docker registry on my lan - k3s

So I have a registry on my lan, from other machines and from the host curl, nslookup, docker pull/run and podman pull/run work as does just curling the v2 manifests address. From within a container curlying the address https://docker.infrastructure.lan.mydomain/v2/my-image/manifests/latest also works works. So how does k3s/containerd do dns lookups? My guess is that k3s is using an internet DNS like 8.8.8.8 instead of coredns for the equivalent of docker pulls? I want it to use mine (or even coredns)
Anyways here's the error is see, the domain suffix was changed.
Pulling image "docker.infrastructure.lan.mydomain/my-image:latest"
Warning Failed 27m (x4 over 29m) kubelet, infrastructure.lan.mydomain Failed to pull image "docker.infrastructure.lan.mydomain/my-image:latest": rpc error: code = Unknown desc = failed to pull and unpack image "docker.infrastructure.lan.mydomain/my-image:latest": failed to resolve reference "docker.infrastructure.lan.mydomain/my-image:latest": failed to do request: Head https://docker.infrastructure.lan.mydomain/v2/my-image/manifests/latest: dial tcp: lookup docker.infrastructure.lan.mydomain: no such host
Again inside a container this is fine (I can curl the url), and it's fine on the host. It's also fine from other non-k3s machines on my network. But things like kubectl run --image docker.infrastructure.lan.mydomain/my-image:latest testing give the above error

Related

go-ethereum - geth - puppeth - ethstat remote server : docker: command not found

I'm trying to setup a private ethereum test network using Puppeth (as Péter Szilágyi demoed in Ethereum devcon three 2017). I'm running it on a macbook pro (macOS Sierra).
When I try to setup the ethstat network component I get an "docker configured incorrectly: bash: docker: command not found" error. I have docker running and I can use it fine in the terminal e.g. docker ps.
Here are the steps I took:
What would you like to do? (default = stats)
1. Show network stats
2. Manage existing genesis
3. Track new remote server
4. Deploy network components
> 4
What would you like to deploy? (recommended order)
1. Ethstats - Network monitoring tool
2. Bootnode - Entry point of the network
3. Sealer - Full node minting new blocks
4. Wallet - Browser wallet for quick sends (todo)
5. Faucet - Crypto faucet to give away funds
6. Dashboard - Website listing above web-services
> 1
Which server do you want to interact with?
1. Connect another server
> 1
Please enter remote server's address:
> localhost
DEBUG[11-15|22:46:49] Attempting to establish SSH connection server=localhost
WARN [11-15|22:46:49] Bad SSH key, falling back to passwords path=/Users/xxx/.ssh/id_rsa err="ssh: cannot decode encrypted private keys"
The authenticity of host 'localhost:22 ([::1]:22)' can't be established.
SSH key fingerprint is xxx [MD5]
Are you sure you want to continue connecting (yes/no)? yes
What's the login password for xxx at localhost:22? (won't be echoed)
>
DEBUG[11-15|22:47:11] Verifying if docker is available server=localhost
ERROR[11-15|22:47:11] Server not ready for puppeth err="docker configured incorrectly: bash: docker: command not found\n"
Here are my questions:
Is there any documentation / tutorial describing how to setup this remote server properly. Or just on puppeth in general?
Can I not use localhost as "remote server address"
Any ideas on why the docker command is not found (it is installed and running and I can use it ok in the terminal).
Here is what I did.
For the docker you have to use the docker-compose binary. You can find it here.
Furthermore, you have to be sure that an ssh server is running on your localhost and that keys have been generated.
I didn't find any documentations for puppeth whatsoever.
I think I found the root cause to this problem. The SSH daemon is compiled with a default path. If you ssh to a machine with a specific command (other than a shell), you get that default path. This does not include /usr/local/bin for example, where docker lives in my case.
I found the solution here: https://serverfault.com/a/585075:
edit /etc/ssh/sshd_config and make sure it contains PermitUserEnvironment yes (you need to edit this with sudo)
create a file ~/.ssh/environment with the path that you want, in my case:
PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin
When you now run ssh localhost env you should see a PATH that matches whatever you put in ~/.ssh/environment.

Openshift 3 , 503 Error (No server is available to handle this request)

I have created a web application using jsp/tiles/struts/mysql/tomcat. I created new project on Openshift 3 console (Openshift online) https://console.preview.openshift.com/console/ then added tomcat/mySql. I was getting 503 error sometimes and other times, same page was working as expected. 503 error came randomly for any page from my project. When I get 503 error, I refresh some no of times and it goes away, and my page is correctly displayed.
Error that I see is:
"503 Service Unavailable
No server is available to handle this request. "
I did some research:
What I understand from this openshift 2 link:
https://blog.openshift.com/how-to-host-your-java-ee-application-with-auto-scaling/
is that to correct 503 error:
SSH into your application gear using rhc ssh --app <app_name>
Change directory to haproxy/conf
change the following in haproxy.cfg option httpchk GET / to option httpchk GET /api/v1/ping
Restart the HAProxy cartridge from your local machine using RHC rhc cartridge-restart --cartridge haproxy
I dont know if it is also applicable to openshift 3. In openshift 3 where is haproxy.log, haproxy.cfg, haproxy/conf or its slightly different in openshift 3. (Nut thanks to Warrens comments, yes he saw 503 error in openshift related to HAProxy)
Now after 1 week after posting this question:
I am getting Quota Reached Error. I am able to build my project but all deployments are failing. I wonder if 503 error that I was getting earlier(either completely or partially) was related to Quota reached. How should I proceed now.
curl -i localhost:8080/GEA
HTTP/1.1 302 Found Server:
Apache-Coyote/1.1
Location: http://localhost:8080/GEA/
Transfer-Encoding: chunked Date: Tue, 11 Apr 2017 18:03:25 GMT
Tomcat logs do not show any application error.
Will Readiness Probe and Liveness Probe help me? I have not set them yet.
Nor do I know how to set them.
Will scaling help me (I dont know how to set it either)
Do I have to set memory/... all at maximum allowed to ensure project runs smooth?
For me I had a similar situation of getting 503's sometimes and sometimes getting my actual page. the reason was because you have haproxy on the frontend handling the requests. Depending on your setup you may even have a few haproxy pods and your request could be funneled between one of the pods. So as in my case one pod was working and the other not.
So basically
oc get pods -n default
NAME READY STATUS RESTARTS AGE
docker-registry-7-i02rh 1/1 Running 0 75d
registry-console-12-wciib 1/1 Running 0 67d
router-1-533cg 1/1 Running 3 76d
router-1-9utld 1/1 Running 1 76d
router-1-uwf64 1/1 Running 1 76d
As you can see in my output default namespace is where my router(haproxy) pods live. If I change to that namespace
oc project default
Then run
oc logs -f router-1-533cg
on each of the pods you will most likely find a sepcific pod that is behaving bad. You can simply delete, and the replication controller will create a new one

Openshift: Error pulling image from remote, secure docker registry using certificates

I use the all-in-one VM of Openshift origin.
I am trying to pull images from a private, secure registry using an Image Stream. This is the ImageStream definition:
apiVersion: v1
kind: ImageStream
metadata:
name: my-image-stream
annotations:
description: Keeps track of changes in the application image
name: my-image
spec:
dockerImageRepository: "my.registry.net/myproject/my-image"
The repository is secured with a certificate. On my local machine, i have them in /etc/docker/certs.d/my.registry.net and I can login with docker login my.registry.net.
When I run oc import-image, however, I get the following error:
The import completed with errors.
Name: my-image
Namespace: myproject
Created: About an hour ago
Labels: <none>
Description: Keeps track of changes in the application image
Annotations: openshift.io/image.dockerRepositoryCheck=2017-01-27T08:09:49Z
Docker Pull Spec: 172.30.53.244:5000/myproject/my-image
Unique Images: 0
Tags: 1
latest
tagged from my.registry.net/myproject/my-image
! error: Import failed (InternalError): Internal error occurred: Get https://my.registry.net/v2/: remote error: handshake failure
About an hour ago
I have copied the certificates to the vagrant machine and restarted the docker daemon, but the problem remains. I have not found any documentation on how to properly add the certificates, so I just put them in the usual docker folder.
What is the appropriate way to make this work?
Update in response to rezie's answer:
There is no file etc/origin/master/ca-bundle.crt on my vagrant box. I found the following ca-bundle.crt files :
$ find / -iname ca-bundle.crt
/etc/pki/tls/certs/ca-bundle.crt
##multiple lines like
/var/lib/docker/devicemapper/mnt/something-hash-like/rootfs/etc/pki/tls/certs/ca-bundle.crt
/var/lib/origin/openshift.local.config/master/ca-bundle.crt
I appended the root certificate to /etc/pki/tls/certs/ca-bundle.crt and to var/lib/origin/openshift.local.config/master/ca-bundle.crt, but that did not change anything.
Please note, however, that I do not need to have this root certificate in /etc/docker/certs.d/... in order to login directly using docker login my.registry.net
I have appended
I cannot comment due tow lo karma so I'll write an answer saying almost the same as rezie.
The error:
! error: Import failed (InternalError): Internal error occurred: Get https://my.registry.net/v2/: remote error: handshake failure
About an hour ago
Comes from OpenShift, not from docker, therefore adding it to /etc/docker/certs.d/my.registry.net doesn't prevent the error from happening.
You should add the CA certificate at OS level, my guess is the steps failed for some reason so do it this way:
openssl s_client -connect my.registry.net:443 </dev/null |
sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' \
> /etc/pki/ca-trust/source/anchors/my.registry.net.crt &&
update-ca-trust check && update-ca-trust extract
Finally test if it worked running
curl https://my.registry.net/v2
If it doesn't give you a certificate error and you still can't do the oc import restart the atomic-openshift-master-api service
Try appending your CA (the same one you said you said that was used in the my.registry.net directory) into Openshift's ca bundle (e.g. /etc/origin/master/ca-bundle.crt. Then restart the service and reattempt import-image (making sure that you do not include the --insecure flag).
For reference, check out this issue from the Origin project. As you've mentioned, there's currently no way to supply certificates along with the dockercfg secret, and the suggestion from that issue is to add the CA as a trusted root CA across all the hosts.

Apache Geode Configuration

I had a problem trying to get Apache Geode (v1.0.0-incubating.M2) running on Linux.
The problem was: while I was trying to run gfsh start server --name=server1 example command from the documentation it gave me the following error:
Exception in thread "main" com.gemstone.gemfire.InternalGemFireError: Cannot resolve local host name to an IP address.
It turns out that you need to have your hostname (given by output of hostname command) be present in /etc/hosts file.
In my case, hostname gives an alias as an output (let's say my_alias), so I solved the problem by adding my_ip my_full_domain my_alias line to /etc/hosts.

UnknownHostException while formatting HDFS

I have installed CDH4 on CentOS 6.3 64-bit in Pseudo Distributed mode using the following instructions. Everything is set to localhost in the Hadoop configuration files. But, still when I format the name node the below exception appears. When I add an 192.168.1.101 CentOSHost entry to the /etc/hosts file the exception goes away and I am able to run format/start HDFS and run MR jobs.
I want to run MR jobs even when I am not connected to the network without adding an entry to the /etc/hosts file. How to get this done?
12/08/27 22:17:15 WARN net.DNS: Unable to determine address of the host-falling back to "localhost" address
java.net.UnknownHostException: CentOSHost: CentOSHost
at java.net.InetAddress.getLocalHost(InetAddress.java:1360)
at org.apache.hadoop.net.DNS.resolveLocalHostIPAddress(DNS.java:283)
at org.apache.hadoop.net.DNS.(DNS.java:59)
at org.apache.hadoop.hdfs.server.namenode.NNStorage.newBlockPoolID(NNStorage.java:1017)
at org.apache.hadoop.hdfs.server.namenode.NNStorage.newNamespaceInfo(NNStorage.java:565)
at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:145)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:724)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1095)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1193)
It looks like some where the configuration is returning/ using the hostname as CentOSHost.
What does hostname --fqdn returns to you?
For Hadoop, it is important that name look-up and reverse look-up work successfully. You should be able to resolve the ip-address and resolve hostname from the ip-address (Reverse resolution). This can be tested using the above command.
The entry to /etc/hosts is required for the reverse resolution to work. Unless the entry and the configuration are pointing to localhost. Even in that case the hostname --fqdn should return as localhost.