Unable to mount volumes for pod - mysql

EDITED:
I've an OpenShift cluster with one master and two nodes. I've installed NFS on the master and NFS client on the nodes.
I've followed the wordpress example with NFS: https://github.com/openshift/origin/tree/master/examples/wordpress
I did the following on my master as: oc login -u system:admin:
mkdir /home/data/pv0001
mkdir /home/data/pv0002
chown -R nfsnobody:nfsnobody /home/data
chmod -R 777 /home/data/
# Add to /etc/exports
/home/data/pv0001 *(rw,sync,no_root_squash)
/home/data/pv0002 *(rw,sync,no_root_squash)
# Enable the new exports without bouncing the NFS service
exportfs -a
So exportfs shows:
/home/data/pv0001
<world>
/home/data/pv0002
<world>
$ setsebool -P virt_use_nfs 1
# Create the persistent volumes for NFS.
# I did not change anything in the yaml-files
$ oc create -f examples/wordpress/nfs/pv-1.yaml
$ oc create -f examples/wordpress/nfs/pv-2.yaml
$ oc get pv
NAME LABELS CAPACITY ACCESSMODES STATUS CLAIM REASON
pv0001 <none> 1073741824 RWO,RWX Available
pv0002 <none> 5368709120 RWO Available
This is also what I get.
Than I'm going to my node:
oc login
test-admin
And I create a wordpress project:
oc new-project wordpress
# Create claims for storage in my project (same namespace).
# The claims in this example carefully match the volumes created above.
$ oc create -f examples/wordpress/pvc-wp.yaml
$ oc create -f examples/wordpress/pvc-mysql.yaml
$ oc get pvc
NAME LABELS STATUS VOLUME
claim-mysql map[] Bound pv0002
claim-wp map[] Bound pv0001
This looks exactly the same for me.
Launch the MySQL pod.
oc create -f examples/wordpress/pod-mysql.yaml
oc create -f examples/wordpress/service-mysql.yaml
oc create -f examples/wordpress/pod-wordpress.yaml
oc create -f examples/wordpress/service-wp.yaml
oc get svc
NAME LABELS SELECTOR IP(S) PORT(S)
mysql name=mysql name=mysql 172.30.115.137 3306/TCP
wpfrontend name=wpfrontend name=wordpress 172.30.170.55 5055/TCP
So actually everyting seemed to work! But when I'm asking for my pod status I get the following:
[root#ip-10-0-0-104 pv0002]# oc get pod
NAME READY STATUS RESTARTS AGE
mysql 0/1 Image: openshift/mysql-55-centos7 is ready, container is creating 0 6h
wordpress 0/1 Image: wordpress is not ready on the node 0 6h
The pods are in pending state and in the webconsole they're giving the following error:
12:12:51 PM mysql Pod failedMount Unable to mount volumes for pod "mysql_wordpress": exit status 32 (607 times in the last hour, 41 minutes)
12:12:51 PM mysql Pod failedSync Error syncing pod, skipping: exit status 32 (607 times in the last hour, 41 minutes)
12:12:48 PM wordpress Pod failedMount Unable to mount volumes for pod "wordpress_wordpress": exit status 32 (604 times in the last hour, 40 minutes)
12:12:48 PM wordpress Pod failedSync Error syncing pod, skipping: exit status 32 (604 times in the last hour, 40 minutes)
Unable to mount +timeout. But when I'm going to my node and I'm doing the following (test is a created directory on my node):
mount -t nfs -v masterhostname:/home/data/pv0002 /test
And I place some file in my /test on my node than it appears in my /home/data/pv0002 on my master so that seems to work.
What's the reason that it's unable to mount in OpenShift?
I've been stuck on this for a while.
LOGS:
Oct 21 10:44:52 ip-10-0-0-129 docker: time="2015-10-21T10:44:52.795267904Z" level=info msg="GET /containers/json"
Oct 21 10:44:52 ip-10-0-0-129 origin-node: E1021 10:44:52.832179 1148 mount_linux.go:103] Mount failed: exit status 32
Oct 21 10:44:52 ip-10-0-0-129 origin-node: Mounting arguments: localhost:/home/data/pv0002 /var/lib/origin/openshift.local.volumes/pods/2bf19fe9-77ce-11e5-9122-02463424c049/volumes/kubernetes.io~nfs/pv0002 nfs []
Oct 21 10:44:52 ip-10-0-0-129 origin-node: Output: mount.nfs: access denied by server while mounting localhost:/home/data/pv0002
Oct 21 10:44:52 ip-10-0-0-129 origin-node: E1021 10:44:52.832279 1148 kubelet.go:1206] Unable to mount volumes for pod "mysql_wordpress": exit status 32; skipping pod
Oct 21 10:44:52 ip-10-0-0-129 docker: time="2015-10-21T10:44:52.832794476Z" level=info msg="GET /containers/json?all=1"
Oct 21 10:44:52 ip-10-0-0-129 docker: time="2015-10-21T10:44:52.835916304Z" level=info msg="GET /images/openshift/mysql-55-centos7/json"
Oct 21 10:44:52 ip-10-0-0-129 origin-node: E1021 10:44:52.837085 1148 pod_workers.go:111] Error syncing pod 2bf19fe9-77ce-11e5-9122-02463424c049, skipping: exit status 32

Logs showed Oct 21 10:44:52 ip-10-0-0-129 origin-node: Output: mount.nfs: access denied by server while mounting localhost:/home/data/pv0002
So it failed mounting on localhost.
to create my persistent volume I've executed this yaml:
{
"apiVersion": "v1",
"kind": "PersistentVolume",
"metadata": {
"name": "registry-volume"
},
"spec": {
"capacity": {
"storage": "20Gi"
},
"accessModes": [ "ReadWriteMany" ],
"nfs": {
"path": "/home/data/pv0002",
"server": "localhost"
}
}
}
So I was mounting to /home/data/pv0002 but this path was not on the localhost but on my master server (which is ose3-master.example.com. So I created my PV in a wrong way.
{
"apiVersion": "v1",
"kind": "PersistentVolume",
"metadata": {
"name": "registry-volume"
},
"spec": {
"capacity": {
"storage": "20Gi"
},
"accessModes": [ "ReadWriteMany" ],
"nfs": {
"path": "/home/data/pv0002",
"server": "ose3-master.example.com"
}
}
}
This was also in a training environment. It's recommended to have a NFS server outside of your cluster to mount to.

Related

Linux capabilities for container to update file atime programmatically

I have a container running as non-privileged mode. I'd like to update file atime via python code for some reason but found I could not do that due to permission issue, even though I can write to that file.
I tried to add linux capabilities to the container, but even with SYS_AMDIN, it still does not work.
Anyone happens to know what capabilities to add or what I missed there?
thank you!
bash-5.1$ id
uid=1000(contest) gid=1000(contest) groups=1000(contest)
bash-5.1$ ls -l
total 250
-rwxrwxrwx 1 root contest 0 Oct 27 07:16 anotherfile
-rwxrwxrwx 1 root contest 254823 Oct 27 07:37 outfile
-rwxrwxrwx 1 root contest 0 Oct 24 03:52 test
-rwxrwxrwx 1 root contest 364 Oct 27 07:16 test.py
-rwxrwxrwx 1 root contest 18 Oct 24 05:25 testfile
bash-5.1$ python3 test.py
1666854988.190472
1666851388.190472
Traceback (most recent call last):
File "/mnt/azurefile/test.py", line 19, in <module>
os.utime(myfile, (atime - 3600.0, mtime))
PermissionError: [Errno 1] Operation not permitted
bash-5.1$ capsh --print
Current: =
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_sys_admin,cap_mknod,cap_audit_write,cap_setfcap
Ambient set =
Current IAB: !cap_dac_read_search,!cap_linux_immutable,!cap_net_broadcast,!cap_net_admin,!cap_ipc_lock,!cap_ipc_owner,!cap_sys_module,!cap_sys_rawio,!cap_sys_ptrace,!cap_sys_pacct,!cap_sys_boot,!cap_sys_nice,!cap_sys_resource,!cap_sys_time,!cap_sys_tty_config,!cap_lease,!cap_audit_control,!cap_mac_override,!cap_mac_admin,!cap_syslog,!cap_wake_alarm,!cap_block_suspend,!cap_audit_read
Securebits: 00/0x0/1'b0 (no-new-privs=0)
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
secure-no-ambient-raise: no (unlocked)
uid=1000(contest) euid=1000(contest)
gid=1000(contest)
groups=1000(contest)
Guessed mode: HYBRID (4)
my python code to update atime:
from datetime import datetime
import os
import time
myfile = "anotherfile"
current_time = time.time()
"""
Set the access time of a given filename to the given atime.
atime must be a datetime object.
"""
stat = os.stat(myfile)
mtime = stat.st_mtime
atime = stat.st_atime
print(mtime)
mtime = mtime - 3600.0
print(mtime)
os.utime(myfile, (atime - 3600.0, mtime))
pod yaml
---
kind: Pod
apiVersion: v1
metadata:
name: nginx-azurefile
spec:
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
nodeSelector:
"kubernetes.io/os": linux
containers:
- image: acheng.azurecr.io/capsh
name: nginx-azurefile
securityContext:
capabilities:
add: ["CHOWN","SYS_ADMIN","SYS_RESOURCES"]
command:
- "/bin/bash"
- "-c"
- set -euo pipefail; while true; do echo $(date) >> /mnt/azurefile/outfile; sleep 10; done
volumeMounts:
- name: persistent-storage
mountPath: "/mnt/azurefile"
imagePullSecrets:
- name: acr-secret
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: pvc-azurefile
tried to add SYS_ADMIN capabilities but didn't work.
if container runs in privileged mode, the code is able to update file access time as expected
answering my own question here.
After searching around, I found kubernetes does not support capabilities for non-root users. the capabilities added in container spec is for root user only. won't take effect for non-root users.
see this github issue for details: https://github.com/kubernetes/kubernetes/issues/56374
a workaround is to add cap directly to the executable file using setcap command (from libcap).
and the capability needed is CAP_FOWNER

Unable to start nginx-ingress-controller Readiness and Liveness probes failed

I have installed using instructions at this link for the Install NGINX using NodePort option.
When I do ks logs -f ingress-nginx-controller-7f48b8-s7pg4 -n ingress-nginx I get :
W0304 09:33:40.568799 8 client_config.go:614] Neither --kubeconfig nor --master was
specified. Using the inClusterConfig. This might not work.
I0304 09:33:40.569097 8 main.go:241] "Creating API client" host="https://10.96.0.1:443"
I0304 09:33:40.584904 8 main.go:285] "Running in Kubernetes cluster" major="1" minor="23" git="v1.23.1+k0s" state="clean" commit="b230d3e4b9d6bf4b731d96116a6643786e16ac3f" platform="linux/amd64"
I0304 09:33:40.911443 8 main.go:105] "SSL fake certificate created" file="/etc/ingress-controller/ssl/default-fake-certificate.pem"
I0304 09:33:40.916404 8 main.go:115] "Enabling new Ingress features available since Kubernetes v1.18"
W0304 09:33:40.918137 8 main.go:127] No IngressClass resource with name nginx found. Only annotation will be used.
I0304 09:33:40.942282 8 ssl.go:532] "loading tls certificate" path="/usr/local/certificates/cert" key="/usr/local/certificates/key"
I0304 09:33:40.977766 8 nginx.go:254] "Starting NGINX Ingress controller"
I0304 09:33:41.007616 8 event.go:282] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"ingress-nginx-controller", UID:"1a4482d2-86cb-44f3-8ebb-d6342561892f", APIVersion:"v1", ResourceVersion:"987560", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/ingress-nginx-controller
E0304 09:33:42.087113 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:43.041954 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:44.724681 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:48.303789 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:59.113203 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:34:16.727052 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
I0304 09:34:39.216165 8 main.go:187] "Received SIGTERM, shutting down"
I0304 09:34:39.216773 8 nginx.go:372] "Shutting down controller queues"
E0304 09:34:39.217779 8 store.go:178] timed out waiting for caches to sync
I0304 09:34:39.217856 8 nginx.go:296] "Starting NGINX process"
I0304 09:34:39.218007 8 leaderelection.go:243] attempting to acquire leader lease ingress-nginx/ingress-controller-leader-nginx...
I0304 09:34:39.219741 8 queue.go:78] "queue has been shutdown, failed to enqueue" key="&ObjectMeta{Name:initial-sync,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{},OwnerReferences:[]OwnerReference{},Finalizers:[],ClusterName:,ManagedFields:[]ManagedFieldsEntry{},}"
I0304 09:34:39.219787 8 nginx.go:316] "Starting validation webhook" address=":8443" certPath="/usr/local/certificates/cert" keyPath="/usr/local/certificates/key"
I0304 09:34:39.242501 8 leaderelection.go:253] successfully acquired lease ingress-nginx/ingress-controller-leader-nginx
I0304 09:34:39.242807 8 queue.go:78] "queue has been shutdown, failed to enqueue" key="&ObjectMeta{Name:sync status,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{},OwnerReferences:[]OwnerReference{},Finalizers:[],ClusterName:,ManagedFields:[]ManagedFieldsEntry{},}"
I0304 09:34:39.242837 8 status.go:84] "New leader elected" identity="ingress-nginx-controller-7f48b8-s7pg4"
I0304 09:34:39.252025 8 status.go:204] "POD is not ready" pod="ingress-nginx/ingress-nginx-controller-7f48b8-s7pg4" node="fbcdcesdn02"
I0304 09:34:39.255282 8 status.go:132] "removing value from ingress status" address=[]
I0304 09:34:39.255328 8 nginx.go:380] "Stopping admission controller"
I0304 09:34:39.255379 8 nginx.go:388] "Stopping NGINX process"
E0304 09:34:39.255664 8 nginx.go:319] "Error listening for TLS connections" err="http: Server closed"
2022/03/04 09:34:39 [notice] 43#43: signal process started
I0304 09:34:40.263361 8 nginx.go:401] "NGINX process has stopped"
I0304 09:34:40.263396 8 main.go:195] "Handled quit, awaiting Pod deletion"
I0304 09:34:50.263585 8 main.go:198] "Exiting" code=0
When I do ks describe pod ingress-nginx-controller-7f48b8-s7pg4 -n ingress-nginx I get :
Name: ingress-nginx-controller-7f48b8-s7pg4
Namespace: ingress-nginx
Priority: 0
Node: fxxxxxxxx/10.XXX.XXX.XXX
Start Time: Fri, 04 Mar 2022 08:12:57 +0200
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/name=ingress-nginx
pod-template-hash=7f48b8
Annotations: kubernetes.io/psp: 00-k0s-privileged
Status: Running
IP: 10.244.0.119
IPs:
IP: 10.244.0.119
Controlled By: ReplicaSet/ingress-nginx-controller-7f48b8
Containers:
controller:
Container ID: containerd://638ff4d63b7ba566125bd6789d48db6e8149b06cbd9d887ecc57d08448ba1d7e
Image: k8s.gcr.io/ingress-nginx/controller:v0.48.1#sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899
Image ID: k8s.gcr.io/ingress-nginx/controller#sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899
Ports: 80/TCP, 443/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Args:
/nginx-ingress-controller
--election-id=ingress-controller-leader
--ingress-class=nginx
--configmap=$(POD_NAMESPACE)/ingress-nginx-controller
--validating-webhook=:8443
--validating-webhook-certificate=/usr/local/certificates/cert
--validating-webhook-key=/usr/local/certificates/key
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 04 Mar 2022 11:33:40 +0200
Finished: Fri, 04 Mar 2022 11:34:50 +0200
Ready: False
Restart Count: 61
Requests:
cpu: 100m
memory: 90Mi
Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: ingress-nginx-controller-7f48b8-s7pg4 (v1:metadata.name)
POD_NAMESPACE: ingress-nginx (v1:metadata.namespace)
LD_PRELOAD: /usr/local/lib/libmimalloc.so
Mounts:
/usr/local/certificates/ from webhook-cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zvcnr (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
webhook-cert:
Type: Secret (a volume populated by a Secret)
SecretName: ingress-nginx-admission
Optional: false
kube-api-access-zvcnr:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 23m (x316 over 178m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Warning BackOff 8m52s (x555 over 174m) kubelet Back-off restarting failed container
Normal Pulled 3m54s (x51 over 178m) kubelet Container image "k8s.gcr.io/ingress-nginx/controller:v0.48.1#sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899" already present on machine
When I try to curl the health endpoints I get Connection refused :
The state of the pods shows that they are both not ready :
NAME READY STATUS RESTARTS AGE
ingress-nginx-admission-create-4hzzk 0/1 Completed 0 3h30m
ingress-nginx-controller-7f48b8-s7pg4 0/1 CrashLoopBackOff 63 (91s ago) 3h30m
I have tried to increase the values for initialDelaySeconds in /etc/nginx/nginx.conf but when I attempt to exec into the container (ks exec -it -n ingress-nginx ingress-nginx-controller-7f48b8-s7pg4 -- bash) I also get an error error: unable to upgrade connection: container not found ("controller")
I am not really sure where I should be looking in the overall setup.
I have installed using instructions at this link for the Install NGINX using NodePort option.
The problem is that you are using outdated k0s documentation:
https://docs.k0sproject.io/v1.22.2+k0s.1/examples/nginx-ingress/
You should use this link instead:
https://docs.k0sproject.io/main/examples/nginx-ingress/
You will install the controller-v1.0.0 version on your Kubernetes cluster by following the actual documentation link.
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.0.0/deploy/static/provider/baremetal/deploy.yaml
The result is:
$ sudo k0s kubectl get pods -n ingress-nginx
NAME READY STATUS RESTARTS AGE
ingress-nginx-admission-create-dw2f4 0/1 Completed 0 11m
ingress-nginx-admission-patch-4dmpd 0/1 Completed 0 11m
ingress-nginx-controller-75f58fbf6b-xrfxr 1/1 Running 0 11m

How to pretty format the Traefik log in JSON?

I have a Traefik service with the following configuration:
version: "3.9"
services:
reverse-proxy:
image: traefik:v2.3.4
networks:
common:
ports:
- target: 80
published: 80
mode: host
- target: 443
published: 443
mode: host
command:
- "--providers.docker.endpoint=unix:///var/run/docker.sock"
- "--providers.docker.swarmMode=true"
- "--providers.docker.exposedbydefault=false"
- "--providers.docker.network=common"
- "--entrypoints.web.address=:80"
# - "--entrypoints.websecure.address=:443"
- "--global.sendAnonymousUsage=true"
# Set a debug level custom log file
- "--log.level=DEBUG"
- "--log.format=json"
- "--log.filePath=/var/log/traefik.log"
- "--accessLog.filePath=/var/log/access.log"
# Enable the Traefik dashboard
- "--api.dashboard=true"
# - "traefik.constraint-label=common" TODO
deploy:
placement:
constraints:
- node.role == manager
labels:
# Expose the Traefik dashboard
- "traefik.enable=true"
- "traefik.http.routers.dashboard.service=api#internal"
- "traefik.http.services.traefik.loadbalancer.server.port=888" # A port number required by Docker Swarm but not being used in fact
- "traefik.http.routers.dashboard.rule=Host(`traefik.learnintouch.com`)"
- "traefik.http.routers.traefik.entrypoints=web"
# - "traefik.http.routers.traefik.entrypoints=websecure"
# Basic HTTP authentication to secure the dashboard access
- "traefik.http.routers.traefik.middlewares=traefik-auth"
- "traefik.http.middlewares.traefik-auth.basicauth.users=stephane:$$apr1$$m72sBfSg$$7.NRvy75AZXAMtH3C2YTz/"
volumes:
# So that Traefik can listen to the Docker events
- "/var/run/docker.sock:/var/run/docker.sock:ro"
- "~/dev/docker/projects/common/volumes/logs/traefik.service.log:/var/log/traefik.log"
- "~/dev/docker/projects/common/volumes/logs/traefik.access.log:/var/log/access.log"
Then I watch the log with the command:
stephane#stephane-pc:~$ tail -f dev/docker/projects/common/volumes/logs/traefik.service.log
{"level":"info","msg":"I have to go...","time":"2021-07-03T10:18:10Z"}
{"level":"info","msg":"Stopping server gracefully","time":"2021-07-03T10:18:10Z"}
{"entryPointName":"web","level":"debug","msg":"Waiting 10s seconds before killing connections.","time":"2021-07-03T10:18:10Z"}
{"entryPointName":"web","level":"error","msg":"accept tcp [::]:80: use of closed network connection","time":"2021-07-03T10:18:10Z"}
I expected the log to be formatted in JSON with indentation.
So I copy-pasted the non indented JSON output in an online JSON formatter but it only indented part of it, making the whole thing useless.
Your problem is that Traefik does not output a single JSON document, but one JSON document per line. You could beautify all documents using xargs and jq:
tail -f dev/docker/projects/common/volumes/logs/traefik.service.log | xargs -n 1 -d "\n" -- bash -c 'echo "$1" | jq' _
In your example, this will result in this output (even with syntax highlighting if your terminal supports that):
{
"level": "info",
"msg": "I have to go...",
"time": "2021-07-03T10:18:10Z"
}
{
"level": "info",
"msg": "Stopping server gracefully",
"time": "2021-07-03T10:18:10Z"
}
{
"entryPointName": "web",
"level": "debug",
"msg": "Waiting 10s seconds before killing connections.",
"time": "2021-07-03T10:18:10Z"
}
{
"entryPointName": "web",
"level": "error",
"msg": "accept tcp [::]:80: use of closed network connection",
"time": "2021-07-03T10:18:10Z"
}

How to manually recreate the bootstrap client certificate for OpenShift 3.11 master?

Our origin-node.service on the master node fails with:
root#master> systemctl start origin-node.service
Job for origin-node.service failed because the control process exited with error code. See "systemctl status origin-node.service" and "journalctl -xe" for details.
root#master> systemctl status origin-node.service -l
[...]
May 05 07:17:47 master origin-node[44066]: bootstrap.go:195] Part of the existing bootstrap client certificate is expired: 2020-02-20 13:14:27 +0000 UTC
May 05 07:17:47 master origin-node[44066]: bootstrap.go:56] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file
May 05 07:17:47 master origin-node[44066]: certificate_store.go:131] Loading cert/key pair from "/etc/origin/node/certificates/kubelet-client-current.pem".
May 05 07:17:47 master origin-node[44066]: server.go:262] failed to run Kubelet: cannot create certificate signing request: Post https://lb.openshift-cluster.mydomain.com:8443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests: EOF
So it seems that kubelet-client-current.pem and/or kubelet-server-current.pem contains an expired certificate and the service tries to create a CSR using an endpoint which is probably not yet available (because the master is down). We tried redeploying the certificates according to the OpenShift documentation Redeploying Certificates, but this fails while detecting an expired certificate:
root#master> ansible-playbook -i /etc/ansible/hosts openshift-master/redeploy-openshift-ca.yml
[...]
TASK [openshift_certificate_expiry : Fail when certs are near or already expired] *******************************************************************************************************************************************
fatal: [master.openshift-cluster.mydomain.com]: FAILED! => {"changed": false, "msg": "Cluster certificates found to be expired or within 60 days of expiring. You may view the report at /root/cert-expiry-report.20200505T042754.html or /root/cert-expiry-report.20200505T042754.json.\n"}
[...]
root#master> cat /root/cert-expiry-report.20200505T042754.json
[...]
"kubeconfigs": [
{
"cert_cn": "O:system:cluster-admins, CN:system:admin",
"days_remaining": -75,
"expiry": "2020-02-20 13:14:27",
"health": "expired",
"issuer": "CN=openshift-signer#1519045219 ",
"path": "/etc/origin/node/node.kubeconfig",
"serial": 27,
"serial_hex": "0x1b"
},
{
"cert_cn": "O:system:cluster-admins, CN:system:admin",
"days_remaining": -75,
"expiry": "2020-02-20 13:14:27",
"health": "expired",
"issuer": "CN=openshift-signer#1519045219 ",
"path": "/etc/origin/node/node.kubeconfig",
"serial": 27,
"serial_hex": "0x1b"
},
[...]
"summary": {
"expired": 2,
"ok": 22,
"total": 24,
"warning": 0
}
}
There is a guide for OpenShift 4.4 for Recovering from expired control plane certificates, but that does not apply for 3.11 and we did not find such a guide for our version.
Is it possible to recreate the expired certificates without a running master node for 3.11? Thanks for any help.
OpenShift Ansible: https://github.com/openshift/openshift-ansible/releases/tag/openshift-ansible-3.11.153-2
Update 2020-05-06: I also executed redeploy-certificates.yml, but it fails at the same TASK:
root#master> ansible-playbook -i /etc/ansible/hosts playbooks/redeploy-certificates.yml
[...]
TASK [openshift_certificate_expiry : Fail when certs are near or already expired] ******************************************************************************
Wednesday 06 May 2020 04:07:06 -0400 (0:00:00.909) 0:01:07.582 *********
fatal: [master.openshift-cluster.mydomain.com]: FAILED! => {"changed": false, "msg": "Cluster certificates found to be expired or within 60 days of expiring. You may view the report at /root/cert-expiry-report.20200506T040603.html or /root/cert-expiry-report.20200506T040603.json.\n"}
Update 2020-05-11: Running with -e openshift_certificate_expiry_fail_on_warn=False results in:
root#master> ansible-playbook -i /etc/ansible/hosts -e openshift_certificate_expiry_fail_on_warn=False playbooks/redeploy-certificates.yml
[...]
TASK [Wait for master API to come back online] *****************************************************************************************************************
Monday 11 May 2020 03:48:56 -0400 (0:00:00.111) 0:02:25.186 ************
skipping: [master.openshift-cluster.mydomain.com]
TASK [openshift_control_plane : restart master] ****************************************************************************************************************
Monday 11 May 2020 03:48:56 -0400 (0:00:00.257) 0:02:25.444 ************
changed: [master.openshift-cluster.mydomain.com] => (item=api)
changed: [master.openshift-cluster.mydomain.com] => (item=controllers)
RUNNING HANDLER [openshift_control_plane : verify API server] **************************************************************************************************
Monday 11 May 2020 03:48:57 -0400 (0:00:00.945) 0:02:26.389 ************
FAILED - RETRYING: verify API server (120 retries left).
FAILED - RETRYING: verify API server (119 retries left).
[...]
FAILED - RETRYING: verify API server (1 retries left).
fatal: [master.openshift-cluster.mydomain.com]: FAILED! => {"attempts": 120, "changed": false, "cmd": ["curl", "--silent", "--tlsv1.2", "--max-time", "2", "--cacert", "/etc/origin/master/ca-bundle.crt", "https://lb.openshift-cluster.mydomain.com:8443/healthz/ready"], "delta": "0:00:00.182367", "end": "2020-05-11 03:51:52.245644", "msg": "non-zero return code", "rc": 35, "start": "2020-05-11 03:51:52.063277", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
root#master> systemctl status origin-node.service -l
[...]
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: E0511 04:23:28.077964 109972 bootstrap.go:195] Part of the existing bootstrap client certificate is expired: 2020-02-20 13:14:27 +0000 UTC
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: I0511 04:23:28.078001 109972 bootstrap.go:56] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: I0511 04:23:28.080555 109972 certificate_store.go:131] Loading cert/key pair from "/etc/origin/node/certificates/kubelet-client-current.pem".
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: F0511 04:23:28.130968 109972 server.go:262] failed to run Kubelet: cannot create certificate signing request: Post https://lb.openshift-cluster.mydomain.com:8443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests: EOF
[...]
I have this same case in customer environment, this error is because the certified was expiry, i "cheated" changing da S.O date before the expiry date. And the origin-node service started in my masters:
systemctl status origin-node
● origin-node.service - OpenShift Node
Loaded: loaded (/etc/systemd/system/origin-node.service; enabled; vendor preset: disabled)
Active: active (running) since Sáb 2021-02-20 20:22:21 -02; 6min ago
Docs: https://github.com/openshift/origin
Main PID: 37230 (hyperkube)
Memory: 79.0M
CGroup: /system.slice/origin-node.service
└─37230 /usr/bin/hyperkube kubelet --v=2 --address=0.0.0.0 --allow-privileged=true --anonymous-auth=true --authentication-token-webhook=true --authentication-token-webhook-cache-ttl=5m --authorization-mode=Webhook --authorization-webhook-c...
Você tem mensagem de correio em /var/spool/mail/okd
The openshift_certificate_expiry role uses the openshift_certificate_expiry_fail_on_warn variable to determine if the playbook should fail when the days left are less than openshift_certificate_expiry_warning_days.
So try running the redeploy-certificates.yml with this additional variable set to "False":
ansible-playbook -i /etc/ansible/hosts -e openshift_certificate_expiry_fail_on_warn=False playbooks/redeploy-certificates.yml

Container keeps crashing for Pod in minikube after the creation of PV and PVC

i have a REST application integrated with kubernetes for testing REST queries. Now when i execute a POST query on my client side the status of the job which is automatically created remains PENDING indefinitely. The same happens with the POD which is also created automatically
When i looked deeper into the events in dashboard, it attaches the volume but is unable to mount the volume and gives this error :
Unable to mount volumes for pod "ingestion-88dhg_default(4a8dd589-e3d3-4424-bc11-27d51822d85b)": timeout expired waiting for volumes to attach or mount for pod "default"/"ingestion-88dhg". list of unmounted volumes=[cdiworkspace-volume]. list of unattached volumes=[cdiworkspace-volume default-token-qz2nb]
i have defined the persistent volume and persistent volume claim manually using following codes but did not connect to any pods. Should i do that?
PV
{
"kind": "PersistentVolume",
"apiVersion": "v1",
"metadata": {
"name": "cdiworkspace",
"selfLink": "/api/v1/persistentvolumes/cdiworkspace",
"uid": "92252f76-fe51-4225-9b63-4d6228d9e5ea",
"resourceVersion": "100026",
"creationTimestamp": "2019-07-10T09:49:04Z",
"annotations": {
"pv.kubernetes.io/bound-by-controller": "yes"
},
"finalizers": [
"kubernetes.io/pv-protection"
]
},
"spec": {
"capacity": {
"storage": "10Gi"
},
"fc": {
"targetWWNs": [
"50060e801049cfd1"
],
"lun": 0
},
"accessModes": [
"ReadWriteOnce"
],
"claimRef": {
"kind": "PersistentVolumeClaim",
"namespace": "default",
"name": "cdiworkspace",
"uid": "0ce96c77-9e0d-4b1f-88bb-ad8b84072000",
"apiVersion": "v1",
"resourceVersion": "98688"
},
"persistentVolumeReclaimPolicy": "Retain",
"storageClassName": "standard",
"volumeMode": "Block"
},
"status": {
"phase": "Bound"
}
}
PVC
{
"kind": "PersistentVolumeClaim",
"apiVersion": "v1",
"metadata": {
"name": "cdiworkspace",
"namespace": "default",
"selfLink": "/api/v1/namespaces/default/persistentvolumeclaims/cdiworkspace",
"uid": "0ce96c77-9e0d-4b1f-88bb-ad8b84072000",
"resourceVersion": "100028",
"creationTimestamp": "2019-07-10T09:32:16Z",
"annotations": {
"pv.kubernetes.io/bind-completed": "yes",
"pv.kubernetes.io/bound-by-controller": "yes",
"volume.beta.kubernetes.io/storage-provisioner": "k8s.io/minikube-hostpath"
},
"finalizers": [
"kubernetes.io/pvc-protection"
]
},
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"resources": {
"requests": {
"storage": "10Gi"
}
},
"volumeName": "cdiworkspace",
"storageClassName": "standard",
"volumeMode": "Block"
},
"status": {
"phase": "Bound",
"accessModes": [
"ReadWriteOnce"
],
"capacity": {
"storage": "10Gi"
}
}
}
Result of journalctl -xe _SYSTEMD_UNIT=kubelet.service
Jul 01 09:47:26 rehan-B85M-HD3 kubelet[22759]: E0701 09:47:26.979098 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:47:40 rehan-B85M-HD3 kubelet[22759]: E0701 09:47:40.979722 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:47:55 rehan-B85M-HD3 kubelet[22759]: E0701 09:47:55.978806 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:48:08 rehan-B85M-HD3 kubelet[22759]: E0701 09:48:08.979375 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:48:23 rehan-B85M-HD3 kubelet[22759]: E0701 09:48:23.979463 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:48:37 rehan-B85M-HD3 kubelet[22759]: E0701 09:48:37.979005 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:48:48 rehan-B85M-HD3 kubelet[22759]: E0701 09:48:48.977686 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:49:02 rehan-B85M-HD3 kubelet[22759]: E0701 09:49:02.979125 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:49:17 rehan-B85M-HD3 kubelet[22759]: E0701 09:49:17.979408 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:49:28 rehan-B85M-HD3 kubelet[22759]: E0701 09:49:28.977499 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:49:41 rehan-B85M-HD3 kubelet[22759]: E0701 09:49:41.977771 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:49:53 rehan-B85M-HD3 kubelet[22759]: E0701 09:49:53.978605 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:50:05 rehan-B85M-HD3 kubelet[22759]: E0701 09:50:05.980251 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:50:16 rehan-B85M-HD3 kubelet[22759]: E0701 09:50:16.979292 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:50:31 rehan-B85M-HD3 kubelet[22759]: E0701 09:50:31.978346 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:50:42 rehan-B85M-HD3 kubelet[22759]: E0701 09:50:42.979302 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:50:55 rehan-B85M-HD3 kubelet[22759]: E0701 09:50:55.978043 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:51:08 rehan-B85M-HD3 kubelet[22759]: E0701 09:51:08.977540 22759 pod_workers.go:190] Error syncing pod 6577b694-f18d-4d7b-9a75-82dc17c908ca ("myplanet-d976447c6-dsfx9_default(6577b694-f18d-4d7
Jul 01 09:51:24 rehan-B85M-HD3 kubelet[22759]: E0701 09:51:24.190929 22759 remote_image.go:113] PullImage "friendly/myplanet:0.0.1-SNAPSHOT" from image service failed: rpc error: code = Unknown desc = E
Jul 01 09:51:24 rehan-B85M-HD3 kubelet[22759]: E0701 09:51:24.190971 22759 kuberuntime_image.go:51] Pull image "friendly/myplanet:0.0.1-SNAPSHOT" failed: rpc error: code = Unknown desc = Error response
Jul 01 09:51:24 rehan-B85M-HD3 kubelet[22759]: E0701 09:51:24.191024 22759 kuberuntime_manager.go:775] container start failed: ErrImagePull: rpc error: code = Unknown desc = Error response from daemon:
Deployment Yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: back
spec:
replicas: 1
selector:
matchLabels:
app: back
template:
metadata:
labels:
app: back
spec:
containers:
- name: back
image: back:latest
ports:
- containerPort: 8081
protocol: TCP
volumeMounts:
- mountPath: /data
name: back
volumes:
- name: back
hostPath:
# directory location on host
path: /back
# this field is optional
type: Directory
Dockerfile
FROM python:3.7-stretch
COPY . /code
WORKDIR /code
CMD exec /bin/bash -c "trap : TERM INT; sleep infinity & wait"
RUN pip install -r requirements.txt
ENTRYPOINT ["python", "ingestion.py"]
pyython file1
import os
import shutil
import logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(name)s - %(message)s')
logger = logging.getLogger("ingestion")
import requests
import datahub
scihub_username = os.environ["scihub_username"]
scihub_password = os.environ["scihub_password"]
result_url = "http://" + os.environ["CDINRW_BASE_URL"] + "/jobs/" + os.environ["CDINRW_JOB_ID"] + "/results"
logger.info("Searching the Copernicus Open Access Hub")
scenes = datahub.search(username=scihub_username,
password=scihub_password,
producttype=os.getenv("producttype"),
platformname=os.getenv("platformname"),
days_back=os.getenv("days_back", 2),
footprint=os.getenv("footprint"),
max_cloud_cover_percentage=os.getenv("max_cloud_cover_percentage"),
start_date = os.getenv("start_date"),
end_date = os.getenv("end_date"))
logger.info("Found {} relevant scenes".format(len(scenes)))
job_results = []
for scene in scenes:
# do not donwload a scene that has already been ingested
if os.path.exists(os.path.join("/out_data", scene["title"]+".SAFE")):
logger.info("The scene {} already exists in /out_data and will not be downloaded again.".format(scene["title"]))
filename = scene["title"]+".SAFE"
else:
logger.info("Starting the download of scene {}".format(scene["title"]))
filename = datahub.download(scene, "/tmp", scihub_username, scihub_password, unpack=True)
logger.info("The download was successful.")
shutil.move(filename, "/out_data")
result_message = {"description": "test",
"type": "Raster",
"format": "SAFE",
"filename": os.path.basename(filename)}
job_results.append(result_message)
res = requests.put(result_url, json=job_results, timeout=60)
res.raise_for_status()
**python file 2 **
import logging
import os
import urllib.parse
import zipfile
import requests
# constructing URLs for querying the data hub
_BASE_URL = "https://scihub.copernicus.eu/dhus/"
SITE = {}
SITE["SEARCH"] = _BASE_URL + "search?format=xml&sortedby=beginposition&order=desc&rows=100&start={offset}&q="
_PRODUCT_URL = _BASE_URL + "odata/v1/Products('{uuid}')/"
SITE["CHECKSUM"] = _PRODUCT_URL + "Checksum/Value/$value"
SITE["SAFEZIP"] = _PRODUCT_URL + "$value"
logger = logging.getLogger(__name__)
def _build_search_url(producttype=None, platformname=None, days_back=2, footprint=None, max_cloud_cover_percentage=None, start_date=None, end_date=None):
search_terms = []
if producttype:
search_terms.append("producttype:{}".format(producttype))
if platformname:
search_terms.append("platformname:{}".format(platformname))
if start_date and end_date:
search_terms.append(
"beginPosition:[{}+TO+{}]".format(start_date, end_date))
elif days_back:
search_terms.append(
"beginPosition:[NOW-{}DAYS+TO+NOW]".format(days_back))
if footprint:
search_terms.append("footprint:%22Intersects({})%22".format(
footprint.replace(" ", "+")))
if max_cloud_cover_percentage:
search_terms.append("cloudcoverpercentage:[0+TO+{}]".format(max_cloud_cover_percentage))
url = SITE["SEARCH"] + "+AND+".join(search_terms)
return url
def _unpack(zip_file, directory, remove_after=False):
with zipfile.ZipFile(zip_file) as zf:
# This assumes that the zipfile only contains the .SAFE directory at root level
safe_path = zf.namelist()[0]
zf.extractall(path=directory)
if remove_after:
os.remove(zip_file)
return os.path.normpath(os.path.join(directory, safe_path))
def search(username, password, producttype=None, platformname=None ,days_back=2, footprint=None, max_cloud_cover_percentage=None, start_date=None, end_date=None):
""" Search the Copernicus SciHub
Parameters
----------
username : str
user name for the Copernicus SciHub
password : str
password for the Copernicus SciHub
producttype : str, optional
product type to filter for in the query (see https://scihub.copernicus.eu/userguide/FullTextSearch#Search_Keywords for allowed values)
platformname : str, optional
plattform name to filter for in the query (see https://scihub.copernicus.eu/userguide/FullTextSearch#Search_Keywords for allowed values)
days_back : int, optional
number of days before today that will be searched. Default are the last 2 days. If start and end date are set the days_back parameter is ignored
footprint : str, optional
well-known-text representation of the footprint
max_cloud_cover_percentage: str, optional
percentage of cloud cover per scene. Can only be used in combination with Sentinel-2 imagery.
(see https://scihub.copernicus.eu/userguide/FullTextSearch#Search_Keywords for allowed values)
start_date: str, optional
start point of the search extent has to be used in combination with end_date
end_date: str, optional
end_point of the search extent has to be used in combination with start_date
Returns
-------
list
a list of scenes that match the search parameters
"""
import xml.etree.cElementTree as ET
scenes = []
search_url = _build_search_url(producttype, platformname, days_back, footprint, max_cloud_cover_percentage, start_date, end_date)
logger.info("Search URL: {}".format(search_url))
offset = 0
rowsBreak = 5000
name_space = {"atom": "http://www.w3.org/2005/Atom",
"opensearch": "http://a9.com/-/spec/opensearch/1.1/"}
while offset < rowsBreak: # Next pagination page:
response = requests.get(search_url.format(offset=offset), auth=(username, password))
root = ET.fromstring(response.content)
if offset == 0:
rowsBreak = int(
root.find("opensearch:totalResults", name_space).text)
for e in root.iterfind("atom:entry", name_space):
uuid = e.find("atom:id", name_space).text
title = e.find("atom:title", name_space).text
begin_position = e.find(
"atom:date[#name='beginposition']", name_space).text
end_position = e.find(
"atom:date[#name='endposition']", name_space).text
footprint = e.find("atom:str[#name='footprint']", name_space).text
scenes.append({
"id": uuid,
"title": title,
"begin_position": begin_position,
"end_position": end_position,
"footprint": footprint})
# Ultimate DHuS pagination page size limit (rows per page).
offset += 100
return scenes
def download(scene, directory, username, password, unpack=True):
""" Download a Sentinel scene based on its uuid
Parameters
----------
scene : dict
the scene to be downloaded
path : str
the path where the file will be downloaded to
username : str
username for the Copernicus SciHub
password : str
password for the Copernicus SciHub
unpack: boolean, optional
flag that defines whether the downloaded product should be unpacked after download. defaults to true
Raises
------
ValueError
if the size of the downloaded file does not match the Content-Length header
ValueError
if the checksum of the downloaded file does not match the checksum provided by the Copernicus SciHub
Returns
-------
str
path to the downloaded file
"""
import hashlib
md5hash = hashlib.md5()
md5sum = requests.get(SITE["CHECKSUM"].format(
uuid=scene["id"]), auth=(username, password)).text
download_path = os.path.join(directory, scene["title"] + ".zip")
# overwrite if path already exists
if os.path.exists(download_path):
os.remove(download_path)
url = SITE["SAFEZIP"].format(uuid=scene["id"])
rsp = requests.get(url, auth=(username, password), stream=True)
cl = rsp.headers.get("Content-Length")
size = int(cl) if cl else -1
# Actually fetch now:
with open(download_path, "wb") as f: # Do not read as a whole into memory:
written = 0
for block in rsp.iter_content(8192):
f.write(block)
written += len(block)
md5hash.update(block)
written = os.path.getsize(download_path)
if size > -1 and written != size:
raise ValueError("{}: size mismatch, {} bytes written but expected {} bytes to write!".format(
download_path, written, size))
elif md5sum:
calculated = md5hash.hexdigest()
expected = md5sum.lower()
if calculated != expected:
raise ValueError("{}: MD5 mismatch, calculated {} but expected {}!".format(
download_path, calculated, expected))
if unpack:
return _unpack(download_path, directory, remove_after=False)
else:
return download_path
How can i mount the volume properly and automatically onto the pod? i do not want to create the pods manually for each REST service and assign volumes to them
i went through the logs of the pod again and realized that the parameters required by python file1 were not being provided and were causing the container to crash. i tested it by providing all the missing parameters pointed out in logs and giving them in deployment.yaml for the pod which looked like this now:
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: back
spec:
replicas: 1
selector:
matchLabels:
app: back
template:
metadata:
creationTimestamp:
labels:
app: back
spec:
containers:
- name: back
image: back:latest
imagePullPolicy: Never
env:
- name: scihub_username
value: test
- name: scihub_password
value: test
- name: CDINRW_BASE_URL
value: 10.1.40.11:8081/swagger-ui.html
- name: CDINRW_JOB_ID
value: 3fa85f64-5717-4562-b3fc-2c963f66afa6
ports:
- containerPort: 8081
protocol: TCP
volumeMounts:
- mountPath: /data
name: test-volume
volumes:
- name: test-volume
hostPath:
# directory location on host
path: /back
# this field is optional
type: Directory
This started downloading the data and solved the problem for now however this is not how i want it to run as i want it to be triggered through a REST API which provides all parameters and starts and stops this container. i'll create a separate question for that and link it below for anyone to follow.
i have defined the persistent volume and persistent volume claim
manually using following codes but did not connect to any pods. Should
i do that?
So you didn't refer to it in any way then in your Pod definition till now, right ? At least I cannot see it anywhere in your Deployment. If so, the answer is: yes, you must do that so that Pods in your cluster can use it.
Let's start from the beginning. Basically the whole process of configuring a Pod (applies also to Pod template in Deployment definition) to use a PersistentVolume for storage consists of 3 steps [source]:
A cluster administrator creates a PersistentVolume that is backed by physical storage. The administrator does not associate the volume
with any Pod.
A cluster user creates a PersistentVolumeClaim, which gets automatically bound to a suitable PersistentVolume.
The user creates a Pod ( it can be also a Deployment in which you define a certain Pod template specification ) that uses the PersistentVolumeClaim as storage.
It makes no sense to describe here in detail all above mentioned steps as it was already done very well here.
You can verify the PV/PVC availability using the following commands:
kubectl get pv volume-name on this stage should show the status of your volume as Bound
the same with kubectl get pvc task-pv-claim ( in your case kubectl get pvc cdiworkspace however I would recommend to use different name e.g. cdiworkspace-claim for PersistentVolumeClaim so it can be easily differentiated from PersistentVolume itself) - this command also should show the status of Bound
Please notice that the Pod’s configuration file specifies only PersistentVolumeClaim, but it does not specify a PersistentVolume itself. From the Pod’s point of view, the claim is a volume. Here is a nice description which clearly marks the difference between those two objects [source]:
A PersistentVolume (PV) is a piece of storage in the cluster that has
been provisioned by an administrator or dynamically provisioned using
Storage Classes. It is a resource in the cluster just like a node is a
cluster resource. PVs are volume plugins like Volumes, but have a
lifecycle independent of any individual pod that uses the PV. This API
object captures the details of the implementation of the storage, be
that NFS, iSCSI, or a cloud-provider-specific storage system.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It
is similar to a pod. Pods consume node resources and PVCs consume PV
resources. Pods can request specific levels of resources (CPU and
Memory). Claims can request specific size and access modes (e.g., can
be mounted once read/write or many times read-only).
Below example of specification in Pod / Deployment definition which refers to existing PersistentVolumeClaim:
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
As to your question:
How can i mount the volume properly and automatically onto the pod? i
do not want to create the pods manually for each REST service and
assign volumes to them
You don't have to create them manually. You may specify the PersistentVolumeClaim they use in Pod template specification in your Deployment definition.
Docummentation resources:
Detailed step by step description of how to configure a Pod to use a PersistentVolumeClaim for storage you can find here.
More about the concept of Persistent Volumes in Kubernetes can be found in this article.
If you want to share some data available on your minikube host with every Pod in your cluster, there is much simpler approach than PersistentVolume. It is called hostPath. Detailed description you can find here, and below an example that may be useful in your particular case:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
volumeMounts:
- mountPath: /data
name: test-volume
volumes:
- name: test-volume
hostPath:
# directory location on host
path: /directory/with/python/files
# this field is optional
type: Directory
Examples that you posted are actually in json, not in yaml format. You should be able to convert them easily to required format on this page. You should place your files in /directory/with/python/files on your minikube host and they will be available in /data directory in each Pod created by your deployment.
Below your deployment in yaml format with /directory/with/python/files directory on your host mounted at /data using hostPath:
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: back
namespace: default
selfLink: "/apis/extensions/v1beta1/namespaces/default/deployments/back"
uid: 9f21717c-2c04-459f-b47a-95fd8e11728d
resourceVersion: '298987'
generation: 1
creationTimestamp: '2019-07-16T13:16:15Z'
labels:
run: back
annotations:
deployment.kubernetes.io/revision: '1'
spec:
replicas: 1
selector:
matchLabels:
run: back
template:
metadata:
creationTimestamp:
labels:
run: back
spec:
containers:
- name: back
image: back:latest
ports:
- containerPort: 8080
protocol: TCP
volumeMounts:
- mountPath: /data
name: test-volume
volumes:
- name: test-volume
hostPath:
# directory location on host
path: /directory/with/python/files
# this field is optional
type: Directory
resources: {}
terminationMessagePath: "/dev/termination-log"
terminationMessagePolicy: File
imagePullPolicy: Never
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
securityContext: {}
schedulerName: default-scheduler
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
status:
observedGeneration: 1
replicas: 1
updatedReplicas: 1
unavailableReplicas: 1
conditions:
- type: Progressing
status: 'True'
lastUpdateTime: '2019-07-16T13:16:34Z'
lastTransitionTime: '2019-07-16T13:16:15Z'
reason: NewReplicaSetAvailable
message: ReplicaSet "back-7fd9995747" has successfully progressed.
- type: Available
status: 'False'
lastUpdateTime: '2019-07-19T08:32:49Z'
lastTransitionTime: '2019-07-19T08:32:49Z'
reason: MinimumReplicasUnavailable
message: Deployment does not have minimum availability.