error while loading shared libraries when running prehook pod - openshift

I am new to OpenShift, I am deploying my flask app onto it, but encountered some problem. My app/container name is flog.
I set up a lifecycle prehook to ensure the database is created correctly for the app deployment. Here is my config(critical part):
spec:
replicas: 1
selector:
deploymentconfig: flog
strategy:
activeDeadlineSeconds: 21600
resources: {}
rollingParams:
intervalSeconds: 1
maxSurge: 25%
maxUnavailable: 25%
pre:
execNewPod:
command:
- flask
- init
containerName: flog
env:
- name: FLASK_APP
value: wsgi.py
failurePolicy: Abort
timeoutSeconds: 600
updatePeriodSeconds: 1
type: Rolling
It works correctly in building but breaks in prehook
--> pre: Running hook pod ...
/opt/app-root/bin/python3: error while loading shared libraries: libpython3.5m.so.rh-python35-1.0: cannot open shared object file: No such file or directory
However, when I debug in terminal and type python3 command, it works well.
Thanks in advance for any help.

You will need to add a shell script into your image which in turns then runs your command. The shell script wrapper is needed as initialisation of the shell environment has the side effect of enabling the Python environment, including setting environment variables so it can find the Python shared library.
So change:
command:
- flask
- init
to:
command:
- somescript
And in somescript have:
#!/bin/bash
flask init

Related

mkdir /.gitlab-runner: permission denied running GitLab Runner in Kubernetes deployed via Helm

I'm trying to deploy the GitLab Runner (15.7.1) onto an on-premise Kubernetes cluster and getting the following error:
PANIC: loading system ID file: saving system ID state file: creating directory: mkdir /.gitlab-runner: permission denied
This is occurring with both the 15.7.1 image (Ubuntu?) and the alpine3.13-v15.7.1 image. Looking at the deployment, it looks likes it should be trying to use /home/gitlab-runner, but for some reason it is trying to use root (/), which is a protected directory.
Anyone else experience this issue or have a suggestion as to what to look at?
I am using the Helm chart (0.48.0) using a copy of the images from dockerhub (simply moved into a local repository as internet access is not available from the cluster). Connectivity to GitLab appears to be working, but the error causes the overall startup to fail. Full logs are:
Registration attempt 4 of 30
Runtime platform arch=amd64 os=linux pid=33 revision=6d480948 version=15.7.1
WARNING: Running in user-mode.
WARNING: The user-mode requires you to manually start builds processing:
WARNING: $ gitlab-runner run
WARNING: Use sudo for system-mode:
WARNING: $ sudo gitlab-runner...
Created missing unique system ID system_id=r_Of5q3G0yFEVe
PANIC: loading system ID file: saving system ID state file: creating directory: mkdir /.gitlab-runner: permission denied
I have tried the 15.7.1 image, the alpine3.13-v15.7.1 image, and the gitlab-runner-ocp:amd64-v15.7.1 image and searched the values.yaml for anything relevant to the path. Looking at the deployment template, it appears that it ought to be using /home/gitlab-runner as the directory (instead of /) [though the docs suggested it was /home].
As for "what was I expecting", of course I was expecting that it would "just work" :)
So, resolved this (and other) issues with:
Updated helm deployment template to mount an empty volume at /.gitlab-runner
[separate issue] explicitly added builds_dir and environment [per gitlab-org/gitlab-runner#3511 (comment 114281106)].
These two steps appeared to be sufficient to get the Helm chart deployment working.
You can easily create and mount the emptyDir (in case you are creating gitlab-runner with kubernetes manifest *.yml file):
volumes:
- emptyDir: {}
name: gitlab-runner
volumeMounts:
- name: gitlab-runner
mountPath: /.gitlab-runner
-------------------- OR --------------------
volumeMounts:
- name: root-gitlab-runner
mountPath: /.gitlab-runner
volumes:
- name: root-gitlab-runner
emptyDir:
medium: "Memory"

Deploying chart using helmfile returns exit code 1

I'm trying to deploy a chart using helmfile. It works just fine locally using the same version and the same cluster.
The helmfile
environments:
dev:
values:
- kubeContext: nuc
- host: urbantz-api.dev.fitfit.dk
prod:
values:
- kubeContext: nuc
- host: urbantz-api.fitfit.dk
releases:
- name: urbantz-api
namespace: urbantz-api-{{ .Environment.Name }}
chart: helm/
kubeContext: "{{ .Values.kubeContext }}"
# verify: true
values:
- image:
tag: '{{ requiredEnv "IMAGE_TAG" }}'
- ingress:
enabled: true
hosts:
- host: {{ .Values.host }}
paths:
- path: /
The complete pipeline can be found here but the relevant command can be seen below
[ "$IMAGE_TAG" == "latest" ] && ./helmfile --debug -e dev sync
The complete output from the pipeline can be found here but the relevant part can be seen below
...
NOTES:
1. Get the application URL by running these commands:
http://urbantz-api.dev.fitfit.dk/
helm:whTHc> WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/runner/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /home/runner/.kube/config
helm:whTHc> NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
urbantz-api urbantz-api-dev 4 2021-03-13 12:07:01.111013559 +0000 UTC deployed urbantz-api-0.1.0 1.16.0
getting deployed release version failed:Failed to get the version for:helm
Removed /tmp/helmfile212040489/urbantz-api-dev-urbantz-api-values-569bd76cf
Removed /tmp/helmfile850374772/urbantz-api-dev-urbantz-api-values-57897fc66b
UPDATED RELEASES:
NAME CHART VERSION
urbantz-api helm/
urbantz-api urbantz-api-dev 4 2021-03-13 12:07:01.111013559 +0000 UTC deployed urbantz-api-0.1.0 1.16.0
Error: Process completed with exit code 1.
Please be aware that I'm also getting the message "getting deployed release version failed:Failed to get the version for:helm" when running locally. But the exit code is still 0.
UPDATE: I made it work by adding a ls at the end of my pipeline. The expression [ "$IMAGE_TAG" == "latest" ] && ./helmfile --debug -e dev sync exits with 1 if the evaluation fails. Does anyone have a better solution than doing a ls on the line after?
Change file permissions on the configuration file.
chmod 600 ~/.kube/config

How do I load a dockerimage in eclipse-che?

I'm trying to load a docker-image on openshift.io
so I attempt to just use 'hello-world' as my docker image, this is my devfile
metadata:
name: test
attributes:
persistVolumes: 'false'
components:
- mountSources: true
endpoints:
- name: hello
port: 4200
memoryLimit: 1Gi
type: dockerimage
image: 'hello-world'
alias: hello-world
apiVersion: 1.0.0
However I get this error Error: Failed to run the workspace: "The following containers have terminated: hello-world: reason = 'Completed', exit code = 0, message = 'null'"
This doesn't happen with the custom images provided by eclipse, so what do I need to change in order to get a docker-image work on openshift.io? as far as I know, I can't edit the "Dockerfile", I can only pull images from a docker registry.
The command attribute of the dockerimage along with other arguments, is used to modify the entrypoint command of the container created from the image. In Eclipse Che the container is needed to run indefinitely so that you can connect to it and execute arbitrary commands in it at any time. Because the availability of the sleep command and the support for the infinity argument for it is different and depends on the base image used in the particular images, Che cannot insert this behavior automatically on its own. However, you can take advantage of this feature to, for example, start necessary servers with modified configurations, and so on.
For the dockerimage component to have access to the project sources, you must set the mountSources attribute to true.
metadata:
name: test
attributes:
persistVolumes: 'false'
components:
- mountSources: true
endpoints:
- name: hello
port: 4200
memoryLimit: 1Gi
type: dockerimage
image: 'hello-world'
alias: hello-world
command: ['sleep', 'infinity']
This looks like the entry process for hello-world image exits. Your images should not exit by default or you should override the default entry command with a command that will not exit on your devfile. You can try to add something like below to the dockerimage component.
command: ['tail']
args: ['-f', '/dev/null']
Check out this example also

Cronjob of existing Pod

I have a django app running on Openshift 3. I need to run certain manage.py commands on a regular basis. In Openshift 2 I used the Cron gear and now in Openshift 3 I want to use the CronJob pod type.
I want to create a pod for the cronjob, use the same source as the django app is using, but not expose it.
For example:
W1 - Django app
D1 - Postgres DB
M1 - django app for manage.py jobs, run as a cronjob pod.
Any help is appreciated.
You want to use a scheduled job.
https://docs.openshift.com/container-platform/3.5/dev_guide/cron_jobs.html
https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
https://blog.openshift.com/openshift-jobs/
Note that at this time (OpenShift 3.5), you have to use batch/v2alpha1 as the API version. Be careful of out of date documentation showing older version labels.
What I am not sure of is how you can easily reference the image associated with an existing imagestream produced when you used the S2I builder to build you application and you want to use the same image. The base Kubernetes object for this expects you to refer to the image from the image registry. You would thus need to work that out by looking at the imagestream and copying the image registry IP and image details over by hand.
UPDATE 1
See:
https://stackoverflow.com/a/45227960/128141
for details of how from OpenShift 3.6 you can have it resolve the imagestream name automatically. That mechanism is still alpha status in 3.6, but does work.
I've gotten it to work with specifying the image name in the YAML, but then tried to get it to work as part of the template, but ran into an error when trying to use the batch/v1 version on this server
Cannot create cron job "djangomanage". The API version batch/v1 for kind CronJob is not supported by this server.
My template code is
- apiVersion: batch/v1
kind: CronJob
metadata:
name: djangomanage
spec:
schedule: "*/5 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: djangomanage
image: '${NAME}:latest'
env:
- name: APP_SCRIPT
value: "/opt/app-root/src/cron.sh"
restartPolicy: Never
CRON.SH
python /opt/app-root/src/manage.py
you need to update line 1 with this:
- apiVersion: batch/v1beta1
see link below:
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#cronjob-v1beta1-batch

Container-VM Image with GPD Volumes fails with "Failed to get GCE Cloud Provider. plugin.host.GetCloudProvider returned <nil> instead"

I currently try to switch from the "Container-Optimized Google Compute Engine Images" (https://cloud.google.com/compute/docs/containers/container_vms) to the "Container-VM" Image (https://cloud.google.com/compute/docs/containers/vm-image/#overview). In my containers.yaml, I define a volume and a container using the volume.
apiVersion: v1
kind: Pod
metadata:
name: workhorse
spec:
containers:
- name: postgres
image: postgres:9.5
imagePullPolicy: Always
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
volumes:
- name: postgres-storage
gcePersistentDisk:
pdName: disk-name
fsType: ext4
This setup worked fine with the "Container-Optimized Google Compute Engine Images", however fails with the "Container-VM". In the logs, I can see the following error:
May 24 18:33:43 battleship kubelet[629]: E0524 18:33:43.405470 629 gce_util.go:176]
Error getting GCECloudProvider while detaching PD "disk-name":
Failed to get GCE Cloud Provider. plugin.host.GetCloudProvider returned <nil> instead
Thanks in advance for any hint!
This happens only when kubelet is run without the --cloud-provider=gce flag. The problem, unless is something different, is dependant on how GCP is launching Container-VMs.
Please contact with google cloud platform guys.
Note if this happens to you when using GCE: Add --cloud-provider=gce flag to kubelet in all your workers. This only applies to 1.2 cluster versions because, if i'm not wrong, there is an ongoing attach/detach design targeted for 1.3 clusters which will move this business logic out of kubelet.
In case someone is interested in the attach/detach redesign here it is its corresponding github issue: https://github.com/kubernetes/kubernetes/issues/20262