Annotation Validation Error when trying to install Vault on OpenShift - openshift

Following this tutorial on installing Vault with Helm on OpenShift, I encountered the following error after executing the command:
bash
helm install vault hashicorp/vault -n $VAULT_NAMESPACE -f my_values.yaml
For the config:
values.yaml
bash
echo '# Custom values for the Vault chart
global:
# If deploying to OpenShift
openshift: true
server:
dev:
enabled: true
serviceAccount:
create: true
name: vault-sa
injector:
enabled: true
authDelegator:
enabled: true' > my_values.yaml
The error:
$ helm install vault hashicorp/vault -n $VAULT_NAMESPACE -f values.yaml
Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "vault-agent-injector-clusterrole" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "my-vault-1": current value is "my-vault-0"
What exactly is happening, or how can I reset this specific name space to point to the right release namespace?

Have you by chance tried the exact same thing before, because that is what the error is hinting.
If we dissect the error, we get the to the root of the problem:
Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists.
So something on the cluster already exists that you were trying to deploy via the helm chart.
Unable to continue with install:
Helm is aborting due to this failure
ClusterRole "vault-agent-injector-clusterrole" in namespace "" exists
So the cluster role vault-agent-injector-clusterrole that the helm chart is supposed to put onto the cluster already exsits. ClusterRoles aren't namespace specific, hence the "namespace" is blank.
and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "my-vault-1": current value is "my-vault-0"
The default behavior is to try to import existing resources that this chart requires, but it is not possible, because the owner of that ClusterRole is different from the deployment.
To fix this, you can remove the existing deployment of your chart and then give it an other try and it should work as expected.
Make sure all resources are gone. For this particular one you can check with kubectl get clusterroles

Related

mkdir /.gitlab-runner: permission denied running GitLab Runner in Kubernetes deployed via Helm

I'm trying to deploy the GitLab Runner (15.7.1) onto an on-premise Kubernetes cluster and getting the following error:
PANIC: loading system ID file: saving system ID state file: creating directory: mkdir /.gitlab-runner: permission denied
This is occurring with both the 15.7.1 image (Ubuntu?) and the alpine3.13-v15.7.1 image. Looking at the deployment, it looks likes it should be trying to use /home/gitlab-runner, but for some reason it is trying to use root (/), which is a protected directory.
Anyone else experience this issue or have a suggestion as to what to look at?
I am using the Helm chart (0.48.0) using a copy of the images from dockerhub (simply moved into a local repository as internet access is not available from the cluster). Connectivity to GitLab appears to be working, but the error causes the overall startup to fail. Full logs are:
Registration attempt 4 of 30
Runtime platform arch=amd64 os=linux pid=33 revision=6d480948 version=15.7.1
WARNING: Running in user-mode.
WARNING: The user-mode requires you to manually start builds processing:
WARNING: $ gitlab-runner run
WARNING: Use sudo for system-mode:
WARNING: $ sudo gitlab-runner...
Created missing unique system ID system_id=r_Of5q3G0yFEVe
PANIC: loading system ID file: saving system ID state file: creating directory: mkdir /.gitlab-runner: permission denied
I have tried the 15.7.1 image, the alpine3.13-v15.7.1 image, and the gitlab-runner-ocp:amd64-v15.7.1 image and searched the values.yaml for anything relevant to the path. Looking at the deployment template, it appears that it ought to be using /home/gitlab-runner as the directory (instead of /) [though the docs suggested it was /home].
As for "what was I expecting", of course I was expecting that it would "just work" :)
So, resolved this (and other) issues with:
Updated helm deployment template to mount an empty volume at /.gitlab-runner
[separate issue] explicitly added builds_dir and environment [per gitlab-org/gitlab-runner#3511 (comment 114281106)].
These two steps appeared to be sufficient to get the Helm chart deployment working.
You can easily create and mount the emptyDir (in case you are creating gitlab-runner with kubernetes manifest *.yml file):
volumes:
- emptyDir: {}
name: gitlab-runner
volumeMounts:
- name: gitlab-runner
mountPath: /.gitlab-runner
-------------------- OR --------------------
volumeMounts:
- name: root-gitlab-runner
mountPath: /.gitlab-runner
volumes:
- name: root-gitlab-runner
emptyDir:
medium: "Memory"

OC cluster UP on Fedora not started correctly

I am trying to run openshift on Fedora 36 using Origin-Client or OC.
I have updated fedora to the latest version.
I have installed oc .
whenever I tried to do oc cluster up
it shows below error :
[root#fedora ridhoswasta]# oc cluster up
Getting a Docker client ...
Checking if image openshift/origin-control-plane:v3.11 is available ...
Checking type of volume mount ...
Determining server IP ...
Checking if OpenShift is already running ...
Checking for supported Docker version (=>1.22) ...
Checking if insecured registry is configured properly in Docker ...
Checking if required ports are available ...
Checking if OpenShift client is configured properly ...
Checking if image openshift/origin-control-plane:v3.11 is available ...
Starting OpenShift using openshift/origin-control-plane:v3.11 ...
I0825 12:11:14.411027 50887 flags.go:30] Running "create-kubelet-flags"
I0825 12:11:16.391985 50887 run_kubelet.go:49] Running "start-kubelet"
I0825 12:11:17.200056 50887 run_self_hosted.go:181] Waiting for the kube-apiserver to be ready ...
E0825 12:16:17.201364 50887 run_self_hosted.go:571] API server error: Get "https://127.0.0.1:8443/healthz?timeout=32s": dial tcp 127.0.0.1:8443: connect: connection refused ()
Error: timed out waiting for the condition
Then I checked the logs for kubelet container it shows :
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-min-version has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-private-key-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --pod-manifest-path has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --file-check-frequency has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --cluster-dns has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
I0825 05:13:19.249680 51788 server.go:417] Version: v1.11.0+d4cacc0
I0825 05:13:19.249928 51788 plugins.go:97] No cloud provider specified.
F0825 05:13:19.253892 51788 server.go:261] failed to run Kubelet: mountpoint for cpu not found
I have tried to reinstall docker with latest version but still I face this issue.
Could someone give me another thing to try?
Thanks!
oc cluster up is using the deprecated version of OpenShift, this has been superseded by OpenShift Local now: https://developers.redhat.com/products/openshift-local/overview. Although OpenShift Local uses a good deal more resources than oc cluster up ever did. There's a spiritual successor that might be worth checking out, and that's MicroShift: https://microshift.io/

Error installing artifactory-oss helm chart in openshift

I am trying to install artifactory oss in a openshift cluster. I am using this helm chart https://charts.jfrog.io/artifactory-oss-107.39.4.tgz (Warning I am very new to openshift etc.. I am on a steep learning curve )
I am running the helm chart as the openshift cluster-admin account
However I am getting this error
pods "artifactory-artifactory-nginx-5c66b8c948-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{107}: 107 is not an allowed group, spec.initContainers[0].securityContext.runAsUser: Invalid value: 104: must be in the ranges: [1000970000, 1000979999], spec.containers[0].securityContext.runAsUser: Invalid
I think it is a openshift permissions error .. in that it requires a more permissive security constraint. However given I am running as cluster-admin I find that a little suprising.
Can anyone offer a suggestion how to resolve this issue and get artifactory-oss running in openshift?
Thanks in advance !
--
Tried passing some options to set the uid and gild..
I tried starting with this
helm upgrade --install artifactory --set artifactory.uid=1001010042,artifactory.gid=1001010042,nginx.uid=1001010042,nginx.gid=1001010042,artifactory.masterKey=${MASTER_KEY},artifactory.joinKey=${JOIN_KEY},artifactory.postgresql.postgresqlPassword=$POSTGRES_PASSWORD --namespace artifactory jfrog/artifactory-oss
The options should have set the uids and gids.. but I still got.. Seems the helm chart ignores efforts to overwrite the values
pods "artifactory-artifactory-nginx-5c66b8c948-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{107}: 107 is not an allowed group, spec.initContainers[0].securityContext.runAsUser: Invalid value: 104: must be in the ranges: [1000930000, 1000939999], spec.containers[0].securityContext.runAsUser: Invalid
Regarding the JFrog Artifactory OSS Helm Chart, its documentation Installing Artifactory points out to some prerequisites.
When installing Artifactory, you must run the installation as a root user or provide sudo access to a non-root user.
For Helm
Create a unique Master Key (Artifactory requires a unique master key) pass it to the template during installation.
Create a secret containing the key. The key in the secret must be named master-key
kubectl create secret generic my-masterkey-secret -n artifactory --from-literal=master-key=${MASTER_KEY}
make sure to pass the same master key on all future calls to Helm install and Helm upgrade.
This means always passing --set artifactory.masterKey=${MASTER_KEY} (for the custom master key) or --set artifactory.masterKeySecretName=my-masterkey-secret (for the manual secret) and verifying that the contents of the secret remain unchanged.
create a unique join key: By default the chart has one set in the values.yaml (artifactory.joinKey).
However, this key is for demonstration purposes only and should not be used in a production environment
The point is: it depends on the exact command used to install the Helm Chart.
helm upgrade --install artifactory --set artifactory.masterKey=${MASTER_KEY} \
--set artifactory.joinKey=${JOIN_KEY} \
--namespace artifactory jfrog/artifactory
As illustrated here, the value for "runAsUser" and "fsGroup" in values.yaml can have an influence on the error message..
Unlike other installations, Helm Chart configurations are made to the values.yaml and are then applied to the system.yaml.
Follow these steps to apply the configuration changes.
Make the changes to values.yaml.
Run the command.
helm upgrade --install artifactory -n artifactory -f values.yaml
See Managing security context constraints for more.

Google Cloud Build windows builder error "Failed to get external IP address: Could not get external NAT IP from list"

I am trying to implement automatic deployments for my Windows Kubernetes container app. I'm following instructions from the Google's windows-builder, but the trigger quickly fails with this error at about 1.5 minutes in:
2021/12/16 19:30:06 Set ingress firewall rule successfully
2021/12/16 19:30:06 Failed to get external IP address: Could not get external NAT IP from list
ERROR
ERROR: build step 0 "gcr.io/[my-project-id]/windows-builder" failed: step exited with non-zero status: 1
The container, gcr.io/[my-project-id]/windows-builder, definitely exists and it's located in the same GCP project as the Cloud Build trigger just as the windows-builder documentation commanded.
I structured my code based off of Google's docker-windows example. Here is my repository file structure:
repository
cloudbuild.yaml
builder.ps1
worker
Dockerfile
Here is my cloudbuild.yaml:
steps:
# WORKER
- name: 'gcr.io/[my-project-id]/windows-builder'
args: [ '--command', 'powershell.exe -file build.ps1' ]
# OPTIONS
options:
logging: CLOUD_LOGGING_ONLY
Here is my builder.ps1:
docker build -t gcr.io/[my-project-id]/test-worker ./worker;
if ($?) {
docker push gcr.io/[my-project-id]/test-worker;
}
Here is my Dockerfile:
FROM gcr.io/[my-project-id]/test-windows-node-base:onbuild
Does anybody know what I'm doing wrong here? Any help would be appreciated.
Replicated the steps from GitHub and got the same error. It is throwing Failed to get external IP address... error because the External IP address of the VM is disabled by default in the source code. I was able to build it successfully by adding '--create-external-ip', 'true' in cloudbuild.yaml.
Here is my cloudbuild.yaml:
steps:
- name: 'gcr.io/$PROJECT_ID/windows-builder'
args: [ '--create-external-ip', 'true',
'--command', 'powershell.exe -file build.ps1' ]

Openshift: Error pulling image from remote, secure docker registry using certificates

I use the all-in-one VM of Openshift origin.
I am trying to pull images from a private, secure registry using an Image Stream. This is the ImageStream definition:
apiVersion: v1
kind: ImageStream
metadata:
name: my-image-stream
annotations:
description: Keeps track of changes in the application image
name: my-image
spec:
dockerImageRepository: "my.registry.net/myproject/my-image"
The repository is secured with a certificate. On my local machine, i have them in /etc/docker/certs.d/my.registry.net and I can login with docker login my.registry.net.
When I run oc import-image, however, I get the following error:
The import completed with errors.
Name: my-image
Namespace: myproject
Created: About an hour ago
Labels: <none>
Description: Keeps track of changes in the application image
Annotations: openshift.io/image.dockerRepositoryCheck=2017-01-27T08:09:49Z
Docker Pull Spec: 172.30.53.244:5000/myproject/my-image
Unique Images: 0
Tags: 1
latest
tagged from my.registry.net/myproject/my-image
! error: Import failed (InternalError): Internal error occurred: Get https://my.registry.net/v2/: remote error: handshake failure
About an hour ago
I have copied the certificates to the vagrant machine and restarted the docker daemon, but the problem remains. I have not found any documentation on how to properly add the certificates, so I just put them in the usual docker folder.
What is the appropriate way to make this work?
Update in response to rezie's answer:
There is no file etc/origin/master/ca-bundle.crt on my vagrant box. I found the following ca-bundle.crt files :
$ find / -iname ca-bundle.crt
/etc/pki/tls/certs/ca-bundle.crt
##multiple lines like
/var/lib/docker/devicemapper/mnt/something-hash-like/rootfs/etc/pki/tls/certs/ca-bundle.crt
/var/lib/origin/openshift.local.config/master/ca-bundle.crt
I appended the root certificate to /etc/pki/tls/certs/ca-bundle.crt and to var/lib/origin/openshift.local.config/master/ca-bundle.crt, but that did not change anything.
Please note, however, that I do not need to have this root certificate in /etc/docker/certs.d/... in order to login directly using docker login my.registry.net
I have appended
I cannot comment due tow lo karma so I'll write an answer saying almost the same as rezie.
The error:
! error: Import failed (InternalError): Internal error occurred: Get https://my.registry.net/v2/: remote error: handshake failure
About an hour ago
Comes from OpenShift, not from docker, therefore adding it to /etc/docker/certs.d/my.registry.net doesn't prevent the error from happening.
You should add the CA certificate at OS level, my guess is the steps failed for some reason so do it this way:
openssl s_client -connect my.registry.net:443 </dev/null |
sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' \
> /etc/pki/ca-trust/source/anchors/my.registry.net.crt &&
update-ca-trust check && update-ca-trust extract
Finally test if it worked running
curl https://my.registry.net/v2
If it doesn't give you a certificate error and you still can't do the oc import restart the atomic-openshift-master-api service
Try appending your CA (the same one you said you said that was used in the my.registry.net directory) into Openshift's ca bundle (e.g. /etc/origin/master/ca-bundle.crt. Then restart the service and reattempt import-image (making sure that you do not include the --insecure flag).
For reference, check out this issue from the Origin project. As you've mentioned, there's currently no way to supply certificates along with the dockercfg secret, and the suggestion from that issue is to add the CA as a trusted root CA across all the hosts.