Error installing artifactory-oss helm chart in openshift - openshift

I am trying to install artifactory oss in a openshift cluster. I am using this helm chart https://charts.jfrog.io/artifactory-oss-107.39.4.tgz (Warning I am very new to openshift etc.. I am on a steep learning curve )
I am running the helm chart as the openshift cluster-admin account
However I am getting this error
pods "artifactory-artifactory-nginx-5c66b8c948-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{107}: 107 is not an allowed group, spec.initContainers[0].securityContext.runAsUser: Invalid value: 104: must be in the ranges: [1000970000, 1000979999], spec.containers[0].securityContext.runAsUser: Invalid
I think it is a openshift permissions error .. in that it requires a more permissive security constraint. However given I am running as cluster-admin I find that a little suprising.
Can anyone offer a suggestion how to resolve this issue and get artifactory-oss running in openshift?
Thanks in advance !
--
Tried passing some options to set the uid and gild..
I tried starting with this
helm upgrade --install artifactory --set artifactory.uid=1001010042,artifactory.gid=1001010042,nginx.uid=1001010042,nginx.gid=1001010042,artifactory.masterKey=${MASTER_KEY},artifactory.joinKey=${JOIN_KEY},artifactory.postgresql.postgresqlPassword=$POSTGRES_PASSWORD --namespace artifactory jfrog/artifactory-oss
The options should have set the uids and gids.. but I still got.. Seems the helm chart ignores efforts to overwrite the values
pods "artifactory-artifactory-nginx-5c66b8c948-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{107}: 107 is not an allowed group, spec.initContainers[0].securityContext.runAsUser: Invalid value: 104: must be in the ranges: [1000930000, 1000939999], spec.containers[0].securityContext.runAsUser: Invalid

Regarding the JFrog Artifactory OSS Helm Chart, its documentation Installing Artifactory points out to some prerequisites.
When installing Artifactory, you must run the installation as a root user or provide sudo access to a non-root user.
For Helm
Create a unique Master Key (Artifactory requires a unique master key) pass it to the template during installation.
Create a secret containing the key. The key in the secret must be named master-key
kubectl create secret generic my-masterkey-secret -n artifactory --from-literal=master-key=${MASTER_KEY}
make sure to pass the same master key on all future calls to Helm install and Helm upgrade.
This means always passing --set artifactory.masterKey=${MASTER_KEY} (for the custom master key) or --set artifactory.masterKeySecretName=my-masterkey-secret (for the manual secret) and verifying that the contents of the secret remain unchanged.
create a unique join key: By default the chart has one set in the values.yaml (artifactory.joinKey).
However, this key is for demonstration purposes only and should not be used in a production environment
The point is: it depends on the exact command used to install the Helm Chart.
helm upgrade --install artifactory --set artifactory.masterKey=${MASTER_KEY} \
--set artifactory.joinKey=${JOIN_KEY} \
--namespace artifactory jfrog/artifactory
As illustrated here, the value for "runAsUser" and "fsGroup" in values.yaml can have an influence on the error message..
Unlike other installations, Helm Chart configurations are made to the values.yaml and are then applied to the system.yaml.
Follow these steps to apply the configuration changes.
Make the changes to values.yaml.
Run the command.
helm upgrade --install artifactory -n artifactory -f values.yaml
See Managing security context constraints for more.

Related

OC cluster UP on Fedora not started correctly

I am trying to run openshift on Fedora 36 using Origin-Client or OC.
I have updated fedora to the latest version.
I have installed oc .
whenever I tried to do oc cluster up
it shows below error :
[root#fedora ridhoswasta]# oc cluster up
Getting a Docker client ...
Checking if image openshift/origin-control-plane:v3.11 is available ...
Checking type of volume mount ...
Determining server IP ...
Checking if OpenShift is already running ...
Checking for supported Docker version (=>1.22) ...
Checking if insecured registry is configured properly in Docker ...
Checking if required ports are available ...
Checking if OpenShift client is configured properly ...
Checking if image openshift/origin-control-plane:v3.11 is available ...
Starting OpenShift using openshift/origin-control-plane:v3.11 ...
I0825 12:11:14.411027 50887 flags.go:30] Running "create-kubelet-flags"
I0825 12:11:16.391985 50887 run_kubelet.go:49] Running "start-kubelet"
I0825 12:11:17.200056 50887 run_self_hosted.go:181] Waiting for the kube-apiserver to be ready ...
E0825 12:16:17.201364 50887 run_self_hosted.go:571] API server error: Get "https://127.0.0.1:8443/healthz?timeout=32s": dial tcp 127.0.0.1:8443: connect: connection refused ()
Error: timed out waiting for the condition
Then I checked the logs for kubelet container it shows :
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-min-version has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-private-key-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --pod-manifest-path has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --file-check-frequency has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --cluster-dns has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
I0825 05:13:19.249680 51788 server.go:417] Version: v1.11.0+d4cacc0
I0825 05:13:19.249928 51788 plugins.go:97] No cloud provider specified.
F0825 05:13:19.253892 51788 server.go:261] failed to run Kubelet: mountpoint for cpu not found
I have tried to reinstall docker with latest version but still I face this issue.
Could someone give me another thing to try?
Thanks!
oc cluster up is using the deprecated version of OpenShift, this has been superseded by OpenShift Local now: https://developers.redhat.com/products/openshift-local/overview. Although OpenShift Local uses a good deal more resources than oc cluster up ever did. There's a spiritual successor that might be worth checking out, and that's MicroShift: https://microshift.io/

Unable to connect worker node to master using K3S

I am trying to setup a K3S cluster for learning purposes but I am having trouble connecting the master node with agents. I have looked several tutorials and discussions on this but I can't find a solution. I know I am probably missing something obvious (due to my lack of knowledge), but still help would be much appreciated.
I am using two AWS t2.micro instances with default configuration.
When ssh into the master and installed K3S using
curl -sfL https://get.k3s.io | sh -s - --no-deploy traefik --write-kubeconfig-mode 644 --node-name k3s-master-01
with kubectl get nodes, I am able to see the master
NAME STATUS ROLES AGE VERSION
k3s-master-01 Ready control-plane,master 13s v1.23.6+k3s1
So far it seems I am doing things right. From what I understand, I am supposed to configure the kubeconfig file. So, I accessed it by using
cat /etc/rancher/k3s/k3s.yaml
I copied the configuration file and the server info to match the private IP I took from AWS console, resulting in something like this
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: <lots_of_info>
server: https://<master_private_IP>:6443
name: default
contexts:
- context:
cluster: default
user: default
name: default
current-context: default
kind: Config
preferences: {}
users:
- name: default
user:
client-certificate-data: <my_certificate_data>
client-key-data: <my_key_data>
Then, I ran vi ~/.kube/config, and there I pasted the kubeconfig file
Finally, I grabbed the token with cat /var/lib/rancher/k3s/server/node-token, ssh into the other machine and then run the following
curl -sfL https://get.k3s.io | K3S_NODE_NAME=k3s-worker-01 K3S_URL=https://<master_private_IP>:6443 K3S_TOKEN=<master_token> sh -
The output is
[INFO] Finding release for channel stable
[INFO] Using v1.23.6+k3s1 as release
[INFO] Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.23.6+k3s1/sha256sum-amd64.txt
[INFO] Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.23.6+k3s1/k3s
[INFO] Verifying binary download
[INFO] Installing k3s to /usr/local/bin/k3s
[INFO] Skipping installation of SELinux RPM
[INFO] Creating /usr/local/bin/kubectl symlink to k3s
[INFO] Creating /usr/local/bin/crictl symlink to k3s
[INFO] Creating /usr/local/bin/ctr symlink to k3s
[INFO] Creating killall script /usr/local/bin/k3s-killall.sh
[INFO] Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh
[INFO] env: Creating environment file /etc/systemd/system/k3s-agent.service.env
[INFO] systemd: Creating service file /etc/systemd/system/k3s-agent.service
[INFO] systemd: Enabling k3s-agent unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.
[INFO] systemd: Starting k3s-agent
By this output, it looks like I have created an agent. However, when I run kubectl get nodes in the master, I still get
NAME STATUS ROLES AGE VERSION
k3s-master-01 Ready control-plane,master 12m v1.23.6+k3s1
What is the thing I was supposed to do in order to get the agent connected to the master? I am guess I am probably missing something simple, but I just can't seem to find the solution. I've read all the documentation but it is still not clear to me where I am making the mistake. I've tried saving the private master IP and token into the agent as environmental variables with export K3S_TOKEN=master_token and K3S_URL=master_private_IP and then simply running curl -sfL https://get.k3s.io | sh - but I still can't see the worker nodes when running kubectl get nodes
Any help would be appreciated.
It might be your VM instance firewall that prevents appropriate connection from your master to the worker node (and vice versa). Official rancher documentation advise to disable firewall for (Red Hat/CentOS) Enterprise Linux:
It is recommended to turn off firewalld:
systemctl disable firewalld --now
If enabled, it is required to disable nm-cloud-setup and reboot the node:
systemctl disable nm-cloud-setup.service nm-cloud-setup.timer reboot
If you are using Ubuntu on your VM's, there is a different firewall tool (ufw).
In my case, allowing 6443 and 443(not sure if required) port TCP connections worked fine.
Allow port 6443 and TCP connection in all of your cluster machines:
sudo ufw allow 6443/tcp
Then apply k3s installation script in your worker node(s):
curl -sfL https://get.k3s.io | K3S_NODE_NAME=k3s-worker-1 K3S_URL=https://<k3s-master-1 IP>:6443 K3S_TOKEN=<k3s-master-1 TOKEN> sh -
This should work. If not, you can try adding additional allow rule for 443 tcp port as well.
A few options to check.
Check Journalctl for errors
journalctl -u k3s-agent.service -n 300 -xn
If using RaspberryPi for a worker node, make sure you have
cgroup_enable=cpuset cgroup_enable=memory cgroup_memory=1
as the very end of your /boot/cmdline.txt file. DO NOT PUT THIS VALUE ON A NEW LINE! Should just be appended to the end of the line.
If your master node(s) have self-signed certs, make sure you copy the master node's self signed cert to your worker node(s). In linux or raspberry pi copy cert to /usr/local/share/ca-certificates, then issue an
sudo update-ca-certificates
on the worker node
Don't forget to reboot the worker node after you make these changes!
Hope this helps someone!

how to delete project in redhat openshift web ui without permissions?

I tried openshift redhat k8s distro and now there are 2 projects that i need to delete. I can only login as user 'erjcan', this is my primary acc and it seems not to be allowed to do admin actions.
The 'delete button' is inactive in gui console, i tried to create a role for myself but can't.
I tried to create admin-like role and assume it as a user, but it is not allowed either.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: all-stuff
namespace: erjcan-stage
rules:
- apiGroups:
- ''
resources:
- '*'
verbs:
- '*'
This code above gives me RBAC not allowed error:
An error occurred
roles.rbac.authorization.k8s.io "all-stuff" is forbidden: user "erjcan"
(groups=["system:authenticated:oauth" "system:authenticated"]) is
attempting to grant RBAC permissions
not currently held: {APIGroups:[""], Resources:["*"],
Verbs:["*"]}
I tried to delete via cli, but i can only login as erjcan user.
Logged into "https://api.sandbox-m2.ll9k.p1.openshiftapps.com:6443" as "erjcan" using the token provided.
You have access to the following projects and can switch between them with 'oc project <projectname>':
erjcan-dev
* erjcan-stage
Using project "erjcan-stage".
bash-4.4 ~ $
bash-4.4 ~ $ oc delete project erjan-dev
Error from server (Forbidden): projects.project.openshift.io "erjan-dev" is forbidden: User "erjcan" cannot delete resource "projects" in API group "project.openshift.io" in the namespace "erjan-dev"
bash-4.4 ~ $ oc delete project erjcan-dev
Error from server (Forbidden): projects.project.openshift.io "erjcan-dev" is forbidden: User "erjcan" cannot delete resource "projects" in API group "project.openshift.io" in the namespace "erjcan-dev"
How to delete a project in redhat openshift gui console?
You appear to be talking about using Red Hat's developer sandbox. Which, indeed, does not allow you to delete projects. There's no way around that: RBAC is specifically set up to not allow you to create or delete projects.
You don't say why you need to delete the projects. They will go away eventually do to inactivity. But, if you just want a clean slate, or just need to remove what you have inside that project you do have permission to delete everything in the project (just not the project itself).
oc delete all --all will remove everything inside the current project. Obviously use that command with strict care: there is no confirmation or warning. (BTW, the first "all" is saying all types of objects: pods/deployments/routes/etc, the second --all is saying "yes, I'm deliberately not providing a filter or any other subset, I really mean delete all of the objects I'm specifying".
Similarly, the following two commands should clean up both of your projects. (Although they will still exist.)
oc delete all --all -n erjcan-stage
oc delete all --all -n erjcan-dev

Annotation Validation Error when trying to install Vault on OpenShift

Following this tutorial on installing Vault with Helm on OpenShift, I encountered the following error after executing the command:
bash
helm install vault hashicorp/vault -n $VAULT_NAMESPACE -f my_values.yaml
For the config:
values.yaml
bash
echo '# Custom values for the Vault chart
global:
# If deploying to OpenShift
openshift: true
server:
dev:
enabled: true
serviceAccount:
create: true
name: vault-sa
injector:
enabled: true
authDelegator:
enabled: true' > my_values.yaml
The error:
$ helm install vault hashicorp/vault -n $VAULT_NAMESPACE -f values.yaml
Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "vault-agent-injector-clusterrole" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "my-vault-1": current value is "my-vault-0"
What exactly is happening, or how can I reset this specific name space to point to the right release namespace?
Have you by chance tried the exact same thing before, because that is what the error is hinting.
If we dissect the error, we get the to the root of the problem:
Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists.
So something on the cluster already exists that you were trying to deploy via the helm chart.
Unable to continue with install:
Helm is aborting due to this failure
ClusterRole "vault-agent-injector-clusterrole" in namespace "" exists
So the cluster role vault-agent-injector-clusterrole that the helm chart is supposed to put onto the cluster already exsits. ClusterRoles aren't namespace specific, hence the "namespace" is blank.
and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "my-vault-1": current value is "my-vault-0"
The default behavior is to try to import existing resources that this chart requires, but it is not possible, because the owner of that ClusterRole is different from the deployment.
To fix this, you can remove the existing deployment of your chart and then give it an other try and it should work as expected.
Make sure all resources are gone. For this particular one you can check with kubectl get clusterroles

Kubernetes google cloud composer with gitlab ci yaml file

I am working on the deployment of a gitlab CI pipeline to trigger a google cloud composer DAG
Below is the .yaml I wrote :
stages:
- deploy
deploy:
stage: deploy
image: google/cloud-sdk
script:
- apt-get update && apt-get --only-upgrade install kubectl google-cloud-sdk
- gcloud config set project $GCP_PROJECT_ID
- gsutil cp plugins/*.py ${PLUGINS_BUCKET}
- gsutil cp dags/*.py ${DAGS_BUCKET}
- kubectl get pods
- gcloud composer environments run ${COMPOSER_ENVIRONMENT} --location ${ENVIRONMENT_LOCATION} trigger_dag -- ${DAG_NAME}
Unfortunately, the execution of the pipleine fails with the error below :
$ gcloud config set project $GCP_PROJECT_ID
Updated property [core/project].
$ gsutil cp plugins/*.py ${PLUGINS_BUCKET}
Copying file://plugins/dataproc_custom_operators.py [Content-Type=text/x-python]...
/ [0 files][ 0.0 B/ 2.3 KiB]
/ [1 files][ 2.3 KiB/ 2.3 KiB]
Operation completed over 1 objects/2.3 KiB.
$ gsutil cp dags/*.py ${DAGS_BUCKET}
copying file://dags/frrm_infdeos_workflow.py [Content-Type=text/x-python]...
/ [0 files][ 0.0 B/ 3.3 KiB]
/ [1 files][ 3.3 KiB/ 3.3 KiB]
Operation completed over 1 objects/3.3 KiB.
$ gcloud composer environments run ${COMPOSER_ENVIRONMENT} --location ${ENVIRONMENT_LOCATION} trigger_dag -- ${DAG_NAME}
kubeconfig entry generated for europe-west1-nameenvironment-a5456e0c-gke.
ERROR: (gcloud.composer.environments.run) No running GKE pods found. If the environment was recently started, please wait and retry.
ERROR: Job failed: command terminated with exit code 1
Do you have any idea about how to fix this please ?
Best regards
I had the same problem as #scalacode. For me, the solution was that the gitlab-runner was running in a different GCP Project than the Composer Environment, so it failed without specifying that error. Running a gitlab-runner in the same project as the Composer Environment fixed the issue.
It seems Composer is unable to retrieve information about the pods/GKE cluster. This could be for a number of reasons ranging from the GKE cluster not creating the nodes to the pods being in a crash loop.
I notice in the script you did not “get-credentials” to authenticate to the cluster. When running commands on a GKE cluster through CLI, traditionally you would first have to authenticate to the cluster first with command. To do this with composer:
gcloud composer environments describe ${COMPOSER_ENVIRONMENT} --location ${ENVIRONMENT_LOCATION} --format="get(config.gkeCluster)"
This will return something of the form: projects/PROJECT/zones/ZONE/clusters/CLUSTER Then run:
gcloud container clusters get-credentials ${CLUSTER} --zone ${ZONE}
Once you have authenticated to the cluster in the script, see if it is now able to complete. If not, try running kubectl get pods to see what is happening with the pods/if they exist.
If you see many pods restarting or generally not in the “running/completed” state, the issue could be with the pod configuration.
If you don’t see pods at all, the deployment may have failed. Check the deployment with command kubectl get deployments.
The deployments airflow-scheduler, airflow-sqlproxy, & airflow-worker should be present. If those three deployments are not present, the environment was likely tampered with, & it would be easiest to make a new environment.