Compute Engine Deploy Container - google-compute-engine

I am using golang to programmatically create and destroy one-off Compute Engine instances using the Compute Engine API.
I can create an instance just fine, but what I'm really having trouble with is launching a container on startup.
You can do it from the Console UI:
But as far as I can tell it's extremely hard to do it programmatically, especially with Container Optimized OS as the base image. I tried doing a startup script that does a docker pull us-central1-docker.pkg.dev/project/repo/image:tag but it fails because you need to do gcloud auth configure-docker us-central1-docker.pkg.dev first for that to work and COOS doesn't have gcloud nor a package manager to get it.
All my workarounds seem hacky:
Manually create a VM template that has the desired container and create instances of the template
Put container in external registry like docker hub (not acceptable)
Use Ubuntu instead of COOS with a package manager so I can programmatically install gcloud, docker, and the container on startup
Use COOS to pull down an image from dockerhub containing gcloud, then do some sort of docker-in-docker mount to pull it down
Am I missing something or is it just really cumbersome to deploy a container to a compute engine instance without using gcloud or the Console UI?

To have a Compute Engine start a container when the Compute Engine starts, one has to define meta data for the description of the container. When the COOS starts, it appears to run an application called konlet which can be found here:
https://github.com/GoogleCloudPlatform/konlet
If we look at the documentation for this, it says:
The agent parses container declaration that is stored in VM instance metadata under gce-container-declaration key and starts the container with the declared configuration options.
Unfortunately, I haven't found any formal documentation for the structure of this metadata. While I couldn't find documentation, I did find two possible solutions:
Decipher the source code of konlet and break it apart to find out how the metadata maps to what is passed when the docker container is started
or
Create a Compute Engine by hand with the desired container definitions and then start the Compute Engine. SSH into the Compute Engine and then retrieve the current metadata. We can read about retrieving meta data here:
https://cloud.google.com/compute/docs/metadata/overview

It turns out, it's not too hard to pull down a container from Artifact Registry in Container Optimized OS:
Run docker-credential-gcr configure-docker --registries [region]-docker.pkg.dev
See: https://cloud.google.com/container-optimized-os/docs/how-to/run-container-instance#accessing_private_images_in_or
So what you can do is put the above line along with docker pull [image] and docker run ... into a startup script. You can specify a startup script when creating an instance using the metadata field: https://cloud.google.com/compute/docs/instances/startup-scripts/linux#api
This seems the least hacky way of provisioning an instance with a container programmatically.

You mentioned you used docker-credential-gcr to solve your problem. I tried the same in my startup script:
docker-credential-gcr configure-docker --registries us-east1-docker.pkg.dev
But it returns:
ERROR: Unable to save docker config: mkdir /root/.docker: read-only file system
Is there some other step needed? Thanks.

I recently ran into the other side of these limitations (and asked a question on the topic).
Basically, I wanted to provision a COOS instance without launching a container. I was unable to, so I just launched a container from a base image and then later in my CI/CD pipeline, Dockerized my app, uploaded it to Artifact Registry and replaced the base image on the COOS instance with my newly built app.
The metadata I provided to launch the initial base image as a container:
spec:
containers:
- image: blairnangle/python3-numpy-ta-lib:latest
name: containervm
securityContext:
privileged: false
stdin: false
tty: false
volumeMounts: []
restartPolicy: Always
volumes: []
I'm a Terraform fanboi, so the metadata exists within some Terraform configuration. I have a public project with the code that achieves this if you want to take a proper look: blairnangle/dockerized-flask-on-gce.

Related

OpenShift single node PersistentVolume with hostPath requires privileged pods, how to set as default?

I am fairly new to OpenShift and have been using CRC (Code Ready Containers) for a little while, and now decided to install the single server OpenShift on bare metal using the Assisted-Installer method from https://cloud.redhat.com/blog/deploy-openshift-at-the-edge-with-single-node-openshift and https://console.redhat.com/openshift/assisted-installer/clusters/. This has worked well and I have a functional single-server.
As a single server in a test environment (without NFS available) I need/want to create PersistentVolumes with hostPath (localhost storage) - these work flawlessly in CRC. However on the full install, I run into an issue when mounting PVC's to pods as the pods were not running privileged. I edited the deployment config and added the lines below (within the containers hash)
- resources: {}
...
securityContext:
privileged: true
... however still had errors as the restricted SCC has 'allowPrivilegedContainer: false'. I have done a horrible hack of changing this to true, so adding the lines above to the deployment yaml works. However there must be an easier way as none of these hacks seem present in CRC. I checked and CRC pods run restricted, the restricted SCC has privileged set to false, and the Persistent Volume is also using hostPath. I also do not have to edit the deployment yaml as above in CRC - it just works (tm).
Guidance here shows that the containers must run privileged, however the containers in CRC are running restricted and the SCC still has 'allowPrivilegedContainer: false'.
https://docs.openshift.com/container-platform/4.8/storage/persistent_storage/persistent-storage-hostpath.html
An example app creation as below (from the RedHat DO280 course) works without any massaging of privileges or deployment config in CRC, but on a real OS server requires the massaging above. As my server is purely for testing, I would like to make it easier without doing the hackjob and deployment changes above.
oc new-app --name mysql --docker-image registry.access.redhat.com/rhscl/mysql-57-rhel7:5.7
oc create secret generic mysql --from-literal password=r3dh4t123
oc set env deployment mysql --prefix MYSQL_ROOT_ --from secret/mysql
oc set volumes deployment/mysql --name mysql-storage --add --type pvc --claim-size 2Gi --claim-mode rwo --mount-path /var/lib/mysql/data
oc get pods -l deployment=mysql
oc get pvc
Any help appreciated.
EDIT: I have overcome this now by enabling nfs-server and adding entries to /etc/exports. However I'm still interested to understand how CRC manages the above issue when using hostPath
The short answer to this is: don't use hostPath.
You are using hostPath to make use of arbitrary disk space available on the underlying host's volume. hostPath can also be used to read/write any directory path on the underlying host's volume -- which, as you can imagine, should be used with great care.
Have a look at this as an alternative -- https://docs.openshift.com/container-platform/4.8/storage/persistent_storage/persistent-storage-local.html

Github Action Service Container from Dockerfile in same repo

I'm learning Github Actions and designing a workflow with a job that requires a Service Container.
The documentation states that configuration must specify "The Docker image to use as the service container to run the action. The value can be the Docker base image name or a public docker Hub or registry". All of the examples in the docs use publicly-available Docker images, however I want to create a Service Container from a Dockerfile contained within my repo.
Is it possible to use a local Dockerfile to create a Service Container?
Because the job depends on a Service Container, that image must exist when the job begins, and therefore the image cannot be created by an earlier step in the same job. The image could be built in a separate job, but because jobs execute in separate runners I believe that Job 2 will not have access to the image created in Job 1. If this is true then could I follow this approach, using upload/download-artifact so provide Job 1's image to Job 2?
If all else fails, I could have Job 1 create the image and upload it to Docker Hub, then have Job 2 download it from Docker Hub, but surely there is a better way.
The GitHub Actions host machine (runner) is a fully loaded Linux machine, with everything everybody needs already installed.
You can easily launch multiple containers - either your own images, or public images - by simply running docker and docker-compose commands.
My advice to you is: Describe your service(s) in a docker-compose.yml file, and in one of your GitHub Actions steps, simply do docker-compose up -d.
You can create a docker image with the Dockerfile or docker-compose.yml residing inside the repo. Refer to this public gist, it might be helpful.
Instead of building multiple docker-images, you can use docker-compose. Docker-compose is the preferred way to deal with this kind of scenario.

Openshift - API to get ARTIFACT_URL parameter of a pod or the version of its deployed app

What I want to do is to make a web app that lists in one single view the version of every application deployed in our Openshift (a fast view of versions). At this moment, the only way I have seen to locate the version of an app deployed in a pod is the ARTIFACT_URL parameter in the envirorment view, that's why I ask for that parameter, but if there's another way to get a pod and the version of its current app deployed, I'm also open to that option as long as I can get it through an API. Maybe I'd eventually also need an endpoint that retrieves the list of the current pods.
I've looked into the Openshift API and the only thing I've found that may help me is this GET but if the parameter :id is what I think, it changes with every deploy, so I would need to be modifying it constantly and that's not practical. Obviously, I'd also need an endpoint to get the list of IDs or whatever that let me identify the pod when I ask for the ARTIFACT_URL
Thanks!
There is a way to do that. See https://docs.openshift.com/enterprise/3.0/dev_guide/environment_variables.html
List Environment Variables
To list environment variables in pods or pod templates:
$ oc env <object-selection> --list [<common-options>]
This example lists all environment variables for pod p1:
$ oc env pod/p1 --list
I suggest redesigning builds and deployments if you don't have persistent app versioning information outside of Openshift.
If app versions need to be obtained from running pods (e.g. with oc rsh or oc env as suggested elsewhere), then you have a serious reproducibility problem. Git should be used for app versioning, and all app builds and deployments, even in dev and test environments should be fully automated.
Within Openshift you can achieve full automation with Webhook Triggers in your Build Configs and Image Change Triggers in your Deployment Configs.
Outside of Openshift, this can be done at no extra cost using Jenkins (which can even be run in a container if you have persistent storage available to preserve its settings).
As a quick workaround you may also consider:
oc describe pods | grep ARTIFACT_URL
to get the list of values of your environment variable (here: ARTIFACT_URL) from all pods.
The corresponding list of pod names can be obtained either simply using 'oc get pods' or a second call to oc describe:
oc describe pods | grep "Name: "
(notice the 8 spaces needed to filter out other Names:)

Container Optimized VM - How to force pulling from registry (and not using local image:tag)?

I am using Google Cloud Build to build containers run on Container Optimized OS VM's on several projects
A typical cloudbuild.yaml file looks like this:
steps:
- name: 'gcr.io/cloud-builders/docker'
args: [ "build",
"-t", "gcr.io/${PROJECT_ID}/core-app-${BRANCH_NAME}:latest",
"."]
- name: 'gcr.io/cloud-builders/docker'
args: ["push", "gcr.io/${PROJECT_ID}/core-app-${BRANCH_NAME}:latest"]
- name: 'gcr.io/cloud-builders/gcloud'
args: ["beta", "compute", "instances", "update-container", "core-app-${BRANCH_NAME}", "--container-image", "gcr.io/${PROJECT_ID}/core-app-${BRANCH_NAME}:latest", "--zone", "${_ZONE}"]
images:
- "gcr.io/${PROJECT_ID}/core-app-${BRANCH_NAME}:latest"
A trigger is defined with some branch condition
In essence, on a commit to a given branch, an image with tag latest is built and used to run a container of a given VM.
It worked great until a couple of weeks ago. And suddenly, on all projects, it stopped working well. Instead of pulling latest the VM keeps using the local one. The only workaround I found was to use a SHA as tag (gcr.io/${PROJECT_ID}/core-app-${BRANCH_NAME}:${SHORT_SHA}), but this results in several images accumulating on the VM and at some point, there is not enough space anymore and the deployment fails.
So, how can I force the container optimized VM to pull an image:tag when it has one with the same name on the local disk?
You can delete the old images before you pull in the new ones, by utilizing a command like
docker image prune -a -f
You would get a higher downtime while updating from one version to another, but if that is not really an issue for you then this should work just fine.

How to run base centos image in minishift?

I try to learn about Open Shift, how it works, how to run apps, build images etc.
To start with something, which I thought will be rather simple, I decided to run a pod with pure centos7 OS, based on this image. I installed locally minishift v1.11.0+4459917, I created a new project, and performed command:
oc new-app openshift/base-centos7 in this project. As a result I received the following message:
--> Found Docker image bb81a09 (11 months old) from Docker Hub for "openshift/base-centos7"
* An image stream will be created as "pon3:latest" that will track this image
* This image will be deployed in deployment config "pon3"
* The image does not expose any ports - if you want to load balance or send traffic to this component
you will need to create a service with 'expose dc/pon3 --port=[port]' later
* WARNING: Image "openshift/base-centos7" runs as the 'root' user which may not be permitted by your cluster administrator
--> Creating resources ...
imagestream "pon3" created
deploymentconfig "pon3" created
--> Success
Run 'oc status' to view your app.
As I can see in the warning this image runs as root, which is clearly not a good practice, but it may be worked around, as described here and here. I tried both approaches - I have created a new service account with anyuid scc, and I assigned anyuid scc to default sa. Unfortunately I'm still not able to run a pod based on this image. The result looks like this:
oc get pods
mycentos-1-deploy 1/1 Running 0 32s
mycentos-1-p1vh5 0/1 CrashLoopBackOff 1 30s
I try to troubleshoot this way:
oc logs -p mycentos-1-p1vh5
This image serves as the base image for all OpenShift v3 S2I builder images.
It provides all essential libraries and development tools needed to
successfully build and run an application.
To use this image as a base image, you need to have 's2i/bin' directory in the
same directory as your S2I image Dockerfile. This directory should contain S2I
scripts.
This base image also provides the default user you should use to run your
application. Your Dockerfile should include this instruction after you finish
installing software:
USER default
The default directory for installing your application sources is
'/opt/app-root/src' and the WORKDIR and HOME for the 'default' user is set
to this directory as well. In your S2I scripts, you don't have to use absolute
path, but rather rely on the relative path.
To learn more about S2I visit: https://github.com/openshift/source-to-image
Additionally I tried to troubleshoot with oc adm diagnostics but to be honest I didn't see anything relevant to this issue.
I'm clearly missing something here. Can someone give me a hint how this should be handled or how can I try to troubleshoot this? Is there a different way to run pure centos OS?
Thank you for any help.
You need the image you want to deploy using oc new-app to have an actual application in it. The openshift/base-centos7 image is a base image only on which other images are built and doesn't have an application in it.
If you just want to spin up a container and be presented with a shell environment in which you can play in use the oc run command instead.
OpenShift isn't like a traditional VPS where you just spin up permanent shell environments which you then access to set up your application manually. The idea is that you build your application into an image and deploy the application.
I would suggest you go read:
https://www.openshift.com/promotions/for-developers.html
https://www.openshift.com/promotions/devops-with-openshift.html
and work through the exercises at:
https://learn.openshift.com
to learn more about what OpenShift is and how to use it.