statefulset unable to rollback if the pods are not in running state - openshift

I have deployed mongo stateful pods with an auto rolling strategy and below is the template for it. The deployment is successful and the pods are into Running state.
- apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo
spec:
serviceName: "mongo"
podManagementPolicy: Parallel
replicas: 3
strategy:
type: Rolling
template:
metadata:
labels:
role: mongo
environment: test
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo:4.0
imagePullPolicy: Always
command:
- mongod
- "--replSet"
- rs0
- "--bind_ip"
- 0.0.0.0
- "--smallfiles"
- "--noprealloc"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
- name: mongo-sidecar
image: cvallance/mongo-k8s-sidecar
env:
- name: MONGO_SIDECAR_POD_LABELS
value: "role=mongo,environment=test"
updateStrategy:
type: RollingUpdate
I am trying to update the image of the mongo using the following set command,
oc set image statefulset/mongo mongo=mongo:4.2 -n mongo-replica
While trying to update the image, the pods are into "CrashLoopBackOff" error. I am expecting the pods to be auto rolled back to the previous running version.
But the pods are struck in "CrashLoopBackOff" error state. I want the pods to be rolled back to the previous running version. Any suggestions here would be appreciated.

Statefulset unfortunately don't have a Rollback, but you can warranty your services using the probes, having a well configure Liveness and Readiness probes the changed version will only take the place of the running version with the probes answering an ok status.
In that way only one of your 3 replicas will crash in a failure, and you can work on it to solve the problem or manually rollback your changes, but without losing the delivery of your service.
More detail about this you can see on the k8s documentation:
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#forced-rollback
About the probes, you can get a good explanation about it here:
https://www.openshift.com/blog/liveness-and-readiness-probes

Related

GKE Kubernetes MySQL Input/output error Ext4Error

I have deployed a MySQL database (statefulset) on Kubernetes zonal cluster, running as a service (GKE) in Google Cloud Platform.
The zonal cluster consist of 3 instances of type e2-medium.
The MySQL container cannot start due to the following error.
kubectl logs mysql-statefulset-0
2022-02-07 05:55:38+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.35-1debian10 started.
find: '/var/lib/mysql/': Input/output error
Last seen events.
4m57s Warning Ext4Error gke-cluster-default-pool-rnfh kernel-monitor, gke-cluster-default-pool-rnfh EXT4-fs error (device sdb): __ext4_find_entry:1532: inode #2: comm mysqld: reading directory lblock 0 40d 8062 gke-cluster-default-pool-rnfh
3m22s Warning BackOff pod/mysql-statefulset-0 spec.containers{mysql} kubelet, gke-cluster-default-pool-rnfh Back-off restarting failed container
Nodes.
kubectl get node -owide
gke-cluster-default-pool-ayqo Ready <none> 54d v1.21.5-gke.1302 So.Me.I.P So.Me.I.P Container-Optimized OS from Google 5.4.144+ containerd://1.4.8
gke-cluster-default-pool-rnfh Ready <none> 54d v1.21.5-gke.1302 So.Me.I.P So.Me.I.P Container-Optimized OS from Google 5.4.144+ containerd://1.4.8
gke-cluster-default-pool-sc3p Ready <none> 54d v1.21.5-gke.1302 So.Me.I.P So.Me.I.P Container-Optimized OS from Google 5.4.144+ containerd://1.4.8
I also noticed that rnfh node is out of memory.
kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
gke-cluster-default-pool-ayqo 117m 12% 992Mi 35%
gke-cluster-default-pool-rnfh 180m 19% 2953Mi 104%
gke-cluster-default-pool-sc3p 179m 19% 1488Mi 52%
MySql mainfest
# HEADLESS SERVICE
apiVersion: v1
kind: Service
metadata:
name: mysql-headless-service
labels:
kind: mysql-headless-service
spec:
clusterIP: None
selector:
tier: mysql-db
ports:
- name: 'mysql-http'
protocol: 'TCP'
port: 3306
---
# STATEFUL SET
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql-statefulset
spec:
selector:
matchLabels:
tier: mysql-db
serviceName: mysql-statefulset
replicas: 1
template:
metadata:
labels:
tier: mysql-db
spec:
terminationGracePeriodSeconds: 10
containers:
- name: my-mysql
image: my-mysql:latest
imagePullPolicy: Always
args:
- "--ignore-db-dir=lost+found"
ports:
- name: 'http'
protocol: 'TCP'
containerPort: 3306
volumeMounts:
- name: mysql-pvc
mountPath: /var/lib/mysql
env:
- name: MYSQL_ROOT_USER
valueFrom:
secretKeyRef:
name: mysql-secret
key: mysql-root-username
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: mysql-root-password
- name: MYSQL_USER
valueFrom:
configMapKeyRef:
name: mysql-config
key: mysql-username
- name: MYSQL_PASSWORD
valueFrom:
configMapKeyRef:
name: mysql-config
key: mysql-password
- name: MYSQL_DATABASE
valueFrom:
configMapKeyRef:
name: mysql-config
key: mysql-database
volumeClaimTemplates:
- metadata:
name: mysql-pvc
spec:
storageClassName: 'mysql-fast'
resources:
requests:
storage: 120Gi
accessModes:
- ReadWriteOnce
- ReadOnlyMany
MySQL storage class manifest:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: mysql-fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate
Why Kubernetes is trying to schedule pod in out of memory node?
UPDATES
I've added requests and limits to MySQL manifest to improve the Qos Class. Now the Qos Class is Guaranteed.
Unfortunately, Kubernetes still trying to schedule to out of memory rnfh node.
kubectl describe po mysql-statefulset-0 | grep node -i
Node: gke-cluster-default-pool-rnfh/So.Me.I.P
kubectl describe po mysql-statefulset-0 | grep qos -i
QoS Class: Guaranteed
I ran a few more tests but I couldn't replicate this.
To answer this one correctly, we would need much more logs. Not sure if you still have them. If I could guess which was the root cause of this issue I would say it was connected with the PersistentVolume.
In one of the Github issue - Volume was remounted as read only after error #752 I found very similar behavior to OP's behavior.
You have created a special storageclass for your MySQL. You've set reclaimPolicy: Retain so PV was not removed. When Statefulset pod (with the same suffix -0) has been recreated (restarted due to error with connectivity, some issues on DB, hard to say) it tried to re-claim this Volume. In the mentioned Github issue, user had very similar situation. Also got inode #262147: comm mysqld: reading directory lblock issue, but in the bellow there was also entry [ +0.003695] EXT4-fs (sda): Remounting filesystem read-only. Maybe it changed permissions when re-mounted?
Another thing that your volumeClaimTemplates contained
accessModes:
- ReadWriteOnce
- ReadOnlyMany
So one PersistentVolume could be used as ReadWriteOnce by one node or only ReadOnlyMany by many nodes. There is a possibility that POD was recreated in different node with Read-Only assessMode.
[ +35.912075] EXT4-fs warning (device sda): htree_dirblock_to_tree:977: inode #2: lblock 0: comm mysqld: error -5 reading directory block
[ +6.294232] EXT4-fs error (device sda): ext4_find_entry:1436: inode #262147: comm mysqld: reading directory lblock ...
[ +0.005226] EXT4-fs error (device sda): ext4_find_entry:1436: inode #2: comm mysqld: reading directory lblock 0
[ +1.666039] EXT4-fs error (device sda): ext4_journal_check_start:61: Detected aborted journal
[ +0.003695] EXT4-fs (sda): Remounting filesystem read-only
It would fit to OP's comment:
Two days ago for reasons unknown to me Kubernetes restarted the container and was keep trying to run it on rnfa machine. The container was probably evicted from another node.
Another thing is that node or cluster might be updated (depending if the auto update option was turned on) which might enforce restart of the pod.
Issue with '/var/lib/mysql/': Input/output error might point to database corruption like mentioned here.
In general, the issue has been resolved by cordoning affected node. Additional information about the difference between cordon and drain can be found here.
Just as an addition, to assign pods to specific node or node with specified label, you can use Affinity

Can't Share a Persistent Volume Claim for an EBS Volume between Apps

Is it possible to share a single persistent volume claim (PVC) between two apps (each using a pod)?
I read: Share persistent volume claims amongst containers in Kubernetes/OpenShift but didn't quite get the answer.
I tried to added a PHP app, and MySQL app (with persistent storage) within the same project. Deleted the original persistent volume (PV) and created a new one with read,write,many mode. I set the root password of the MySQL database, and the database works.
Then, I add storage to the PHP app using the same persistent volume claim with a different subpath. I found that I can't turn on both apps. After I turn one on, when I try to turn on the next one, it get stuck at creating container.
MySQL .yaml of the deployment step at openshift:
...
template:
metadata:
creationTimestamp: null
labels:
name: mysql
spec:
volumes:
- name: mysql-data
persistentVolumeClaim:
claimName: mysql
containers:
- name: mysql
...
volumeMounts:
- name: mysql-data
mountPath: /var/lib/mysql/data
subPath: mysql/data
...
terminationMessagePath: /dev/termination-log
imagePullPolicy: IfNotPresent
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
PHP .yaml from deployment step:
template:
metadata:
creationTimestamp: null
labels:
app: wiki2
deploymentconfig: wiki2
spec:
volumes:
- name: volume-959bo <<----
persistentVolumeClaim:
claimName: mysql
containers:
- name: wiki2
...
volumeMounts:
- name: volume-959bo
mountPath: /opt/app-root/src/w/images
subPath: wiki/images
terminationMessagePath: /dev/termination-log
imagePullPolicy: Always
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
securityContext: {}
The volume mount names are different. But that shouldn't make the two pods can't share the PVC. Or, the problem is that they can't both mount the same volume at the same time?? I can't get the termination log at /dev because if it can't mount the volume, the pod doesn't start, and I can't get the log.
The PVC's .yaml (oc get pvc -o yaml)
apiVersion: v1
items:
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-class: ebs
volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
creationTimestamp: YYYY-MM-DDTHH:MM:SSZ
name: mysql
namespace: abcdefghi
resourceVersion: "123456789"
selfLink: /api/v1/namespaces/abcdefghi/persistentvolumeclaims/mysql
uid: ________-____-____-____-____________
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
volumeName: pvc-________-____-____-____-____________
status:
accessModes:
- ReadWriteMany
capacity:
storage: 1Gi
phase: Bound
kind: List
metadata: {}
resourceVersion: ""
selfLink: ""
Suspicious Entries from oc get events
Warning FailedMount {controller-manager }
Failed to attach volume "pvc-________-____-____-____-____________"
on node "ip-172-__-__-___.xx-xxxx-x.compute.internal"
with:
Error attaching EBS volume "vol-000a00a00000000a0" to instance
"i-1111b1b11b1111111": VolumeInUse: vol-000a00a00000000a0 is
already attached to an instance
Warning FailedMount {kubelet ip-172-__-__-___.xx-xxxx-x.compute.internal}
Unable to mount volumes for pod "the pod for php app":
timeout expired waiting for volumes to attach/mount for pod "the pod".
list of unattached/unmounted volumes=
[volume-959bo default-token-xxxxx]
I tried to:
turn on the MySQL app first, and then try to turn on the PHP app
found php app can't start
turn off both apps
turn on the PHP app first, and then try to turn on the MySQL app.
found mysql app can't start
The strange thing is that the event log never says it can't mount volume for the MySQL app.
The remaining volumen to mount is either default-token-xxxxx, or volume-959bo (the volume name in PHP app), but never mysql-data (the volume name in MySQL app).
So the error seems to be caused by the underlying storage you are using, in this case EBS. The OpenShift docs actually specifically state that this is the case for block storage, see here.
I know this will work for both NFS and Glusterfs storage, and have done this in numerous projects using these storage type but unfortunately, in your case it's not supported

Kubernetes + MySQL : Creating custom database and user in a Kubernetes container

I am trying to create a Django + MySQL app using Google Container Engine and Kubernetes. Following the docs from official MySQL docker image and Kubernetes docs for creating MySQL container I have created the following replication controller
apiVersion: v1
kind: ReplicationController
metadata:
labels:
name: mysql
name: mysql
spec:
replicas: 1
template:
metadata:
labels:
name: mysql
spec:
containers:
- image: mysql:5.6.33
name: mysql
env:
#Root password is compulsory
- name: "MYSQL_ROOT_PASSWORD"
value: "root_password"
- name: "MYSQL_DATABASE"
value: "custom_db"
- name: "MYSQL_USER"
value: "custom_user"
- name: "MYSQL_PASSWORD"
value: "custom_password"
ports:
- name: mysql
containerPort: 3306
volumeMounts:
# This name must match the volumes.name below.
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
gcePersistentDisk:
# This disk must already exist.
pdName: mysql-disk
fsType: ext4
According to the docs, passing the environment variables MYSQL_DATABASE. MYSQL_USER, MYSQL_PASSWORD, a new user will be created with that password and assigned rights to the newly created database. But this does not happen. When I SSH into that container, the ROOT password is set. But neither the user, nor the database is created.
I have tested this by running locally and passing the same environment variables like this
docker run -d --name some-mysql \
-e MYSQL_USER="custom_user" \
-e MYSQL_DATABASE="custom_db" \
-e MYSQL_ROOT_PASSWORD="root_password" \
-e MYSQL_PASSWORD="custom_password" \
mysql
When I SSH into that container, the database and users are created and everything works fine.
I am not sure what I am doing wrong here. Could anyone please point out my mistake. I have been at this the whole day.
EDIT: 20-sept-2016
As Requested
#Julien Du Bois
The disk is created. it appears in the cloud console and when I run the describe command I get the following output
Command : gcloud compute disks describe mysql-disk
Result:
creationTimestamp: '2016-09-16T01:06:23.380-07:00'
id: '4673615691045542160'
kind: compute#disk
lastAttachTimestamp: '2016-09-19T06:11:23.297-07:00'
lastDetachTimestamp: '2016-09-19T05:48:14.320-07:00'
name: mysql-disk
selfLink: https://www.googleapis.com/compute/v1/projects/<details-withheld-by-me>/disks/mysql-disk
sizeGb: '20'
status: READY
type: https://www.googleapis.com/compute/v1/projects/<details-withheld-by-me>/diskTypes/pd-standard
users:
- https://www.googleapis.com/compute/v1/projects/<details-withheld-by-me>/instances/gke-cluster-1-default-pool-e0f09576-zvh5
zone: https://www.googleapis.com/compute/v1/projects/<details-withheld-by-me>
I referred to lot of tutorials and google cloud examples. To run the mysql docker container locally my main reference was the official image page on docker hub
https://hub.docker.com/_/mysql/
This works for me and locally the container created has a new database and user with right privileges.
For kubernetes, my main reference was the following
https://cloud.google.com/container-engine/docs/tutorials/persistent-disk/
I am just trying to connect to it using Django container.
I was facing the same issue when I was using volumes and mounting them to mysql pods.
As mentioned in the documentation of mysql's docker image:
When you start the mysql image, you can adjust the configuration of the MySQL instance by passing one or more environment variables on the docker run command line. Do note that none of the variables below will have any effect if you start the container with a data directory that already contains a database: any pre-existing database will always be left untouched on container startup.
So after spinning wheels I managed to solve the problem by changing the hostPath of the volume that I was creating from "/data/mysql-pv-volume" to "/var/lib/mysql"
Here is a code snippet that might help create the volumes
apiVersion: v1
kind: PersistentVolume
metadata:
name: mysql-pv-volume
labels:
type: local
spec:
persistentVolumeReclaimPolicy: Delete /* For development Purposes only */
storageClassName: manual
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/var/lib/mysql"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Hope that helped.
You set mysql-disk in your deployment and the disk you have is custom-disk. Change pdName to custom-disk and it will work.

How to setup error reporting in Stackdriver from kubernetes pods?

I'm a bit confused at how to setup error reporting in kubernetes, so errors are visible in Google Cloud Console / Stackdriver "Error Reporting"?
According to documentation
https://cloud.google.com/error-reporting/docs/setting-up-on-compute-engine
we need to enable fluentd' "forward input plugin" and then send exception data from our apps. I think this approach would have worked if we had setup fluentd ourselves, but it's already pre-installed on every node in a pod that just runs gcr.io/google_containers/fluentd-gcp docker image.
How do we enable forward input on those pods and make sure that http port available to every pod on the nodes? We also need to make sure this config is used by default when we add more nodes to our cluster.
Any help would be appreciated, may be I'm looking at all this from a wrong point?
The basic idea is to start a separate pod that receives structured logs over TCP and forwards it to Cloud Logging, similar to a locally-running fluentd agent. See below for the steps I used.
(Unfortunately, the logging support that is built into Docker and Kubernetes cannot be used - it just forwards individual lines of text from stdout/stderr as separate log entries which prevents Error Reporting from seeing complete stack traces.)
Create a docker image for a fluentd forwarder using a Dockerfile as follows:
FROM gcr.io/google_containers/fluentd-gcp:1.18
COPY fluentd-forwarder.conf /etc/google-fluentd/google-fluentd.conf
Where fluentd-forwarder.conf contains the following:
<source>
type forward
port 24224
</source>
<match **>
type google_cloud
buffer_chunk_limit 2M
buffer_queue_limit 24
flush_interval 5s
max_retry_wait 30
disable_retry_limit
</match>
Then build and push the image:
$ docker build -t gcr.io/###your project id###/fluentd-forwarder:v1 .
$ gcloud docker push gcr.io/###your project id###/fluentd-forwarder:v1
You need a replication controller (fluentd-forwarder-controller.yaml):
apiVersion: v1
kind: ReplicationController
metadata:
name: fluentd-forwarder
spec:
replicas: 1
template:
metadata:
name: fluentd-forwarder
labels:
app: fluentd-forwarder
spec:
containers:
- name: fluentd-forwarder
image: gcr.io/###your project id###/fluentd-forwarder:v1
env:
- name: FLUENTD_ARGS
value: -qq
ports:
- containerPort: 24224
You also need a service (fluentd-forwarder-service.yaml):
apiVersion: v1
kind: Service
metadata:
name: fluentd-forwarder
spec:
selector:
app: fluentd-forwarder
ports:
- protocol: TCP
port: 24224
Then create the replication controller and service:
$ kubectl create -f fluentd-forwarder-controller.yaml
$ kubectl create -f fluentd-forwarder-service.yaml
Finally, in your application, instead of using 'localhost' and 24224 to connect to the fluentd agent as described on https://cloud.google.com/error-reporting/docs/setting-up-on-compute-engine, use the values of evironment variables FLUENTD_FORWARDER_SERVICE_HOST and FLUENTD_FORWARDER_SERVICE_PORT.
To add to Boris' answer: As long as errors are logged in the right format (see https://cloud.google.com/error-reporting/docs/troubleshooting) and Cloud Logging is enabled (you can see the errors in https://console.cloud.google.com/logs/viewer) then errors will make it to Error Reporting without any further setup.
Boris' answer was great but was a lot more complicated then it really needed to be (no need to build a docker image). If you have kubectl configured on your local box (or you can use the Google Cloud Shell), copy and paste the following and it will install the forwarder in your cluster (I updated the version of fluent-gcp from the above answer). My solution uses a ConfigMap to store the file so it can be changed easily without rebuilding.
cat << EOF | kubectl create -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-forwarder
data:
google-fluentd.conf: |+
<source>
type forward
port 24224
</source>
<match **>
type google_cloud
buffer_chunk_limit 2M
buffer_queue_limit 24
flush_interval 5s
max_retry_wait 30
disable_retry_limit
</match>
---
apiVersion: v1
kind: ReplicationController
metadata:
name: fluentd-forwarder
spec:
replicas: 1
template:
metadata:
name: fluentd-forwarder
labels:
app: fluentd-forwarder
spec:
containers:
- name: fluentd-forwarder
image: gcr.io/google_containers/fluentd-gcp:2.0.18
env:
- name: FLUENTD_ARGS
value: -qq
ports:
- containerPort: 24224
volumeMounts:
- name: config-vol
mountPath: /etc/google-fluentd
volumes:
- name: config-vol
configMap:
name: fluentd-forwarder
---
apiVersion: v1
kind: Service
metadata:
name: fluentd-forwarder
spec:
selector:
app: fluentd-forwarder
ports:
- protocol: TCP
port: 24224
EOF

container is in waiting state, kubernetes, docker container

this is my .yaml content
apiVersion: v1
kind: Pod
metadata:
name: mysql
labels:
name: mysql
spec:
containers:
- resources:
limits :
cpu: 0.5
image: imagelingga
name: imagelingga
ports:
- containerPort: 80
name: imagelingga
- resources:
limits :
cpu: 0.5
image: mysql
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
# change this
value: pass
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysqlkuber
mountPath: /var/lib/mysql
readOnly: false
volumes:
- name: mysqlkuber
hostPath:
path: /home/mysqlkuber
i have two image
-mysql
-imagelingga = microservice server for java
the mysql logs shows that already run
but the imagelingga logs show Pod "mysql" in namespace "default": container "imagelingga" is in waiting state.trial
the connection between these two images is, imagelinnga need connection to mysql as DB.
i already run both images in docker container without kubernetes and run normally. but when i run inside kubernetes then the problem appear like that.
how to trigger imagelingga container to start the service
thx before!!
The container is in waiting state because when runnning the images it's crash or fail.
Then the container will be restart by the kubernetes, that make the container is in waiting state because on restarting progress.
For pod status
kubectl get pods
if the status "CrashLoopBackOff", then its restarting the container
For check container inside pod logs
kubectl logs [pod] [container]