how to trace restart of MySQL container? - mysql

I have mysql running in k8s with one replicaset, it keeps crashing at random times with exit code 137. Memory consumption is 82% while crashing.
I couldn't find anything in syslog, mysql error log and kubelet log other than restart message.
Instance is already having 64gb and once after the restart it is able to handle the application requests. So increasing memory should not be an actual solution.
Also monitoring tools says only 82% of the memory is being used at the time of crash.
How kubernetes calculates the memory consumption of a pod?
How to find why it is crashing?

You can use kubectl logs your-pod -c container-name -n your-namespaces to see your log, use kubectl describe pod your-pod -n your-namespaces to see pod events.

Related

How to track disk usage on Container-Optimized OS

I have an application running on Container-Optimized OS based Compute Engine.
My application runs every 20min, fetches and writes data to a local file, then deletes the file after some processing. Note that each file is less than 100KB.
My boot disk size is the default 10GB.
I run into "no space left on device" error every month or so while attempting to write the file locally.
How can I track disk usage?
I manually checked the size of the folders and it seems that the bulk of the space is taken by /mnt/stateful_partition/var/lib/docker/overlay2.
my-vm / # sudo du -sh /mnt/stateful_partition/var/lib/docker/*
20K /mnt/stateful_partition/var/lib/docker/builder
72K /mnt/stateful_partition/var/lib/docker/buildkit
208K /mnt/stateful_partition/var/lib/docker/containers
4.4M /mnt/stateful_partition/var/lib/docker/image
52K /mnt/stateful_partition/var/lib/docker/network
1.6G /mnt/stateful_partition/var/lib/docker/overlay2
20K /mnt/stateful_partition/var/lib/docker/plugins
4.0K /mnt/stateful_partition/var/lib/docker/runtimes
4.0K /mnt/stateful_partition/var/lib/docker/swarm
4.0K /mnt/stateful_partition/var/lib/docker/tmp
4.0K /mnt/stateful_partition/var/lib/docker/trust
28K /mnt/stateful_partition/var/lib/docker/volumes
TL;DR: Use Stackdriver Monitoring and create an alert for DISK usage.
Since you are using COS images, you can enable Stackdriver Monitoring agent by simply adding the “google-monitoring-enabled” label set to “true” on GCE Instance metadata. To do so, run the command:
gcloud compute instances add-metadata instance-name --metadata=google-monitoring-enabled=true
Replace instance-name with the name of your instance. Remember to restart your instance to get the change done. You don't need to install the Stackdriver Monitoring agent since is already installed by default in COS images.
Then, you can use disk usage metric to get the usage of your disk.
You can create an alert to get a notification each time the usage of the partition reaches a certain threshold.
Since you are in a cloud, it is always the best idea to use the Cloud resources to solve Cloud issues.
Docker uses /var/lib/docker to store your images, containers, and local named volumes. Deleting this can result in data loss and possibly stop the engine from running. The overlay2 subdirectory specifically contains the various filesystem layers for images and containers.
To cleanup unused containers and images via command:
docker system prune.
Monitor it via command "watch"
sudo watch "du -sh /mnt/stateful_partition/var/lib/docker/*"

Trying to create two MySQL pods in kubernetes with same volume for high availability

I am trying to deploy two MySQL pods with the same PVC, but I am getting CrashLoopBackoff state when I create the second pod with the error in logs: "innoDB check that you do not already have another mysqld process using the same innodb log files". How to resolve this error?
There are different options to solve high availability. If you are running kubernetes with an infrastructure that can provision the volume to different nodes (f.e. in the cloud) and your pod/node crashes, kubernetes will restart the database on a different node with the same volume. Aside from a short downtime you will have the database back up running in a relatively short time.
The volume will be mounted to a single running mysql pod to prevent data corruption from concurrent access. (This is what mysql notices in your scenario as well, since it is no designed for shared storage as HA solution)
If you need more you can use the built in replication of mysql to create a mysql 'cluster' which can be used even if one node/pod should fail. Each instance of the mysql cluster will have an individual volume in that case. Look at the kubernetes stateful set example for this scenario: https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/

gcloud compute: issue command and close terminal

Since it takes time to create snapshot of google compute engine instance, I wonder it is possible to issues gcloud compute disks snapshot command on my local machine and close terminal on local machine without interrupting shapshot creation process?
From the documentation for gcloud compute disks snapshot:
FLAGS
--async
Display information about the operation in progress, without waiting for the operation to complete.
You can run gcloud compute disks snapshot --async and note the operation ID, then run gcloud compute operations describe <OPERATION ID> to check on the operation (you may also have to provide the zone of the operation, which should be the same as the zone of the disk).
Even if you don't use the --async flag, the operation is running asynchronously in the background (gcloud is just staying open until it finishes). If you close the terminal, the snapshot will finish. You'd just need to do some digging to find the operation ID if you're interested in following up on its status.

mysql in docker container hangs

Two mysql(5.6.20) instances in two docker containers (1.8.32),
master and slave build semi-synchronous replication with each other,
then users do some dml or ddl operating in master always。
after ten days or more, all the clients which connect to slave will hang
gdb -p/strace slave mysqld process hangs
pstack/perf top -p slave mysqld process show nothing
kill -9 will not kill the mysqld process
docker stop will not stop the docker container
what tools or methods can help locating the problem?
I had the same occur today. In my case, using docker compose to bring up mysql and a range of consumers, using the current "latest" mysql image from docker hub. (5.7.16-1debian8)
I've launched a number of these, and within a week I've seen a couple of instances where mysql has well over 100 threads, all the memory on the host is consumed, and the containers are hung. I can't stop anything, I can't even reboot. Only a power cycle of the VM recovers.
I'll try and monitor. I suspect it depends highly on infrastructure load (slow VM host results in slow queries backing up). The solution is more likely to be mysql tuning and a docker bug.

Docker containers, memory consumption and logs

I've been trying Docker for a few days. I'm using a Drupal image (docker4drupal) which basically contains MySQL (MariaDB), PHP (php-fpm) and NGINX.
Almost everytime I do a database import to the database container, on a VPS with 512MB RAM, the container with MariaDB dies... and messages like "MySQL server has gone away" appear... And this does not happen when my VPS has 1GB o 2GB RAM.
So, this seems to be a memory problem, but I need the evidence! I don't know where is the log that tells me that my container died because wasn't enough memory.
I checked MariaDB logs but I can't find anything... it's log only say somethign like "the database was not normally shutdown" and thaen "it's starting" and then "wating for connections"...
So, independently of my MariaDB config (which is not proper for a 512MB VPS)... Where can I find explicitly the reason of why the container with the database server died?
Any help is welcome.
Thanks a lot.
PD: I execute mysql cli from the PHP container, that's why despite the database container dies I still can see the output that something wrong happened.
Could be the kernel terminating most memory-consuming process on 'lack-of-memory' event. Some entries may be there in host system log. Lack of such entries doesn't guarantee it wasn't kernel who killed your DB, though.
Exact filename depends on host system configuration (meaning the VPS, in your case). Could be /var/log/{system.log,error.log, ...}.
As long as docker container is not an isolated VM but a wrapper over kernel-driven cgroups, kernel events are handled by host system loggin daemon
Hi Beto we can see the logs in docker checkout the below commands:
The docker logs --follow command will continue streaming the new output from the container’s STDOUT and STDERR.
That is probably too much to cram in a minuscule 512MB. Do one of
Increase RAM available. ("And this does not happen when my VPS has 1GB")
Split applications across multiple tiny Dockers.
Tune each app to use less RAM. (Didn't I answer your question recently?)
How many tables do you have? Hopefully not a lot, as in https://dba.stackexchange.com/questions/60888/mysql-runs-out-of-memory-when-importing-innodb-database