gcloud compute: issue command and close terminal - google-compute-engine

Since it takes time to create snapshot of google compute engine instance, I wonder it is possible to issues gcloud compute disks snapshot command on my local machine and close terminal on local machine without interrupting shapshot creation process?

From the documentation for gcloud compute disks snapshot:
FLAGS
--async
Display information about the operation in progress, without waiting for the operation to complete.
You can run gcloud compute disks snapshot --async and note the operation ID, then run gcloud compute operations describe <OPERATION ID> to check on the operation (you may also have to provide the zone of the operation, which should be the same as the zone of the disk).
Even if you don't use the --async flag, the operation is running asynchronously in the background (gcloud is just staying open until it finishes). If you close the terminal, the snapshot will finish. You'd just need to do some digging to find the operation ID if you're interested in following up on its status.

Related

How to track disk usage on Container-Optimized OS

I have an application running on Container-Optimized OS based Compute Engine.
My application runs every 20min, fetches and writes data to a local file, then deletes the file after some processing. Note that each file is less than 100KB.
My boot disk size is the default 10GB.
I run into "no space left on device" error every month or so while attempting to write the file locally.
How can I track disk usage?
I manually checked the size of the folders and it seems that the bulk of the space is taken by /mnt/stateful_partition/var/lib/docker/overlay2.
my-vm / # sudo du -sh /mnt/stateful_partition/var/lib/docker/*
20K /mnt/stateful_partition/var/lib/docker/builder
72K /mnt/stateful_partition/var/lib/docker/buildkit
208K /mnt/stateful_partition/var/lib/docker/containers
4.4M /mnt/stateful_partition/var/lib/docker/image
52K /mnt/stateful_partition/var/lib/docker/network
1.6G /mnt/stateful_partition/var/lib/docker/overlay2
20K /mnt/stateful_partition/var/lib/docker/plugins
4.0K /mnt/stateful_partition/var/lib/docker/runtimes
4.0K /mnt/stateful_partition/var/lib/docker/swarm
4.0K /mnt/stateful_partition/var/lib/docker/tmp
4.0K /mnt/stateful_partition/var/lib/docker/trust
28K /mnt/stateful_partition/var/lib/docker/volumes
TL;DR: Use Stackdriver Monitoring and create an alert for DISK usage.
Since you are using COS images, you can enable Stackdriver Monitoring agent by simply adding the “google-monitoring-enabled” label set to “true” on GCE Instance metadata. To do so, run the command:
gcloud compute instances add-metadata instance-name --metadata=google-monitoring-enabled=true
Replace instance-name with the name of your instance. Remember to restart your instance to get the change done. You don't need to install the Stackdriver Monitoring agent since is already installed by default in COS images.
Then, you can use disk usage metric to get the usage of your disk.
You can create an alert to get a notification each time the usage of the partition reaches a certain threshold.
Since you are in a cloud, it is always the best idea to use the Cloud resources to solve Cloud issues.
Docker uses /var/lib/docker to store your images, containers, and local named volumes. Deleting this can result in data loss and possibly stop the engine from running. The overlay2 subdirectory specifically contains the various filesystem layers for images and containers.
To cleanup unused containers and images via command:
docker system prune.
Monitor it via command "watch"
sudo watch "du -sh /mnt/stateful_partition/var/lib/docker/*"

Why restored VM from Google cloud snapshot doesn't have data in its database?

I'm running an web application in google compute engine and have scheduled a snapshot for the VM [Ubuntu 16.04].
I tried restoring the VM from the last available snapshot. I'm able to bring up the web application from the restored VM. But the problem is there are no any data in the database [mongodb]. All the collections created by application and default data [data seeded during deployment] are present in the mongodb in restored VM, but other than that, there is no data.
Is this how Google snapshots work? Isn't the new restored VM supposed to have all the data till the time of snapshot creation?
Creating snapshot while all the apps are running may not prove 100% accurate because some data are in buffers / caches etc.
Your missing data might have been not yet written to the disk when the snapshot was being created.
Google documentation about creating snapshots is quite clear about it:
You can create a snapshot of a persistent disk even while your apps
write data to the disk. However, you can improve snapshot consistency
if you flush the disk buffers and sync your file system before you
create a snapshot.
Pause apps or operating system processes that write data to that
persistent disk. Then flush the disk buffers before you create the
snapshot.
Try following instructions and test the results.
If for some reason you can't completely stop the database try to just flush the buffer to the disk, freeze the file system (if possible) and then create a snapshot.
You can freeze file system by logging into the instance and typing sudo fsfreeze -f [example-disk_location] and unfreeze sudo fsfreeze -u [example-disk_location].
The perfect way (with guaranteed data integrity) is to either stop the VM or unmount the disk.

How to restore instance using snapshot in Google Compute Engine?

I created a snapshot of a VM instance via cloud console. I would like to know how I can restore an instance using a snapshot. The documentation for compute engine is not very helpful. The instance runs on Ubuntu. Thanks.
To restore an instance from a snapshot without deleting/re-creating the instance:
Shut down the instance and detach the boot disk: gcloud beta compute instances detach-disk INSTANCE_NAME --disk BOOT_DISK_NAME
Create a new disk from the snapshot: gcloud compute disks create DISK_NAME --source-snapshot SNAPSHOT_NAME
Attach the disk created from step 2 as the boot disk: gcloud beta compute instances attach-disk INSTANCE_NAME --disk DISK_NAME --boot
Restoring the instance without deleting/re-creating the instance means that after restore, the instance will keep its IP address and other properties like tags, labels, etc.
In Console, you can go to VM Instances from Compute Engine tab.
There you click on '+Create Instance'
and there in section of boot-disk, you can navigate to 'snapshots' tab and select the snapshot you took.
Like this from console, you can Restore your instance.
Let me know, if this doesn't work for you!
Try using:
gcloud compute disks \
create <NEW_INSTANCE_NAME> \
--source-snapshot <SNAPSHOT_NAME> \
--type pd-ssd \
--zone <ZONE>
You could find useful instruction here.
If you want to do it in the web interface, it's pretty easy. Edit your VM instance. Scroll down to your boot disk and click "x" next to the boot volume, then click create button. In the next window give your new volume a meaningful name, set up your snapshot schedule, and then under "source type" select "Snapshot". Choose your snapshot from the dropdown. If you have a customer managed/supplied key (not recommended) select it, otherwise leave it as google-managed key. Click create. Depending on disk size it might take a while. Be patient and let it finish the setup and then start your instance once it is al complete.

How do I automatically restart a GCE preemptible instance?

How do I automatically restart a preemptible Google Compute Engine instance? I only have one instance that doesn't need 100% uptime but that I would like to restart once the data center becomes unloaded again. The instance/server that I'm trying to automatically restart has its own boot disk that I'd like to use each time it restarts.
You could try using Instance Group Manager to set up a pool of size 1. It will then try to re-create instances after they are preempted.
You should be aware that there is no guarantee that there is going to be capacity for your instance. As the docs say:
Preemptible instances are available from a finite amount of Compute Engine resources, and might not always be available.
You could create a f1-micro instance which is free for one instance per month in several data centers and create a cron job
*/10 * * * * /snap/bin/gcloud beta compute instances start --zone "yourzone" "yourinstance" --project "yourproject"
after you ran gcloud auth login once.
This will restart your instance every 10 minutes. Of course you can set this also to an hour or more. With a bit more scripting also things like exponential back off can be done.
If you'd like to restart it less frequently, you can use Instance schedules that's built in to the Google Cloud Dashboard.
https://cloud.google.com/compute/docs/instances/schedule-instance-start-stop

Google Compute Engine disk snapshot stuck at creating

I took a snapshot of a 50GB volume (non-boot) which is attached to an instance. The snapshot was successful.
I shut the instance and tried taking another snapshot of the same volume. This time the command hung. gcloud status reflects "CREATING" for this attempt. It is hours since I started the snapshot command. I tried the same using google developers console. The behaviour remains the same.
I restarted the instance and the status of the snapshot changes to "READY" within seconds.
It seems that snapshots should be taken if the volume is attached to a running instance. Otherwise the command is queued and executed when the volume/instance is live. Is this expected behaviour?
I replicated your issue and indeed the snapshot process halts when the instance is shut down. You may have also noticed that now the Shutdown/Start feature has been introduced - it was not available before.
I believe this is due to how snapshots are being handled on the platform. Your first snapshot creates a full copy of the disk, while the second one is differential - the differential one will fail or stay in pending as it cannot query the source disk while the instance is down to check what has been changed . You can check this for further info.
Dettaching the disk from the instance and then creating a snapshot works, so that could be a workaround for you.
Hope this helps