Google cloud compute instance metrics taking up disk space - google-compute-engine

I have a google cloud compute instance set up but it's getting low on disk space. It looks like it is the /mnt/stateful_partition/var/lib/metrics directory taking up a significant amount of space (3+gb). I assume this is the compute metrics but I can't find any way to safely remove these other than just deleting the files. Is this going to cause any issues?

The path you are referring are File System directories that are used for the GCE VM instance, and you are correct that the metrics folder is safe to be removed. To learn more about these directories, see Disks and file system overview.
I would also suggest to create a snapshot first if you wanted to make sure that the changes you will do on your instance won't affect your system performance. So that you can easily revert it back to your previous instance state.

Related

Is it safe to delete metric events file in google compute engine vm

So, I found a file called uma-events under metrics folder in a google compute engine vm(built using Container Optimized OS) which is taking about 5gb space. I cannot extend the partition in the current condition and am running low on disk space. Also, the file mentioned above is owned by chronos(Maybe it is a default user/group?) So, would it be safe to delete the file?
full path of the file is - /mnt/stateful_partition/var/lib/metrics/uma-events
I went through several documentations but didn't find anything useful.
The root file system is mounted as read-only to protect system integrity. However, home directories and /mnt/stateful_partition are persistent and writable.
You can remove it as a temporary solution or you can resize it.
To resolve your issue, do the following:
Create a snapshot of the disk : Make sure that the changes you will do on your instance won't affect your system performance. So that you can easily revert it back to your previous instance state.
Delete files that you don't need on the disk to free up space(Your safe to be removed metrics folder), to learn more about these directories, See Disks and file system overview.
If your disk requires more space after this, resize the disk. (If your VM might become inaccessible if your boot disk is full, to troubleshoot full disk or resize disk refer to this link Troubleshooting full disks and disk resizing.

How do I make a snapshot of my boot disk?

I've read multiple times that I can cause read/write errors if I create a snapshot. Is it possible to create a snapshot of the disk my machine is booted off of?
It depends on what you mean by "snapshot".
A snapshot is not a backup, it is a way of temporarily capturing the state of a system so you can make changes test the results and revert back to the previously known good state if the changes cause issues.
How to take a snapshot varies depending on the OS you're using, whether you're talking about a physical system or a virtual system, what virtualization platform, you're using, what image types you're using for disks within a given virtualization platform etc. etc. etc.
Once you have a snapshot, then you can make a real backup from the snapshot. You'll want to make sure that if it's a database server that you've flushed everything to disk and then write lock it for the time it takes to make the snapshot (typically seconds). For other systems you'll similarly need to address things in a way that ensures that you have a consistent state.
If you want to make a complete backup of your system drive, directly rather than via a snapshot then you want to shut down and boot off an alternate boot device like a CD or an external drive.
If you don't do that, and try to directly back up a running system then you will be leaving yourself open to all manner of potential issues. It might work some of the time, but you won't know until you try and restore it.
If you can provide more details about the system in question, then you'll get more detailed answers.
As far as moving apps and data to different drives, data is easy provided you can shut down whatever is accessing the data. If it's a database, stop the database, move the data files, tell the database server where to find its files and start it up.
For applications, it depends. Often it doesn't matter and it's fine to leave it on the system disk. It comes down to how it's being installed.
It looks like that works a little differently. The first snapshot will create an entire copy of the disk and subsequent snapshots will act like ordinary snapshots. This means it might take a bit longer to do the first snapshot.
According to :
this you ideally want to shut down the system before taking a snapshot of your boot disk. If you can't do that for whatever reason, then you want to minimize the amount of writes hitting the disk and then take the snapshot. Assuming you're using a journaling filesystem (ext3, ext4, xfs etc.) it should be able to recover without issue.
You an use the GCE APIs. Use the Disks:insert API to create the Persistence disk. you have some code examples on how to start an instance using Python, but Google has libraries for other programming languages like Java, PHP and other

Google Compute Engine: what is the difference between disk snapshot and disk image?

I've been using both for my startup and to me, the functionality is the same. Until now, the instances I've been creating are only for computation. I'm wondering how GCE disk images and snapshots are different in terms of technology, and in which situation it is better to use one over the other.
A snapshot reflects the contents of a persistent disk in a concrete instant in time. An image is the same thing, but includes an operating system and boot loader and can be used to boot an instance.
Images and snapshots can be public or private. In the case of images, public can mean official public images provided by Google or not.
Snapshots are stored as diffs (a snapshot is stored relative to the previous one, though that is transparent to you) while images are not. They are also cheaper ($0.026 per GB/month vs $0.050 for images) (Snapshots are increasing to $0.050/GB/month on October 1, 2022).
These days the two concepts are quite similar. It's now possible to start an instance using a snapshot instead of an image, which is an easy way of resizing your boot partition. Using snapshots may be simpler for most cases.
Snapshots:
Good for backup and disaster recovery
Lower cost than images
Smaller size than images since it doesn't contain OS, etc.
Differential backups - only the data changed since the last snapshot
is recreated
Faster to create than images
Snapshots are only available in the project they are
created (now it is possible to share between projects)
Can be created for running disks even while they are attached
to running instances
Images:
Good for reusing compute engine instance states with new instances
Available across different projects
Can't be created for running instances(unless you use --force flag)
Snapshots are primarily targeting backup and disaster recovery scenarios, they are cheaper, easier to create (can often be uploaded without stopping the VM). They are meant for frequent regular upload, and rare downloads.
Images are primarily meant for boot disk creation. They optimized for multiple downloads of the same data over and over. If the same image downloaded many times, subsequent to the first download the following downloads are going to be very fast (even for large images).
Images do not have to be used for boot disks exclusively, they also can be used for data that need to be made quickly available to a large set of VMs (In a scenario where a shared read-only disk doesn't satisfy the requirements for whatever reason)
Snapshot is a copy of your disk that you can use to create a new persistence disk (PD) of any type (standard PD or SSD PD). You can use the snapshot to create a bigger disk size, also you have the ability of creating the new disk on any zone you might need. Pricing is a bit cheaper for the provisioned space used for a snapshot. when used as backup, you can create differential snapshots.
When you use an existing disk to create an instance, you have to create the instance in the same zone where the disk exists and it will have the size of the disk.
When referring to images resources, is the pre-configured GCE operating system that you’re running (Centos, Debian, etc) and you can use the public images, available for all projects or private images for a specific project or create your own custom image.
A snapshot is locked within a project, but a custom image can be
shared between projects.
simply put - snapshot is basically the backup of the data in the disk
also important point is they are differentially backed up (lesser size).
used for backup and DR mostly.
Image is having backup of the OS as well , custom images are prepared to ensure some organizational policies as well.
In terms of cloud computing - Images are used to launch multiple instances with same configurations and snapshots are mostly for backup

Share a persistent disk between Google Compute Engine VMs

From Google's documentation:
It is possible to attach a persistent disk to more than one instance. However, if you attach a persistent disk to multiple instances, all instances must attach the persistent disk in read-only mode. It is not possible to attach the persistent disk to multiple instances in read-write mode.
If you attach a persistent disk in read-write mode and then try to attach the disk to subsequent instances, Google Compute Engine returns an error.
So, I need to have a share persistent-disk as frontend for all my compute engine, good, how can you write on this shared disk?
My guess (I hope) is a read/write persistent-disk can be attached only with 1 compute engine but this same disk can be share in read only to others VMs, is thats right?
Lets say I have 2 Compute Engine VMs and 2 persistent disks,
is this flow is possible?
compute1 read/write disk1 and read only disk2
compute2 read/write disk2 and read only disk1
Update: this is available as of 2020-06-16
As per another answer by Matthew Lenz, the functionality for creating multi-writer persistent disks is available, but it's still in alpha status (even though it's documented as being in the beta track) and requires special per-project enablement.
Note: This GitHub issue notes that the functionality is still in alpha, even though it's labelled as beta. You can submit feedback via Cloud Console to request it for your project if you'd like to get early access to this functionality, but it's not guaranteed to be enabled.
Assuming your project has the permissions to use this feature (or the feature becomes public-access), note that it comes with some caveats:
--multi-writer
Create the disk in multi-writer mode so that it can be attached with read-write access to multiple VMs. Can only be used with zonal SSD persistent disks. Disks in multi-writer mode do not support resize and snapshot operations.
You can use this via:
$ gcloud beta compute disks create DISK_NAME --multi-writer [...]
Note the caveats:
zonal SSD persistent disks only
no disk resizing
no snapshots
If these trade-offs are not acceptable to you, see the original answer (below) which has a long list of recommended storage alternatives for sharing data between multiple GCE VMs.
Original answer (valid prior to 2020-06-16)
No, this is not possible, as the documentation that you cited at the time of writing said (since updated):
However, if you attach a persistent disk to multiple instances, all instances must attach the persistent disk in read-only mode.
The documentation has been re-arranged since then; the new docs are at a different URL but with the same content:
You can attach a non-root persistent disk to more than one virtual machine instance in read-only mode, which allows you to share static data between multiple instances. Sharing static data between multiple instances from one persistent disk is cheaper than replicating your data to unique disks for individual instances.
If you attach a persistent disk to multiple instances, all of those instances must attach the persistent disk in read-only mode. It is not possible to attach the persistent disk to multiple instances in read-write mode. If you need to share dynamic storage space between multiple instances, connect your instances to Cloud Storage or create a network file server.
If you have a persistent disk with data that you want to share between multiple instances, detach it from any read-write instances and attach it to one or more instances in read-only mode.
which means you cannot have one instance have write access while another has read-only access.
If you want to share data between them, you need to use something other than Persistent Disk. Below are some possible solutions.
You can use any of the following hosted/managed services:
Google Cloud Filestore — perhaps closest to what you're looking for, as it provides an NFSv3 file system
You can also use Elastifile on GCP as a fully-managed service; note that GCP acquired Elastifile in July 2019
Google Cloud Datastore
Google Cloud Storage, which you can use via the GCS API (JSON or XML) or you can mount it using gcsfuse as a block device
Google Cloud Bigtable
Google Cloud SQL
Alternatively, you can run your own:
self-managed or third-party managed file servers solutions, including NetApp and Panzura
self-managed Elastifile storage deployment (for fully-managed, see previous section for the link)
database (whether SQL or NoSQL)
distributed filesystem such as Ceph, GlusterFS, OrangeFS, ZFS, etc.
file server such as NFS or SAMBA
single VM as a data storage node, and use sshfs to create a FUSE mount from other VMs that want to access that data
GCP has alpha functionality for 'multi-write' persistent disks. It's been in alpha for quite a long time so who knows if it'll make it to beta or ga any time soon. Here is a link to the documentation. https://cloud.google.com/sdk/gcloud/reference/beta/compute/disks/create#--multi-writer
EDIT: 2020-06-16. This has been promoted to beta.

Can I create an OS image of more than 10GB on Google Compute Engine?

I need to boot about 500 instances with an specific image to do a job with big files that requires POSIX access to more than 10GB. According to that doc https://developers.google.com/compute/docs/images it's impossible create a boot disk of more than 10GB and I need POSIX access to more than 10 GB. Does this mean I will need to create another non-boot disk on each instance with disk space I need? Is there another way to do that?
That doc refers to a limit to the size of the operating system Image, not the size of the boot disk.
You can create a boot disk of any size, and then use it when creating the instance, e.g:
gcutil adddisk "disk-1" --size_gb="15" --zone="europe-west1-b" --source_image="https://www.googleapis.com/compute/v1/projects/debian-cloud/global/images/debian-7-wheezy-v20140522"
gcutil addinstance "instance-1" --zone="europe-west1-b" --machine_type="n1-standard-1" --network="default" --external_ip_address="ephemeral" --metadata="sshKeys:" --disk="disk-1,deviceName=disk-1,mode=READ_WRITE,boot" --auto_delete_boot_disk="true"
See: https://developers.google.com/compute/docs/disks#create_disk
You can use Packer to create images on GCE which will auto-resize on boot. I have created a repo specifically for this question which has a demo of an auto-resizing image using recent versions of either Debian-7 backports or container-optimized VM images.
It uses cloud-initramfs-growroot:
automatically resize the root partition on first boot
This package adds functionality to an initramfs built by initramfs-tools. When installed, the initramfs will repartition a disk to make the root volume consume all space that follows it.
You most likely do not want this package unless you know what you are doing. It is primarily interesting in a virtualized environment when a disk can provisioned with a size larger than its original size. In this case, with this package installed, you can automatically use the new space without requiring a reboot to re-read the partition table.
See my script growroot.sh for details and how you can adapt it to your use case.
You have two options:
Create a boot disk larger than 10GB but then you'll need to repartition it, because by default, the provided VM images expand to 10GB so you'll need to use these instructions and run fdisk, reboot, and then run resize2fs to expand the usable space to the full size of the disk. You can automate it so that it runs as part of instance creation by using startup scripts.
Another alternative is to create a separate persistent disk and attach it separately, but then it won't be a boot disk but just a data disk. For that you can use the instructions on the same page, namely:
gcutil adddisk [...]
gcutil attachdisk [...], unless the boot and data disks are added during instance creation via gcutil addinstance --disk=disk1 --disk=disk2 [...], in which case this is not needed
/usr/share/google/safe_format_and_mount [...] to automate the rest
You can easily do this without having to manually resize/partition/format a disk or any of the complications introduced in all the other answers on StackOverflow. Please see my answer here for how this can be done: How to get a bigger boot disk on Google Compute Engine