Google Compute Engine: what is the difference between disk snapshot and disk image? - google-compute-engine

I've been using both for my startup and to me, the functionality is the same. Until now, the instances I've been creating are only for computation. I'm wondering how GCE disk images and snapshots are different in terms of technology, and in which situation it is better to use one over the other.

A snapshot reflects the contents of a persistent disk in a concrete instant in time. An image is the same thing, but includes an operating system and boot loader and can be used to boot an instance.
Images and snapshots can be public or private. In the case of images, public can mean official public images provided by Google or not.
Snapshots are stored as diffs (a snapshot is stored relative to the previous one, though that is transparent to you) while images are not. They are also cheaper ($0.026 per GB/month vs $0.050 for images) (Snapshots are increasing to $0.050/GB/month on October 1, 2022).
These days the two concepts are quite similar. It's now possible to start an instance using a snapshot instead of an image, which is an easy way of resizing your boot partition. Using snapshots may be simpler for most cases.

Snapshots:
Good for backup and disaster recovery
Lower cost than images
Smaller size than images since it doesn't contain OS, etc.
Differential backups - only the data changed since the last snapshot
is recreated
Faster to create than images
Snapshots are only available in the project they are
created (now it is possible to share between projects)
Can be created for running disks even while they are attached
to running instances
Images:
Good for reusing compute engine instance states with new instances
Available across different projects
Can't be created for running instances(unless you use --force flag)

Snapshots are primarily targeting backup and disaster recovery scenarios, they are cheaper, easier to create (can often be uploaded without stopping the VM). They are meant for frequent regular upload, and rare downloads.
Images are primarily meant for boot disk creation. They optimized for multiple downloads of the same data over and over. If the same image downloaded many times, subsequent to the first download the following downloads are going to be very fast (even for large images).
Images do not have to be used for boot disks exclusively, they also can be used for data that need to be made quickly available to a large set of VMs (In a scenario where a shared read-only disk doesn't satisfy the requirements for whatever reason)

Snapshot is a copy of your disk that you can use to create a new persistence disk (PD) of any type (standard PD or SSD PD). You can use the snapshot to create a bigger disk size, also you have the ability of creating the new disk on any zone you might need. Pricing is a bit cheaper for the provisioned space used for a snapshot. when used as backup, you can create differential snapshots.
When you use an existing disk to create an instance, you have to create the instance in the same zone where the disk exists and it will have the size of the disk.
When referring to images resources, is the pre-configured GCE operating system that you’re running (Centos, Debian, etc) and you can use the public images, available for all projects or private images for a specific project or create your own custom image.

A snapshot is locked within a project, but a custom image can be
shared between projects.

simply put - snapshot is basically the backup of the data in the disk
also important point is they are differentially backed up (lesser size).
used for backup and DR mostly.
Image is having backup of the OS as well , custom images are prepared to ensure some organizational policies as well.
In terms of cloud computing - Images are used to launch multiple instances with same configurations and snapshots are mostly for backup

Related

Google cloud compute instance metrics taking up disk space

I have a google cloud compute instance set up but it's getting low on disk space. It looks like it is the /mnt/stateful_partition/var/lib/metrics directory taking up a significant amount of space (3+gb). I assume this is the compute metrics but I can't find any way to safely remove these other than just deleting the files. Is this going to cause any issues?
The path you are referring are File System directories that are used for the GCE VM instance, and you are correct that the metrics folder is safe to be removed. To learn more about these directories, see Disks and file system overview.
I would also suggest to create a snapshot first if you wanted to make sure that the changes you will do on your instance won't affect your system performance. So that you can easily revert it back to your previous instance state.

Shrink a Dataproc worker boot disk

Due to some mix-up during planning we ended up with several worker nodes running 23TB drives which are now almost completely unused (we keep data on external storage). As the drives are only wasting money at the moment, we need to shrink them to a reasonable size.
Using weresync I was able to fully clone the drive to a much smaller one but apparently you can't swap the boot drive in GCE (which makes no sense to me). Is there a way to achieve that or do I need to create new workers using the images? If so, is there any other config I need to copy to the new instance in order for it to be automatically joined to the cluster?
Dataproc does not support VMs configuration changes in running clusters.
I would advise you to delete old cluster and create new one with workers disk size that you need.
I ended up creating a ticket with GCP support - https://issuetracker.google.com/issues/120865687 - to get an official answer to that question. Got an that this is not possible currently but should be available shortly (within months) in the beta GCP CLI, possibly in the Console on a later data as well.
Went on with a complete rebuild of the cluster.

How do I make a snapshot of my boot disk?

I've read multiple times that I can cause read/write errors if I create a snapshot. Is it possible to create a snapshot of the disk my machine is booted off of?
It depends on what you mean by "snapshot".
A snapshot is not a backup, it is a way of temporarily capturing the state of a system so you can make changes test the results and revert back to the previously known good state if the changes cause issues.
How to take a snapshot varies depending on the OS you're using, whether you're talking about a physical system or a virtual system, what virtualization platform, you're using, what image types you're using for disks within a given virtualization platform etc. etc. etc.
Once you have a snapshot, then you can make a real backup from the snapshot. You'll want to make sure that if it's a database server that you've flushed everything to disk and then write lock it for the time it takes to make the snapshot (typically seconds). For other systems you'll similarly need to address things in a way that ensures that you have a consistent state.
If you want to make a complete backup of your system drive, directly rather than via a snapshot then you want to shut down and boot off an alternate boot device like a CD or an external drive.
If you don't do that, and try to directly back up a running system then you will be leaving yourself open to all manner of potential issues. It might work some of the time, but you won't know until you try and restore it.
If you can provide more details about the system in question, then you'll get more detailed answers.
As far as moving apps and data to different drives, data is easy provided you can shut down whatever is accessing the data. If it's a database, stop the database, move the data files, tell the database server where to find its files and start it up.
For applications, it depends. Often it doesn't matter and it's fine to leave it on the system disk. It comes down to how it's being installed.
It looks like that works a little differently. The first snapshot will create an entire copy of the disk and subsequent snapshots will act like ordinary snapshots. This means it might take a bit longer to do the first snapshot.
According to :
this you ideally want to shut down the system before taking a snapshot of your boot disk. If you can't do that for whatever reason, then you want to minimize the amount of writes hitting the disk and then take the snapshot. Assuming you're using a journaling filesystem (ext3, ext4, xfs etc.) it should be able to recover without issue.
You an use the GCE APIs. Use the Disks:insert API to create the Persistence disk. you have some code examples on how to start an instance using Python, but Google has libraries for other programming languages like Java, PHP and other

Share a persistent disk between Google Compute Engine VMs

From Google's documentation:
It is possible to attach a persistent disk to more than one instance. However, if you attach a persistent disk to multiple instances, all instances must attach the persistent disk in read-only mode. It is not possible to attach the persistent disk to multiple instances in read-write mode.
If you attach a persistent disk in read-write mode and then try to attach the disk to subsequent instances, Google Compute Engine returns an error.
So, I need to have a share persistent-disk as frontend for all my compute engine, good, how can you write on this shared disk?
My guess (I hope) is a read/write persistent-disk can be attached only with 1 compute engine but this same disk can be share in read only to others VMs, is thats right?
Lets say I have 2 Compute Engine VMs and 2 persistent disks,
is this flow is possible?
compute1 read/write disk1 and read only disk2
compute2 read/write disk2 and read only disk1
Update: this is available as of 2020-06-16
As per another answer by Matthew Lenz, the functionality for creating multi-writer persistent disks is available, but it's still in alpha status (even though it's documented as being in the beta track) and requires special per-project enablement.
Note: This GitHub issue notes that the functionality is still in alpha, even though it's labelled as beta. You can submit feedback via Cloud Console to request it for your project if you'd like to get early access to this functionality, but it's not guaranteed to be enabled.
Assuming your project has the permissions to use this feature (or the feature becomes public-access), note that it comes with some caveats:
--multi-writer
Create the disk in multi-writer mode so that it can be attached with read-write access to multiple VMs. Can only be used with zonal SSD persistent disks. Disks in multi-writer mode do not support resize and snapshot operations.
You can use this via:
$ gcloud beta compute disks create DISK_NAME --multi-writer [...]
Note the caveats:
zonal SSD persistent disks only
no disk resizing
no snapshots
If these trade-offs are not acceptable to you, see the original answer (below) which has a long list of recommended storage alternatives for sharing data between multiple GCE VMs.
Original answer (valid prior to 2020-06-16)
No, this is not possible, as the documentation that you cited at the time of writing said (since updated):
However, if you attach a persistent disk to multiple instances, all instances must attach the persistent disk in read-only mode.
The documentation has been re-arranged since then; the new docs are at a different URL but with the same content:
You can attach a non-root persistent disk to more than one virtual machine instance in read-only mode, which allows you to share static data between multiple instances. Sharing static data between multiple instances from one persistent disk is cheaper than replicating your data to unique disks for individual instances.
If you attach a persistent disk to multiple instances, all of those instances must attach the persistent disk in read-only mode. It is not possible to attach the persistent disk to multiple instances in read-write mode. If you need to share dynamic storage space between multiple instances, connect your instances to Cloud Storage or create a network file server.
If you have a persistent disk with data that you want to share between multiple instances, detach it from any read-write instances and attach it to one or more instances in read-only mode.
which means you cannot have one instance have write access while another has read-only access.
If you want to share data between them, you need to use something other than Persistent Disk. Below are some possible solutions.
You can use any of the following hosted/managed services:
Google Cloud Filestore — perhaps closest to what you're looking for, as it provides an NFSv3 file system
You can also use Elastifile on GCP as a fully-managed service; note that GCP acquired Elastifile in July 2019
Google Cloud Datastore
Google Cloud Storage, which you can use via the GCS API (JSON or XML) or you can mount it using gcsfuse as a block device
Google Cloud Bigtable
Google Cloud SQL
Alternatively, you can run your own:
self-managed or third-party managed file servers solutions, including NetApp and Panzura
self-managed Elastifile storage deployment (for fully-managed, see previous section for the link)
database (whether SQL or NoSQL)
distributed filesystem such as Ceph, GlusterFS, OrangeFS, ZFS, etc.
file server such as NFS or SAMBA
single VM as a data storage node, and use sshfs to create a FUSE mount from other VMs that want to access that data
GCP has alpha functionality for 'multi-write' persistent disks. It's been in alpha for quite a long time so who knows if it'll make it to beta or ga any time soon. Here is a link to the documentation. https://cloud.google.com/sdk/gcloud/reference/beta/compute/disks/create#--multi-writer
EDIT: 2020-06-16. This has been promoted to beta.

Apache & MySQL with Persistent Disks to Multiple Instances

I plan on mount persistent disks into folders Apache(/var/www) and Mysql (/var/lib/mysql) to avoid having to replicate information between servers.
Anyone has done tests to know the I/O performance of persistent disk is similar when attaching the same disk to 100 instances as well as only 2 instances? Also has a limit of how many instances can be attach one persistent disk?
I'm not sure exactly what setup you're planning to use, so it's a little hard to comment specifically.
If you plan to attach the same persistent disk to all servers, note that a disk can only be attached to multiple instances in read-only mode, so you may not be able to use temporary tables, etc. in MySQL without extra configuration.
It's a bit hard to give performance numbers for a hypothetical configuration; I'd expect performance would depend on amount of data stored (e.g. 1TB of data will behave differently than 100MB), instance size (larger instances have more memory for page cache and more CPU for processing I/O), and access pattern. (Random reads vs. sequential reads)
The best option is to set up a small test system and run an actual loadtest using something like apachebench, jmeter, or httpperf. Failing that, you can try to construct an artificial load that's similar to your target benchmark.
Note that just running bonnie++ or fio against the disk may not tell you if you're going to run into problems; for example, it could be that a combination of sequential reads from one machine and random reads from another causes problems, or that 500 simultaneous sequential reads from the same block causes a problem, but that your application never does that. (If you're using Apache+MySQL, it would seem unlikely that your application would do that, but it's hard to know for sure until you test it.)