I have used Google Compute Engine for my backend (debian-lamp), suddenly it gets deleted automatically without any user interaction and also doesn't shows the operation(Deletion of VM Instance ) performed by which user. I have also attached the image of Google Compute Engine Operations for further study.
I want to know why does this happened and what are the ways to restore the deleted instance.
Note: I am using trial version of Google Compute Engine and this was my second VM Instance created in Current Project.
It looks like the instance was deleted by the Instance Group Manager after you resized the instance group (most likely to zero). To learn about why this happened, visit the docs pages for Instance Groups and the Instance Group Manager.
If you resize the Instance Group back up to 1, the Instance Group Manager will create a new VM automatically.
Related
I am trying to set up a custom dashboard for my Compute Engine instances. One of the metrics that I want to report on is the amount of free disk space available on each VM. I noticed that "disk bytes used" is one of the available metrics but it is not actually available to me to select unless I disable the "Only Show Active" metrics.
I have the "OS Agent" (recently released) installed and running on the VMs.
I can't seem to find any documentation referencing this particular metric and how to get it working.
Has anyone tried this and figured out the magic solution?
Here is what I did in order to get the metrics working in a replicated environment:
1.-I created 2 GCE instances (Debian and RedHat).
Navigate to the Monitoring section, and select Dashboards.
3.- Select the VM Instances Dashboard from the Dashboard List.
4.- From the Instances section, I selected both instances and clicked on Install Agents; it will open the Cloud Shell VM and auto populate the command to install the Ops Agent.
5.- You might need to wait up to 10 minutes to get the agents connected to the Monitoring Dashboard.
6.- Once you see the Ops Agent running on the instances, select the Infrastructure Summary Dashboard.
7.- Scroll down the Dashboard, and you will see the Top Disk Used (Agent) section populated.
If you prefer, you can also create a custom Dashboard.
On the Left Panel, navigate to the Metrics Explorer section.
In the Resource type, select VM Instance (gce_instance), and, at the bottom, unselect the “Only show active” checkbox.
In the Metric dropdown, menu select Disk Usage, and also unselect the “Only show active” checkbox.
4.- You need to wait at least 1 minute to see the chart populated.
Here is the full list of metrics accepted for gce_compute
Is there any way for a GCP Compute Engine instance to know if it was created by the Instance Group auto-scaling policy or if it was manually created?
On logs we generate on our instances we include the instance id. This is fine for manual instances that are started to test something, but it's not that useful for other instances as it clutters graphs of machine metrics.
In other words, for test machines we need the instance's id, but for other machines we need to log something else that's common to them all.
You can see who perform the creation task in stackdriver logging by using the following filter:
resource.type="gce_instance"
"create"
You can select a log and expand it to see if the VM was created by a user (email) or by the Instance group manager.
Note: Please have in mind that stackdriver has retention periods for the logs.
For some reason I see under "Operations" in my "Compute Engine" the following:
I would like to know/understand why this is happening. What is this gae-default-* VM (assuming these are actually VMs)? What are they doing actually?
If you know a lot of stuff about GAE and the Compute Engine please consider taking a look at this question "Deploying a GWT application to Google Compute Engine - What is happening here?" as well.
The CPU is getting utilized as well even though there can't be anything that runs:
If I manually delete those VMs they simply re-appear.
GAE stands for Google App Engine. Looks like you have some App Engine jobs configured. If you use the flexible version then it would manage GCE instances on your behalf. I would imagine you should be able to find the running jobs in the web console.
Google Compute Engine guide says that Google may migrate a VM in order to do maintenance:
https://cloud.google.com/compute/docs/instances/setting-instance-scheduling-options
By default, standard instances are set to live migrate, where Google
Compute Engine automatically migrates your instance away from an
infrastructure maintenance event, and your instance remains running
during the migration. Your instance might experience a short period of
decreased performance, although generally most instances should not
notice any difference.
There is a disruption during migration.
Is it possible that Google decides to migrate all instances within a zone at the same time? Is there a maximum to a number of concurrent migrations?
Q: There is a disruption during migration?
A: Yes there is a small period of time where the instance is not running on the old host neither the new one. Here [1] you can see how the process works.
Q: Is it possible that Google decides to migrate all instances within a zone at the same time?
A: It is very unlikely that this escenario happens, as this would implicate that all your Google Compute Engine instances of your project are on the same physical host.
Q: Is there a maximum to a number of concurrent migrations?
A: I don't know the answer to that question but I have addressed to the proper team so maybe they can answer it.
You can find more about the live migration procedure here [2].
[1] https://cloud.google.com/compute/docs/instances/live-migration#how_does_the_live_migration_process_work
[2] https://cloud.google.com/compute/docs/instances/live-migration
I originally created an instance with a persistent boot and data disk. I wanted to test that should something happen to an instance, I could just recreate one with the same boot and data disk and it would run as normal.
However, I'm getting this error when creating the instance from the developer console:
Invalid value for field 'resource.disks[1].source': 'site-data'. Must be a URL to a valid Compute resource of the correct type.
The only thing I'm doing differently is setting the boot disk to the previous site-boot disk rather than a new image, and attaching the site-data disk in read/write.
I suggest you try again -- it looks like their web-based Developer Console was broken for a few days bracketing the time you put your question in. It seems to work correctly now.
I also received this error when attempting to create an instance that included an additional Persistent Disk. Creating an instance with only the boot drive worked fine, but attempting to create an instance with any additional disk (including a new, empty disk) resulted in the same error you reported above.
I used the "Need Help?" link at the bottom left of the 'Create a new instance' web form to report the problem yesterday (10/21/14). Although I did not receive any kind of reply (I have not paid for any support options), the issue was resolved within 24 hours. I am now able to successfully create instances with additional Persistent Disks again.