While attempting to start up compute instances in us-east1-b, we're repeatedly but intermittently getting an error code ZONE_RESOURCE_POOL_EXHAUxSTED_WITH_DETAILS which google's documentation indicates is due to lack of resource availability. We used to be able to fill our CPU quota, but now only a small fraction are available at any given point. Other regions allowed us to start new instances and we are nowhere near any of our project quotas, so it seems that Google itself doesn't have the resources available to allocate in this zone - but we're wondering if other projects are also hitting the same issues to confirm.
It may occur at some point with other Projects in time where there is a spike in demand for such machines in a region.
Google Cloud Platform also offer a feature called reservations that provides a very high level of assurance in obtaining capacity on Google Cloud Platform.
Related
I see that Google Cloud may terminate preemptible instances at any time, but have any unofficial, independent studies been reported, showing "preempt rates" (number of VMs preempted per hour), perhaps sampled in several different regions?
Given how little information I'm finding (as with similar questions), even anecdotes such as: "Looking back the past 6 months, I generally see 3% - 5% instances preempt per hour in uswest1" would be useful (I presume this can be monitored similarly to instance count metrics in AWS).
Clients occasionally want to shove their existing, non-fault-tolerant code in the cloud for "cheap" (despite best practices), and without having an expected rate of failure, they're often blind-sighted by the cheapness of preemptible, so I'd like to share some typical experiences of the GCP community, even if people's experiences may vary, to help convey safe expectations.
Thinking about “unofficial, independent studies” and “even anecdotes such as:” “Clients occasionally want to shove their existing, non-fault-tolerant code in the cloud for "cheap"” it ought to be said that no one architect or sysadmin in right mind would place production workloads with defined SLA into an execution environment without SLA. Hence the topic is rather speculative.
For those who is keen, Google provides preemption rate expectation:
For reference, we've observed from historical data that the average
preemption rate varies between 5% and 15% per day per project, on a
seven-day average, occasionally spiking higher depending on time and
zone. Keep in mind that this is an observation only: Preemptible
instances have no guarantees or SLAs for preemption rates or
preemption distributions.
Besides that there is an interesting edutainment approach to the task of "how to make inapplicable applicable".
When publishing a large amount of events to a topic (where the retry and time to live is in the minutes) many fail to get delivered to subscribed functions. Does anyone know of any settings, or approaches to ensure scaling react quickly without dropping them all?
I am creating a Azure Function app that essentially passes events to an event grid topic at high rate, and other functions subscribed to a topic will handle the events. These events are meant to be short lived and not persist longer than a specified set of minutes. Ideally I want to see the app scale to handle the load without dropping events. the overall goal is that each event will trigger an outbound api endpoint call to my own api to test performance/load.
I have reviewed documentation on MSDN, and other locations but not much fits my scenario (most talk in terms of incoming events and not outbound http events).
For scaling I have looked into host.json settings for http (as there is none for grid events and grid events look to be similar to http triggers) and setting those seemed to have made some improvements
The end result I expect is that for every publish to a topic endpoint it gets delivered to a function and executed with a low fail delivery/drop rate.
What I am seeing is that when publishing many events to a topic (and at a consistent rate), the majority of events get dead-lettered/dropped
Consumption plan is limited by the computing power that is assigned to your function. In essence that there are some limits up to which it can scale, and then it becomes the bottle neck.
I suggest to have a look at the limitations.
And here you can some insights about computing power differences.
If you want to enable automatic scaling, or scaling in the number of vm instances I suggest using an app service plan. The cheapest option where scaling is supported is Standard pricing tier.
I am using google free trial of $300. Recently tried to launch a GPU instance as per this.
I have configured the right region. But the message is "Quota 'NVIDIA_K80_GPUS' exceeded. Limit: 0.0". Does this mean that GPU is not available in free trial? Or is it somekind of error from gcp.
By default the quota is zero for every one. One need to request for additional quota if he needs to increase the GPU. This form is only available if we upgrade our account. In the increase quota form it says "Please note that projects using free trial credits are not eligible for quota increases until the free trial period has ended."
Update
The google GPU is no more in beta and is shown available in free trial.But you can't start the machine in free trial mode as the quota is 0.
Refer for information on regions and restrictions
I'm unable to understand the free bandwidth/traffic allowed in per Google Compute engine instance. I'm using digitalocean and here with every server it provides free bandwidth/transfer e.g with $ 0.015- 1GB/1CPU and 2TB of Transfer is allowed.
Hence is there any free bandwidth per compute instance or google will charge for every bit transferred to/from VM.
As documented on our Network Pricing page, traffic prices depend on the source and destination. There is no "bucket of bits up til x GB" that are free like a cellphone plan or something. Rather certain types of traffic are always free, and other types are charged. For example, anything coming in from the internet. Or, anything to another VM in the same zone (using internal IPs).
If you are in Free Trial, then of course we give you usage credits, so you can use up to that total amount, in dollars, "for free."
If I am within my quota am I guaranteed to be able to have my instance created (assumimg all other inputs are valid) - trying to determine if my quota equates to reserved capacity that I can count on to be available if needed.
Google Compute Engine is engineered for scale, and a fundamental design goal is enabling all users to scale their workloads up (and down) on demand.
However, quotas are not private reservations and instances.insert() can fail in rare cases.