Inter data center traffic of Google Compute VMs - google-compute-engine

Does anyone know that if there is a limit on network traffic among VMs in different data centers in Google Compute Engine?
Specifically, are there any performance limits if VMs in different DCs are frequently (every 5 ms) communicating with each other?
Thanks in advance.

I'm sure that there are some performance limits, but they should be fairly high if you're within the same region. (>100Mbps, possibly >1Gbps) Between regions, bandwidth is likely to be somewhat more variable, but I'd expect it to be >100Mbps on the same continent.
Note that there are also egress fees for traffic between VMs in different GCP zones, so you might want to pay attention to the total data transferred; 130Mbps would be around 1GB every minute, or $6/hour.

Related

Compute Engine - Automatic scale

I have one VM Compute Engine to host simple apps. My apps is growing and the number of users too.
Now my users work basicaly from 08:00 AM to 07:00 PM, in this period the usage os CPU and Memory is High and the speed of work is very important.
I'm preparing to expand the memory and processor in the next days, but i search a more scalable and cost efective way.
Is there a way for automatic add resources when i need and reduce after no more need?
Thanks
The cost of running your VMs is directly related to a number of different factors i.e. the type of network in use (premium vs standard), the machine type, the boot disk image you use (premium vs open-source images) and the region/zone where your workloads are running, among other things.
Your use case seems to fit managed instance groups (MIGs). With MIGs you essentially configure a template for VMs that share the same attributes. During the configuration of your MIG, you will be able to specify the CPU/memory limit beyond which the MIG autoscaler will kick off. When your CPU/memory reading goes below that threshold, MIG scales your VMs down to the number of instances specified in your template.
You can also use requests per second as a threshold for autoscaling and I would recommend you explore the docs to know more about it.
See docs

N2D/C2 quota for europe

I am a google cloud user which created a cloud HPC system. The web application is really consuming in terms of hardware resources [it's quite common that I have 4/5 N1 instances allocated with 96 cores each for a total of more than 400 cores].
I found that currently in certain zones it's possible to use N2D and C2 instances which are higher in terms of CPU the first ones and dedicated to the computing the latter. Unluckily I can't use these two instances because, for some reason, I have troubles increasing the quota N2D_CPUS and C2_CPUS above the default value of 24 [which is nothing considering my needs].
Can anyone help me?
The only way to increase the quota is to submit a Quota Increase request. Once you submit the request, you should receive an email saying that the request has been submitted and that it is being reviewed.

How often does Google Cloud Preemptible instances preempt (roughly)?

I see that Google Cloud may terminate preemptible instances at any time, but have any unofficial, independent studies been reported, showing "preempt rates" (number of VMs preempted per hour), perhaps sampled in several different regions?
Given how little information I'm finding (as with similar questions), even anecdotes such as: "Looking back the past 6 months, I generally see 3% - 5% instances preempt per hour in uswest1" would be useful (I presume this can be monitored similarly to instance count metrics in AWS).
Clients occasionally want to shove their existing, non-fault-tolerant code in the cloud for "cheap" (despite best practices), and without having an expected rate of failure, they're often blind-sighted by the cheapness of preemptible, so I'd like to share some typical experiences of the GCP community, even if people's experiences may vary, to help convey safe expectations.
Thinking about “unofficial, independent studies” and “even anecdotes such as:” “Clients occasionally want to shove their existing, non-fault-tolerant code in the cloud for "cheap"” it ought to be said that no one architect or sysadmin in right mind would place production workloads with defined SLA into an execution environment without SLA. Hence the topic is rather speculative.
For those who is keen, Google provides preemption rate expectation:
For reference, we've observed from historical data that the average
preemption rate varies between 5% and 15% per day per project, on a
seven-day average, occasionally spiking higher depending on time and
zone. Keep in mind that this is an observation only: Preemptible
instances have no guarantees or SLAs for preemption rates or
preemption distributions.
Besides that there is an interesting edutainment approach to the task of "how to make inapplicable applicable".

Benefit of running Kubernetes in bare metal and cloud with idle VM or machines?

I want to know the high level benefit of Kubernetes running in bare metal machines.
So let's say we have 100 bare metal machines ready with kubelet being deployed in each. Doesn't it mean that when the application only runs on 10 machines, we are wasting the rest 90 machines, just standing by and not using them for anything?
For cloud, does Kubernetes launch new VMs as needed, so that clients do not pay for idle machines?
How does Kubernetes handle the extra machines that are needed at the moment?
Yes, if you have 100 bare metal machines and use only 10, you are wasting money. You should only deploy the machines you need.
The Node Autoscaler works at certain Cloud Providers like AWS, GKE, or Open Stack based infrastructures.
Now, Node Autoscaler is useful if your load is not very predictable and/or scales up and down widely over the course of a short period of time (think Jobs or cyclic loads like a Netflix type use case).
If you're running services that just need to scale eventually as your customer base grows, that is not so useful as it is as easy to simply add new nodes manually.
Kubernetes will handle some amount of auto-scaling with an assigned number of nodes (i.e. you can run many Pods on one node, and you would usually pick your machines to run in a safe range but still allow handling of spikes in traffic by spinning more Pods on those nodes.
As a side note: with bare metal, you typically gain in performance, since you don't have the overhead of a VM / hypervisor, but you need to supply distributed storage, which a cloud provider would typically provide as a service.

What's the free bandwidth given with Google compute engine instances

I'm unable to understand the free bandwidth/traffic allowed in per Google Compute engine instance. I'm using digitalocean and here with every server it provides free bandwidth/transfer e.g with $ 0.015- 1GB/1CPU and 2TB of Transfer is allowed.
Hence is there any free bandwidth per compute instance or google will charge for every bit transferred to/from VM.
As documented on our Network Pricing page, traffic prices depend on the source and destination. There is no "bucket of bits up til x GB" that are free like a cellphone plan or something. Rather certain types of traffic are always free, and other types are charged. For example, anything coming in from the internet. Or, anything to another VM in the same zone (using internal IPs).
If you are in Free Trial, then of course we give you usage credits, so you can use up to that total amount, in dollars, "for free."