Benefit of running Kubernetes in bare metal and cloud with idle VM or machines? - containers

I want to know the high level benefit of Kubernetes running in bare metal machines.
So let's say we have 100 bare metal machines ready with kubelet being deployed in each. Doesn't it mean that when the application only runs on 10 machines, we are wasting the rest 90 machines, just standing by and not using them for anything?
For cloud, does Kubernetes launch new VMs as needed, so that clients do not pay for idle machines?
How does Kubernetes handle the extra machines that are needed at the moment?

Yes, if you have 100 bare metal machines and use only 10, you are wasting money. You should only deploy the machines you need.
The Node Autoscaler works at certain Cloud Providers like AWS, GKE, or Open Stack based infrastructures.
Now, Node Autoscaler is useful if your load is not very predictable and/or scales up and down widely over the course of a short period of time (think Jobs or cyclic loads like a Netflix type use case).
If you're running services that just need to scale eventually as your customer base grows, that is not so useful as it is as easy to simply add new nodes manually.
Kubernetes will handle some amount of auto-scaling with an assigned number of nodes (i.e. you can run many Pods on one node, and you would usually pick your machines to run in a safe range but still allow handling of spikes in traffic by spinning more Pods on those nodes.
As a side note: with bare metal, you typically gain in performance, since you don't have the overhead of a VM / hypervisor, but you need to supply distributed storage, which a cloud provider would typically provide as a service.

Related

Compute Engine - Automatic scale

I have one VM Compute Engine to host simple apps. My apps is growing and the number of users too.
Now my users work basicaly from 08:00 AM to 07:00 PM, in this period the usage os CPU and Memory is High and the speed of work is very important.
I'm preparing to expand the memory and processor in the next days, but i search a more scalable and cost efective way.
Is there a way for automatic add resources when i need and reduce after no more need?
Thanks
The cost of running your VMs is directly related to a number of different factors i.e. the type of network in use (premium vs standard), the machine type, the boot disk image you use (premium vs open-source images) and the region/zone where your workloads are running, among other things.
Your use case seems to fit managed instance groups (MIGs). With MIGs you essentially configure a template for VMs that share the same attributes. During the configuration of your MIG, you will be able to specify the CPU/memory limit beyond which the MIG autoscaler will kick off. When your CPU/memory reading goes below that threshold, MIG scales your VMs down to the number of instances specified in your template.
You can also use requests per second as a threshold for autoscaling and I would recommend you explore the docs to know more about it.
See docs

looking for simple cluster configuration

I am using compute engine for embarrassingly parallel scientific calculations. Some of my calculations require a single core and some require 64-cores machines. I am currently using my own scripts: I have a qsub-like command that creates a new instance with the required number of cores, booting it from a custom image with the pre-installed software, connects to a storage bucket via gcsfuse, runs the required command and then kills the instance after it's done.
Do I really need to do all of that with my own scripts, or is there any tool that I should use instead? I'd much rather use some ready made tool for all of the management.
My usage fluctuates widely (hundreds of cores in parallel for 3 hours, then 2 days with nothing, etc). So I don't want constant sized machines: I like to be billed by the minute for my computations.
You may want to use auto-scaling feature for managed instance group in Google Compute Engine(GCE). This feature adds more instances to your instance group when there is more load (upscaling), and removes instances when there is less load (downscaling). Moreover, you can define autoscaling policy based upon CPU utilization, or Load balancer utilization or request per seconds. Please refer autoscaler decisions document to understand decisions that autoscaler might make when scaling instance groups.

what is the difference between distributed computing, microservice and parallel computing

My basic understanding for:
Distributed computing is a model of connected nodes -from hardware perspective they share only network connection- and communicate through messages. each node code be responsible for one part of the business logic as in ERP system there is a node for hr, node for accounting. communication could be HTML, SOA, RCP
Microservice is a service that is responsible for one part of the business logic and communicate with each other usually by http. microservices could share the hardware resources and are accessed by thier api.
Parallel systems are systems which optimize the use of resources. for example multithreaded app running on several thread where sharing memory resources.
I am a little bit confused since microservices are distributed systems, but when running multiple microservices on single hardware resources they are also parallel systems. Am i getting it right here:
Micro services is one way to do distributed computing. There are many more distributed computing models like Map-Reduce and Bulk Synchronous Parallel.
However, as you pointed out, you don't need to use micro servers for a distributed system. You can put all your services on one machine. It's like using a screw driver to hammer a nail ;). Yeah, you'll have parallel computation on a single multi-core machine, but are micro services the right way to achieve it? They might be if you plan to move those services onto separate machines. However, if those services require co-location, then micro services was the wrong tool.
Distributed systems is one way to do parallel computing. There are many different ways to achieve parallel computation, like grid computing, multi-core machines, etc. Many of them are listed in the article I linked.

Kubernetes on GCE / Prevent pods undergoing an eviction with "The node was low on compute resources."

Painful investigation on aspects that so far are not that highlighted by documentation (at least from what I've googled)
My cluster's kube-proxy became evicted (+-experienced users might be able to consider the faced issues). Searched a lot, but no clues about how to have them up again.
Until describing the concerned pod gave a clear reason : "The node was low on compute resources."
Still not that experienced with resources balance between pods/deployments and "physical" compute, how would one 'prioritizes' (or similar approach) to make sure specific pods will never end up in such a state ?
The cluster has been created with fairly low resources in order to get our hands on while keeping low costs and eventually witnessing such problems (gcloud container clusters create deemx --machine-type g1-small --enable-autoscaling --min-nodes=1 --max-nodes=5 --disk-size=30), is using g1-small is to prohibit ?
If you are using iptables-based kube-proxy (the current best practice), then kube-proxy being killed should not immediately cause your network connectivity to fail, but new services and updates to endpoints will stop working. Still, your apps should continue to work, but degrade slowly.
If you are using userspace kube-proxy, you might want to upgrade.
The error message sounds like it was due to memory pressure on the machine.
When there is memory pressure, Kubelet tries to terminate things in order of lowest to highest QoS level.
If your kube-proxy pod is not using Guaranteed resources, then you might want to change that.
Other things to look at:
if kube-proxy suddenly used a lot more memory, it could be terminated. If you made a huge number of pods or services or endpoints, this could cause it to use more memory.
if you started processes on the machine that are not under kubernetes control, that could cause kubelet to make an incorrect decision about what to terminate. Avoid this.
It is possible that on such a small machine as a g1-small, the amount of node resources held back is insufficient, such that too much guaranteed work got put on the machine -- see allocatable vs capacity. This might need tweaking.
Node oom documentation

Is hosting my multiplayer HTML5 game on a free heroku dyno hurting my network performance?

I've recently built a multiplayer game in HTML5 using the TCP-based WebSockets protocol for the networking. I already have taken steps in my code to minimize lag (using interpolation, minimizing the number of messages sent/message size), but I occasionally run into issues with lag and choppiness that I believe are happening due to a combination of packet loss and TCP's policy of in-order delivery.
To elaborate - my game sends out frequent websocket messages to players to update them on the position of the enemy players. If a packet gets dropped/delayed, my understanding is that it will prevent later packets from being received in a timely manner, which causes the enemy players to appear frozen in the same spot and then zoom to the correct location once the delayed packet is finally received.
I confess that my understanding of networking/bandwidth/congestion is quite weak. I've been wondering whether running my game on a single free heroku dyno, which is basically a VM on another virtual server (heroku dynos are on EC2 instances) could be exacerbating this problem. Do heroku dynos and multi-tenant servers in general tend to have worse network congestion due to noisy neighbors or other reasons?
Yes. You don't get dedicated networking performance from Heroku instances. Some classes of EC2 instances in a VPC can have "Enhanced Networking" enabled which is supposed to help give you dedicated performance.
Ultimately, though the best thing to do before jumping to a new solution is benchmarking. Benchmark what level of throughput you can get from a Heroku dyno then try benchmarking an Amazon instance to see what kind of difference it makes.