Google Container Engine Architecture

Google Container Engine Architecture - google-compute-engine

I was exploring the architecture of Google's IaaS/PaaS oferings, and I am confused as to how GKE (Google Container Engine) runs in Google data centers. From this article (http://www.wired.com/2012/07/google-compute-engine/) and also from some of the Google IO 2012 sessions, I gathered that GCE (Google Compute Engine) runs the provisioned VMs using KVM (Kernel-based Virtual Machine); these VMs run inside Google's cgroups-based containers (this allows Google to schedule user VMs the same way they schedule their existing container-based workloads; probably using Borg/Omega). Now how does Kubernetes figure into this, given that it makes you run Docker containers on GCE provisioned VMs, and not on bare metal? If my understanding is correct, then Kubernetes-scheduled Docker containers run inside KVM VMs which themselves run inside Google cgroups containers scheduled by Borg/Omega...
Also, how does Kubernetes networking fit into Google's existing GCE Andromeda software-defined networking?
I understand that this is a very low-level architectural question, but I feel understanding of the internals will ameliorate my understanding of how user workloads eventually run on bare metal. Also, I'm curious, if the whole running containers on VMs inside containers is necessary from a performance point of view? E.g. doesn't networking performance degrade by having multiple layers? Google mentions in its Borg paper (http://research.google.com/pubs/archive/43438.pdf) that they run their container-based workloads without a VM (they don't want to pay the "cost of virtualization"); I understand the logic of running public external workloads in VMs (better isolation, more familiar model, heteregeneous workloads, etc.), but with Kubernetes, can not our workloads be scheduled directly on bare metal, just like Google's own workloads?

It is possible to run Kubernetes on both virtual and physical machines see this link. Google's Cloud Platform only offers virtual machines as a service, and that is why Google Container Engine is built on top of virtual machines.
In Borg, containers allow arbitrary sizes, and they don't pay any resource penalties for odd-sized tasks.

Related

What is the difference between application console vs cluster console?

What is the difference between application console vs cluster console in openshift enterprise version. I am new to openshift and confused with terminologies. I feel that openshift is like linux kernel in our system(an analogy). On top of that are containers and to orchestrate we have kubernetes. However , the architecture of openshift is exact opposite. Please correct me.

OpenShift is just one of the available Kubernetes distributions, which adds enterprise-level services like authentication, authorization and multitenancy.
The web console provides two perspectives: Administrator and Developer. The Developer perspective provides workflows specific to developer use cases like create, deploy and monitor applications, while Administrator perspective is responsible for managing the cluster resources, users, and projects. Depending on the user's role, you will see a different set of views available in the main menu.

What is the difference between Serverless containers and App Engine flexible with custom runtimes?

I came across the article : Bringing the best of serverless to you
where I came to know about upcoming product called Serverless containers on Cloud Functions which is currently in Alpha.
As described in the article:
Today, we’re also introducing serverless containers, which allow you
to run container-based workloads in a fully managed environment and
still only pay for what you use.
and in GCP solutions page
Serverless containers on Cloud Functions enables you to run your own containerized workloads on
GCP with all the benefits of serverless. And you will still pay only
for what you use. If you are interested in learning more about
serverless containers, please sign up for the alpha.
So my question is how this serverless containers different from app engine flexible with custom runtime, which also use a docker file?
And it's my suspicion, since mentioned named is Serverless containers on Cloud Functions, the differentiation may include role of cloud functions. If so what is the role played by cloud functions in the serverless containers?
Please clarify.

What are Cloud Funtions?
From the official documentation:
Google Cloud Functions is a serverless execution environment for building and connecting cloud services. With Cloud Functions you write simple, single-purpose functions that are attached to events emitted from your cloud infrastructure and services. Your function is triggered when an event being watched is fired. Your code executes in a fully managed environment. There is no need to provision any infrastructure or worry about managing any servers.
In simple words, the Cloud Function is triggered by some event (HTTP request, PubSub message, Cloud Storage file insert...), runs the code of the function, returns a result and then the function dies.
Currently there are available four runtime environments:
Node.js 6
Node.js 8 (Beta)
Python (Beta)
Go (Beta)
With the Serverless containers on Cloud Functions product it is intended that you can provide your own custom runtime environment with a Docker Image. But the life cycle of the Cloud Function will be the same:
It is triggered > Runs > Outputs Result > Dies
App Engine Flex applications
Applications running in the App Engine flexible environment are deployed to virtual machines, i.e Google Cloud Compute Engine instances. You can choose the type of machine you want use and the resources (CPU, RAM, disk space). The App Engine flexible environment automatically scales your app up and down while balancing the load.
As well as in the case of the Cloud Functions there runtimes provided by Google but if you would like to use an alternative implementation of Python, Java, Node.js, Go, Ruby, PHP, .NET you can use Custom Runtimes. Or even you can work with another language like C++, Dart..., you just need to provide a Docker Image for your Application.
What are differences between Cloud Functions and App Engine Flex apps?
The main difference between them are its life cycle and the use case.
As commented above a Cloud Function has a defined life cycle and it dies when it task concludes. They should be used to do 1 thing and do it well.
On the other hand an Application running on the GAE Flex environment will always have at least 1 instance running. The typical case for this applications are to serve several endpoints where users can do REST API calls. But they provide more flexibility as you have full control over the Docker Image provided. You can do "almost" whatever you want there.

What is a Serverless Container?
As stated on the official blog post (search for Serverless Containerss), it's basically a Cloud Function running inside a custom environment defined by the Dockerfile.
It is stated on the official blog post:
With serverless containers, we are providing the same underlying
infrastructure that powers Cloud Functions, but you’ll be able to
simply provide a Docker image as input.
So, instead of deploying your code on the CF, you could also just deploy the Docker image with the runtime and the code to execute.
What's the difference between this Cloud Functions with custom runtimes vs App Engine Flexible?
There are 5 basic differences:
Network: On GAE Flexible you can customize the network the instances run. This let's you add firewalls rules to restrict egress and ingress traffic, block specific ports or specify the SSL you wish to run.
Time-Out: Cloud Functions can run for a maximum of 9 minutes, Flexible on the other hand, can run indefinitely.
Ready only environment: Cloud Functions environment is read-only while Flexible could be written (this is only intended to store spontaneous information as once the Flexible instance is restarted or terminated, all the stored data is lost).
Cold Boot: Cloud Functions are fast to deploy and fast to start compared to Flexible. This is because Flexible runs inside a VM, thus this extra time is taken in order for the VM to start.
How they work: Cloud functions are event driven (ex: upload of photo to cloud storage executing a function) on the other hand flexible is request driven.(ex: handling a request coming from a browser)
As you can see, been able to deploy a small amount of code without having to take care of all the things listed above is a feature.
Also, take into account that Serverless Containers are still in Alpha, thus, many things could change in the future and there is still not a lot of documentation explaining in-depth it's behavior.

OpenShift 3.5 Architecture - VM Provisioning

I have been tasked with recommending the VM provisioning for an OpenShift production environment. The OpenShift installation documents don't really detail a lot of different options. I know that we want High Availability (which means multiple masters) but some of the things that I'm a bit confused by are:
separate hosts for etcd
infrastructure nodes
Do I need separate hosts/nodes for etcd? (advantages seem to be performance related but would like to better understand)
Do I need separate hosts/nodes for the infrastructure components (registry, router, etc.) or can these just be hosted on the master nodes?

AFAIK etcd can be on same host as master unless you really have a big cluster and want maintenance of etcd separate of openshift cluster.
Running routers on dedicated nodes help having high availability and reduce chances of nodes running into health issues due to other container work loads running on same machine. applications inside openshift cluster can run even if all masters go down (may be rare) but router nodes need to be available all the time for serving traffic.
There are many reference architectures published by redhat checkout blog.openshift.com and also redhat.com official docs

etcd and masters can be installed in the same node or separately. Here you can find some best practices for etcd. As you see, here is recommended that it is installed separately and this is what I would suggest if you can "afford" more servers. If not, co-locating masters and etcds we can say is symbiotic in that masters are CPU intensive whereas etcd uses a lot of disk IO and memory.
Regarding infrastructure deployments such as routers, docker-registry, EFK stack, metrics and so forth, the recommended deployment configuration (all within your possibilities) is that masters are not schedulable, and they worry only about serving the API and controlling the nodes. Then you can split your schedulable nodes into infrastructure and compute nodes.
Infrastructure nodes will only host applications used by the cluster itself or by other applications (i.e. Gitlab or Nexus)
Worker/Compute nodes will host business applications
Having a multi-master installation with HA routers is of course the best solution, but then you have to decide how you want to provide this HA, is it with an external LoadBalancer or with IP Failover?
As #debianmaster mentioned, there are several reference architecture documents you can read. Like this one here

Differences between OpenShift and Kubernetes

What's the difference between OpenShift and Kubernetes and when should you use each? I understand that OpenShift is running Kubernetes under the hood but am looking to determine when running OpenShift would be better than Kubernetes and when OpenShift may be overkill.

In addition to the additional API entities, as mentioned by #SteveS, Openshift also has advanced security concepts.
This can be very helpful when running in an Enterprise context with specific requirements regarding security.
As much as this can be a strength for real-world applications in production, it can be a source of much frustration in the beginning.
One notable example is the fact that, by default, containers run as root in Kubernetes, but run under an arbitrary user with a high ID (e.g. 1000090000) in Openshift. This means that many containers from DockerHub do not work as expected. For some popular applications, The Red Hat Container Catalog supplies images with this feature/limitation in mind. However, this catalog contains only a subset of popular containers.
To get an idea of the system, I strongly suggest starting out with Kubernetes. Minikube is an excellent way to quickly setup a local, one-node Kubernetes cluster to play with. When you are familiar with the basic concepts, you will better understand the implications of the Openshift features and design decisions.

OpenShift includes a distribution of Kubernetes, so if you don't need any of those added features of OpenShift you can choice to ignore them such as: Web Console, Builds, advanced deployment models and much, much more.
Here's a summary of items available on the OpenShift website.

Kubernetes comes with Ingress Rules but Openshift comes with Routes
Kubernetes has IngressController but Openshift has Router as HAProxy
To swtich namespace in cli for openshift is very easy but in
kubernetes you need to create contex and switch between context
Openshift UI has more interactive and informative then Kubernetes
To bake docker image inside Openshift has BuildConfig but kubernetes
don't has any thing you need to build image and push to registry
Openshift has Pipeline where u don't need any jenkins to deploy any
app but Kubernetes don't has.

The easiest way to differentiate between them is to understand that while vanilla K8S is community project, OpenShift is more focused towards making it a enterprise ready product. Resources like Imagestreams, BC, Builds, DC, Routes etc along with leveraging functionalities like S2I, Router etc make it easier for Developers and admin alike to use OCP for development, deployment and lifecycle management. You can refer to the URL https://cloud.redhat.com/learn/topics/kubernetes/ for getting more information on key differences between them.
OCP makes your life much easier by giving easy actions using CLI command OC and fine grained webconsole.
You can try OCP and get first hand experience of the features using https://developers.redhat.com/developer-sandbox
where you can quick get access to sandboxed environment in a shared cluster.

Can I install MySQL on the VMs provided in Azure Cloud Services?

From what I gather, the only way to use a MySQL database with Azure websites is to use Cleardb but can I install MySQL on VMs provided in Azure Cloud Services. And if so how?

This question might get closed and moved to ServerFault (where it really belongs). That said: ClearDB provides MySQL-as-a-Service in Azure. It has nothing to do with what you can install in your own Virtual Machines. You can absolutely do a VM-based MySQL install (or any other database engine that you can install on Linux or Windows). In fact, the Azure portal even has a tutorial for a MySQL installation on OpenSUSE.
If you're referring to installing in web/worker roles: This simply isn't a good fit for database engines, due to:
the need to completely script/automate the install with zero interaction (which might take a long time). This includes all necessary software being downloaded/installed to the vm images every time a new instance is spun up.
the likely inability for a database cluster to cope with arbitrary scale-out (the typical use case for web/worker roles). Database clusters may or may not work well when a scale-out occurs (adding an additional vm). Same thing when scaling in (removing a vm).
less-optimal attached-storage configuration
inability to use Linux VMs
So, assuming you're still ok with Virtual Machines (vs stateless Cloud Service vm's): You'll need to carefully plan your deployment, with decisions such as:
Distro (Ubuntu, CentOS, etc). Azure-supported Linux distro list here
Selecting proper VM size (the DS series provide SSD attached disk support; the G series scale to 448GB RAM)
Azure Storage attached disks being non-Premium or Premium (premium disks are SSD-backed, durable disks scaling to 1TB/5000 IOPS per disk, up to 32 disks per VM depending on VM size)
Virtual network configuration (for multi-node cluster)
Accessibility of database cluster (whether your app is in the vnet or accesses it through a public endpoint; and if the latter, setting up ACL's)
Backup / HA / DR planning
Someone else mentioned using a pre-built VM image from VM Depot. Just realize that, if you go that route, you're relying on someone else to configure the database engine install for you. This may or may not be optimal for what you're trying to achieve. And the images may or may not be up-to-date with the latest versions, patches, etc.
Of course, what I wrote applies to any database engine you install in your own virtual machines, where a service provider (such as ClearDB) tends to take care of most of these things for you.

If you are talking about standard VMs then you can use a pre-built images on VMDepot for that.
If you are talking about web or worker roles (PaaS) I wouldn't recommend it, but if you really want to you could. You would need to fully script the install of the solution on the host. The only downside (and it's a big one) you would have would be the that the host will be moved to a new host at some point which would mean your MySQL data files would be lost - if you backed up frequently and were happy to lose some data then this option may work for you.

I think, that the main question is "what You want to achieve?". As I see, You want to use PaaS solution with Web Apps or Cloud Service and You need a MySQL database. If Yes, You have two options (both technically as David Makogon said). First one is to deploy Your own (one) server with MySQL and connect to it from the outside (internet side). Second solution is to create one MySQL server or cluster and connect Your application internally in Azure virtual network. WIth Cloud Service it is simple but with Web App it is not. You must create VPN gateway in Azure VM and connect Your Web App to this gateway. In this way You will have internal connection wfrom Your application to Your own MySQL cluster.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008