Is it possible to get GCP's ANY distribution for Kubernetes GKE node pool? - google-compute-engine

I have a GKE Kubernetes cluster running on GCP. This cluster has multiple node pools set with autoscale ON and placed at us-central1-f.
Today we started getting a lot of errors on these Node pools' Managed Instance Groups saying that us-central1-f had run out of resources. The specific error: ZONE_RESOURCE_POOL_EXHAUSTED_WITH_DETAILS
I've found another topic on Stackoverflow with a similar question, where the answer points to a discussion on Google Groups with more details. I know that one of the recommended ways of avoiding this is to use multiple zones and/or regions.
When I first faced this issue I wondered if there is a way to set multiple region as a fallback system, instead of redundancy system. In that sense, I would set my VMs to be placed wherever zone that has available resources prioritizing the ones closer to, lets say, us-central1-f.
Then, reading the discussion on the Google Group I found a feature that caught my attentions which is the ANY distribution method for Managed Instance Groups. It seems that this feature does exactly what I need - the zone fallback.
So, my question: Does the ANY distribution method resolve my issue? Can I use it for GKE Node Pools? If not, is there any other solution other than using multiple zones?

It is possible to get a regional (i.e. multi-zonal) GKE deployment, however this will use multiple zonal MIGs as the underlying compute layer. So technically speaking you will not use the ANY distribution method, but you should achieve pretty much the same result.

Related

Shrink a Dataproc worker boot disk

Due to some mix-up during planning we ended up with several worker nodes running 23TB drives which are now almost completely unused (we keep data on external storage). As the drives are only wasting money at the moment, we need to shrink them to a reasonable size.
Using weresync I was able to fully clone the drive to a much smaller one but apparently you can't swap the boot drive in GCE (which makes no sense to me). Is there a way to achieve that or do I need to create new workers using the images? If so, is there any other config I need to copy to the new instance in order for it to be automatically joined to the cluster?
Dataproc does not support VMs configuration changes in running clusters.
I would advise you to delete old cluster and create new one with workers disk size that you need.
I ended up creating a ticket with GCP support - https://issuetracker.google.com/issues/120865687 - to get an official answer to that question. Got an that this is not possible currently but should be available shortly (within months) in the beta GCP CLI, possibly in the Console on a later data as well.
Went on with a complete rebuild of the cluster.

Kubernetes cluster on GCE from Instances/Group

Have Kubernetes computation cluster running on GCE, reasonable happy so far. I know if I created K-cluster, I'll get to see nodes as VM Instances and cluster as Instance group. I would like to do other way around - create instances/group and make K-cluster out of it so it could be managed by Kubernetes. Reason I want to do so is to try and make nodes preemptible, which might better fit my workload.
So question - Kubernetes cluster with preemptible nodes how-to. I could do either one or another now, but not together
There is a patch out for review at the moment (#12384) that makes a configuration option to mark the nodes in the instance group as preemptible. If you are willing to build from head, this should be available as a configuration option in the next couple of days. In the meantime, you can see from the patch how easy it is to modify the GCE startup scripts to make your VMs preemptible.

My servers status suddenly got PROVISIONING in Google Compute Engine

After restarting some of my servers in Google Compute Engine and try to connect them via ssh they are all in PROVISIONING status for more than 4 hours!
According to google documentation:
https://cloud.google.com/compute/docs/instances#checkmachinestatus
PROVISIONING - Resources are being reserved for the instance. The
instance isn't running yet.
Well, they were working for more than one month.
I tried several time to turn them off via gcloud command-line tool but it didn't work.
check for any problem in Google Cloud Status, nothing is mentioned there for today:
https://status.cloud.google.com
Any idea?
In some cases where your "PROVISIONING" takes too long, depending on the region/zone/time you try to create your instances in, the issue could be related to (generally, but not limited to. I'm just giving some ideas):
Limited resources in that zone, at that time, could cause the instances to hang in the provisioning status. Google usually takes care of this pretty quickly, within a few days or something. I don't have exact information but they add more resources to the zone and the issue disappears. They also add new zones and such, so it could be worth moving to a new zone if more resources are readily available there.
Temporary issue that should resolve itself.
A few questions I have for you:
Is this still happing? Considering the fact that you posted about a month ago, I assume you've gotten past this issue and everything is working as expected at this time. If so, I'd recommend posting an answer yourself with details on what happened or what you did to fix it. You can then accept your own answer so that others can see what fixed it.
Have you tried creating instances in different zones to see if you have the problem everywhere or just within a specific one?
All in all, this is usually a transient issue, based on my experience with the Google Cloud Platform. If this is still causing trouble for you, give us some more information on what is currently happening and the community might be able to help better.

mySQL authoritative database - combined with Firebase

We have built a LAMP-stack API application via PHP Laravel. This currently uses a local mySQL instance. We have mostly implemented views in AngularJS.
In order to use Firebase, we need to sync data between the authoritative store in mySQL with anything relevant that exists on Firebase, as close to real-time as possible. This means that other parts of the app which are not real-time and don't use Firebase can also serve up fresh content that's very recently been entered into the system.
I know that Firebase is essentially a noSQL database in the cloud. My question is - how do I write a wrapper or a means to sync the canonical version of my Firebase into my database of record - mySQL?
Update to answer - our final decision - ditching Firebase as an option
We have decided against this, as we can easily have a socket.io instance on the same server with an extremely low latency connection to mySQL, so that the two can remain in sync. There's no need to go across the web when resources and endpoints can exist on localhost. It also gives us the option to run our app without any internet connection, which is important if we sell an on-premise appliance to large companies.
A noSQL sync platform like Firebase is really just a temporary store that makes reads/writes faster in semi-real-time. If they attempt to get into the "we also persist everything for you" business - that's a whole different ask with much more commitment required.
The guarantee on eventual consistency between mySQL and Firebase is more important to get right first - to prevent problems down the line. Also, an RDMS is essential to our app - it's the only way to attack a lot of data-heavy problems in our analytics/data mappings - there's very strong reasons most of the world still uses a RDMS like mySQL, etc. You can make those very reliable too - through Amazon RDS and Google Cloud SQL.
There's no specific problem beyond scaling real-time sync that Firebase actually solves for us, which other open source frameworks don't already solve. If their JS lib actually handled offline scenarios (when you START offline) elegantly, I might have considered it, but it doesn't do that yet.
So, YMMV - but in our specific case, we're not considering Firebase for the reasons given above.
The entire topic is incredibly broad, definitely too broad to provide a simple answer to.
I'll stick to the use-case you provided in the comments:
Imagine that you have a checklist stored in mySQL, comprised of some attributes and a set of steps. The steps are stored in another table. When someone updates this checklist on Firebase - how would I sync mySQL as well?
If you insist on combining Firebase and mySQL for this use-case, I would:
Set up your Firebase as a work queue: var ref = new Firebase('https://my.firebaseio.com/workqueue')
have the client push a work item into Firebase: ref.push({ task: 'id-of-state', newState: 'newstate'})
set up a (nodejs) server that:
monitors the work queue (ref.on('child_added')
updates the item in the mySQL database
removes the task from the queue
See this github project for an example of a work queue on top of Firebase: https://github.com/firebase/firebase-work-queue

how do you add additional nics to a compute engine vm?

how do I add a NIC to a compute engine instance? I need more then one NIC so I can build out an environment...I've looked all over and there is nothing on how to do it...
I know it's probably some API call through the SDK, but I have no idea, and I can't find anything on it.
EDIT:
It's the rhel6 image. figured I should clarify.
The question is probably old and a lot has changed since. Now it's definitely possible to add more nics to an instance but only at creation time (you can find a networking tab on the create instance page on the portal - corresponding rest api exists too). Each nic has to connect to a different virtual network, so you need to create more before creating the instance (if you don't have already).
Do you need an external address or an internal address? If external, you can use gcutil to add an IP address to an existing instance. If internal, you can configure a static network address on the instance, and add a route entry to send traffic for that address to that instance.
I was looking for similiar thing (to have a VM which runs Apache and nginx simultaneously on different IPs), but it seems like although you can have multiple networks (up to 5) in a project and each network can belong to multiple instances, you can not have more than one network per instance. From the documentation:
A project can contain multiple networks and each network can have multiple instances attached to it. [...] A network belongs to only one project and each instance can only belong to one network.