Add additional disks through metadata on Goocle Compute Engine - google-compute-engine

I want to build a Google Cloud image using Packer, but can't seem to find a way for packer to add additional disks. This is required as want a persistent disk for application to store data on it.
Is it something that can be done through startup_script or any other way?

From a quick glance at the GCE API documentation it looks like images only can contain one device, the boot device.

Related

GCE autoscaling by GKE resource reservation

According to Kubernetes documentation,
If you are using GCE, you can configure your cluster so that the number of nodes will be automatically scaled based on:
CPU and memory utilization.
Amount of of CPU and memory requested by the pods (called also reservation).
Is this actually true?
I am running mainly Jobs on my cluster, and would like to spin up new instances to service them on demand. CPU usage doesn't work well as a scaling metric for this workload.
From Google's CKE documentation, however, this only appears to be possible by using Cloud Monitoring metrics -- relying on a third-party service that you then have to customize. This seems like a perplexing gap in basic functionality that Kubernetes itself claims to support.
Is there any simpler way to achieve the very simple goal of having the GCE instance group autoscale based on the CPU requirements that I'm quite explictly specifying in my GKE Jobs?
The disclaimer at the bottom of that section explains why it won't work by default in GKE:
Note that autoscaling will work properly only if node metrics are accessible in Google Cloud Monitoring. To make the metrics accessible, you need to create your cluster with KUBE_ENABLE_CLUSTER_MONITORING equal to google or googleinfluxdb (googleinfluxdb is the default value). Please also make sure that you have Google Cloud Monitoring API enabled in Google Developer Console.
You might be able to get it working by standing up a heapster instance in your cluster configured with --sink=gcm (like this), but I think it was more of an older proof of concept than a well-maintained, production-grade configuration.
The community is working hard on a better, more-fully-supported version of node autoscaling in the upcoming 1.3 release.

What is the difference between Google Container Engine and Container-Optimized Compute Engine?

I have a little bit of an idea about their differences, but it would be great to have expert opinions.
Container-Optimized Google Compute Engine Images
Google Container Engine
Thanks in advance :)
Google Container Engine is a kubernetes backed cluster manager. It makes managing simple or complex docker based applications easy. Easy in the form of configuration, updating and scaling.
The container optimized compute engine images allows you to run docker containers on a single node. Note, you can create your own containerized cluster with this image if you wish, but if you're going down this path you should really reconsider container engine.
It's worth noting that the container optimized image also has aspects of kubernete's in the form of a kubelet.

What is the recommended way to watch for changes in a Couchbase document?

I want to use Couchbase but I want to implement change tracking in a few areas similar to the way RethinkDB does it.
There appears to be a hand full of ways to have changes pushed to me from a Couchbase server.
DCP
TAP
XDCR
Which one is the correct choice, or is there a better method?
UPDATE
Thanks #Kirk!
Thanks! it looks like DCP does not have a 100% production ready API today (5/19/2015). Your blog ref helped me decide to use XDCR today and migrate to DCP as soon as an official API is ready.
For XDCR this GitHub Repo has been helpful.
Right now the only fully supported way is XDCR as Kirk mentioned already. If you want to save time implementing it, you might want to base your code on this: https://github.com/couchbaselabs/couchbase-capi-server - it implements server side of the XDCR protocol (v1). The ElasticSearch plugin is based on this CAPI server, for example. XDCR is a good choice if your application is a server/service that can wait for incoming connections, so Couchbase (or the administrator) controls how and when Couchbase replicates data to your service.
Depending on what you want to accomplish, DCP might end up being a better choice later, because it's conceptually different from XDCR. Any DCP-based solution would be pull-based (from your code's side), so you have more fine-grained, programmatical, control over how and when to connect to a Couchbase bucket, and how to distribute your connections across different processes if necessary. For a more in-depth example of using DCP, take a look at the Couchbase-Kafka connector here: https://github.com/couchbase/couchbase-kafka-connector
DCP is the proper choice for this if how it works fits your use case and you can write an application to consume the stream as there is no official API...yet. Here is a blog post about doing this in java by one of the Couchbase Solutions Engineers, http://nosqlgeek.blogspot.de/2015/05/dcp-magic.html
TAP is basically deprecated at this point. It is still in the product, but DCP is far superior to it in most every fashion.
XDCR could be used, as it uses DCP, but you'd have to write a plug-in for XDCR. So you'd just be better off to write one directly to consume the DCP stream.

Google cloud load balancing instance status

I've tried few different setups of HTTP load balancing in google compute engine.
I used this as a reference :
https://cloud.google.com/compute/docs/load-balancing/http/cross-region-example
And I'm at scenario with 3 instances where I simulate the outage on one of them.
And I can see that one instance is not healthy which is great, so my question would be how can I see which one of them is not up. I mean when this is a real scenario I want to immediately know which one is it.
Any suggestions?
You can use the gcloud tool to get detailed health information. Based on that tutorial, I would run:
gcloud compute backend-services get-health NAME
I am not sure how to view this information in the developer console.
See more:
https://cloud.google.com/compute/docs/load-balancing/http/backend-service#health_checking

Openshift scaling on specific (software) condition

I'm looking for a scaling mechanism on OpenStack cloud, and then I found OpenShift. My scenario is something like this: we have a distributed system with many agents stand on many nodes. One node contain a Message Broker that direct the traffic. We want to monitor the Message Broker node, if a queue is full, we scale out the agent nodes handle that queue. In brief, we monitor one node to scale other nodes.
We used OpenStack cloud now. In OpenStack, I found heat and ceilometer which are able to create alarm and scale out nodes. However, alarms are based only on general info like CPU, RAM, Network usage, etc (not inside-VM info).
Then I search for a layer above: PaaS. I found OpenShift can handle scaling apps. But as I knew, the scaling mechanism of OpenShift is: duplicate the apps based on network traffic, then put an HAProxy in front.
Am I right that OpenShift can't monitor software specific data. Is there any other tool that suit our scenario?
You can try using this script (https://github.com/openshift/origin-server/blob/master/cartridges/openshift-origin-cartridge-haproxy/usr/bin/haproxy_ctld.rb) to control how your gears are scaled, but I believe that it is still experimental. Make sure that you read through all of the comments and understand what you are doing before making any changes. You might also consider spinning up a second scaled application to test this on before messing with your production application.