Enable autoscaling on GKE cluster creation - google-compute-engine

I try to create an autoscaled container cluster on GKE.
When I use the "--enable-autoscaling" option (like the documentation indicates here : https://cloud.google.com/container-engine/docs/clusters/operations#create_a_cluster_with_autoscaling) :
$ gcloud container clusters create mycluster --zone $GOOGLE_ZONE --num-nodes=3 --enable-autoscaling --min-nodes=2 --max-nodes=5
but the MIG (Managed Instanced Group) is not displayed as 'autoscaled' as shown by both the web interface and the result of the following command :
$ gcloud compute instance-groups managed list
NAME SIZE TARGET_SIZE AUTOSCALED
gke-mycluster... 3 3 no
Why ?
Then, I tried the other way indicated in the kubernetes docs (http://kubernetes.io/docs/admin/cluster-management/#cluster-autoscaling) but got an error caused by the '=true' apparently :
$ gcloud container clusters create mytestcluster --zone=$GOOGLE_ZONE --enable-autoscaling=true --min-nodes=2 --max-nodes=5 --num-nodes=3
usage: gcloud container clusters update NAME [optional flags]
ERROR: (gcloud.container.clusters.update) argument --enable-autoscaling: ignored explicit argument 'true'
Is the doc wrong on this ?
Here is my gcloud version results :
$ gcloud version
Google Cloud SDK 120.0.0
beta 2016.01.12
bq 2.0.24
bq-nix 2.0.24
core 2016.07.29
core-nix 2016.03.28
gcloud
gsutil 4.20
gsutil-nix 4.18
kubectl
kubectl-linux-x86_64 1.3.3
Last precision : the autoscaler seems 'on' in the description on the cluster :
$ gcloud container clusters describe mycluster | grep auto -A 3
- autoscaling:
enabled: true
maxNodeCount: 5
minNodeCount: 2
Any idea to explain this behaviour please ?

Kubernetes cluster autoscaling does not use the Managed Instance Group autoscaler. It runs a cluster-autoscaler controller on the Kubernetes master that uses Kubernetes-specific signals to scale your nodes. The code is in the autoscaler repo if you want more info.
I've also sent out a PR to fix the invalid flag usage in the autoscaling docs. Thanks for catching that!

Related

Cannot create dataproc cluster due to SSD label error

I've been creating dataproc clusters successfully over the past couple of weeks using the following gcloud command:
gcloud dataproc --region us-east1 clusters create test1 --subnet
default --zone us-east1-c --master-machine-type n1-standard-4
--master-boot-disk-size 250 --num-workers 10 --worker-machine-type n1-standard-4 --worker-boot-disk-size 200 --num-worker-local-ssds 1
--image-version 1.2 --scopes 'https://www.googleapis.com/auth/cloud-platform' --project MyProject
--initialization-actions gs://MyBucket/MyScript.sh
But today I'm getting the following error when I try to create dataproc cluster from either gcloud cli or the GCP web console:
ERROR: (gcloud.dataproc.clusters.create) Operation
[projects/MyProject/regions/us-east1/operations/SOMELONGIDHERE]
failed: Invalid value for field
'resource.disks[1].initializeParams.labels': ''. Cannot specify
initializeParams.labels for local SSD..
I tried changing the cluster name and the zone (not region), without any success.
Thanks in advance
There was an issue on Google's end that was corrected.
It should be working now.

How to make oc cluster up persistent?

I'm using "oc cluster up" to start my Openshift Origin environment. I can see, however, that once I shutdown the cluster my projects aren't persisted at restart. Is there a way to make them persistent ?
Thanks
There are a couple ways to do this. oc cluster up doesn't have a primary use case of persisting resources.
There are couple ways to do it:
Leverage capturing etcd as described in the oc cluster up README
There is a wrapper tool, that makes it easy to do this.
There is now an example in the cluster up --help command, it is bound to stay up to date so check that first
oc cluster up --help
...
Examples:
# Start OpenShift on a new docker machine named 'openshift'
oc cluster up --create-machine
# Start OpenShift using a specific public host name
oc cluster up --public-hostname=my.address.example.com
# Start OpenShift and preserve data and config between restarts
oc cluster up --host-data-dir=/mydata --use-existing-config
So specifically in v1.3.2 use --host-data-dir and --use-existing-config
Assuming you are using docker machine with vm such as virtual box, the easiest way I found is taking a vm snapshot WHILE vm and openshift cluster are up and running. This snapshot will backup memory in addition to disk therefore you can restore entire cluster later on by restoring the vm snapshot, then run docker-machine start ...
btw, as of latest os image openshift/origin:v3.6.0-rc.0 and oc cli, --host-data-dir=/mydata as suggested in the other answer doesn't work for me.
I'm using:
VirtualBox 5.1.26
Kubernetes v1.5.2+43a9be4
openshift v1.5.0+031cbe4
Didn't work for me using --host-data-dir (and others) :
oc cluster up --logging=true --metrics=true --docker-machine=openshift --use-existing-config=true --host-data-dir=/vm/data --host-config-dir=/vm/config --host-pv-dir=/vm/pv --host-volumes-dir=/vm/volumes
With output:
-- Checking OpenShift client ... OK
-- Checking Docker client ...
Starting Docker machine 'openshift'
Started Docker machine 'openshift'
-- Checking Docker version ...
WARNING: Cannot verify Docker version
-- Checking for existing OpenShift container ... OK
-- Checking for openshift/origin:v1.5.0 image ... OK
-- Checking Docker daemon configuration ... OK
-- Checking for available ports ... OK
-- Checking type of volume mount ...
Using Docker shared volumes for OpenShift volumes
-- Creating host directories ... OK
-- Finding server IP ...
Using docker-machine IP 192.168.99.100 as the host IP
Using 192.168.99.100 as the server IP
-- Starting OpenShift container ...
Starting OpenShift using container 'origin'
FAIL
Error: could not start OpenShift container "origin"
Details:
Last 10 lines of "origin" container log:
github.com/openshift/origin/vendor/github.com/coreos/pkg/capnslog.(*PackageLogger).Panicf(0xc4202a1600, 0x42b94c0, 0x1f, 0xc4214d9f08, 0x2, 0x2)
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/coreos/pkg/capnslog/pkg_logger.go:75 +0x16a
github.com/openshift/origin/vendor/github.com/coreos/etcd/mvcc/backend.newBackend(0xc4209f84c0, 0x33, 0x5f5e100, 0x2710, 0xc4214d9fa8)
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/coreos/etcd/mvcc/backend/backend.go:106 +0x341
github.com/openshift/origin/vendor/github.com/coreos/etcd/mvcc/backend.NewDefaultBackend(0xc4209f84c0, 0x33, 0x461e51, 0xc421471200)
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/coreos/etcd/mvcc/backend/backend.go:100 +0x4d
github.com/openshift/origin/vendor/github.com/coreos/etcd/etcdserver.NewServer.func1(0xc4204bf640, 0xc4209f84c0, 0x33, 0xc421079a40)
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/coreos/etcd/etcdserver/server.go:272 +0x39
created by github.com/openshift/origin/vendor/github.com/coreos/etcd/etcdserver.NewServer
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/coreos/etcd/etcdserver/server.go:274 +0x345
Openshift writes to the directories /vm/... (also defined in VirtualBox) but successfully won't start.
See [https://github.com/openshift/origin/issues/12602][1]
Worked for me too, using Virtual Box Snapshots and restoring them.
To make it persistent after each shutdown you need to provide base-dir parameter.
$ mkdir ~/openshift-config
$ oc cluster up --base-dir=~/openshift-config
From help
$ oc cluster up --help
...
Options:
--base-dir='': Directory on Docker host for cluster up configuration
--enable=[*]: A list of components to enable. '*' enables all on-by-default components, 'foo' enables the component named 'foo', '-foo' disables the component named 'foo'.
--forward-ports=false: Use Docker port-forwarding to communicate with origin container. Requires 'socat' locally.
--http-proxy='': HTTP proxy to use for master and builds
--https-proxy='': HTTPS proxy to use for master and builds
--image='openshift/origin-${component}:${version}': Specify the images to use for OpenShift
--no-proxy=[]: List of hosts or subnets for which a proxy should not be used
--public-hostname='': Public hostname for OpenShift cluster
--routing-suffix='': Default suffix for server routes
--server-loglevel=0: Log level for OpenShift server
--skip-registry-check=false: Skip Docker daemon registry check
--write-config=false: Write the configuration files into host config dir
But you shouln't use it, because "cluster up" is removed in version 4.0.0. More here: https://github.com/openshift/origin/pull/21399

List instances in subnetwork

Hi I am trying list compute instances in a specific network, and subnetwork, and can't seem to get the filtering right. For example, I have a network named "prod-net" with a subnetwork named "app-central". When I run the search I just get "Listed 0 items".
~ gcloud compute instances list --filter='network:prod-net'
Listed 0 items.
Any suggestions?
The --filter flag doesn't operate on the table data, but rather the underlying rich resource object. To see this object, run gcloud compute instances list --format=json.
What you're looking for in this case is:
$ gcloud compute instances list --filter='networkInterfaces.network=prod-net'
(I switched the : to = because the former means "contains" and the latter means an exact match. See gcloud topic filters for more).
You can indeed filter GCE instances by subnetwork using gcloud.
You need to filter by networkInterfaces.subnetwork and the literal value to compare with, is the full subnet resource url, not just the subnet-name.
The "resource url" for your subnet can be obtained by:
gcloud compute networks subnets list <YOUR_SUBNET_NAME> --format=flattened
Example:
$ gcloud compute networks subnets list sg-zk-1 --project my-gcp-project --format=flattened
---
creationTimestamp: 2017-04-20T02:22:17.853-07:00
gatewayAddress: 10.9.19.33
id: 6783412628763296550
ipCidrRange: 10.9.19.32/28
kind: compute#subnetwork
name: sg-zk-1
network: valkyrie
privateIpGoogleAccess: True
region: asia-southeast1
selfLink: https://www.googleapis.com/compute/v1/projects/my-gcp-project/regions/asia-southeast1/subnetworks/sg-zk-1
In the above example, the subnet-name is sg-zk-1.
The corresponding resource URL for the subnet is the value of the selfLink which is https://www.googleapis.com/compute/v1/projects/my-gcp-project/regions/asia-southeast1/subnetworks/sg-zk-1.
Now that I have the subnet_url I can filter the instances belonging to it:
$ subnet_url="https://www.googleapis.com/compute/v1/projects/my-gcp-project/regions/asia-southeast1/subnetworks/sg-zk-1"
$ gcloud compute instances list --filter="networkInterfaces.subnetwork=${subnet_url}"
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
sg-zookeeper-4 asia-southeast1-b n1-standard-2 10.9.19.37 RUNNING
sg-zookeeper-5 asia-southeast1-b n1-standard-2 10.9.19.38 RUNNING
sg-zookeeper-1 asia-southeast1-a n1-standard-2 10.9.19.34 RUNNING
sg-zookeeper-2 asia-southeast1-a n1-standard-2 10.9.19.35 RUNNING
sg-zookeeper-3 asia-southeast1-a n1-standard-2 10.9.19.36 RUNNING

gcloud compute instances create command fails when creating an instance

Creating an instance using gcloud does not seem to work:
google-cloud> gcloud compute instances create minecraft-instance --image ubuntu-14-10 --tags minecraft
NAME ZONE MACHINE_TYPE INTERNAL_IP EXTERNAL_IP STATUS
ERROR: (gcloud.compute.instances.create) Unable to fetch a list of zones. Specifying [--zone] may fix this issue:
- Project marked for deletion.
Adding the zone name fails differently:
google-cloud> gcloud compute instances create minecraft-instance --image ubuntu-14-10 --zone us-central1-a --tags minecraft
NAME ZONE MACHINE_TYPE INTERNAL_IP EXTERNAL_IP STATUS
ERROR: (gcloud.compute.instances.create) Failed to find image for alias [ubuntu-14-10] in public image project [ubuntu-os-cloud].
- Project marked for deletion.
Providing a different image name fails too:
google-cloud> gcloud compute instances create minecraft-instance --image ubuntu-1410-utopic --zone us-central1-a --tags minecraft
NAME ZONE MACHINE_TYPE INTERNAL_IP EXTERNAL_IP STATUS
ERROR: (gcloud.compute.instances.create) Could not fetch image resource:
- Project marked for deletion.
What is the exact command to create an instance using gcloud?
Did you authenticate before and set the default project?
gcloud auth login
gcloud config set project PROJECT
The base setup of gcloud is in the Google Cloud documentation.
Or did you delete your project?
Project marked for deletion.
You have several things going on, one of which is reading the docs:
https://cloud.google.com/compute/docs/gcloud-compute/#creating
You syntax should be:
gcloud compute instances create minecraftinstance \
--image ubuntu-14-10 \
--zone [SOME-ZONE-ID] \
--machine-type [SOME-MACHINE-TYPE]
Where SOME-ZONE-ID is a geographic zone to create the instance in, found by running:
gcloud compute zones list
SOME-MACHINE-TYPE is the machince type to create. Valid types are found by running:
gcloud compute machine-types list
But specifically, you seem to be creating an instance in a Project that has been deleted:
- Project marked for deletion.
Also, you need to authenticate and set a default project:
gcloud auth
and
gcloud config set project [ID]
Billable resources can not be created for projects which has been flagged for deletion. For a project to be deletable, billing must be disabled first, and so, instances can not be created. As for the error messages, it seems gcloud command is not handling this situation correctly and replying bogus error codes instead.
The only compulsory arguments to gcloud compute instances create are the name, the zone and the project. A valid working project must be set either by using --project PROJECT flag to gcloud commands, or by using gcloud config set project PROJECT before. Similarly, to choose the zone you can either use the --zone ZONE flag or the gcloud config set compute/zone ZONE command before.
Enabling billing on your current project and undeleting it will work too. To figure out which project and zone the gcloud command is running in by default, use this:
gcloud config list
In my case I had to specify --image-project that got me going:
gcloud compute instances create core --image ubuntu-1604-xenial-v20180126 --machine-type f1-micro --zone us-east4-a --image-project ubuntu-os-cloud
My Case,Create a managed instance group using the instance template:
gcloud compute instance-groups managed create nginx-group \
--base-instance-name nginx \
--size 2 \
--template nginx-template \
--target-pool nginx-pool \
--zone us-central1-c
You have to specify the --image-project --image-family
Refer https://cloud.google.com/compute/docs/images#os-compute-support.

Google Compute Engine: how to delete access config with whitespace in name ("External NAT")?

I'm trying to delete the access config for one of my Google Compute Engine instances, and as described in some of the documentation, the access config for my instance is named "External NAT" rather than the default "external-nat". When I try to run:
gcloud compute instances delete-access-config my-instance-name --access-config-name="External NAT"
I get the following error:
ERROR: (gcloud.compute.instances.delete-access-config) unrecognized arguments: NAT
I'm assuming the error of the space in "External NAT". Seems like this should be a simple fix but I can't figure it out. Any help would be much appreciated!
You added "=" when in fact it is not needed. It worked as follows :
$ gcloud compute instances delete-access-config test-instance --access-config-name "External NAT"
Output:
Updated [https://www.googleapis.com/compute/v1/projects/test-project/zones/europe-west1-c/instances/test-instance].
gcloud compute instances delete-access-config test-instance --access-config-name="External NAT" --network-interface="nic0" --zone="us-east1-b"