Azure cli - wait for operation to complete before next statement executes - azure-cli

How do I make my cli script wait until a resource is provisioned before attempting the next operation?
For example I am creating a waf policy and then attempting to assign that waf policy to an app gateway
The issue is that the waf policy is still being created
# WAF Policy
az network application-gateway waf-policy create \
--name $wafPolicyName
--resource-group $resourceGroupName
# App Gateway - wont allow to create without a private IP
az network application-gateway create \
--name $appGatewayName \
--resource-group $resourceGroupName \
--waf-policy $wafPolicyName
This result is an error:- Another operation on this or dependent resource is in progress
How do I make it wait?

Try using this: az network application-gateway waf-policy wait --name $wafPolicyName --resource-group $resourceGroupName --created
Here is a link about how to use az network application-gateway waf-policy wait.
The wait command works perfectly on my side:

Related

Private Azure Kubernetes cluster with nginx ingress can't be restarted

I have private Azure Kubernetes cluster with installed Nginx Ingress (using internal Load Balancer)
This is non-production cluster and during weekends we plan stop it. But when we start it - it can't be finished successfully and after 30 minutes AKS cluster is in Failed state
After research I found that it happens only if Ingress is installed on private AKS with restricted outbound access
Any ideas how can it be solved?
one thing you can do is upgrade your Kubernetes cluster:
Check the upgrades available for your cluster
az aks get-upgrades --resource-group <resoure-group-name> --name <cluster-name> --output table
Then upgrade your cluster
az aks upgrade \
--resource-group <resoure-group-name> \
--name <cluster-name> \
--kubernetes-version <kubernetes_verion>
Replace the Kubernetes version by a version you got from the first command

make az cli synapse resume/pause fault tolerant

In order to save money, I'm using az synapse sql pool pause and az synapse sql pool resume, so that the Synapse dedicated pool database only turns on to run tests when there is a Pull Request, then shuts down after.
The challenge is that:
az synapse sql pool resume fails is the database is already resumed, and
az synapse sql pool pause fails if the database is already paused
below is an example output of what happens when one of the above situation occurs
Command group 'synapse' is in preview and under development.
Reference and support levels: https://aka.ms/CLI_refstatus
Deployment failed.
Correlation ID: 062ea436-f0f0-4c11-a3e4-4df92cdaf6b5.
An unexpected error occured while processing the request.
Tracking ID: 'ba5ce906-5631-42ad-b3f4-a659095bdbe3'
Exited with code exit status 1
How can I make this command tolerate the state already being achieved?
here's how you would resume the database if and only if it were currently paused, then wait for it to be resumed/"Online". requires 3 separate commands...
state=$(az synapse sql pool show \
--name $DB_NAME \
--resource-group $RG_NAME \
--workspace-name $WKSPC_NAME \
--query "status")
if [ "$state" = "Paused" ]; then
echo "Resuming pool!"
az synapse sql pool resume
--name $DB_NAME
--resource-group $RG_NAME
--workspace-name $WKSPC_NAME
az synapse sql pool wait
--sql-pool-name $DB_NAME
--resource-group $RG_NAME
--workspace-name $WKSPC_NAME
--custom "state==Online"
fi

OpenShift deploy an application from private registry by using "oc new pp" command

In OpenShift, I want to deploy application by using docker image which its location is on the private docker registry. To do this I have written the following command from terminal by using OpenShift Container Platform Command Line Interface (oc CLI)
oc new-app --docker-image=myregistry.com/mycompany/myimage --name=private --insecure-registry=true
I received an error which type is 407 proxy authentication when I run the above command. Because, To pull the image from my private registry need to authentication. I have a secret for this authentication, too, but I don't know how can add the secret to above command.
Could you help me, please? or another way ...
Finally, I could have solved. The problem is lack of steps while creating secret for private docker registry. The all steps are:
1) If you do not already have a Docker credentials file for the secured registry, you can create a secret by running:
$ oc create secret docker-registry <pull_secret_name> \
--docker-server=<registry_server> \
--docker-username=<user_name> \
--docker-password=<password> \
--docker-email=<email>
2) To use a secret for pulling images for Pods, you must add the secret to your service account:
$ oc secrets link default <pull_secret_name> --for=pull
3) To use a secret for pushing and pulling build images, the secret must be mountable inside of a Pod. You can do this by running:
$ oc secrets link builder <pull_secret_name>
https://docs.openshift.com/container-platform/4.1/openshift_images/managing-images/using-image-pull-secrets.html

How to make GCE instance stop when its deployed container finishes?

I have a Docker container that performs a single large computation. This computation requires lots of memory and takes about 12 hours to run.
I can create a Google Compute Engine VM of the appropriate size and use the "Deploy a container image to this VM instance" option to run this job perfectly. However once the job is finished the container quits but the VM is still running (and charging).
How can I make the VM exit/stop/delete when the container exits?
When the VM is in its zombie mode only the stackdriver containers are left running:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bfa2feb03180 gcr.io/stackdriver-agents/stackdriver-logging-agent:0.2-1.5.33-1-1 "/entrypoint.sh /u..." 17 hours ago Up 17 hours stackdriver-logging-agent
161439a487c2 gcr.io/stackdriver-agents/stackdriver-metadata-agent:0.2-0.0.17-2 "/bin/sh -c /opt/s..." 17 hours ago Up 17 hours 8000/tcp stackdriver-metadata-agent
I create the VM like this:
gcloud beta compute --project=abc instances create-with-container vm-name \
--zone=us-central1-c --machine-type=custom-1-65536-ext \
--network=default --network-tier=PREMIUM --metadata=google-logging-enabled=true \
--maintenance-policy=MIGRATE \
--service-account=xyz \
--scopes=https://www.googleapis.com/auth/cloud-platform \
--image=cos-stable-69-10895-71-0 --image-project=cos-cloud --boot-disk-size=10GB \
--boot-disk-type=pd-standard --boot-disk-device-name=vm-name \
--container-image=gcr.io/abc/my-image --container-restart-policy=on-failure \
--container-command=python3 \
--container-arg="a" --container-arg="b" --container-arg="c" \
--labels=container-vm=cos-stable-69-10895-71-0
When you create the VM, you'll need to give it write access to compute so you can delete the instance from within. You should also set container environment variables like gce_zone and gce_project_id at this time. You'll need them to delete the instance.
gcloud beta compute instances create-with-container {NAME} \
--container-env=gce_zone={ZONE},gce_project_id={PROJECT_ID} \
--service-account={SERVICE_ACCOUNT} \
--scopes=https://www.googleapis.com/auth/compute,...
...
Then within the container, whenever YOU determine your task is finished:
request an api token (im using curl for simplicity and DEFAULT gce service account)
curl "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token" -H "Metadata-Flavor: Google"
This will respond with json that looks like
{
"access_token": "foobarbaz...",
"expires_in": 1234,
"token_type": "Bearer"
}
Take that access token and hit the instances.delete api endpoint (notice the environment variables)
curl -XDELETE -H 'Authorization: Bearer {TOKEN}' https://www.googleapis.com/compute/v1/projects/$gce_project_id/zones/$gce_zone/instances/$HOSTNAME
Having grappled with the problem for some time, here's a full solution that works pretty well.
This solution doesn't use the "start machine with a container image" option. Instead it uses a startup script, which is more flexible. You still use a Container-Optimized OS instance.
Create a startup script:
#!/usr/bin/env bash
# get image name and container parameters from the metadata
IMAGE_NAME=$(curl http://metadata.google.internal/computeMetadata/v1/instance/attributes/image_name -H "Metadata-Flavor: Google")
CONTAINER_PARAM=$(curl http://metadata.google.internal/computeMetadata/v1/instance/attributes/container_param -H "Metadata-Flavor: Google")
# This is needed if you are using a private images in GCP Container Registry
# (possibly also for the gcp log driver?)
sudo HOME=/home/root /usr/bin/docker-credential-gcr configure-docker
# Run! The logs will go to stack driver
sudo HOME=/home/root docker run --log-driver=gcplogs ${IMAGE_NAME} ${CONTAINER_PARAM}
# Get the zone
zoneMetadata=$(curl "http://metadata.google.internal/computeMetadata/v1/instance/zone" -H "Metadata-Flavor:Google")
# Split on / and get the 4th element to get the actual zone name
IFS=$'/'
zoneMetadataSplit=($zoneMetadata)
ZONE="${zoneMetadataSplit[3]}"
# Run compute delete on the current instance. Need to run in a container
# because COS machines don't come with gcloud installed
docker run --entrypoint "gcloud" google/cloud-sdk:alpine compute instances delete ${HOSTNAME} --delete-disks=all --zone=${ZONE}
Put the script somewhere public. For example put it on Cloud Storage and create a public URL. You can't use a gs:// URI for a COS startup script.
Start an instance using a startup-script-url, and passing the image name and parameters, e.g.:
gcloud compute --project=PROJECT_NAME instances create INSTANCE_NAME \
--zone=ZONE --machine-type=TYPE \
--metadata=image_name=IMAGE_NAME,\
container_param="PARAM1 PARAM2 PARAM3",\
startup-script-url=PUBLIC_SCRIPT_URL \
--maintenance-policy=MIGRATE --service-account=SERVICE_ACCUNT \
--scopes=https://www.googleapis.com/auth/cloud-platform --image-family=cos-stable \
--image-project=cos-cloud --boot-disk-size=10GB --boot-disk-device-name=DISK_NAME
(You probably want to limit the scopes, the example uses full access for simplicity)
I wrote a self-contained Python function based on Vincent's answer.
def kill_vm():
"""
If we are running inside a GCE VM, kill it.
"""
# based on https://stackoverflow.com/q/52748332/321772
import json
import logging
import requests
# get the token
r = json.loads(
requests.get("http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token",
headers={"Metadata-Flavor": "Google"})
.text)
token = r["access_token"]
# get instance metadata
# based on https://cloud.google.com/compute/docs/storing-retrieving-metadata
project_id = requests.get("http://metadata.google.internal/computeMetadata/v1/project/project-id",
headers={"Metadata-Flavor": "Google"}).text
name = requests.get("http://metadata.google.internal/computeMetadata/v1/instance/name",
headers={"Metadata-Flavor": "Google"}).text
zone_long = requests.get("http://metadata.google.internal/computeMetadata/v1/instance/zone",
headers={"Metadata-Flavor": "Google"}).text
zone = zone_long.split("/")[-1]
# shut ourselves down
logging.info("Calling API to delete this VM, {zone}/{name}".format(zone=zone, name=name))
requests.delete("https://www.googleapis.com/compute/v1/projects/{project_id}/zones/{zone}/instances/{name}"
.format(project_id=project_id, zone=zone, name=name),
headers={"Authorization": "Bearer {token}".format(token=token)})
A simple atexit hook gets me my desired behavior:
import atexit
atexit.register(kill_vm)
Another solution is to not use GCE and instead use AI Platform's custom job service, which automatically shuts down the VM after the Docker container exits.
gcloud ai-platform jobs submit training $JOB_NAME \
--region $REGION \
--master-image-uri $IMAGE_URI
You can specify --master-machine-type.
See the GCP documentation on custom containers.
The simplest way, from within the container, once it's finished:
ZONE=`gcloud compute instances list --filter="name=($HOSTNAME)" --format 'csv[no-heading](zone)'`
gcloud compute instances delete $HOSTNAME --zone=$ZONE -q
-q skips the interactive confirmation
$HOSTNAME is already exported
Just use curl and the local metadata server (no need for Python scripts or gcloud). Add the following to the end of your Docker Entrypoint script, so it's run when the container finishes:
# Note: inside the container the name is exposed as $HOSTNAME
INSTANCE_NAME=$(curl -sq "http://metadata.google.internal/computeMetadata/v1/instance/name" -H "Metadata-Flavor: Google")
INSTANCE_ZONE=$(curl -sq "http://metadata.google.internal/computeMetadata/v1/instance/zone" -H "Metadata-Flavor: Google")
echo "Terminating instance [${INSTANCE_NAME}] in zone [${INSTANCE_ZONE}}"
TOKEN=$(curl -sq "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token" -H "Metadata-Flavor: Google" | jq -r '.access_token')
curl -X DELETE -H "Authorization: Bearer ${TOKEN}" https://www.googleapis.com/compute/v1/$INSTANCE_ZONE/instances/$INSTANCE_NAME
For security sake, and Principle of Least Privilege, you can run the VM with a custom service account, and give that service account a role, with this permission (a custom role is best).
compute.instances.delete

Cannot set maintenance policy on a instance template from the command line

I have tried to set in a new instances template the maintenance policy to "MIGRATE" and the automatic restart to "On" (as the Web Console does); but it ignores the flags.
This is the command I am using:
gcloud compute instance-templates create \
$TEMPLATE_NAME \
--boot-disk-size 50GB \
--image coreos-beta-681-0-0-v20150527 \
--image-project coreos-cloud \
--machine-type n1-standard-2 \
--metadata-from-file user-data=my-cloud-config.yml \
--scopes compute-rw,storage-full,logging-write \
--tags web-minion \
--maintenance-policy MIGRATE \
--boot-disk-type pd-standard
But the template is created with Automatic restart to "Off" and On host maintenance to "Terminate VM instance". Instances created from this template have also the same settings.
When I log HTTP requests and responses this appears in the create request:
{"automaticRestart": true, "onHostMaintenance": "MIGRATE"}
so it does not seem a client error.
How can I create templates with the same settings the Web Console uses?
EDIT: Version of gcloud: 0.9.61; Version of compute: 2015.05.19.
EDIT 2: This also occurs now in Developers Console; it's a regression because I had a template with the correct values before.
The issue has been fixed and now I can create templates with the correct settings.