CannotStartContainerError while submitting a AWS Batch Job

CannotStartContainerError while submitting a AWS Batch Job - json

In AWS Batch I have a job definition and a job queue and a compute environment where to execute my AWS Batch jobs.
After submitting a job, I find it in the list of the failed ones with this error:
Status reason
Essential container in task exited
Container message
CannotStartContainerError: API error (404): oci runtime error: container_linux.go:247: starting container process caused "exec: \"/var/application/script.sh --file= --key=.
and in the cloudwatch logs I have:
container_linux.go:247: starting container process caused "exec: \"/var/application/script.sh --file=Toulouse.json --key=out\": stat /var/application/script.sh --file=Toulouse.json --key=out: no such file or directory"
I have specified a correct docker image that has all the scripts (we use it already and it works) and I don't know where the error is coming from.
Any suggestions are very appreciated.
The docker file is something like that:
# Pull base image.
FROM account-id.dkr.ecr.region.amazonaws.com/application-image.base-php7-image:latest
VOLUME /tmp
VOLUME /mount-point
RUN chown -R ubuntu:ubuntu /var/application
# Create the source directories
USER ubuntu
COPY application/ /var/application
# Register aws profile
COPY data/aws /home/ubuntu/.aws
WORKDIR /var/application/
ENV COMPOSER_CACHE_DIR /tmp
RUN composer update -o && \
rm -Rf /tmp/*
Here is the Job Definition:
{
"jobDefinitionName": "JobDefinition",
"jobDefinitionArn": "arn:aws:batch:region:accountid:job-definition/JobDefinition:25",
"revision": 21,
"status": "ACTIVE",
"type": "container",
"parameters": {},
"retryStrategy": {
"attempts": 1
},
"containerProperties": {
"image": "account-id.dkr.ecr.region.amazonaws.com/application-dev:latest",
"vcpus": 1,
"memory": 512,
"command": [
"/var/application/script.sh",
"--file=",
"Ref::file",
"--key=",
"Ref::key"
],
"volumes": [
{
"host": {
"sourcePath": "/mount-point"
},
"name": "logs"
},
{
"host": {
"sourcePath": "/var/log/php/errors.log"
},
"name": "php-errors-log"
},
{
"host": {
"sourcePath": "/tmp/"
},
"name": "tmp"
}
],
"environment": [
{
"name": "APP_ENV",
"value": "dev"
}
],
"mountPoints": [
{
"containerPath": "/tmp/",
"readOnly": false,
"sourceVolume": "tmp"
},
{
"containerPath": "/var/log/php/errors.log",
"readOnly": false,
"sourceVolume": "php-errors-log"
},
{
"containerPath": "/mount-point",
"readOnly": false,
"sourceVolume": "logs"
}
],
"ulimits": []
}
}
In Cloudwatch log stream /var/log/docker:
time="2017-06-09T12:23:21.014547063Z" level=error msg="Handler for GET /v1.17/containers/4150933a38d4f162ba402a3edd8b7763c6bbbd417fcce232964e4a79c2286f67/json returned error: No such container: 4150933a38d4f162ba402a3edd8b7763c6bbbd417fcce232964e4a79c2286f67"

This error was because the command was malformed. I was submitting the job by a lambda function (python 2.7) using boto3 and the syntax of the command should be something like this:
'command' : ['sudo','mkdir','directory']
Hope it helps somebody.

Related

Using packer and type qemu in the json file to create a guest kvm vm, but ssh timeout error coming

I have RHEL 8.5 as the KVM host. I want to create a guest vm through packer type qemu and have a json file where all the configurations are mentioned.
{
"builders": [
{
"type": "qemu",
"iso_url": "/var/lib/libvirt/images/test.iso",
"iso_checksum": "md5:3959597d89e8c20d58c4514a7cf3bc7f",
"output_directory": "/var/lib/libvirt/images/iso-dir/test",
"disk_size": "55G",
"headless": "true",
"qemuargs": [
[
"-m",
"4096"
],
[
"-smp",
"2"
]
],
"format": "qcow2",
"shutdown_command": "echo 'siedgerexuser' | sudo -S shutdown -P now",
"accelerator": "kvm",
"ssh_username": "nonrootuser",
"ssh_password": "********",
"ssh_timeout": "20m",
"vm_name": "test",
"net_device": "virtio-net",
"disk_interface": "virtio",
"http_directory": "/home/azureuser/http",
"boot_wait": "10s",
"boot_command": [
"e inst.ks=http://{{ .HTTPIP }}:{{ .HTTPPort }}/anaconda-ks.cfg"
]
}
],
"provisioners":
[
{
"type": "file",
"source": "/home/azureuser/service_status_check.sh",
"destination": "/tmp/service_status_check.sh"
},
{
"type": "file",
"source": "/home/azureuser/service_check.sh",
"destination": "/tmp/service_check.sh"
},
{
"type": "file",
"source": "/home/azureuser/azure.sh",
"destination": "/tmp/azure.sh"
},
{
"type": "file",
"source": "/home/azureuser/params.cfg",
"destination": "/tmp/params.cfg"
},
{
"type": "shell" ,
"execute_command": "echo 'siedgerexuser' | {{.Vars}} sudo -E -S bash '{{.Path}}'",
"inline": [
"echo copying" , "cp /tmp/params.cfg /root/",
"sudo ls -lrt /root/params.cfg",
"sudo ls -lrt /opt/scripts/"
],
"inline_shebang": "/bin/sh -x"
},
{
"type": "shell",
"pause_before": "5s",
"expect_disconnect": true ,
"inline": [
"echo runningconfigurescript" , "sudo sh /opt/scripts/configure-env.sh"
]
},
{
"type": "shell",
"pause_before": "200s",
"inline": [
"sudo sh /tmp/service_check.sh",
"sudo sh /tmp/azure.sh"
]
}
]
}
It is working fine in rhel 7.9, but the same thing giving ssh timeout error in RHEL 8.4.
But when i am creating guest vm with virt-install it is able to create a vm and i am able to see it in cockpit web ui, but when i initiate packer build then while giving ssh timeout error it is not visible in cockpit UI, so not able to see where the guest vm created get stuck.
Can anyone please help me to fix this issue

Failed to start minikube: Error while starting minikube. Error: X Exiting due to MK_USAGE: Container runtime must be set to "containerd" for rootless

I'm getting the error and I believe the way to solve it is by running: minikube start --container-runtime=containerd
but the extension seems to run minikube start. So how am I supposed to add the flag?
Here's the launch.json file
{
"configurations": [
{
"name": "Cloud Run: Run/Debug Locally",
"type": "cloudcode.cloudrun",
"request": "launch",
"build": {
"docker": {
"path": "Dockerfile"
}
},
"image": "dai",
"service": {
"name": "dai",
"containerPort": 8080,
"resources": {
"limits": {
"memory": "256Mi"
}
}
},
"target": {
"minikube": {}
},
"watch": true
}
]
}

Cloud Code for VS Code doesn't support such settings at the moment. But you can configure minikube to apply these settings with minikube config set.
The Cloud Run emulation creates a separate minikube profile called cloud-run-dev-internal. So you should be able to run the following:
minikube config set --profile cloud-run-dev-internal container-runtime containerd
You have to delete that minikube profile to cause the setting to take effect for your next launch:
minikube delete --profile cloud-run-dev-internal

AWS step function: how to pass InputPath to OutputPath unchanged in Fargate task

I have an AWS steps function defined using this Serverless plugin with 3 steps (FirstStep -> Worker -> EndStep -> Done):
stepFunctions:
stateMachines:
MyStateMachine:
name: "MyStateMachine"
definition:
StartAt: FirstStep
States:
FirstStep:
Type: Task
Resource:
Fn::GetAtt: [ FirstStep, Arn ]
InputPath: $
OutputPath: $
Next: Worker
Worker:
Type: Task
Resource: arn:aws:states:::ecs:runTask.sync
InputPath: $
OutputPath: $
Parameters:
Cluster: "#{EcsCluster}"
TaskDefinition: "#{EcsTaskDefinition}"
LaunchType: FARGATE
Overrides:
ContainerOverrides:
- Name: container-worker
Environment:
- Name: ENV_VAR_1
'Value.$': $.ENV_VAR_1
- Name: ENV_VAR_2
'Value.$': $.ENV_VAR_2
Next: EndStep
EndStep:
Type: Task
Resource:
Fn::GetAtt: [ EndStep, Arn ]
InputPath: $
OutputPath: $
Next: Done
Done:
Type: Succeed
I would like to propagate the InputPath unchanged from Worker step (Fargate) to EndStep, but when I inspect step input of EndStep from AWS management console I see that data associated with Fargate task is passed instead:
{
"Attachments": [...],
"Attributes": [],
"AvailabilityZone": "...",
"ClusterArn": "...",
"Connectivity": "CONNECTED",
"ConnectivityAt": 1619602512349,
"Containers": [...],
"Cpu": "1024",
"CreatedAt": 1619602508374,
"DesiredStatus": "STOPPED",
"ExecutionStoppedAt": 1619602543623,
"Group": "...",
"InferenceAccelerators": [],
"LastStatus": "STOPPED",
"LaunchType": "FARGATE",
"Memory": "3072",
"Overrides": {
"ContainerOverrides": [
{
"Command": [],
"Environment": [
{
"Name": "ENV_VAR_1",
"Value": "..."
},
{
"Name": "ENV_VAR_2",
"Value": "..."
}
],
"EnvironmentFiles": [],
"Name": "container-worker",
"ResourceRequirements": []
}
],
"InferenceAcceleratorOverrides": []
},
"PlatformVersion": "1.4.0",
"PullStartedAt": 1619602522806,
"PullStoppedAt": 1619602527294,
"StartedAt": 1619602527802,
"StartedBy": "AWS Step Functions",
"StopCode": "EssentialContainerExited",
"StoppedAt": 1619602567040,
"StoppedReason": "Essential container in task exited",
"StoppingAt": 1619602553655,
"Tags": [],
"TaskArn": "...",
"TaskDefinitionArn": "...",
"Version": 5
}
Basically, if the initial input is
{
"ENV_VAR_1": "env1",
"ENV_VAR_2": "env2",
"otherStuff": {
"k1": "v1",
"k2": "v2"
}
}
I want it to be passed as is to FirstStep, Worker and EndStep inputs without changes.
Is this possible?

Given that you invoke the step function with an object (let's call that A), then a task's...
...InputPath specifies what part of A is handed to your task
...ResultPath specifies where in A to put the result of the task
...OutputPath specifies what part of A to hand over to the next state
Source: https://docs.aws.amazon.com/step-functions/latest/dg/input-output-example.html
So you are currently overwriting all content in A with the result of your Worker state (implicitly). If you want to discard the result of your Worker state, you have to specify:
ResultPath: null
Source: https://docs.aws.amazon.com/step-functions/latest/dg/input-output-resultpath.html#input-output-resultpath-null

How to build and deploy an app from template files?

I'm trying to build and deploy an app using CLI and a template taken from an example app.
My steps:
Download template
oc login <...>
oc new-project <...>
oc new-app -f ./nodejs.json
Result:
An app reachable to the outside world (built from the remote github repo source code)
Problem:
It's all good, but I would like to use my own source files located in my current working directory . As I understand it, in order to do this I need to modify BuildConfig part of the template.
{
"kind": "BuildConfig",
"apiVersion": "v1",
"metadata": {
"name": "${NAME}",
"annotations": {
"description": "Defines how to build the application",
"template.alpha.openshift.io/wait-for-ready": "true"
}
},
"spec": {
"source": {
"type": "Git",
"git": {
"uri": "${SOURCE_REPOSITORY_URL}",
"ref": "${SOURCE_REPOSITORY_REF}"
},
"contextDir": "${CONTEXT_DIR}"
},
"strategy": {
"type": "Source",
"sourceStrategy": {
"from": {
"kind": "ImageStreamTag",
"namespace": "${NAMESPACE}",
"name": "nodejs:6"
},
"env": [
{
"name": "NPM_MIRROR",
"value": "${NPM_MIRROR}"
}
]
}
},
"output": {
"to": {
"kind": "ImageStreamTag",
"name": "${NAME}:latest"
}
},
"triggers": [
{
"type": "ImageChange"
},
{
"type": "ConfigChange"
},
{
"type": "GitHub",
"github": {
"secret": "${GITHUB_WEBHOOK_SECRET}"
}
},
{
"type": "Generic",
"generic": {
"secret": "${GENERIC_WEBHOOK_SECRET}"
}
}
],
"postCommit": {
"script": "npm test"
}
}
}
Can you please help me edit this file?

As far as I can see you are developing for nodejs:
A possible solution is to build (aka do all npm stuff) on your local machine (to skip the assemble phase in the s2i build container) then start with a binary source deployment [1][2].
You can do this with these steps:
oc new-app <IMAGE-NAME>~/tmp/nocontent --name=<APPLICATION_NAME>
oc start-build <APPLICATION_NAME> --from-dir=<PATH_TO_DIR>/my-built-app
The <PATH_TO_DIR>/my-built-app dir has to contain the binary (or javascript files) on the root.
The command will stream files to a new build container in openshift (this works also on minishift).
You can also do more customization adding a .s2i dir in the <PATH_TO_DIR>/my-built-app
eg: <PATH_TO_DIR>/my-built-app/.s2i[3]
Note: You have to read documentation and/or explore a pod of your s2i image to know where the files should be placed and where the files are moved by the s2i default scripts shipped with the images itself.
[1]: https://access.redhat.com/documentation/en-us/reference_architectures/2017/html/build_and_deployment_of_java_applications_on_openshift_container_platform_3/build_and_deploy#binary_source_deployment
[2]:https://docs.openshift.com/container-platform/3.6/dev_guide/builds/basic_build_operations.html
[3]: https://access.redhat.com/documentation/en-us/openshift_container_platform/3.6/html/using_images/source-to-image-s2i#customizing-s2i-images

How to specify metadata for GCE in packer?

I'm trying to create a GCE image from packer template.
Here is the part that I use for that purpose.
"builders": [
...
{
"type": "googlecompute",
"account_file": "foo",
"project_id": "bar",
"source_image": "centos-6-v20160711",
"zone": "us-central1-a",
"instance_name": "packer-building-image-centos6-baz",
"machine_type": "n1-standard-1",
"image_name": "centos6-some-box-name",
"ssh_username": "my_username",
"metadata": {
"startup-script-log-dest": "/opt/script.log",
"startup-script": "/opt/startup.sh",
"some_other_custom_metadata_key": "some_value"
},
"ssh_pty": true
}
],
...
I have also created the required files. Here is that part
"provisioners": [
...
{
"type": "file",
"source": "{{user `files_path`}}/startup.sh",
"destination": "/opt/startup.sh"
},
...
{
"type": "shell",
"execute_command": "sudo sh '{{.Path}}'",
"inline": [
...
"chmod ugo+x /opt/startup.sh"
]
}
...
Everything works for me without "metadata" field. I can create image/instance with provided parameters. but when I try to create an instance from the image, I can't find the provided metadata and respectively I can't run my startup script, set logging file and other custom metadata.
Here is the source that I use https://www.packer.io/docs/builders/googlecompute.html#metadata.
Any suggestion will be helpful.
Thanks in advance

The metadata tag startup-script should contain the actuall script not a path. Provisioners run after the startup script has been executed (at least started).
Instead use startup_script_file in Packer and supply a path to a startup script.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

CannotStartContainerError while submitting a AWS Batch Job - json

This error was because the command was malformed. I was submitting the job by a lambda function (python 2.7) using boto3 and the syntax of the command should be something like this: 'command' : ['sudo','mkdir','directory'] Hope it helps somebody.

Related

Using packer and type qemu in the json file to create a guest kvm vm, but ssh timeout error coming

Failed to start minikube: Error while starting minikube. Error: X Exiting due to MK_USAGE: Container runtime must be set to "containerd" for rootless

AWS step function: how to pass InputPath to OutputPath unchanged in Fargate task

How to build and deploy an app from template files?

How to specify metadata for GCE in packer?

Categories

Resources