Related
I have the following output coming from a step function task: ListObjectsV2
{
"Contents": [
{
"ETag": "\"86c12c034bc6c30cb89b500b954c188f\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_1.csv",
"LastModified": "2023-02-09T13:46:20Z",
"Size": 796014,
"StorageClass": "STANDARD"
},
{
"ETag": "\"58e4a770e0f66073b00d185df500f07f\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_2.csv",
"LastModified": "2023-02-09T13:47:20Z",
"Size": 934038,
"StorageClass": "STANDARD"
},
{
"ETag": "\"460abd0de64d5cb67e8f0d46878cb1ef\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_3.csv",
"LastModified": "2023-02-09T13:46:57Z",
"Size": 794264,
"StorageClass": "STANDARD"
},
{
"ETag": "\"1bfedc3dc92e4ba8d04e24b9b5a0ed58\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_4.csv",
"LastModified": "2023-02-09T13:46:24Z",
"Size": 788756,
"StorageClass": "STANDARD"
},
{
"ETag": "\"9d6c434ce5ebdf203a790fbcf19338dc\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_5.csv",
"LastModified": "2023-02-09T13:47:07Z",
"Size": 831156,
"StorageClass": "STANDARD"
}
],
"IsTruncated": false,
"KeyCount": 5,
"MaxKeys": 1000,
"Name": "vita-internal-text-classification-dev-183576513728",
"Prefix": "55271f52fffe4461a2ee3228ebb97157"
}
I want to have an array containing only the Key key, to pass to the next state, like so:
[
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_1.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_2.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_3.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_4.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_5.csv",
}
]
So far I've tried setting the ResultPath to:
$.Contents[*].Key
$.Contents[*].['Key']
What I get is:
[
"55271f52fffe4461a2ee3228ebb97157/input/batch_1.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_2.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_3.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_4.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_5.csv",
]
But I've gotten bad output from that, any help?
The way I've solved this is to use an Inline Map state with a Pass state to build the necessary format. You can see this pattern in an example here for how to use Step Functions Distributed Map to bulk delete objects from S3. You can see this in the inner Create Object Identifier Array Map state. If you were doing this in Standard Workflows, this could be a cost concern given the number of state transitions involved. But since in the Item Processor I'm using Express Workflows, which are billed by duration (and these are super fast), it works pretty well.
{
"Comment": "A state machine to bulk delete objects from S3 using Distributed Map",
"StartAt": "Confirm Bucket Provided",
"States": {
"Confirm Bucket Provided": {
"Type": "Choice",
"Choices": [
{
"Not": {
"Variable": "$.bucket",
"IsPresent": true
},
"Next": "Fail - No Bucket"
}
],
"Default": "Check for Prefix"
},
"Check for Prefix": {
"Type": "Choice",
"Choices": [
{
"Not": {
"Variable": "$.prefix",
"IsPresent": true
},
"Next": "Generate Parameters - Without Prefix"
}
],
"Default": "Generate Parameters - With Prefix"
},
"Generate Parameters - Without Prefix": {
"Type": "Pass",
"Parameters": {
"Bucket.$": "$.bucket",
"Prefix": ""
},
"ResultPath": "$.list_parameters",
"Next": "Delete Objects from S3 Bucket"
},
"Fail - No Bucket": {
"Type": "Fail",
"Error": "InsuffcientArguments",
"Cause": "No Bucket was provided"
},
"Generate Parameters - With Prefix": {
"Type": "Pass",
"Next": "Delete Objects from S3 Bucket",
"Parameters": {
"Bucket.$": "$.bucket",
"Prefix.$": "$.prefix"
},
"ResultPath": "$.list_parameters"
},
"Delete Objects from S3 Bucket": {
"Type": "Map",
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "DISTRIBUTED",
"ExecutionType": "EXPRESS"
},
"StartAt": "Create Object Identifier Array",
"States": {
"Create Object Identifier Array": {
"Type": "Map",
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "INLINE"
},
"StartAt": "Create Object Identifier",
"States": {
"Create Object Identifier": {
"Type": "Pass",
"End": true,
"Parameters": {
"Key.$": "$.Key"
}
}
}
},
"ItemsPath": "$.Items",
"ResultPath": "$.object_identifiers",
"Next": "Delete Objects"
},
"Delete Objects": {
"Type": "Task",
"Next": "Clear Output",
"Parameters": {
"Bucket.$": "$.BatchInput.bucket",
"Delete": {
"Objects.$": "$.object_identifiers"
}
},
"Resource": "arn:aws:states:::aws-sdk:s3:deleteObjects",
"Retry": [
{
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 2,
"IntervalSeconds": 1,
"MaxAttempts": 6
}
],
"ResultSelector": {
"Deleted.$": "$.Deleted",
"RetryCount.$": "$$.State.RetryCount"
}
},
"Clear Output": {
"Type": "Pass",
"End": true,
"Result": {}
}
}
},
"ItemReader": {
"Resource": "arn:aws:states:::s3:listObjectsV2",
"Parameters": {
"Bucket.$": "$.list_parameters.Bucket",
"Prefix.$": "$.list_parameters.Prefix"
}
},
"MaxConcurrency": 5,
"Label": "S3objectkeys",
"ItemBatcher": {
"MaxInputBytesPerBatch": 204800,
"MaxItemsPerBatch": 1000,
"BatchInput": {
"bucket.$": "$.list_parameters.Bucket"
}
},
"ResultSelector": {},
"End": true
}
}
}
Here is my use case :-
I am trying to get the deployment details in a JSON format using :
kubectl get deployment -o json depl_name
and inserting result back to a column: meta_data in MySQL. The column data type is json . But the insert statement is failing with error :-
ERROR 3140 (22032): Invalid JSON text: "Missing a comma or '}' after an object member." at position 1035 in value for column
Here is my entire JSON :-
{
"uuid": {
"view": "demoBoard",
"demo": [
{
"serviceName": "wordpress-backend",
"configurations": {
"ec2_iam": {
"user": [],
"roles": null,
"permissions": null
}
},
"deployment_config": {
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"annotations": {
"deployment.kubernetes.io/revision": "6",
"kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"apps/v1\",\"kind\":\"Deployment\",\"metadata\":{\"annotations\":{},\"labels\":{\"app\":\"wordpress-backend\",\"wordpress_app_id\":\"w26\"},\"name\":\"wordpress-backend\",\"namespace\":\"wordpress\"},\"spec\":{\"selector\":{\"matchLabels\":{\"app\":\"wordpress-backend\"}},\"template\":{\"metadata\":{\"labels\":{\"app\":\"wordpress-backend\",\"wordpress_app_id\":\"w26\"}},\"spec\":{\"containers\":[{\"envFrom\":[{\"configMapRef\":{\"name\":\"wordpress-backend-config\"}}],\"image\":\"docker-image\",\"imagePullPolicy\":\"IfNotPresent\",\"name\":\"wordpress-backend\",\"ports\":[{\"containerPort\":8000}],\"resources\":{},\"volumeMounts\":[{\"mountPath\":\"/tmp/me/cloud\",\"name\":\"my-key\"}]}],\"imagePullSecrets\":[{\"name\":\"my-json\"}],\"volumes\":[{\"name\":\"my-cloud-key\",\"secret\":{\"defaultMode\":123,\"secretName\":\"my-key\"}}]}}}}\n"
},
"creationTimestamp": "2022-09-12T13:56:34Z",
"generation": 7,
"labels": {
"app": "wordpress-backend",
"wordpress_app_id": "w26"
},
"name": "wordpress-backend",
"namespace": "wordpress",
"resourceVersion": "v2",
"uid": "0da99b29"
},
"spec": {
"progressDeadlineSeconds": 600,
"replicas": 1,
"revisionHistoryLimit": 10,
"selector": {
"matchLabels": {
"app": "wordpress-backend"
}
},
"strategy": {
"rollingUpdate": {
"maxSurge": "25%",
"maxUnavailable": "25%"
},
"type": "RollingUpdate"
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "wordpress-backend",
"wordpress_app_id": "267"
}
},
"spec": {
"containers": [
{
"envFrom": [
{
"configMapRef": {
"name": "wordpress-backend-config"
}
}
],
"image": "docker.io/my-image",
"imagePullPolicy": "IfNotPresent",
"name": "wordpress-backend",
"ports": [
{
"containerPort": 8000,
"protocol": "TCP"
}
],
"resources": {},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"volumeMounts": [
{
"mountPath": "/my/path/cloud",
"name": "my-key"
}
]
}
],
"dnsPolicy": "ClusterFirst",
"imagePullSecrets": [
{
"name": "my-key"
}
],
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"securityContext": {},
"terminationGracePeriodSeconds": 30,
"volumes": [
{
"name": "my-key",
"secret": {
"defaultMode": 123,
"secretName": "sampleKeyName"
}
}
]
}
}
},
"status": {
"availableReplicas": 1,
"conditions": [
{
"lastTransitionTime": "2022-09-29T15:11:14Z",
"lastUpdateTime": "2022-09-29T15:11:14Z",
"message": "Deployment has minimum availability.",
"reason": "MinimumReplicasAvailable",
"status": "True",
"type": "Available"
},
{
"lastTransitionTime": "2022-09-12T14:20:35Z",
"lastUpdateTime": "2022-09-30T14:13:08Z",
"message": "ReplicaSet \"wordpress-backend-abc123\" has successfully progressed.",
"reason": "NewReplicaSetAvailable",
"status": "True",
"type": "Progressing"
}
],
"observedGeneration": 7,
"readyReplicas": 1,
"replicas": 1,
"updatedReplicas": 1
}
}
}
]
}
}
I guess, because of escape sequence in below line causing the failure :-
"message": "ReplicaSet \"wordpress-backend-abc123\" has successfully progressed.", tried removing that, but no luck.
need help to parse the JSON data received from Oracle Integration Cloud. The expected output is mentioned below alongwith the command i am trying to use.
JQ command
jq '[{id: .id},{integrations: [.integrations[]|{code: .code, version: .version, dependencies: .dependencies|{connections: .connections[]|{id: .id, status: .status}}, .dependencies|{lookups: .lookups}}]}]' output.json
Error :
jq: error: syntax error, unexpected FIELD (Unix shell quoting issues?) at , line 1:
[{id: .id},{integrations: [.integrations[]|{code: .code, version: .version, dependencies: .dependencies|{connections: .connections[]|{id: .id, status: .status}}, .dependencies|{lookups: .lookups}}]}]
Note : If i run below command to fetch only connections data it works fine
jq '[{id: .id},{integrations: [.integrations[]|{code: .code, version: .version, dependencies: .dependencies|{connections: .connections[]|{id: .id, status: .status}}}]}]' output.json
Expected Output:
[
{
"id": "SAMPLE_PACKAGE"
},
{
"integrations": [
{
"code": "HELLO_INTEGRATION",
"version": "01.00.0000",
"dependencies": {
"connections": {
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
}
}
},
{
"code": "HELLO_INTEGRATIO_LOOKUP",
"version": "01.00.0000",
"dependencies": {
"connections": {
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
},
"lookups": {
"name": "COMMON_LOOKUP_VARIABLES",
"status": "CONFIGURED"
}
}
},
{
"code": "HI_INTEGRATION",
"version": "01.00.0000",
"dependencies": {
"connections": {
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
}
}
}
]
}
]
output.json file contains
{
"bartaType": "DEVELOPED",
"countOfIntegrations": 3,
"id": "SAMPLE_PACKAGE",
"integrations": [
{
"code": "HELLO_INTEGRATION",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"lockedFlag": false,
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED",
"type": "rest",
"usage": 6
}
]
},
"description": "",
"eventSubscriptionFlag": false,
"filmstrip": [
{
"code": "HELLO_WORLD1",
"iconUrl": "/images/rest/rest_icon_46.png",
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED"
}
],
"id": "HELLO_INTEGRATION|01.00.0000",
"lockedFlag": false,
"name": "HELLO_INTEGRATION",
"pattern": "Orchestration",
"patternDescription": "Map Data",
"payloadTracingEnabledFlag": true,
"publishFlag": false,
"scheduleApplicable": false,
"scheduleDefined": false,
"status": "ACTIVATED",
"style": "FREEFORM",
"styleDescription": "Orchestration",
"tempCopyExists": false,
"tracingEnabledFlag": true,
"version": "01.00.0000",
"warningMsg": "ACTIVATE_PUBLISH_NO_CONN"
},
{
"code": "HELLO_INTEGRATIO_LOOKUP",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"lockedFlag": false,
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED",
"type": "rest",
"usage": 6
}
],
"lookups": [
{
"lockedFlag": false,
"name": "COMMON_LOOKUP_VARIABLES",
"status": "CONFIGURED",
"usage": 1
}
]
},
"description": "",
"eventSubscriptionFlag": false,
"filmstrip": [
{
"code": "HELLO_WORLD1",
"iconUrl": "/images/rest/rest_icon_46.png",
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED"
}
],
"id": "HELLO_INTEGRATIO_LOOKUP|01.00.0000",
"lockedFlag": false,
"name": "HELLO_INTEGRATION_LOOKUP",
"pattern": "Orchestration",
"patternDescription": "Map Data",
"payloadTracingEnabledFlag": true,
"publishFlag": false,
"scheduleApplicable": false,
"scheduleDefined": false,
"status": "ACTIVATED",
"style": "FREEFORM",
"styleDescription": "Orchestration",
"tempCopyExists": false,
"tracingEnabledFlag": true,
"version": "01.00.0000",
"warningMsg": "ACTIVATE_PUBLISH_NO_CONN"
},
{
"code": "HI_INTEGRATION",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"lockedFlag": false,
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED",
"type": "rest",
"usage": 6
}
]
},
"description": "",
"eventSubscriptionFlag": false,
"filmstrip": [
{
"code": "HELLO_WORLD1",
"iconUrl": "/images/rest/rest_icon_46.png",
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED"
}
],
"id": "HI_INTEGRATION|01.00.0000",
"lockedFlag": false,
"name": "HI_INTEGRATION",
"pattern": "Orchestration",
"patternDescription": "Map Data",
"payloadTracingEnabledFlag": true,
"publishFlag": false,
"scheduleApplicable": false,
"scheduleDefined": false,
"status": "ACTIVATED",
"style": "FREEFORM",
"styleDescription": "Orchestration",
"tempCopyExists": false,
"tracingEnabledFlag": true,
"version": "01.00.0000",
"warningMsg": "ACTIVATE_PUBLISH_NO_CONN"
}
],
"isCloneAllowed": false,
"isViewAllowed": false,
"name": "SAMPLE_PACKAGE",
"type": "DEVELOPED"
}
The problem is that the lookups key is not always present so, you cannot use the [] on it. So, instead you can use the map function and provide a default before piping to the map function like below
[
{ id: .id },
{
integrations: [
.integrations[]|{
id: .id,
code: .code,
dependencies: {
connections: (.dependencies.connections//[]|map({id,status}))[0],
lookups: (.dependencies.lookups//[]|map({name,status}))[0]
}
}
]
}
]
The (.dependencies.lookups//[]|map({name,status}))[0] has the effect of passing an empty array to the map function which results in a null value when accessing the first element.
See in action https://jqplay.org/s/zQBkHtnzOd1
The provided JQ statement works fine for single elements in the array , but incase the array contains multiple elements it only fetches the first element. Also i updated the dependencies object to capture all the arrays ( connections,lookups,certificates,libraries,integrations)
Below is the modified one. Please suggest for any better options.
[
{ id: .id },
{
integrations: [
.integrations[]|{
id: .id,
code: .code,
dependencies: {
connections: (.dependencies.connections//[]|map({id,status})),
lookups: (.dependencies.lookups//[]|map({name,status})),
certificates: (.dependencies.certificates//[]|map({id,status})),
libraries: (.dependencies.libraries//[]|map({code,status,version})),
integrations: (.dependencies.integrations//[]|map({code,version}))
}
}
]
}
]|del(..|select(.==[]))
Note: To remove the empty arrays del function is added which is giving the below output :
[
{
"id": "SAMPLE_PACKAGE"
},
{
"integrations": [
{
"id": "HELLO_INTEGRATION|01.00.0000",
"code": "HELLO_INTEGRATION",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
},
{
"id": "HELLO_WORLD2",
"status": "CONFIGURED"
}
]
}
},
{
"id": "HELLO_INTEGRATIO_LOOKUP|01.00.0000",
"code": "HELLO_INTEGRATIO_LOOKUP",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
}
],
"lookups": [
{
"name": "COMMON_LOOKUP_VARIABLES",
"status": "CONFIGURED"
}
]
}
},
{
"id": "HI_INTEGRATION|01.00.0000",
"code": "HI_INTEGRATION",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
}
]
}
}
]
}
]
I am trying to create EMR-5.30.1 clusters with applications such as Hadoop, livy, Spark, ZooKeeper, and Hive with the help of the CloudFormation template. But the issue is with this template is I am able the cluster with only one application from the above list of applications.
below is the CloudFormation Template
{
"AWSTemplateFormatVersion": "2010-09-09",
"Description": "Best Practice EMR Cluster for Spark or S3 backed Hbase",
"Parameters": {
"EMRClusterName": {
"Description": "Name of the cluster",
"Type": "String",
"Default": "emrcluster"
},
"KeyName": {
"Description": "Must be an existing Keyname",
"Type": "String",
"Default": "keyfilename"
},
"MasterInstanceType": {
"Description": "Instance type to be used for the master instance.",
"Type": "String",
"Default": "m5.xlarge"
},
"CoreInstanceType": {
"Description": "Instance type to be used for core instances.",
"Type": "String",
"Default": "m5.xlarge"
},
"NumberOfCoreInstances": {
"Description": "Must be a valid number",
"Type": "Number",
"Default": 1
},
"SubnetID": {
"Description": "Must be Valid public subnet ID",
"Default": "subnet-ee15b3e0",
"Type": "String"
},
"LogUri": {
"Description": "Must be a valid S3 URL",
"Default": "s3://aws/elasticmapreduce/",
"Type": "String"
},
"S3DataUri": {
"Description": "Must be a valid S3 bucket URL ",
"Default": "s3://aws/elasticmapreduce/",
"Type": "String"
},
"ReleaseLabel": {
"Description": "Must be a valid EMR release version",
"Default": "emr-5.30.1",
"Type": "String"
},
"Applications": {
"Description": "Please select which application will be installed on the cluster this would be either Ganglia and spark, or Ganglia and s3 backed Hbase",
"Type": "String",
"AllowedValues": [
"Spark",
"Hbase",
"Hive",
"Livy",
"ZooKeeper"
]
}
},
"Mappings": {},
"Conditions": {
"Spark": {
"Fn::Equals": [
{
"Ref": "Applications"
},
"Spark"
]
},
"Hbase": {
"Fn::Equals": [
{
"Ref": "Applications"
},
"Hbase"
]
},
"Hive": {
"Fn::Equals": [
{
"Ref": "Applications"
},
"Hive"
]
},
"Livy": {
"Fn::Equals": [
{
"Ref": "Applications"
},
"Livy"
]
},
"ZooKeeper": {
"Fn::Equals": [
{
"Ref": "Applications"
},
"ZooKeeper"
]
}
},
"Resources": {
"EMRCluster": {
"DependsOn": [
"EMRClusterServiceRole",
"EMRClusterinstanceProfileRole",
"EMRClusterinstanceProfile"
],
"Type": "AWS::EMR::Cluster",
"Properties": {
"Applications": [
{
"Name": "Ganglia"
},
{
"Fn::If": [
"Spark",
{
"Name": "Spark"
},
{
"Ref": "AWS::NoValue"
}
]
},
{
"Fn::If": [
"Hbase",
{
"Name": "Hbase"
},
{
"Ref": "AWS::NoValue"
}
]
},
{
"Fn::If": [
"Hive",
{
"Name": "Hive"
},
{
"Ref": "AWS::NoValue"
}
]
},
{
"Fn::If": [
"Livy",
{
"Name": "Livy"
},
{
"Ref": "AWS::NoValue"
}
]
},
{
"Fn::If": [
"ZooKeeper",
{
"Name": "ZooKeeper"
},
{
"Ref": "AWS::NoValue"
}
]
}
],
"Configurations": [
{
"Classification": "hbase-site",
"ConfigurationProperties": {
"hbase.rootdir":{"Ref":"S3DataUri"}
}
},
{
"Classification": "hbase",
"ConfigurationProperties": {
"hbase.emr.storageMode": "s3"
}
}
],
"Instances": {
"Ec2KeyName": {
"Ref": "KeyName"
},
"Ec2SubnetId": {
"Ref": "SubnetID"
},
"MasterInstanceGroup": {
"InstanceCount": 1,
"InstanceType": {
"Ref": "MasterInstanceType"
},
"Market": "ON_DEMAND",
"Name": "Master"
},
"CoreInstanceGroup": {
"InstanceCount": {
"Ref": "NumberOfCoreInstances"
},
"InstanceType": {
"Ref": "CoreInstanceType"
},
"Market": "ON_DEMAND",
"Name": "Core"
},
"TerminationProtected": false
},
"VisibleToAllUsers": true,
"JobFlowRole": {
"Ref": "EMRClusterinstanceProfile"
},
"ReleaseLabel": {
"Ref": "ReleaseLabel"
},
"LogUri": {
"Ref": "LogUri"
},
"Name": {
"Ref": "EMRClusterName"
},
"AutoScalingRole": "EMR_AutoScaling_DefaultRole",
"ServiceRole": {
"Ref": "EMRClusterServiceRole"
}
}
},
"EMRClusterServiceRole": {
"Type": "AWS::IAM::Role",
"Properties": {
"AssumeRolePolicyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": [
"elasticmapreduce.amazonaws.com"
]
},
"Action": [
"sts:AssumeRole"
]
}
]
},
"ManagedPolicyArns": [
"arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceRole"
],
"Path": "/"
}
},
"EMRClusterinstanceProfileRole": {
"Type": "AWS::IAM::Role",
"Properties": {
"AssumeRolePolicyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": [
"ec2.amazonaws.com"
]
},
"Action": [
"sts:AssumeRole"
]
}
]
},
"ManagedPolicyArns": [
"arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceforEC2Role"
],
"Path": "/"
}
},
"EMRClusterinstanceProfile": {
"Type": "AWS::IAM::InstanceProfile",
"Properties": {
"Path": "/",
"Roles": [
{
"Ref": "EMRClusterinstanceProfileRole"
}
]
}
}
},
"Outputs": {}
}
Also, I want to add a bootstrap script in this template as well, Can anyone please help me with the issue.
As per my knoweldge and understanding, Applications in your case should be an array like below, as mentioned in documentation
"Applications" : [ Application, ... ],
In you case, you can list applications like
"Applications" : [
{"Name" : "Spark"},
{"Name" : "Hbase"},
{"Name" : "Hive"},
{"Name" : "Livy"},
{"Name" : "Zookeeper"},
]
For more arguments other than Name to individual application dictionary , see detail here, you can pass Args, Additional_info etc
You can use following way:-
If you set "ReleaseLabel" then there is no need to mention versions of applications
"Applications": [{
"Name": "Hive"
},
{
"Name": "Presto"
},
{
"Name": "Spark"
}
]
For bootstrap:-
"BootstrapActions": [{
"Name": "setup",
"ScriptBootstrapAction": {
"Path": "s3://bucket/key/Bootstrap.sh"
}
}]
Define like this to create all applications at once.
{
"Type": "AWS::EMR::Cluster",
"Properties": {
"Applications": [
{
"Name": "Ganglia"
},
{
"Name": "Spark"
},
{
"Name": "Livy"
},
{
"Name": "ZooKeeper"
},
{
"Name": "JupyterHub"
}
]
}
}
I have created new docker image and added in kubernetes statefulset yaml for mysql pod. When I scaled mysql pod to 1 , it's keeps crashing and throwing message
unable to lock ./ibdata1 error: 11
I have googled lot about this error but none of them gave a solution. Appreciate if any one help me!
Docker file:
FROM mysql/mysql-server
CMD [ "--max_connections=10000" ]
And Created MySQL YAML like below:
{
"kind": "StatefulSet",
"apiVersion": "apps/v1beta2",
"metadata": {
"name": "mysql-test",
"namespace": "test",
"selfLink": "/apis/apps/v1beta2/namespaces/test/statefulsets/mysql-test",
"uid": "e7768a0b-faf9-11e9-a989-d8c497367e2a",
"resourceVersion": "315823479",
"generation": 33,
"creationTimestamp": "2019-10-30T09:44:44Z",
"labels": {
"app": "mysql-test",
"release": "mysql-test"
}
},
"spec": {
"replicas": 1,
"selector": {
"matchLabels": {
"app": "mysql-test"
}
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "mysql-test"
}
},
"spec": {
"volumes": [
{
"name": "data",
"persistentVolumeClaim": {
"claimName": "mysqldata"
}
},
{
"name": "backup",
"persistentVolumeClaim": {
"claimName": "mysqlbackup"
}
}
],
"containers": [
{
"name": "latest",
"image": "<Enterprise repository>/careercompass/mysql:latest",
"ports": [
{
"name": "mysql",
"containerPort": 3306,
"protocol": "TCP"
}
],
"env": [
{
"name": "MYSQL_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "mysql",
"key": "mysql-password",
"optional": true
}
}
},
{
"name": "MYSQL_ROOT_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "mysql",
"key": "mysql-root-password",
"optional": true
}
}
},
{
"name": "MYSQL_USER",
"value": "mysql"
},
{
"name": "MYSQL_DATABASE"
}
],
"resources": {
"limits": {
"cpu": "10",
"memory": "10000Mi"
},
"requests": {
"cpu": "200m",
"memory": "3000Mi"
}
},
"volumeMounts": [
{
"name": "data",
"mountPath": "/var/lib/mysql"
},
{
"name": "backup",
"mountPath": "/var/lib/mysqlbackup"
}
],
"terminationMessagePath": "/var/lib/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent"
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"securityContext": {
"runAsUser": 999,
"fsGroup": 999
},
"schedulerName": "default-scheduler"
}
},
"serviceName": "",
"podManagementPolicy": "OrderedReady",
"updateStrategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"partition": 0
}
},
"revisionHistoryLimit": 10
},
"status": {
"observedGeneration": 33,
"replicas": 1,
"currentReplicas": 1,
"updatedReplicas": 1,
"currentRevision": "mysql-test-68cb64885c",
"updateRevision": "mysql-test-68cb64885c",
"collisionCount": 0
}
}