Related
I have the following output coming from a step function task: ListObjectsV2
{
"Contents": [
{
"ETag": "\"86c12c034bc6c30cb89b500b954c188f\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_1.csv",
"LastModified": "2023-02-09T13:46:20Z",
"Size": 796014,
"StorageClass": "STANDARD"
},
{
"ETag": "\"58e4a770e0f66073b00d185df500f07f\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_2.csv",
"LastModified": "2023-02-09T13:47:20Z",
"Size": 934038,
"StorageClass": "STANDARD"
},
{
"ETag": "\"460abd0de64d5cb67e8f0d46878cb1ef\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_3.csv",
"LastModified": "2023-02-09T13:46:57Z",
"Size": 794264,
"StorageClass": "STANDARD"
},
{
"ETag": "\"1bfedc3dc92e4ba8d04e24b9b5a0ed58\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_4.csv",
"LastModified": "2023-02-09T13:46:24Z",
"Size": 788756,
"StorageClass": "STANDARD"
},
{
"ETag": "\"9d6c434ce5ebdf203a790fbcf19338dc\"",
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_5.csv",
"LastModified": "2023-02-09T13:47:07Z",
"Size": 831156,
"StorageClass": "STANDARD"
}
],
"IsTruncated": false,
"KeyCount": 5,
"MaxKeys": 1000,
"Name": "vita-internal-text-classification-dev-183576513728",
"Prefix": "55271f52fffe4461a2ee3228ebb97157"
}
I want to have an array containing only the Key key, to pass to the next state, like so:
[
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_1.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_2.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_3.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_4.csv",
},
{
"Key": "55271f52fffe4461a2ee3228ebb97157/input/batch_5.csv",
}
]
So far I've tried setting the ResultPath to:
$.Contents[*].Key
$.Contents[*].['Key']
What I get is:
[
"55271f52fffe4461a2ee3228ebb97157/input/batch_1.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_2.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_3.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_4.csv",
"55271f52fffe4461a2ee3228ebb97157/input/batch_5.csv",
]
But I've gotten bad output from that, any help?
The way I've solved this is to use an Inline Map state with a Pass state to build the necessary format. You can see this pattern in an example here for how to use Step Functions Distributed Map to bulk delete objects from S3. You can see this in the inner Create Object Identifier Array Map state. If you were doing this in Standard Workflows, this could be a cost concern given the number of state transitions involved. But since in the Item Processor I'm using Express Workflows, which are billed by duration (and these are super fast), it works pretty well.
{
"Comment": "A state machine to bulk delete objects from S3 using Distributed Map",
"StartAt": "Confirm Bucket Provided",
"States": {
"Confirm Bucket Provided": {
"Type": "Choice",
"Choices": [
{
"Not": {
"Variable": "$.bucket",
"IsPresent": true
},
"Next": "Fail - No Bucket"
}
],
"Default": "Check for Prefix"
},
"Check for Prefix": {
"Type": "Choice",
"Choices": [
{
"Not": {
"Variable": "$.prefix",
"IsPresent": true
},
"Next": "Generate Parameters - Without Prefix"
}
],
"Default": "Generate Parameters - With Prefix"
},
"Generate Parameters - Without Prefix": {
"Type": "Pass",
"Parameters": {
"Bucket.$": "$.bucket",
"Prefix": ""
},
"ResultPath": "$.list_parameters",
"Next": "Delete Objects from S3 Bucket"
},
"Fail - No Bucket": {
"Type": "Fail",
"Error": "InsuffcientArguments",
"Cause": "No Bucket was provided"
},
"Generate Parameters - With Prefix": {
"Type": "Pass",
"Next": "Delete Objects from S3 Bucket",
"Parameters": {
"Bucket.$": "$.bucket",
"Prefix.$": "$.prefix"
},
"ResultPath": "$.list_parameters"
},
"Delete Objects from S3 Bucket": {
"Type": "Map",
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "DISTRIBUTED",
"ExecutionType": "EXPRESS"
},
"StartAt": "Create Object Identifier Array",
"States": {
"Create Object Identifier Array": {
"Type": "Map",
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "INLINE"
},
"StartAt": "Create Object Identifier",
"States": {
"Create Object Identifier": {
"Type": "Pass",
"End": true,
"Parameters": {
"Key.$": "$.Key"
}
}
}
},
"ItemsPath": "$.Items",
"ResultPath": "$.object_identifiers",
"Next": "Delete Objects"
},
"Delete Objects": {
"Type": "Task",
"Next": "Clear Output",
"Parameters": {
"Bucket.$": "$.BatchInput.bucket",
"Delete": {
"Objects.$": "$.object_identifiers"
}
},
"Resource": "arn:aws:states:::aws-sdk:s3:deleteObjects",
"Retry": [
{
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 2,
"IntervalSeconds": 1,
"MaxAttempts": 6
}
],
"ResultSelector": {
"Deleted.$": "$.Deleted",
"RetryCount.$": "$$.State.RetryCount"
}
},
"Clear Output": {
"Type": "Pass",
"End": true,
"Result": {}
}
}
},
"ItemReader": {
"Resource": "arn:aws:states:::s3:listObjectsV2",
"Parameters": {
"Bucket.$": "$.list_parameters.Bucket",
"Prefix.$": "$.list_parameters.Prefix"
}
},
"MaxConcurrency": 5,
"Label": "S3objectkeys",
"ItemBatcher": {
"MaxInputBytesPerBatch": 204800,
"MaxItemsPerBatch": 1000,
"BatchInput": {
"bucket.$": "$.list_parameters.Bucket"
}
},
"ResultSelector": {},
"End": true
}
}
}
need help to parse the JSON data received from Oracle Integration Cloud. The expected output is mentioned below alongwith the command i am trying to use.
JQ command
jq '[{id: .id},{integrations: [.integrations[]|{code: .code, version: .version, dependencies: .dependencies|{connections: .connections[]|{id: .id, status: .status}}, .dependencies|{lookups: .lookups}}]}]' output.json
Error :
jq: error: syntax error, unexpected FIELD (Unix shell quoting issues?) at , line 1:
[{id: .id},{integrations: [.integrations[]|{code: .code, version: .version, dependencies: .dependencies|{connections: .connections[]|{id: .id, status: .status}}, .dependencies|{lookups: .lookups}}]}]
Note : If i run below command to fetch only connections data it works fine
jq '[{id: .id},{integrations: [.integrations[]|{code: .code, version: .version, dependencies: .dependencies|{connections: .connections[]|{id: .id, status: .status}}}]}]' output.json
Expected Output:
[
{
"id": "SAMPLE_PACKAGE"
},
{
"integrations": [
{
"code": "HELLO_INTEGRATION",
"version": "01.00.0000",
"dependencies": {
"connections": {
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
}
}
},
{
"code": "HELLO_INTEGRATIO_LOOKUP",
"version": "01.00.0000",
"dependencies": {
"connections": {
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
},
"lookups": {
"name": "COMMON_LOOKUP_VARIABLES",
"status": "CONFIGURED"
}
}
},
{
"code": "HI_INTEGRATION",
"version": "01.00.0000",
"dependencies": {
"connections": {
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
}
}
}
]
}
]
output.json file contains
{
"bartaType": "DEVELOPED",
"countOfIntegrations": 3,
"id": "SAMPLE_PACKAGE",
"integrations": [
{
"code": "HELLO_INTEGRATION",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"lockedFlag": false,
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED",
"type": "rest",
"usage": 6
}
]
},
"description": "",
"eventSubscriptionFlag": false,
"filmstrip": [
{
"code": "HELLO_WORLD1",
"iconUrl": "/images/rest/rest_icon_46.png",
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED"
}
],
"id": "HELLO_INTEGRATION|01.00.0000",
"lockedFlag": false,
"name": "HELLO_INTEGRATION",
"pattern": "Orchestration",
"patternDescription": "Map Data",
"payloadTracingEnabledFlag": true,
"publishFlag": false,
"scheduleApplicable": false,
"scheduleDefined": false,
"status": "ACTIVATED",
"style": "FREEFORM",
"styleDescription": "Orchestration",
"tempCopyExists": false,
"tracingEnabledFlag": true,
"version": "01.00.0000",
"warningMsg": "ACTIVATE_PUBLISH_NO_CONN"
},
{
"code": "HELLO_INTEGRATIO_LOOKUP",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"lockedFlag": false,
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED",
"type": "rest",
"usage": 6
}
],
"lookups": [
{
"lockedFlag": false,
"name": "COMMON_LOOKUP_VARIABLES",
"status": "CONFIGURED",
"usage": 1
}
]
},
"description": "",
"eventSubscriptionFlag": false,
"filmstrip": [
{
"code": "HELLO_WORLD1",
"iconUrl": "/images/rest/rest_icon_46.png",
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED"
}
],
"id": "HELLO_INTEGRATIO_LOOKUP|01.00.0000",
"lockedFlag": false,
"name": "HELLO_INTEGRATION_LOOKUP",
"pattern": "Orchestration",
"patternDescription": "Map Data",
"payloadTracingEnabledFlag": true,
"publishFlag": false,
"scheduleApplicable": false,
"scheduleDefined": false,
"status": "ACTIVATED",
"style": "FREEFORM",
"styleDescription": "Orchestration",
"tempCopyExists": false,
"tracingEnabledFlag": true,
"version": "01.00.0000",
"warningMsg": "ACTIVATE_PUBLISH_NO_CONN"
},
{
"code": "HI_INTEGRATION",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"lockedFlag": false,
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED",
"type": "rest",
"usage": 6
}
]
},
"description": "",
"eventSubscriptionFlag": false,
"filmstrip": [
{
"code": "HELLO_WORLD1",
"iconUrl": "/images/rest/rest_icon_46.png",
"name": "Hello World1",
"role": "SOURCE",
"status": "CONFIGURED"
}
],
"id": "HI_INTEGRATION|01.00.0000",
"lockedFlag": false,
"name": "HI_INTEGRATION",
"pattern": "Orchestration",
"patternDescription": "Map Data",
"payloadTracingEnabledFlag": true,
"publishFlag": false,
"scheduleApplicable": false,
"scheduleDefined": false,
"status": "ACTIVATED",
"style": "FREEFORM",
"styleDescription": "Orchestration",
"tempCopyExists": false,
"tracingEnabledFlag": true,
"version": "01.00.0000",
"warningMsg": "ACTIVATE_PUBLISH_NO_CONN"
}
],
"isCloneAllowed": false,
"isViewAllowed": false,
"name": "SAMPLE_PACKAGE",
"type": "DEVELOPED"
}
The problem is that the lookups key is not always present so, you cannot use the [] on it. So, instead you can use the map function and provide a default before piping to the map function like below
[
{ id: .id },
{
integrations: [
.integrations[]|{
id: .id,
code: .code,
dependencies: {
connections: (.dependencies.connections//[]|map({id,status}))[0],
lookups: (.dependencies.lookups//[]|map({name,status}))[0]
}
}
]
}
]
The (.dependencies.lookups//[]|map({name,status}))[0] has the effect of passing an empty array to the map function which results in a null value when accessing the first element.
See in action https://jqplay.org/s/zQBkHtnzOd1
The provided JQ statement works fine for single elements in the array , but incase the array contains multiple elements it only fetches the first element. Also i updated the dependencies object to capture all the arrays ( connections,lookups,certificates,libraries,integrations)
Below is the modified one. Please suggest for any better options.
[
{ id: .id },
{
integrations: [
.integrations[]|{
id: .id,
code: .code,
dependencies: {
connections: (.dependencies.connections//[]|map({id,status})),
lookups: (.dependencies.lookups//[]|map({name,status})),
certificates: (.dependencies.certificates//[]|map({id,status})),
libraries: (.dependencies.libraries//[]|map({code,status,version})),
integrations: (.dependencies.integrations//[]|map({code,version}))
}
}
]
}
]|del(..|select(.==[]))
Note: To remove the empty arrays del function is added which is giving the below output :
[
{
"id": "SAMPLE_PACKAGE"
},
{
"integrations": [
{
"id": "HELLO_INTEGRATION|01.00.0000",
"code": "HELLO_INTEGRATION",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
},
{
"id": "HELLO_WORLD2",
"status": "CONFIGURED"
}
]
}
},
{
"id": "HELLO_INTEGRATIO_LOOKUP|01.00.0000",
"code": "HELLO_INTEGRATIO_LOOKUP",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
}
],
"lookups": [
{
"name": "COMMON_LOOKUP_VARIABLES",
"status": "CONFIGURED"
}
]
}
},
{
"id": "HI_INTEGRATION|01.00.0000",
"code": "HI_INTEGRATION",
"dependencies": {
"connections": [
{
"id": "HELLO_WORLD1",
"status": "CONFIGURED"
}
]
}
}
]
}
]
json data as given and have the names of the students in multiple instance like 100 (only 3 given). So, is there a way to give a #defs for a key and value to simplify the schema?
{
"student_id": {
"Alice": 0,
"Bob": 1,
"Charlie": 2,
"Derek": 3,
"Emily": 4,
"Florence": 5
},
"project": {
"Alice": "Science",
"Bob": "Math",
"Charlie": "Science",
"Derek": "Science",
"Emily": "Math",
"Florence": "Math"
},
"summer_camp": {
"Alice": true,
"Bob": false,
"Charlie": true,
"Derek": false,
"Emily": true,
"Florence": false
},
"Data":[
"student_id",
"project",
"summer_camp"
]
}
You can specify the property names in a reusable definition:
{
"$defs": {
"property_names_students": {
"propertyNames": {
"enum": [
"Alice",
"Bob",
...
]
]
}
},
"type": "object",
"properties": {
"student_id": {
"$ref": "#/$defs/property_names_students",
"additionalProperties": {
"type": "integer"
}
},
"project": {
"$ref": "#/$defs/property_names_students",
"additionalProperties": {
"enum": ["Science", "Math", ... ]
}
},
...
}
}
The entire JSON file is rather large so I've only taken out the subsection I've had an issue with.
{
"diagrams": {
"5f759d15cd046720c28531dd": {
"_id": "5f759d15cd046720c28531dd",
"offsetX": 320,
"offsetY": 42,
"zoom": 80,
"modified": 1604279356,
"nodes": {
"5f9f5c3ccd046720c28531e4": {
"nodeID": "5f9f5c3ccd046720c28531e4",
"type": "start",
"coords": [
360,
120
],
"data": {
"name": "Start",
"color": "standard",
"ports": [
{
"type": "",
"target": "5f9f5c3ccd046720c28531e6"
}
],
"steps": []
}
},
"5f9f5c3ccd046720c28531e5": {
"nodeID": "5f9f5c3ccd046720c28531e5",
"type": "block",
"coords": [
760,
120
],
"data": {
"name": "Help Message",
"color": "standard",
"steps": [
"5f9f5c3ccd046720c28531e6",
"5f9f5c3ccd046720c28531e7"
]
}
},
"5f9f5c3ccd046720c28531e6": {
"nodeID": "5f9f5c3ccd046720c28531e6",
"type": "speak",
"data": {
"randomize": false,
"dialogs": [
{
"voice": "Alexa",
"content": "You said help. Do you want to continue?"
}
],
"ports": [
{
"type": "",
"target": "5f9f5c3ccd046720c28531e7"
}
]
}
},
"5f9f5c3ccd046720c28531e7": {
"nodeID": "5f9f5c3ccd046720c28531e7",
"type": "interaction",
"data": {
"name": "Choice",
"else": {
"type": "path",
"randomize": false,
"reprompts": []
},
"choices": [
{
"intent": "",
"mappings": []
},
{
"intent": "",
"mappings": []
}
],
"reprompt": null,
"ports": [
{
"type": "else",
"target": null
},
{
"type": "",
"target": null
},
{
"type": "",
"target": "5f9f5c3ccd046720c28531e9"
}
]
}
},
"5f9f5c3ccd046720c28531e8": {
"nodeID": "5f9f5c3ccd046720c28531e8",
"type": "block",
"coords": [
1170,
260
],
"data": {
"name": "Exit",
"color": "standard",
"steps": [
"5f9f5c3ccd046720c28531e9"
]
}
},
"5f9f5c3ccd046720c28531e9": {
"nodeID": "5f9f5c3ccd046720c28531e9",
"type": "exit",
"data": {
"ports": []
}
}
},
"children": [],
"creatorID": 42661,
"variables": [],
"name": "Help Flow",
"versionID": "5f759d15cd046720c28531db"
}
}
}
The Current JSON Schema Definition I have is:
{
"$schema":"http://json-schema.org/schema#",
"type":"object",
"properties":{
"diagrams":{
"type":"object"
}
},
"required":[
"diagrams",
]
}
The problem I am having is that within diagrams contains multiple objects with a random string as the name e.g "5f759d15cd046720c28531dd".
Then within that object there are properties such as (_id, offsetX) which I want to express as well as a nodes object, which again contains multiple objects with arbitrary names e.g ("5f9f5c3ccd046720c28531e4", "5f9f5c3ccd046720c28531e5", ...) which have a unique node definition where some nodes have different properties to other nodes (nodeID, type, data vs nodeID, type, data, coords).
My question is with all these arbitrary things such as random names as well as different properties per each node. How do I turn it into 1 JSON schema definition which covers all the cases of how a diagram/node can be made.
You can do this with additionalProperties or patternProperties.
additionalProperties applies to any property that isn't declared in properties or patternProperties.
{
"type": "object",
"additionalProperties": {
"type": "object",
"properties": {
"_id": { ... },
"offsetX": { ... },
...
}
}
}
Your property names appear to always be hex numbers. If you want to enforce that those property names are always hex numbers, you can use patternProperties. Any property that matches the regex must conform to that schema.
{
"type": "object",
"patternProperties": {
"^[0-9a-f]{24}$": {
"type": "object",
"properties": {
"_id": { ... },
"offsetX": { ... },
...
}
}
},
"additionalProperties": false
}
I have the following JSON and I need to get id values for instances which do not have type = Jenkins
{
"data": [
{
"id": "35002399-6fd7-40b7-b0d0-8be64e4ec09c",
"name": "94Jenkins",
"url": "http://127.0.0.1:8084",
"authProvider": false,
"siteId": "cce1b6e2-4b5d-4455-ac96-6b5d4c0d901d",
"status": {
"status": "ONLINE"
},
"instanceStateReady": true,
"instanceState": {
"#type": "InstanceStateDto",
"version": "2.60.3"
},
"adminUser": "admin1",
"hasDRConfig": false,
"managed": true,
"type": "JENKINS",
"siteName": "City",
"lastRefreshTime": "2018-04-24T09:43:01.694Z"
},
{
"id": "5cd3caf6-bac1-4f07-8793-5f124b90eaf5",
"name": "RJO",
"url": "http://test.com",
"authProvider": false,
"status": {
"status": "UNAUTHORIZED"
},
"instanceStateReady": true,
"instanceState": {
"#type": "numberOfArtifacts",
"version": "5.5.2-m002",
"licenses": {
"RJO : artrjo-m": {
"type": "ENTERPRISE",
"validThrough": "Jun 12, 2021",
"licensedTo": "Test",
"licenseHash": "asdadsdb612bda1aae745bd2a3",
"expired": false
},
"RJO : artrjo-s1": {
"type": "ENTERPRISE",
"validThrough": "Jun 12, 2021",
"licensedTo": "JFrog",
"licenseHash": "asaswca236350205a3798c0fa3",
"expired": false
}
}
},
"adminUser": "jfmc",
"hasDRConfig": false,
"managed": false,
"warnings": [
"Site is missing",
"Failed to connect to the service. Please verify that the service information provided is correct."
],
"type": "ARTIFACTORY"
},
{
"id": "0727a49a-6c95-433e-9fc5-7e5c760cc76f",
"name": "NinetyTwo",
"url": "http:127.0.0.1:8081",
"authProvider": true,
"siteId": "cce1b6e2-4b5d-4455-ac96-6b5d4c0d901d",
"status": {
"status": "ONLINE"
},
"instanceStateReady": true,
"instanceState": {
"#type": "numberOfArtifacts",
"version": "5.9.0",
"licenses": {
"NinetyTwo": {
"type": "ENTERPRISE",
"validThrough": "Dec 30, 2018",
"licensedTo": "Test",
"licenseHash": "qweqwed95f712dbabee98184da52443",
"expired": false
}
}
},
"adminUser": "admin",
"hasDRConfig": false,
"managed": true,
"type": "ARTIFACTORY",
"serviceId": "jfrt#01c7g4c7hq0dpd0qa71r8c09sj",
"siteName": "Test",
"lastRefreshTime": "2018-04-24T09:43:01.698Z"
}
]
}
And I use $..[?(#.type!='JENKINS')].id path to receive id-s which relate to the instances with type NOT JENKINS, however JSON Extractor returns me the Jenkins's id too. The question is how can I receive ids for non-jenkins instances only?
Replace your JSON path expression with the below and it should work:
$.data.[*][?(#.type != "JENKINS")].id