It there is a way to process the result of States.StringToJson intesic function directly ?
Currently in a step function, I try to handle the error from another synchronous step function call :
"OtherStepFunction": {
"Type": "Task",
"Resource": "arn:aws:states:::states:startExecution.sync:2",
"Parameters": {
"StateMachineArn": "otherstepFunctionCall",
"Input.$": "$"
},
"End": true,
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"Comment": "OtherStepFunctionFailed",
"Next": "StatusStepFunctionFailed",
"ResultPath": "$.error"
}
]
},
All errors goes in a pass flow named StatusStepFunctionFailed, with the errors output in $.error path.
The $.error is composed of the error type and the cause as an escapedJson string.
"error": {
"Error": "States.TaskFailed",
"Cause": "{\"ExecutionArn\":\"otherfunctionarm:executionid\",\"Input\":\"foooooo\"}"
}
Is there any way to extract only the ExecutionARN from this input ? In my pass step, I convert the Cause path as a json, but i didn't find a way to select directly the ExectionARN part. The following :
"reason.$": "States.JsonMerge($.error.Cause).ExecutionArn"
return The value for the field 'reason.$' must be a valid JSONPath or a valid intrinsic function call (at /States/HandleResource/Iterator/States/StatusStepFunctionFailedHandleJSON/Parameters)
My current workaround is to use 2 pass flow, first convert the output and then formating.
I had a similar issue.
What I did was create a task to put the Cause into a new path parameter using StringToJSON. I put that task as the next from the error and then called the subsequent task from that one.
Using your variable names and values:
In the Catch, change the Next from StatusStepFunctionFailed to parseErrorCause
Then parseErrorCause is like this:
"parseErrorCause": {
"Type": "Pass",
"Parameters": {
"Result.$": "States.StringToJson($.error.Cause)"
},
"ResultPath": "$.parsedJSON",
"Next": "StatusStepFunctionFailed"
},
And StatusStepFunctionFailed accesses
"Variable": "$.parsedJSON.Result.Input",
to get foooooo
Related
{
"metadata": {
"id": "2",
"uri": "3",
"type": "2"
},
"Number": "2323600002913",
"Date": "04/21/2009",
"postingDate": "00/00/0000",
"ata": {
"results": [
{
"metadata": {
"id": "r",
"uri": "e2",
"type": "s2"
},
"item": "000010",
"data":"ad"
}
]
}
}
want to remove metadata property from above json message and output should be like below
{
"Number": "2323600002913",
"Date": "04/21/2009",
"postingDate": "00/00/0000",
"ata": {
"results": [
{
"item": "000010",
"data":"ad"
}
]
}
}
I tried with removeProperty() which is working for root level metadata but inside metadata not removed.
how to use replace() in this case or anything else to only remove metadata.
The simplest way is use inline code, cause even with removeProperty() expression to remove the metadata under results, it will return the results array data not the whole json data. Then you will have to combine them, it's not a convenient way.
And with inline code you could refer to my below picture. The variable json is the value from triggerbody, then just delete the node or key and return the json variable. And with this way, even you want to delete many metadata in the array, you could add a for loop to delete it, just think of it as plain js code.
Update:if you want to get value from variable,cause no support expression to get value from variable so use the below expression.
var json =wworkflowContext.actions.Initialize_variable.inputs.variables[0].value;
And about how to loop the array in the json refer to my below pic.
I am trying to use an example from
https://www.elastic.co/guide/en/elasticsearch/reference/6.4/modules-scripting-using.html
I have created a function and saved it.
POST http://localhost:9200/_scripts/calculate-score
{
"script": {
"lang": "painless",
"source": "ctx._source.added + params.my_modifier"
}
}
Try to call saved function
POST http://localhost:9200/users/user/_search
{
"query": {
"script": {
"script": {
"id": "calculate-score",
"params": {
"my_modifier": 2
}
}
}
}
}
And it returns an error: Variable [ctx] is not defined. I tried to use doc['added'] but received the same error. Please help me understand how to call the function.
You should try using doc['added'].value, let me explain you why and how. In short, because painless scripting language is rather simple but obscure.
Why can't ES find ctx variable?
The reason it cannot find ctx variable is because this painless script runs in "filter context" and such variable is not available in filter context. (If you are curious, there were 18 types of painless context as of ES 6.4).
In filter context there are only two variables available:
params (Map, read-only)
User-defined parameters passed in as part of the query.
doc (Map, read-only)
Contains the fields of the current document where each field is a List of values.
It should be enough to use doc['added'].value in your case:
POST /_scripts/calculate-score
{
"script": {
"lang": "painless",
"source": "doc['added'].value + params.my_modifier"
}
}
Should, because there will be another problem if we try to execute it (exactly like you did):
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"doc['added'].value + params.my_modifier",
"^---- HERE"
],
"script": "calculate-score",
"lang": "painless",
"caused_by": {
"type": "class_cast_exception",
"reason": "cannot cast def [long] to boolean"
}
Because of its context, this script is expected to return a boolean:
Return
boolean
Return true if the current document should be returned as a
result of the query, and false otherwise.
At this point we can understand why the script you were trying to execute did not make much sense for Elasticsearch: it is supposed to tell if a document matches a script query or not. If a script returns an integer, Elasticsearch wouldn't know if it is true or false.
How to make a stored script work in filter context?
As an example we can use the following script:
POST /_scripts/calculate-score1
{
"script": {
"lang": "painless",
"source": "doc['added'].value > params.my_modifier"
}
}
Now we can access the script:
POST /users/user/_search
{
"query": {
"script": {
"script": {
"id": "calculate-score1",
"params": {
"my_modifier": 2
}
}
}
}
}
And it will return all documents where added is greater than 2:
"hits": [
{
"_index": "users",
"_type": "user",
"_id": "1",
"_score": 1,
"_source": {
"name": "John Doe",
"added": 40
}
}
]
This time the script returned a boolean and Elasticsearch managed to use it.
If you are curious, range query can do the same job, without scripting.
Why do I have to put .value after doc['added']?
If you try to access doc['added'] directly you may notice that the error message is different:
POST /_scripts/calculate-score
{
"script": {
"lang": "painless",
"source": "doc['added'] + params.my_modifier"
}
}
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"doc['added'] + params.my_modifier",
" ^---- HERE"
],
"script": "calculate-score",
"lang": "painless",
"caused_by": {
"type": "class_cast_exception",
"reason": "Cannot apply [+] operation to types [org.elasticsearch.index.fielddata.ScriptDocValues.Longs] and [java.lang.Integer]."
}
Once again painless shows us its obscurity: when accessing the field 'added' of the document, we obtain an instance of org.elasticsearch.index.fielddata.ScriptDocValues.Longs, which Java Virtual Machine denies to add to an integer (we can't blame Java here).
So we have to actually call .getValue() method, which, translated in painless, is simply .value.
What if I want to change that field in a document?
What if you want to add 2 to field added of some document, and save the updated document? Update API can do this.
It operates in update context, which actually has got ctx variable defined, which in turn has access to the original JSON document via ctx['_source'].
We might create a new script:
POST /_scripts/add-some
{
"script": {
"lang": "painless",
"source": "ctx['_source']['added'] += params.my_modifier"
}
}
Now we can use it:
POST /users/user/1/_update
{
"script" : {
"id": "add-some",
"params" : {
"my_modifier" : 2
}
}
}
Why the example from the documentation doesn't work?
Apparently, because it is wrong. This script (from this documentation page):
POST _scripts/calculate-score
{
"script": {
"lang": "painless",
"source": "Math.log(_score * 2) + params.my_modifier"
}
}
is later executed in filter context (in a search request, in a script query), and, as we now know, there is no _score variable available.
This script would kind of make sense only in score context, when running a funtion_score query which allows to twiggle the relevance score of the documents.
Final note
I would like to mention that in general, it's recommended to avoid using scripts because their performance is poor.
In my datafactory pipeline I hava a web activity which is giving below JSON response. In the next stored procedure activity I am unable parse the output parameter. I tried few methods.
I have set Content-Type application/json in web activity
Sample JSON:
Output
{
"Response": "[{\"Message\":\"Number of barcode(s) found:1\",\"Status\":\"Success\",\"CCS Office\":[{\"Name\":\"Woodstock\",\"CCS Description\":null,\"BranchType\":\"Sub CFS Office\",\"Status\":\"Active\",\"Circle\":\"NJ\"}]}]"
}
For parameter in stored procedure activity:
#json(first(activity('Web1').output.Response))
output - System.Collections.Generic.List`1[System.Object]
#json(activity('Web1').output.Response[0])
output - cannot be evaluated because property '0' cannot be selected. Property selection is not supported on values of type 'String'
#json(activity('Web1').output.Response.Message)
output - cannot be evaluated because property 'Message' cannot be selected. Property selection is not supported on values of type 'String'
Here is what I did:
I created a new pipeline, and created a parameter of type 'object' using your 'output' in its entirety:
{ "Response": "[{\"Message\":\"Number of barcode(s) found:1\",\"Status\":\"Success\",\"CCS Office\":[{\"Name\":\"Woodstock\",\"CCS Description\":null,\"BranchType\":\"Sub CFS Office\",\"Status\":\"Active\",\"Circle\":\"NJ\"}]}]" }
I created a variable and setVariable activity. Variable is of type string. The dynamic expression I used is:
#{json(pipeline().parameters.output.response)[0]}
Let me break down and explain. The {curly braces} were necessary because variable is of type string. You may not want/need them.
json(....)
was necessary because data type for the value of 'response' was left as a string. Whether it being string is correct behavior or not is a different discussion. By converting from string to json, I can now do the final piece.
[0]
Now works because Data Factory sees the contents as an objects rather than string literal. This conversion seems to have been applied to the nested contents as well, because without the encapsulating {curly braces} to convert to string, I would get a type error from my setVariable activity, as the variable is of type string.
Entire pipeline code:
{
"name": "pipeline11",
"properties": {
"activities": [
{
"name": "Set Variable1",
"type": "SetVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "thing",
"value": {
"value": "#{json(pipeline().parameters.output.response)[0]}",
"type": "Expression"
}
}
}
],
"parameters": {
"output": {
"type": "object",
"defaultValue": {
"Response": "[{\"Message\":\"Number of barcode(s) found:1\",\"Status\":\"Success\",\"CCS Office\":[{\"Name\":\"Woodstock\",\"CCS Description\":null,\"BranchType\":\"Sub CFS Office\",\"Status\":\"Active\",\"Circle\":\"NJ\"}]}]"
}
}
},
"variables": {
"thing": {
"type": "String"
}
},
"annotations": []
}
}
I had the similar problem and this is how I resolved the issue.
I passed the value of Response as a string to lookup activity which calls a stored procedure in Azure SQL. The stored procedure parses the string using Json_value and return the individual key, value as a row. Now output of lookup activity can be accessed directly from preceding activities.
supposing the json body returned from a call contains some dynamic keys ie
{
"message": "search results matching criteria",
"permission": {
"261ef70e-0a95-4967-b078-81e657e32699": {
"device": {
"read:own": [
"*"
]
},
"account": {
"read:own": [
"*"
]
},
"user": {
"read:own": [
"*"
]
}
}
}
I can validate the json as follows easily enough although I am having a lot of trouble working out how to validate the objects BELOW the dynamic guid level of the response.
pm.test("response body to have correct items", function () {
pm.expect(jsonData.message).to.eq("search results matching criteria");
pm.expect(jsonData).to.have.property('permission');
pm.expect(jsonData.permission).to.have.property(pm.variables.get("otherUserId"));
});
Would ideally like to verify the device and account and user levels of the object.
Anyone with some tips?
I've tried a few ways to try and reference the otherUserId variable but nothing is working. It is either not resolving the variable therefore failing the test as its looking for a level in the json called otherUserId or it fails to run the test due to a syntax error.
This works:
pm.expect(jsonData.permission[pm.variables.get("otherUserId")]).to.have.property('device');
I have following flow in NIFI , JSON has (1000+) objects in it.
invokeHTTP->SPLIT JSON->putMongo
Flow works fine, till I receive some keys in json with "." in the name. e.g. "spark.databricks.acl.dfAclsEnabled".
my current solution is not optimal, I have jotted down bad keys, and using multiple replace text processor to replace "." with "_". I am not using REGEX, I am using string literal find/replace. So each time I am getting failure in putMongo processor, I am inserting new replaceText processor.
This is not maintainable. I am wondering if I can use JOLT for this? couple of info regarding input JSON.
1) no set structure, only thing that is confirmed is. everything will be in events array. But event object itself is free form.
2) maximum list size = 1000.
3) 3rd party JSON, so I cant ask for change in format.
Also, key with ".", can appear anywhere. So I am looking for JOLT spec that can cleanse at all level and then rename it.
{
"events": [
{
"cluster_id": "0717-035521-puny598",
"timestamp": 1531896847915,
"type": "EDITED",
"details": {
"previous_attributes": {
"cluster_name": "Kylo",
"spark_version": "4.1.x-scala2.11",
"spark_conf": {
"spark.databricks.acl.dfAclsEnabled": "true",
"spark.databricks.repl.allowedLanguages": "python,sql"
},
"node_type_id": "Standard_DS3_v2",
"driver_node_type_id": "Standard_DS3_v2",
"autotermination_minutes": 10,
"enable_elastic_disk": true,
"cluster_source": "UI"
},
"attributes": {
"cluster_name": "Kylo",
"spark_version": "4.1.x-scala2.11",
"node_type_id": "Standard_DS3_v2",
"driver_node_type_id": "Standard_DS3_v2",
"autotermination_minutes": 10,
"enable_elastic_disk": true,
"cluster_source": "UI"
},
"previous_cluster_size": {
"autoscale": {
"min_workers": 1,
"max_workers": 8
}
},
"cluster_size": {
"autoscale": {
"min_workers": 1,
"max_workers": 8
}
},
"user": ""
}
},
{
"cluster_id": "0717-035521-puny598",
"timestamp": 1535540053785,
"type": "TERMINATING",
"details": {
"reason": {
"code": "INACTIVITY",
"parameters": {
"inactivity_duration_min": "15"
}
}
}
},
{
"cluster_id": "0717-035521-puny598",
"timestamp": 1535537117300,
"type": "EXPANDED_DISK",
"details": {
"previous_disk_size": 29454626816,
"disk_size": 136828809216,
"free_space": 17151311872,
"instance_id": "6cea5c332af94d7f85aff23e5d8cea37"
}
}
]
}
I created a template using ReplaceText and RouteOnContent to perform this task. The loop is required because the regex only replaces the first . in the JSON key on each pass. You might be able to refine this to perform all substitutions in a single pass, but after fuzzing the regex with the look-ahead and look-behind groups for a few minutes, re-routing was faster. I verified this works with the JSON you provided, and also JSON with the keys and values on different lines (: on either):
...
"spark_conf": {
"spark.databricks.acl.dfAclsEnabled":
"true",
"spark.databricks.repl.allowedLanguages"
: "python,sql"
},
...
You could also use an ExecuteScript processor with Groovy to ingest the JSON, quickly filter all JSON keys that contain ., perform a collect operation to do the replacement, and re-insert the keys in the JSON data if you want a single processor to do this in a single pass.