Search inside JSON with Elastic - json

I have an index/type in ES which has the following type of records:
body "{\"Status\":\"0\",\"Time\":\"2017-10-3 16:39:58.591\"}"
type "xxxx"
source "11.2.21.0"
The body field is a JSON.So I want to search for example the records that have in their JSON body Status:0.
Query should look something like this(it doesn't work):
GET <host>:<port>/index/type/_search
{
"query": {
"match" : {
"body" : "Status:0"
}
}
}
Any ideas?

You have to change the analyser settings of your index.
For the JSON pattern you presented you will need to have a char_filter and a tokenizer which remove the JSON elements and then tokenize according to your needs.
Your analyser should contain a tokenizer and a char_filter like these ones here:
{
"tokenizer" : {
"type": "pattern",
"pattern": ","
},
"char_filter" : [ {
"type" : "mapping",
"mappings" : [ "{ => ", "} => ", "\" => " ]
} ],
"text" : [ "{\"Status\":\"0\",\"Time\":\"2017-10-3 16:39:58.591\"}" ]
}
Explanation: the char_filter will remove the characters: { } ". The tokenizer will tokenize by the comma.
These can be tested using the Analyze API. If you execute the above JSON against this API you will get these tokens:
{
"tokens" : [ {
"token" : "Status:0",
"start_offset" : 2,
"end_offset" : 13,
"type" : "word",
"position" : 0
}, {
"token" : "Time:2017-10-3 16:39:58.591",
"start_offset" : 15,
"end_offset" : 46,
"type" : "word",
"position" : 1
} ]
}
The first token ("Status:0") which is retrieved by the Analyze API is the one you were using in your search.

Related

How to retrieve all key-value pairs avoiding key duplication from JSON in Groovy script

I am totally new to groovy script and would like some help to solve this out. I have a JSON response I want to manipulate and get desired parameters back by avoiding duplication. The Json response does not have indexes like 0,1,2.. that I can iterate through.
Here is the response that I want to work with:
{
"AuthenticateV2" : {
"displayName" : "Verification of authentication",
"description" : "notification ",
"smsTemplate" : "authentication.v2.0_sms",
"emailHeaderTemplate" : "v2.0_header",
"emailBodyTemplate" : "html",
"parameters" : {
"displayName" : "USER_DISPLAY_NAME",
"actionTokenURL" : "VERIFICATION_LINK",
"customToken" : "VERIFICATION_CODE"
},
"supportedPlans" : [
"connectGo"
]
},
"PasswordRecovery" : {
"displayName" : "Verification of password recovery",
"description" : "notification",
"smsTemplate" : "recovery.v1.0_sms",
"emailHeaderTemplate" : "recovery.v1.0_header",
"emailBodyTemplate" : "recovery.v1.0_body_html",
"parameters" : {
"displayName" : "USER_DISPLAY_NAME",
"actionTokenURL" : "VERIFICATION_LINK",
"customToken" : "VERIFICATION_CODE",
"adminInitiated" : false,
"authnId" : "AUTHENTICATION_IDENTIFIER",
"authnType" : "EMAIL",
"user" : {
"displayName" : "USER_DISPLAY_NAME"
}
},
"supportedPlans" : [
"connectGo"
]
},
"PasswordReset" : {
"displayName" : "password reset",
"description" : "notification",
"smsTemplate" : "recovery.v1.0_sms",
"emailHeaderTemplate" : "recovery.v1.0_header",
"emailBodyTemplate" : "html",
"parameters" : {
"displayName" : "USER_DISPLAY_NAME",
"user" : {
"displayName" : "USER_DISPLAY_NAME"
}
}
The expected output that I want to have:
{
"displayName" : "USER_DISPLAY_NAME",
"actionTokenURL" : "VERIFICATION_LINK",
"customToken" : "VERIFICATION_CODE",
"customToken" : "VERIFICATION_CODE",
"adminInitiated" : false,
"authnId" : "AUTHENTICATION_IDENTIFIER",
"authnType" : "EMAIL"
}
I need to retrieve all fields under parameters tag and also want to avoid duplication
You should first get familiar with parsing and producing JSON in Groovy.
Then, assuming the provided response is a valid JSON (it's not - there are 2 closing curlies (}) missing at the end) to get all the parameters keys merged into one JSON we have to convert the JSON string into a Map object first using JsonSlurper:
def validJsonResponse = '<your valid JSON string>'
Map parsedResponse = new JsonSlurper().parseText(validJsonResponse) as Map
Now, when we have a parsedResponse map we can iterate over all the root items in the response and transform them into the desired form (which is all the unique parameters keys) using Map::collectEntries method:
Map uniqueParameters = parsedResponse.collectEntries { it.value['parameters'] }
Finally, we can convert the uniqueParameters result back into a pretty printed JSON string using JsonOuput:
println JsonOutput.prettyPrint(JsonOutput.toJson(uniqueParameters))
After applying all the above we'll get the output
{
"displayName": "USER_DISPLAY_NAME",
"actionTokenURL": "VERIFICATION_LINK",
"customToken": "VERIFICATION_CODE",
"adminInitiated": false,
"authnId": "AUTHENTICATION_IDENTIFIER",
"authnType": "EMAIL",
"user": {
"displayName": "USER_DISPLAY_NAME"
}
}
If you want to get rid of user entry from the final output just remove it from the resulting uniqueParameters map (uniqueParameters.remove('user')) before converting it back to JSON string.

Splitting Json to multiple jsons in NIFI

I have the below json file which I want to split in NIFI
Input:
[ {
"id" : 123,
"ticket_id" : 345,
"events" : [ {
"id" : 3322,
"type" : "xyz"
}, {
"id" : 6675,
"type" : "abc",
"value" : "sample value",
"field_name" : "subject"
}, {
"id" : 9988,
"type" : "abc",
"value" : [ "text_file", "json_file" ],
"field_name" : "tags"
}]
}]
and my output should be 3 different jsons like below:
{
"id" : 123,
"ticket_id" : 345,
"events.id" :3322,
"events.type":xyz
}
{
"id" : 123,
"ticket_id" : 345,
"events.id" :6675,
"events.type":"abc",
"events.value": "sample value"
"events.field_name":"subject"
}
{
"id" : 123,
"ticket_id" : 345,
"events.id" :9988,
"events.type":"abc",
"events.value": "[ "text_file", "json_file" ]"
"events.field_name":"tags"
}
I want to know can we do it using splitjson? I mean can splitjson split the json based on the array of json objects present inside the json?
Please let me know if there is a way to achieve this.
If you want 3 different flow files, each containing one JSON object from the array, you should be able to do it with SplitJson using a JSONPath of $ and/or $.*
Using reduce function:
function split(json) {
return json.reduce((acc, item) => {
const events = item.events.map((evt) => {
const obj = {id: item.id, ticket_id: item.ticket_id};
for (const k in evt) {
obj[`events.${k}`] = evt[k];
}
return obj;
});
return [...acc, ...events];
}, []);
}
const input = [{"id":123,"ticket_id":345,"events":[{"id":3322,"type":"xyz"},{"id":6675,"type":"abc","value":"sample value","field_name":"subject"},{"id":9988,"type":"abc","value":["text_file","json_file"],"field_name":"tags"}]}];
const res = split(input);
console.log(res);

How to get entire parent node using jq json parser?

I am trying to find a value in the json file and based on that I need to get the entire json data instead of that particular block.
Here is my sample json
[{
"name" : "Redirect to Website 1",
"behaviors" : [ {
"name" : "redirect",
"options" : {
"mobileDefaultChoice" : "DEFAULT",
"destinationProtocol" : "HTTPS",
"destinationHostname" : "SAME_AS_REQUEST",
"responseCode" : 302
}
} ],
"criteria" : [ {
"name" : "requestProtocol",
"options" : {
"value" : "HTTP"
}
} ],
"criteriaMustSatisfy" : "all"
},
{
"name" : "Redirect to Website 2",
"behaviors" : [ {
"name" : "redirect",
"options" : {
"mobileDefaultChoice" : "DEFAULT",
"destinationProtocol" : "HTTPS",
"destinationHostname" : "SAME_AS_REQUEST",
"responseCode" : 301
}
} ],
"criteria" : [ {
"name" : "contentType",
"options" : {
"matchOperator" : "IS_ONE_OF",
"values" : [ "text/html*", "text/css*", "application/x-javascript*" ],
}
} ],
"criteriaMustSatisfy" : "all"
}]
I am trying to match for "name" : "redirect" inside each behaviors array and if it matches then I need the entire block including the "criteria" section, as you can see its under same block {}
I managed to find the values using select methods but not able to get the parent section.
https://jqplay.org/s/BWJwVdO3Zv
Any help is much appreciated!
To avoid unwanted duplication:
.[]
| first(select(.behaviors[].name == "redirect"))
Equivalently:
.[]
| select(any(.behaviors[]; .name == "redirect"))
You can try this jq command:
<file jq 'select(.[].behaviors[].name=="redirect")'

How to find all the json key-value pair by matching the value using json query

I have below JSON structure :
{
"key" : "value",
"array" : [
{ "key" : 1 },
{ "key" : 2, "misc": {
"a": "Apple",
"b": "Butterfly",
"c": "Cat",
"d": "Dog"
} },
{ "key" : 3 }
],
"tokenize" : {
"firstkey" : {
"token" : 0
},
"secondkey" : {
"token" : 1
},
"thirdkey" : {
"token" : 0
}
}
}
I am able to traverse the above structure till array->dictionary->b by the below syntax :
$.array[?(#.key=2)].misc.b
Now I need to print all the tokens which has value 0. The same way as shown above I can traverse till $.array[?(#.key=2)].tokenize.
How can I query it to print all values having token:0 .
To be very precise, I want the output to be shown as :
[
"tokenize" : {
"firstkey" : {
"token" : 0
},
"thirdkey" : {
"token" : 0
}
}
]
The following query already showing something near to what I want but it does not show the keys ("firstkey" and "thirdkey" in this case).
$.tokenize[?(#.token == 0)]
Please help me to get this as well.
Thanks.
You can try this script.
$.tokenize[?(#.token == 0)].token
Result:
[
0,
0
]
$.tokenize[?(#.token == 0)]~
will output
[
"firstkey",
"thirdkey"
]
for the OP's sample json, use https://jsonpath-plus.github.io/JSONPath/demo/ to verify against your data.

Mongolite group by/aggregate on JSON object

I have a json document like this on my mongodb collection:
Updated document:
{
"_id" : ObjectId("59da4aef8c5d757027a5a614"),
"input" : "hi",
"output" : "Hi. How can I help you?",
"intent" : "[{\"intent\":\"greeting\",\"confidence\":0.8154089450836182}]",
"entities" : "[]",
"context" : "{\"conversation_id\":\"48181e58-dd51-405a-bb00-c875c01afa0a\",\"system\":{\"dialog_stack\":[{\"dialog_node\":\"root\"}],\"dialog_turn_counter\":1,\"dialog_request_counter\":1,\"_node_output_map\":{\"node_5_1505291032665\":[0]},\"branch_exited\":true,\"branch_exited_reason\":\"completed\"}}",
"user_id" : "50001",
"time_in" : ISODate("2017-10-08T15:57:32.000Z"),
"time_out" : ISODate("2017-10-08T15:57:35.000Z"),
"reaction" : "1"
}
I need to perform group by on intent.intent field and I'm using Rstudio with mongolite library.
What I have tried is :
pp = '[{"$unwind": "$intent"},{"$group":{"_id":"$intent.intent", "count": {"$sum":1} }}]'
stats <- chat$aggregate(
pipeline=pp,
options = '{"allowDiskUse":true}'
)
print(stats)
But it's not working, output for above code is
_id count
1 NA 727
If intent attribute type is string and keep the object as string.
We can split it to array with \" and use third item of array.
db.getCollection('test1').aggregate([
{ "$project": { intent_text : { $arrayElemAt : [ { $split: ["$intent", "\""] } ,3 ] } } },
{ "$group": {"_id": "$intent_text" , "count": {"$sum":1} }}
])
Result:
{
"_id" : "greeting",
"count" : 1.0
}