Querying in elasticsearch to get a json object as output - json

{
"key":
[
{
"text": "I noticed when you came up he's even",
"start": 0.0,
"duration": 4.68
},
{
"text": "playing with a little bit your your",
"start": 3.3,
"duration": 5.07
}
]
}
I have huge amount of files in this format and i have to write a query for only getting the text
and its start.But,I'm getting the whole file in which the text is present as output.
if i write the query like this:
{
"query":
{
"term":
{
"key.text": "came"
}
}
}
then i should get only
{
"text": "I noticed when you came up he's even",
"start": 0.0,
"duration": 4.68
}
as output but Instead im getting the whole file as output.

You need to use the inner hits of nested query to fetch only the item in a array matches your search query.

Related

Robot Framework how to count item list in JSON

I would like to count item "Start" from JSON APIs on Robot Framework
{
"result": {
"api": "xxx",
"timestamp": "14:41:18",
"series": [
{
"series_code": "test",
"series_name_eng": "test",
"unit_eng": "t",
"series_type": "e",
"frequency": "s",
"last_update_date": "2020",
"observations": [
{
"start": "2020-01",
"value": "999"
},
{
"start": "2020-02",
"value": "888"
},
{
"start": "2020-03",
"value": "777"
},
]
}
I use this not working
${json_string} Get File ./example.json
${json_object} evaluate json.loads('''${json_string}''') json
#${value}= get value from json ${json_object} $.result.series[0].observations
${x_count} Get Length ${json_object["$.result.series[0].observations"]}
Could you please help guide to for how?
The Json provided in the example above is not valid one. That will need to be fixed by closing series array ] then close the results object } and then close the outer object }
Valid json will look like this -
${Getjson}= {"result":{"api":"xxx","timestamp":"14:41:18","series":[{"series_code":"test","series_name_eng":"test","unit_eng":"t","series_type":"e","frequency":"s","last_update_date":"2020","observations":[{"start":"2020-01","value":"999"},{"start":"2020-02","value":"888"},{"start":"2020-03","value":"777"}]}]}}
You were close with this jsonpath $.result.series[0].observations. The correct one is in below example -
${json}= Convert String to JSON ${Getjson}
#{Start}= Get Value From Json ${json} $.result.series[?(#.observations)].observations[?(#.start)].start
${length} Get length ${Start}
log ${length}
Output:-

Json Hive: Need a function to find length of list in json

I have a json list which has to be unlisted. Right now I am using get_json_object() but it is a cumber-son process as I don't know the length of list . In the code I pasted it is 3 but it can got up to 50. How Do i know the latest version number
get_json_object(description,'$.data.info[3].version')
I need the number 3 or version 1567954261
{
"data":
{
"info": [
{ "version": 1567904215, "amount": 1987 },
{ "version": 1567943816, "amount": 169 },
{ "version": 1567954261, "amount": 3034 }
]
}
}

Elastic search filter for distinct categories

I made a simple mapping with three fields and i am analyzing one field which is text type and other fields are keyword type.
example
fields: Category_one, Category_two, Category_three.
Now i am searching the documents.
Get _search/cat
{
"size": 4,
"query": {
"match": {
"Category_one.ngrams": {
"query": "Nice food place in XYZ location",
"analyzer": "standard"
}
},
"aggs":{
"distincr_values":{
"terms": {
"fields" : "Category_two"
}
}
}
}
}
It's showing this error
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[match] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 10,
"col": 5
}
],
"type": "parsing_exception",
"reason": "[match] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 10,
"col": 5
},
"status": 400
}
Kindly help me with this error. My main motive is to find distinct searches according Category_two field.
Any help would be appreciated.
I believe youre getting this error because of your query structure.
Your aggregations keyword must be outside (same level as) the query. At the moments your aggs is wrapped up inside the query.
Following this structure:
Get _search/cat
{
"size": 4,
"query": {
'query goes here'
},
"aggs":{
'aggregation go here'
}
}

How can I can improve raw JSON data in order to use it?

I'm trying to use some results exported in JSON of a script called "Mixed Content Scan" (it's a script in order to search on a website if there is some mixed HTTP/HTTPS content and if all your pages are ok in HTTPS).
I'm a beginner with JSON, I read and watched a lot of tutorials in order to understand how to structure JSON data but I'm stumbling on something.
Here is a sample of my data (first 3 lines) :
{"message":"Scanning https://mywebsite.com/","context":[],"level":250,"level_name":"NOTICE","channel":"MCS","datetime":{"date":"2018-10-05 23:48:50.268196","timezone_type":3,"timezone":"America/New_York"},"extra":[]}
{"message":"00000 - https://mywebsite.com/","context":[],"level":400,"level_name":"ERROR","channel":"MCS","datetime":{"date":"2018-10-05 23:48:50.760948","timezone_type":3,"timezone":"America/New_York"},"extra":[]}
{"message":"http://mywebsite.com/wp-content/uploads/2015/03/image.jpg","context":[],"level":300,"level_name":"WARNING","channel":"MCS","datetime":{"date":"2018-10-05 23:48:50.761082","timezone_type":3,"timezone":"America/New_York"},"extra":[]}
I know I need to wrap my data around some {} or [] (tried both), but I think I'm missing something, for example, every JSON data validator websites are telling me that I have an error between 2 lines when I add a "," when I try to have multiple results into it.
How can I upgrade this raw data in order for a JSON validator to validate it?
Thanks!
How's this
[{
"message": "Scanning https://mywebsite.com/",
"context": [],
"level": 250,
"level_name": "NOTICE",
"channel": "MCS",
"datetime": {
"date": "2018-10-05 23:48:50.268196",
"timezone_type": 3,
"timezone": "America/New_York"
},
"extra": []
}, {
"message": "00000 - https://mywebsite.com/",
"context": [],
"level": 400,
"level_name": "ERROR",
"channel": "MCS",
"datetime": {
"date": "2018-10-05 23:48:50.760948",
"timezone_type": 3,
"timezone": "America/New_York"
},
"extra": []
}, {
"message": "http://mywebsite.com/wp-content/uploads/2015/03/image.jpg",
"context": [],
"level": 300,
"level_name": "WARNING",
"channel": "MCS",
"datetime": {
"date": "2018-10-05 23:48:50.761082",
"timezone_type": 3,
"timezone": "America/New_York"
},
"extra": []
}]
Entries in an array need to be separated by commas.

JSON Slurper Offsets

I have a large JSON file that I'm trying to parse with JSON Slurper. The JSON file consists of information about bugs so it has things like issue keys, descriptions, and comments. Not every issue has a comment though. For example, here is a sample of what the JSON input looks like:
{
"projects": [
{
"name": "Test Project",
"key": "TEST",
"issues": [
{
"key": "BUG-1",
"priority": "Major",
"comments": [
{
"author": "a1",
"created": "d1",
"body": "comment 1"
},
{
"author": "a2",
"created": "d2",
"body": "comment 2"
}
]
},
{
"key": "BUG-2",
"priority": "Major"
},
{
"key": "BUG-3",
"priority": "Major",
"comments": [
{
"author": "a3",
"created": "d3",
"body": "comment 3"
}
]
}
]
}
]
}
I have a method that creates Issue objects based on the JSON parse. Everything works well when every issue has at least one comment, but, once an issue comes up that has no comments, the rest of the issues get the wrong comments. I am currently looping through the JSON file based on the total number of issues and then looking for comments using how far along in the number of issues I've gotten. So, for example,
parsedData.issues.comments.body[0][0][0]
returns "comment 1". However,
parsedData.issues.comments.body[0][1][0]
returns "comment 3", which is incorrect. Is there a way I can see if a particular issue has any comments? I'd rather not have to edit the JSON file to add empty comment fields, but would that even help?
You can do this:
parsedData.issues.comments.collect { it?.body ?: [] }
So it checks for a body and if none exists, returns an empty list
UPDATE
Based on the update to the question, you can do:
parsedData.projects.collectMany { it.issues.comments.collect { it?.body ?: [] } }