Elastic search filter for distinct categories - json

I made a simple mapping with three fields and i am analyzing one field which is text type and other fields are keyword type.
example
fields: Category_one, Category_two, Category_three.
Now i am searching the documents.
Get _search/cat
{
"size": 4,
"query": {
"match": {
"Category_one.ngrams": {
"query": "Nice food place in XYZ location",
"analyzer": "standard"
}
},
"aggs":{
"distincr_values":{
"terms": {
"fields" : "Category_two"
}
}
}
}
}
It's showing this error
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[match] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 10,
"col": 5
}
],
"type": "parsing_exception",
"reason": "[match] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 10,
"col": 5
},
"status": 400
}
Kindly help me with this error. My main motive is to find distinct searches according Category_two field.
Any help would be appreciated.

I believe youre getting this error because of your query structure.
Your aggregations keyword must be outside (same level as) the query. At the moments your aggs is wrapped up inside the query.
Following this structure:
Get _search/cat
{
"size": 4,
"query": {
'query goes here'
},
"aggs":{
'aggregation go here'
}
}

Related

Elastic Search - Nested aggregation

I would like to form a nested aggregation type query in elastic search. Basically , the nested aggregation is at four levels.
groupId.keyword
---direction
--billingCallType
--durationCallAnswered
example:
"aggregations": {
"avgCallDuration": {
"terms": {
"field": "groupId.keyword",
"size": 10000,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"call_direction": {
"terms" : {
"field": "direction"
},
"aggregations": {
"call_type" : {
"terms": {
"field": "billingCallType"
},
"aggregations": {
"avg_value": {
"terms": {
"field": "durationCallAnswered"
}
}
}
}
}
}
}
}
}
This is part of a query . While running this , I am getting the error as
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [direction] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
Can anyone throw light on this?
Tldr;
As the error state, you are performing an aggregation on a text field, the field direction.
Aggregation are not supported by default on text field, as it is very expensive (cpu and memory wise).
They are 3 solutions to your issue,
Change the mapping from text to keyword (will require re indexing, most efficient way to query the data)
Change the mapping to add to this field fielddata: true (flexible, but not optimised)
Don't do the aggregation on this field :)

How do I properly use deleteMany() with an $and query in the MongoDB shell?

I am trying to delete all documents in my collection infrastructure that have a type.primary property of "pipelines" and a type.secondary property of "oil."
I'm trying to use the following query:
db.infrastructure.deleteMany({$and: [{"properties.type.primary": "pipelines"}, {"properties.type.secondary": "oil"}] }),
That returns: { acknowledged: true, deletedCount: 0 }
I expect my query to work because in MongoDB Compass, I can retrieve 182 documents that match the query {$and: [{"properties.type.primary": "pipelines"}, {"properties.type.secondary": "oil"}] }
My documents appear with the following structure (relevant section only):
properties": {
"optional": {
"description": ""
},
"original": {
"Opername": "ENBRIDGE",
"Pipename": "Lakehead",
"Shape_Leng": 604328.294581,
"Source": "EIA"
},
"required": {
"unit": null,
"viz_dim": null,
"years": []
},
"type": {
"primary": "pipelines",
"secondary": "oil"
}
...
My understanding is that I just need to pass a filter to deleteMany() and that $and expects an array of objects. For some reason the two combined isn't working here.
I realized the simplest answer was the correct one -- I spelled my database name incorrectly.

Why can't I submit a record to Zoho Sales Order API?

I'm trying to insert a record using the Zoho API, and I keep receiving a cryptic INVALID_DATA error message.
I've tried using their sample code which, of course, produces another error. And the sample code they provide for running in Postman also produces an error.
Their docs are lacking and inconsistent, and nobody is getting back to me on their message boards, and I'm getting desperate as I need to have this done today. Can anyone see what I'm doing wrong?
This is what I'm submitting via Postman
{
"data": [
{
"Owner": {
"id": "3938209039489388001"
},
"Contact_Name": {
"id": "398129039938498309"
},
"Subject": "Test",
"Product_Details": [
{
"product": {
"id": "1234567"
},
"quantity": 1
}
]
}
]
}
This is the error response
{
"data": [
{
"code": "INVALID_DATA",
"details": {
"api_name": "product",
"index": 0,
"parent_api_name": "Product_Details"
},
"message": "invalid data",
"status": "error"
}
]
}
The solution was to POST a product first, then grab that product ID and insert it under Product_Details. This is not documented, so I assumed the product would be created automatically, which it wasn't.

Querying Microsoft Academic graph by fields of study of references

I've been playing around with the Microsoft Academic API, trying to do graph queries using JSON formatted queries. I'm at the point where I think I can produce results, but for some reason I don't get the full set of results.
The query I am attempting to perform will retrieve all papers that reference a paper that has a FieldOfStudy that is one of the ones I'm looking for. Essentially, I'm trying to find out how well cited a field of study is.
I think the query should look something like this:
{
"path": "/paper/ReferenceIDs/reference/FieldOfStudyIDs/field",
"paper": {
"type": "Paper",
"match" : {
"PublishYear": 2017
},
"select": ["DOI","OriginalTitle","PublishYear"]
},
"reference" : {
"type" : "Paper",
"select" : "OriginalTitle"
},
"field": {
"type": "FieldOfStudy",
"select": [ "Name" ],
"return": { "id": [106686826,204641814] }
}
}
Unfortunately, I get only an incomplete subset of results. Funnily enough though, if I further restrict the initial node by matching on a title, I get another set of results (disjoint from the first query result set)
{
"path": "/paper/ReferenceIDs/reference/FieldOfStudyIDs/field",
"paper": {
"type": "Paper",
"match" : {
"OriginalTitle": "cancer",
"PublishYear": 2017
},
"select": ["DOI","OriginalTitle","PublishYear"]
},
"reference" : {
"type" : "Paper",
"select" : "OriginalTitle"
},
"field": {
"type": "FieldOfStudy",
"select": [ "Name" ],
"return": { "id": [106686826,204641814] }
}
}
So, what could be going on here? Is the query giving up because the very first node it hits on the broader search doesn't match the path? Is it even possible to query all papers published in a year like this?

JSON Slurper Offsets

I have a large JSON file that I'm trying to parse with JSON Slurper. The JSON file consists of information about bugs so it has things like issue keys, descriptions, and comments. Not every issue has a comment though. For example, here is a sample of what the JSON input looks like:
{
"projects": [
{
"name": "Test Project",
"key": "TEST",
"issues": [
{
"key": "BUG-1",
"priority": "Major",
"comments": [
{
"author": "a1",
"created": "d1",
"body": "comment 1"
},
{
"author": "a2",
"created": "d2",
"body": "comment 2"
}
]
},
{
"key": "BUG-2",
"priority": "Major"
},
{
"key": "BUG-3",
"priority": "Major",
"comments": [
{
"author": "a3",
"created": "d3",
"body": "comment 3"
}
]
}
]
}
]
}
I have a method that creates Issue objects based on the JSON parse. Everything works well when every issue has at least one comment, but, once an issue comes up that has no comments, the rest of the issues get the wrong comments. I am currently looping through the JSON file based on the total number of issues and then looking for comments using how far along in the number of issues I've gotten. So, for example,
parsedData.issues.comments.body[0][0][0]
returns "comment 1". However,
parsedData.issues.comments.body[0][1][0]
returns "comment 3", which is incorrect. Is there a way I can see if a particular issue has any comments? I'd rather not have to edit the JSON file to add empty comment fields, but would that even help?
You can do this:
parsedData.issues.comments.collect { it?.body ?: [] }
So it checks for a body and if none exists, returns an empty list
UPDATE
Based on the update to the question, you can do:
parsedData.projects.collectMany { it.issues.comments.collect { it?.body ?: [] } }