How to round up values in Elasticsearch query? - json

I am trying to set up an automated Kibana alert that takes in data from a defined extraction query. I get all the information I want, however, the response query returns values without rounding them up (up to 12 decimal points). Where in the extraction query and what do I specify to round this value up?
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"match_all": {
"boost": 1
}
},
{
"range": {
"#timestamp": {
"from": "{{period_end}}||-24h",
"to": "{{period_end}}",
"include_lower": true,
"include_upper": true,
"format": "epoch_millis",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"_source": {
"includes": [],
"excludes": []
},
"stored_fields": "*",
"docvalue_fields": [
{
"field": "#timestamp",
"format": "date_time"
},
{
"field": "timestamp",
"format": "date_time"
}
],
"script_fields": {},
"aggregations": {
"2": {
"terms": {
"field": "tag.country.keyword",
"size": 20,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"1": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"1": {
"avg": {
"field": "my_field"
}
}
}
}
}
}
Here, I'm talking about the "avg" aggregation at the very bottom. As I understand, right below the "field" key, I should specify a "script" key, defining the rounding function that I want to use. Can anybody help me come up with the correct function?
I'm not sure what to specify in the "script" key to make the rounding function work.

Related

Elasticsearch: Aggregation of null fields in a facet bucket

I'm trying to implement facets with a date range aggregation in the current version of Amazon Elasticsearch Service (version 7.10). The key for what I want the article documents to group for, is publishedAt, what is a date. I want one bucket, where publishedAt is in the past, which means, it is published, one where it is in the future, which means scheduled and one for all articles without a publishedAt, which are drafts. published and scheduled are working as they should. For drafts I can't enter a filter or date range as they are null. So I want to make use of the "Missing Values" feature. This should treat the documents with publishedAt = null like to have the date given in the missing field. Unfortunately it has no effect on the results. Even if I change the date of missing to let it match with published or scheduled.
My request:
GET https://es.amazonaws.com/articles/_search
{
"size": 10,
"aggs": {
"facet_bucket_all": {
"aggs": {
"channel": {
"terms": {
"field": "channel.keyword",
"size": 5
}
},
"brand": {
"terms": {
"field": "brand.keyword",
"size": 5
}
},
"articleStatus": {
"date_range": {
"field": "publishedAt",
"format": "dd-MM-yyyy",
"missing": "01-07-1886",
"ranges": [
{ "key": "published", "from": "now-99y/M", "to": "now/M" },
{ "key": "scheduled", "from": "now+1s/M", "to": "now+99y/M" },
{ "key": "drafts", "from": "01-01-1886", "to": "31-12-1886" }
]
}
}
},
"filter": {
"bool": {
"must": []
}
}
},
"facet_bucket_publishedAt": {
"aggs": {},
"filter": {
"bool": {
"must": []
}
}
},
"facet_bucket_author": {
"aggs": {
"author": {
"terms": {
"field": "author",
"size": 10
}
}
},
"filter": {
"bool": {
"must": []
}
}
}
},
"query": {
"bool": {
"filter": [
{
"range": {
"publishedAt": {
"lte": "2021-08-09T09:52:19.975Z"
}
}
}
]
}
},
"from": 0,
"sort": [
{
"_score": "desc"
}
]
}
And in the result, the drafts are empty:
"articleStatus": {
"buckets": [
{
"key": "published",
"from": -1.496448E12,
"from_as_string": "01-08-1922",
"to": 1.627776E12,
"to_as_string": "01-08-2021",
"doc_count": 47920
},
{
"key": "scheduled",
"from": 1.627776E12,
"from_as_string": "01-08-2021",
"to": 4.7519136E12,
"to_as_string": "01-08-2120",
"doc_count": 3
},
{
"key": "drafts",
"from": 1.67252256E13,
"from_as_string": "01-01-1886",
"to": 1.67566752E13,
"to_as_string": "31-12-1886",
"doc_count": 0
}
]
}
SearchKit added this part to the query:
"query": {
"bool": {
"filter": [
{
"range": {
"publishedAt": {
"lte": "2021-08-09T09:52:19.975Z"
}
}
}
]
}
}
This had to be removed, because it filters out null values, before the missing filter makes its job.
Now I get the correct result:
"articleStatus": {
"buckets": [
{
"key": "drafts",
"from": -2.650752E12,
"from_as_string": "01-01-1886",
"to": -2.6193024E12,
"to_as_string": "31-12-1886",
"doc_count": 7
},
{
"key": "published",
"from": -1.496448E12,
"from_as_string": "01-08-1922",
"to": 1.627776E12,
"to_as_string": "01-08-2021",
"doc_count": 47920
},
{
"key": "scheduled",
"from": 1.627776E12,
"from_as_string": "01-08-2021",
"to": 4.7519136E12,
"to_as_string": "01-08-2120",
"doc_count": 3
}
]
}

Unknown key for a START_OBJECT in a multiple aggregations elasticsearch

I'm trying to build a query allowing me to make multiple aggregations (on the same level, not sub aggregations) on a single query. Here's the request I'm sending :
{
"index": "index20",
"type": "arret",
"body": {
"size": 0,
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "anim fore",
"analyzer": "query_analyzer",
"type": "cross_fields",
"fields": [
"doc_id"
],
"operator": "and"
}
}
]
}
},
"aggs": {
"anim_fore": {
"terms": {
"field": "suggest_keywords.autocomplete",
"order": {
"_count": "desc"
},
"include": {
"pattern": "anim.*fore.*"
}
}
},
"fore": {
"terms": {
"field": "suggest_keywords.autocomplete",
"order": {
"_count": "desc"
},
"include": {
"pattern": "fore.*"
}
}
}
}
}
}
However, I'm getting the following error when executing this query :
Error: [parsing_exception] Unknown key for a START_OBJECT in [fore]., with { line=1 & col=1351 }
I've been trying to change this query in many forms to make it works but I always end up with this error. It seems really strange to me as this query seems compatible with the format specified there : ES documentation.
Maybe there is something specific about terms aggregations but I haven't been able to sort it out.
The error is in your include settings, which should simply be strings
"aggs": {
"anim_fore": {
"terms": {
"field": "suggest_keywords.autocomplete",
"order": {
"_count": "desc"
},
"include": "anim.*fore.*" <--- here
}
},
"fore": {
"terms": {
"field": "suggest_keywords.autocomplete",
"order": {
"_count": "desc"
},
"include": "fore.*" <--- and here
}
}
}
You have trailing commas after doc_id and after closing array tag for must, your query should look like this
"must": [
{
"multi_match": {
"query": "anim fore",
"analyzer": "query_analyzer",
"type": "cross_fields",
"fields": [
"doc_id" // You have trailing comma here
],
"operator": "and"
}
}
] // And here

How to negate filter query in Kibana

I'm using ELK stack and I'm trying to find out how to visualize all logs except of those from specific IP ranges (for example 10.0.0.0/8). Is there any way how to negate filter query:
{"wildcard":{"src_address":"10.*"}}
I put it to Buckets -> Split Bars -> Aggregation -> Filters and I would like to negate this query so I got all logs except of those from 10.0.0.0/8
This is the whole JSON request:
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "low_level_category:\"user_authentication_failure\" AND NOT src_address:\"10.*\"",
"analyze_wildcard": true
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"#timestamp": {
"gte": 1474384885044,
"lte": 1474989685044,
"format": "epoch_millis"
}
}
}
],
"must_not": []
}
}
}
},
"size": 0,
"aggs": {
"2": {
"date_histogram": {
"field": "#timestamp",
"interval": "3h",
"time_zone": "Europe/Berlin",
"min_doc_count": 200,
"extended_bounds": {
"min": 1474384885043,
"max": 1474989685043
}
},
"aggs": {
"3": {
"terms": {
"field": "src_address.raw",
"size": 5,
"order": {
"_count": "desc"
}
}
}
}
}
}
}
Thanks
You can input this in the Kibana search box and it should get you what you need:
NOT src_address:10.*

How to query an elasticsearch aggregation with a term and sum on different nested objects?

I have the following object whose value attribute is a nested object type:
{
"metadata": {
"tenant": "home",
"timestamp": "2016-03-24T23:59:38Z"
},
"value": {
{ "key": "foo", "int_value": 100 },
{ "key": "bar", "str_value": "taco" }
}
}
This type of object has the following mapping:
{
"my_index": {
"mappings": {
"my_doctype": {
"properties": {
"metadata": {
"properties": {
"tenant": {
"type": "string",
"index": "not_analyzed"
},
"timestamp": {
"type": "date",
"format": "dateOptionalTime"
}
}
},
"value": {
"type": "nested",
"properties": {
"str_value": {
"type": "string",
"index": "not_analyzed"
},
"int_value": {
"type": "long"
},
"key": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
With this setup, I would like to perform an aggregation that performs the following result:
Perform a term aggregation on the str_value attribute of objects where the key is set to "bar"
In each bucket created from the above aggregation, calculate the sum of the int_value attributes where the key is set to "foo"
Have the results laid out in a date_histogram for a given time range.
With this goal in mind, I have been able to get the term and date_histogram aggregations to work on my nested objects, but have not had luck performing the second level of calculation. Here is the current query I am attempting to get working:
{
"query": {
"match_all": {}
},
"aggs": {
"filters": {
"filter": {
"bool": {
"must": [
{
"term": {
"metadata.org": "gw"
}
},
{
"range": {
"metadata.timestamp": {
"gte": "2016-03-24T00:00:00.000Z",
"lte": "2016-03-24T23:59:59.999Z"
}
}
}
]
}
},
"aggs": {
"intervals": {
"date_histogram": {
"field": "metadata.timestamp",
"interval": "1d",
"min_doc_count": 0,
"extended_bounds": {
"min": "2016-03-24T00:00:00Z",
"max": "2016-03-24T23:59:59Z"
},
"format": "yyyy-MM-dd'T'HH:mm:ss'Z'"
},
"aggs": {
"nested_type": {
"nested": {
"path": "value"
},
"aggs": {
"key_filter": {
"filter": {
"term": {
"value.key": "bar"
}
},
"aggs": {
"groupBy": {
"terms": {
"field": "value.str_value"
},
"aggs": {
"other_nested": {
"reverse_nested": {
"path": "value"
},
"aggs": {
"key_filter": {
"filter": {
"term": {
"value.key": "foo"
}
},
"aggs": {
"amount_sum": {
"sum": {
"field": "value.int_value"
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
The result I am expecting to receive from Elasticsearch would look like the following:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 7,
"max_score": 0.0,
"hits": []
},
"aggregations": {
"filters": {
"doc_count": 2,
"intervals": {
"buckets": [
{
"key_as_string": "2016-03-24T00:00:00Z",
"key": 1458777600000,
"doc_count": 2,
"nested_type": {
"doc_count": 5,
"key_filter": {
"doc_count": 2,
"groupBy": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "taco",
"doc_count": 1,
"other_nested": {
"doc_count": 1,
"key_filter": {
"doc_count": 1,
"amount_sum": {
"value": 100.0
}
}
}
}
]
}
}
}
}
]
}
}
}
}
However, the innermost object (...groupBy.buckets.key_filter.amount_sum) is having its value return 0.0 instead of 100.0.
I think this is due to the fact that nested objects are indexed as separate documents, so filtering by one key attribute's value is not allowing me to query to against another key.
Would anyone have any idea on how to get this type of query to work?
For a bit more context, the reason for this document structure is because I do not control the content of the JSON documents that get indexed, so different tenants may have conflicting key names with different values (e.g. {"tenant": "abc", "value": {"foo": "a"} } vs. {"tenant": "xyz", "value": {"foo": 1} }. The method I am trying to use is the one laid out by this Elasticsearch Blog Post, where it recommends to transform objects that you don't control into a structure that you do and to use nested objects to help with this (specifically the Nested fields for each data type section of the article). I would also be open to learn of a better way to handle this situation of not controlling the document's JSON structure if there is one so that I can perform aggregations.
Thank you!
EDIT: I am using Elasticsearch 1.5.
Solved this situation by utilizing the reverse_nested aggregation in the correct way as described here: http://www.shayne.me/blog/2015/2015-05-18-elasticsearch-nested-docs/

Elasticsearch aggregations filtered result is not working properly

two sample documents
POST /aggstest/test/1
{
"categories": [
{
"type": "book",
"words": [
{"word":"storm","count":277},
{"word":"pooh","count":229}
]
},
{
"type": "magazine",
"words": [
{"word":"vibe","count":100},
{"word":"sunny","count":50}
]
}
]
}
POST /aggstest/test/2
{
"categories": [
{
"type": "book",
"words": [
{"word":"rain","count":160},
{"word":"jurassic park","count":150}
]
},
{
"type": "megazine",
"words": [
{"word":"tech","count":200},
{"word":"homes","count":30}
]
}
]
}
aggs query
GET /aggstest/test/_search
{
"size": 0,
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"categories.type": "book"
}
},
{
"term": {
"categories.words.word": "storm"
}
}
]
}
}
}
},
"aggs": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"categories.type": "book"
}
}
]
}
},
"aggs": {
"book_category": {
"terms": {
"field": "categories.words.word",
"size": 10
}
}
}
}
},
"post_filter": {
"term": {
"categories.type": "book"
}
}
}
result
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0,
"hits": []
},
"aggregations": {
"filtered": {
"doc_count": 1,
"book_category": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "pooh",
"doc_count": 1
},
{
"key": "storm",
"doc_count": 1
},
{
"key": "sunny",
"doc_count": 1
},
{
"key": "vibe",
"doc_count": 1
}
]
}
}
}
}
========================
Expected aggs result set should not include "sunny" and "vibe" because it's "magazine" type.
I used filter query and post_filter, but I couldn't get only "book" type aggs result.
All the filters you apply (in-query and in-aggregation) still return the whole categories document. And this document, which contains all 4 words, is a scope for aggregation. Hence you always get all 4 buckets.
As far as I understand, some way to manipulate buckets on server-side would be introduced with reducers in version 2.0 of Elasticsearch.
What you may use now is changing the mapping so that categories is nested object. Hence you'll be able to query them independently and aggregate accordingly using nested aggregation. Changing object type to nested requires reindexing.
Also please note that post-filters are not applied to aggregation whatsoever. They are used to filter the original query without affecting the aggregation when you need to aggregate on wider scope than returned hits.
And one more thing, if you already have filter in query there's no need to put it in aggregation, scope is already filtered.