Logstash Grok JSON error - mapper of different type - json

I have this log file:
2020-08-05 09:11:19 INFO-flask.model-{"version": "1.2.1", "time": 0.651745080947876, "output": {...}}
This is my logstash filter setting
grok{
match => {
"message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log.level}-%{DATA:model}-%{GREEDYDATA:log.message}"}
}
date {
timezone => "UTC"
match => ["timestamp" , "ISO8601", "yyyy-MM-dd HH:mm:ss"]
target => "#timestamp"
remove_field => [ "timestamp" ]
}
json{
source => "log.message"
target => "log.message"
}
mutate {
add_field => {
"execution.time" => "%{[log.message][time]}"
}
}
}
I want to extract the "time" value from the message. But I receive this error:
[2020-08-05T09:11:32,688][WARN ][logstash.outputs.elasticsearch][main][81ad4d5f6359b99ec4e52c93e518567c1fe91de303faf6fa1a4d905a73d3c334] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"index-2020.08.05", :routing=>nil, :_type=>"_doc"}, #<LogStash::Event:0xbe6a80>], :response=>{"index"=>{"_index"=>"index-2020.08.05", "_type"=>"_doc", "_id"=>"ywPjvXMByEqBCvLy1871", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"mapper [log.message.input.values] of different type, current_type [long], merged_type [text]"}}}}

Please find the filter part for your logstash configuration:
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log-level}-%{DATA:model}-%{GREEDYDATA:KV}" }
overwrite => [ "message" ]
}
kv {
source => "KV"
value_split => ": "
field_split => ", "
target => "msg"
}
}
Hope this will solve your problem.

Related

Parse date inside input logs file

I'm new using ELK stack and i'm trying to create an index from an S3 file. This S3 file's format CSV and has the following schema:
date: Date field with format yyyy-MM-dd HH:mm:ss
filename: Name of the input file that triggers some events
input_registers: count with num of lines for the file
wrong_registers: count with num of wrong registers
result_registers: count with num of result registers (validated)
I need to set date as the #timestamp field on ELK.
I already tried some things with date filter plugin, here i show my current configuration:
input{
s3 {
"id" => "rim-pfinal"
"access_key_id" => ""
"secret_access_key" => ""
"region" => "eu-west-3"
"bucket" => "practica.final.rim.elk"
"prefix" => "logs"
"interval" => "3600"
"additional_settings" => {
"force_path_style" => true
"follow_redirects" => false
}
sincedb_path => "/dev/null"
}
}
filter {
date {
match => [ "date", "ISO8601", "yyyy-MM-dd HH:mm:ss" ]
target => "date"
add_field => { "DummyField" => "Fecha cambiada" }
}
csv{
columns => ["date", "filename", "input_registers", "wrong_registers", "result_registers", "err_type"]
separator => ";"
}
mutate { convert => [ "input_registers", "integer"] }
mutate { convert => [ "wrong_registers", "integer"] }
mutate { convert => [ "result_registers", "integer"] }
#Remove first header line to insert in elasticsearch
if [PK] =~ "PK"{
drop {}
}
}
output{
elasticsearch {
hosts => ["localhost:9200"]
index => "practica-rim"
}
}
I tried to set target to timestamp and match too but seems not working.
Thank you for the help!
{
"query":{
"range":{
"#timestamp":{
"gte":"2015-08-04T11:00:00",
"lt":"2015-08-04T12:00:00"
}
}
}
}
datetimes will be serialized
es.index(index="my-index", doc_type="test-type", id=42, body={"any": "data", "timestamp": datetime.now()})
{u'_id': u'42', u'_index': u'my-index', u'_type': u'test-type', u'_version': 1, u'ok': True}
# but not deserialized
>>> es.get(index="my-index", doc_type="test-type", id=42)['_source']
{u'any': u'data', u'timestamp': u'2013-05-12T19:45:31.804229'}
https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html

Logstash cannot extract json key

I need help regarding logstash filter to extract json key/value to new_field. The following is my logstash conf.
input {
tcp {
port => 5044
}
}
filter {
json {
source => "message"
add_field => {
"data" => "%{[message][data]}"
}
}
}
output {
stdout { codec => rubydebug }
}
I have tried with mutate:
filter {
json {
source => "message"
}
mutate {
add_field => {
"data" => "%{[message][data]}"
}
}
}
I have tried with . instead of []:
filter {
json {
source => "message"
}
mutate {
add_field => {
"data" => "%{message.data}"
}
}
}
I have tried with index number:
filter {
json {
source => "message"
}
mutate {
add_field => {
"data" => "%{[message][0]}"
}
}
}
All with no luck. :(
The following json is sent to port 5044:
{"data": "blablabla"}
The problem is the new field not able to extract value from the key of the json.
"data" => "%{[message][data]}"
The following is my stdout:
{
"#version" => "1",
"host" => "localhost",
"type" => "logstash",
"data" => "%{[message][data]}",
"path" => "/path/from/my/app",
"#timestamp" => 2019-01-11T20:39:10.845Z,
"message" => "{\"data\": \"blablabla\"}"
}
However if I use "data" => "%{[message]}" instead:
filter {
json {
source => "message"
add_field => {
"data" => "%{[message]}"
}
}
}
I will get the whole json from stdout.
{
"#version" => "1",
"host" => "localhost",
"type" => "logstash",
"data" => "{\"data\": \"blablabla\"}",
"path" => "/path/from/my/app",
"#timestamp" => 2019-01-11T20:39:10.845Z,
"message" => "{\"data\": \"blablabla\"}"
}
Can anyone please tell me what I did wrong.
Thank you in advance.
I use docker-elk stack, ELK_VERSION=6.5.4
add_field is used to add custom logic when filter succeeds, many filters have this option. If you want to parse json into a field, you should use target:
filter {
json {
source => "message"
target => "data" // parse into data field
}
}

Error in logstash configuration file tomcat

I have problem with Logstash configuration
My logs pattern are
2017-07-26 14:31:03,644 INFO [http-bio-10.60.2.21-10267-exec-92] jsch.DeployManagerFileUSImpl (DeployManagerFileUSImpl.java:132) - passage par ficher temporaire .bindings.20170726-143103.tmp
My current pattern is
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log-level} \(%{DATA:class}\):%{GREEDYDATA:message}" }
Which pattern for [http-bio-10.60.2.21-10267-exec-92] and for jsch.DeployManagerFileUSImpl?
Doesn't seem like the current pattern you've shown would work, as you don't have anything in your sample message that matches \(%{DATA:class}\):%{GREEDYDATA:message} and you're not dealing with the double space after the loglevel.
If you want to match some random stuff in the middle of a line, use %{DATA}, e.g.:
\[%{DATA:myfield}\]
and then you can use %{GREEDYDATA} to get the stuff at the end of the line:
\[%{DATA:myfield1}\] %{GREEDYDATA:myfield2}
If you need to break these items down into fields of their own, then be more specific with the pattern or use a second grok{} block.
in my logstash.conf i have change my pattern to
match => [ "message", "%{TIMESTAMP_ISO8601:logdate},%{INT} %{LOGLEVEL:log-level} \[(?<threadname>[^\]]+)\] %{JAVACLASS:package} \(%{JAVAFILE:file}:%{INT:line}\) - %{GREEDYDATA:message}" ]
With helping of site https://grokdebug.herokuapp.com/ .
But i could not see in kibana 5.4.3 my static log file contains in /home/elasticsearch/static_logs/ directory ?
My logstash configuration file with "static" section
input {
file {
type => "access-log"
path => "/home/elasticsearch/tomcat/logs/*.txt"
}
file {
type => "tomcat"
path => "/home/elasticsearch/tomcat/logs/*.log" exclude => "*.zip"
codec => multiline {
negate => true
pattern => "(^%{MONTH} %{MONTHDAY}, 20%{YEAR} %{HOUR}:?%{MINUTE}(?::?%{SECOND}) (?:AM|PM))"
what => "previous"
}
}
file {
type => "static"
path => "/home/elasticsearch/static_logs/*.log" exclude => "*.zip"
}
}
filter {
if [type] == "access-log" {
grok {
# Access log pattern is %a %{waffle.servlet.NegotiateSecurityFilter.PRINCIPAL}s %t %m %U%q %s %B %T "%{Referer}i" "%{User-Agent}i"
match => [ "message" , "%{IPV4:clientIP} %{NOTSPACE:user} \[%{DATA:timestamp}\] %{WORD:method} %{NOTSPACE:request} %{NUMBER:status} %{NUMBER:bytesSent} %{NUMBER:duration} \"%{NOTSPACE:referer}\" \"%{DATA:userAgent}\"" ]
remove_field => [ "message" ]
}
grok{
match => [ "request", "/%{USERNAME:app}/" ]
tag_on_failure => [ ]
}
date {
match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ]
remove_field => [ "timestamp" ]
}
geoip {
source => ["clientIP"]
}
dns {
reverse => [ "clientIP" ]
}
mutate {
lowercase => [ "user" ]
convert => [ "bytesSent", "integer", "duration", "float" ]
}
if [referer] == "-" {
mutate {
remove_field => [ "referer" ]
}
}
if [user] == "-" {
mutate {
remove_field => [ "user" ]
}
}
}
if [type] == "tomcat" {
if [message] !~ /(.+)/ {
drop { }
}
grok{
patterns_dir => "./patterns"
overwrite => [ "message" ]
# oK Catalina normal
match => [ "message", "%{CATALINA_DATESTAMP:timestamp} %{NOTSPACE:className} %{WORD:methodName}\r\n%{LOGLEVEL: logLevel}: %{GREEDYDATA:message}" ]
}
grok{
match => [ "path", "/%{USERNAME:app}.20%{NOTSPACE}.log"]
tag_on_failure => [ ]
}
# Aug 25, 2014 11:23:31 AM
date{
match => [ "timestamp", "MMM dd, YYYY hh:mm:ss a" ]
remove_field => [ "timestamp" ]
}
}
if [type] == "static" {
if [message] !~ /(.+)/ {
drop { }
}
grok{
patterns_dir => "./patterns"
overwrite => [ "message" ]
# 2017-08-03 16:01:11,352 WARN [Thread-552] pcf2.AbstractObjetMQDAO (AbstractObjetMQDAO.java:137) - Descripteur de
match => [ "message", "%{TIMESTAMP_ISO8601:logdate},%{INT} %{LOGLEVEL:log-level} \[(?<threadname>[^\]]+)\] %{JAVACLASS:package} \(%{JAVAFILE:file}:%{INT:line}\) - %{GREEDYDATA:message}" ]
}
# 2017-08-03 16:01:11,352
date{
match => [ "timestamp", "YYYY-MM-dd hh:mm:ss,SSS" ]
remove_field => [ "timestamp" ]
}
}
}
output {
elasticsearch { hosts => ["192.168.99.100:9200"]}
}
Where is my mistake ?
Regards

Logstash Parsing and Calculations with CSV

I am having trouble parsing and calculating performance Navigation Timing data I have in a csv.
I was able to parse the fields but not sure how to approach the calculations (below) properly. Some points to keep in mind:
Data sets are grouped together by the bolded value (it is the ts of when the 21 datapoints were taken
ACMEPage-1486643427973,unloadEventEnd,1486643372422
2.Calculations need to be done with data points within the group
I am assuming some tagging and grouping will need to be done but I don't have a clear vision on how to implement it. Any help would be greatly appreciated.
Thanks,
---------------Calculations-----------------
Total First byte Time = responseStart - navigationStart
Latency = responseStart – fetchStart
DNS / Domain Lookup Time = domainLookupEnd - domainLookupStart
Server connect Time = connectEnd - connectStart
Server Response Time = responseStart - requestStart
Page Load time = loadEventStart - navigationStart
Transfer/Page Download Time = responseEnd - responseStart
DOM Interactive Time = domInteractive - navigationStart
DOM Content Load Time = domContentLoadedEventEnd - navigationStart
DOM Processing to Interactive =domInteractive - domLoading
DOM Interactive to Complete = domComplete - domInteractive
Onload = loadEventEnd - loadEventStart
-------Data in CSV-----------
ACMEPage-1486643427973,unloadEventEnd,1486643372422
ACMEPage-1486643427973,responseEnd,1486643372533
ACMEPage-1486643427973,responseStart,1486643372416
ACMEPage-1486643427973,domInteractive,1486643373030
ACMEPage-1486643427973,domainLookupEnd,1486643372194
ACMEPage-1486643427973,unloadEventStart,1486643372422
ACMEPage-1486643427973,domComplete,1486643373512
ACMEPage-1486643427973,domContentLoadedEventStart,1486643373030
ACMEPage-1486643427973,domainLookupStart,1486643372194
ACMEPage-1486643427973,redirectEnd,0
ACMEPage-1486643427973,redirectStart,0
ACMEPage-1486643427973,connectEnd,1486643372194
ACMEPage-1486643427973,toJSON,{}
ACMEPage-1486643427973,connectStart,1486643372194
ACMEPage-1486643427973,loadEventStart,1486643373512
ACMEPage-1486643427973,navigationStart,1486643372193
ACMEPage-1486643427973,requestStart,1486643372203
ACMEPage-1486643427973,secureConnectionStart,0
ACMEPage-1486643427973,fetchStart,1486643372194
ACMEPage-1486643427973,domContentLoadedEventEnd,1486643373058
ACMEPage-1486643427973,domLoading,1486643372433
ACMEPage-1486643427973,loadEventEnd,1486643373514
----------Output---------------
"path" => "/Users/philipp/Downloads/build2/logDataPoints_com.concur.automation.cge.ui.admin.ADCLookup_1486643340910.csv",
"#timestamp" => 2017-02-09T12:29:57.763Z,
"navigationTimer" => "connectStart",
"#version" => "1",
"host" => "15mbp-09796.local",
"elapsed_time" => "1486643372194",
"pid" => "1486643397763",
"page" => "ADCLookupDataPage",
"message" => "ADCLookupDataPage-1486643397763,connectStart,1486643372194",
"type" => "csv"
}
--------------logstash.conf----------------
input {
file {
type => "csv"
path => "/Users/path/logDataPoints_com.concur.automation.acme.ui.admin.acme_1486643340910.csv"
start_position => beginning
# to read from the beginning of file
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["page_id", "navigationTimer", "elapsed_time"]
}
if (["elapsed_time"] == "{}" ) {
drop{}
}
else {
grok {
match => { "page_id" => "%{WORD:page}-%{INT:pid}"
}
remove_field => [ "page_id" ]
}
}
date {
match => [ "pid", "UNIX_MS" ]
target => "#timestamp"
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}
I the following to get trend my data:
-I found it easier to pivot the data, rather than going down the column, to have the data go along the rows per each "event" or "document"
-Each field needed to be mapped accordingly as an integer or string
Once the data was in Kibana properly I had problems using the ruby code filter to make simple math calculations so I ended up using the "scripted fields" to make the calculations in Kibana.
input {
file {
type => "csv"
path => "/Users/philipp/perf_csv_pivot2.csv"
start_position => beginning
# to read from the beginning of file
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["page_id","unloadEventEnd","responseEnd","responseStart","domInteractive","domainLookupEnd","unloadEventStart","domComplete","domContentLoadedEventStart","domainLookupstart","redirectEnd","redirectStart","connectEnd","toJSON","connectStart","loadEventStart","navigationStart","requestStart","secureConnectionStart","fetchStart","domContentLoadedEventEnd","domLoading","loadEventEnd"]
}
grok {
match => { "page_id" => "%{WORD:page}-%{INT:page_ts}" }
remove_field => [ "page_id", "message", "path" ]
}
mutate {
convert => { "unloadEventEnd" => "integer" }
convert => { "responseEnd" => "integer" }
convert => { "responseStart" => "integer" }
convert => { "domInteractive" => "integer" }
convert => { "domainLookupEnd" => "integer" }
convert => { "unloadEventStart" => "integer" }
convert => { "domComplete" => "integer" }
convert => { "domContentLoadedEventStart" => "integer" }
convert => { "domainLookupstart" => "integer" }
convert => { "redirectEnd" => "integer" }
convert => { "redirectStart" => "integer" }
convert => { "connectEnd" => "integer" }
convert => { "toJSON" => "string" }
convert => { "connectStart" => "integer" }
convert => { "loadEventStart" => "integer" }
convert => { "navigationStart" => "integer" }
convert => { "requestStart" => "integer" }
convert => { "secureConnectionStart" => "integer" }
convert => { "fetchStart" => "integer" }
convert => { "domContentLoadedEventEnd" => "integer" }
convert => { "domLoading" => "integer" }
convert => { "loadEventEnd" => "integer" }
}
date {
match => [ "page_ts", "UNIX_MS" ]
target => "#timestamp"
remove_field => [ "page_ts", "timestamp", "host", "toJSON" ]
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}
Hope this can help someone else,

Access nested JSON Field in Logstash

I have a Problem with accessing a nested JSON field in logstash (latest version).
My config file is the following:
input {
http {
port => 5001
codec => "json"
}
}
filter {
mutate {
add_field => {"es_index" => "%{[statements][authority][name]}"}
}
mutate {
gsub => [
"es_index", " ", "_"
]
}
mutate {
lowercase => ["es_index"]
}
ruby {
init => "
def remove_dots hash
new = Hash.new
hash.each { |k,v|
if v.is_a? Hash
v = remove_dots(v)
end
new[ k.gsub('.','_') ] = v
if v.is_a? Array
v.each { |elem|
if elem.is_a? Hash
elem = remove_dots(elem)
end
new[ k.gsub('.','_') ] = elem
} unless v.nil?
end
} unless hash.nil?
return new
end
"
code => "
event.instance_variable_set(:#data,remove_dots(event.to_hash))
"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => "elasticsearch:9200"
index => "golab-%{+YYYY.MM.dd}"
}
}
I have a filter with mutate. I want to add a field that I can use as a part of the index name. When I use this "%{[statements][authority][name]}" the content in the brackets is used as string.%{[statements][authority][name]} is saved in the es_indexfield. Logstash seems to think this is a string, but why?
I've also tried to use this expression: "%{statements}". It's working like expected. Everything in the field statements is passed to es_index. If I use "%{[statements][authority]}" strange things happen. es_index is filled with the exact same output that "%{statements}" produces. What am I missing?
Logstash Output with "%{[statements][authority]}":
{
"statements" => {
"verb" => {
"id" => "http://adlnet.gov/expapi/verbs/answered",
"display" => {
"en-US" => "answered"
}
},
"version" => "1.0.1",
"timestamp" => "2016-07-21T07:41:18.013880+00:00",
"object" => {
"definition" => {
"name" => {
"en-US" => "Example Activity"
},
"description" => {
"en-US" => "Example activity description"
}
},
"id" => "http://adlnet.gov/expapi/activities/example"
},
"actor" => {
"account" => {
"homePage" => "http://example.com",
"name" => "xapiguy"
},
"objectType" => "Agent"
},
"stored" => "2016-07-21T07:41:18.013880+00:00",
"authority" => {
"mbox" => "mailto:info#golab.eu",
"name" => "GoLab",
"objectType" => "Agent"
},
"id" => "0771b9bc-b1b8-4cb7-898e-93e8e5a9c550"
},
"id" => "a7e31874-780e-438a-874c-964373d219af",
"#version" => "1",
"#timestamp" => "2016-07-21T07:41:19.061Z",
"host" => "172.23.0.3",
"headers" => {
"request_method" => "POST",
"request_path" => "/",
"request_uri" => "/",
"http_version" => "HTTP/1.1",
"http_host" => "logstasher:5001",
"content_length" => "709",
"http_accept_encoding" => "gzip, deflate",
"http_accept" => "*/*",
"http_user_agent" => "python-requests/2.9.1",
"http_connection" => "close",
"content_type" => "application/json"
},
"es_index" => "{\"verb\":{\"id\":\"http://adlnet.gov/expapi/verbs/answered\",\"display\":{\"en-us\":\"answered\"}},\"version\":\"1.0.1\",\"timestamp\":\"2016-07-21t07:41:18.013880+00:00\",\"object\":{\"definition\":{\"name\":{\"en-us\":\"example_activity\"},\"description\":{\"en-us\":\"example_activity_description\"}},\"id\":\"http://adlnet.gov/expapi/activities/example\",\"objecttype\":\"activity\"},\"actor\":{\"account\":{\"homepage\":\"http://example.com\",\"name\":\"xapiguy\"},\"objecttype\":\"agent\"},\"stored\":\"2016-07-21t07:41:18.013880+00:00\",\"authority\":{\"mbox\":\"mailto:info#golab.eu\",\"name\":\"golab\",\"objecttype\":\"agent\"},\"id\":\"0771b9bc-b1b8-4cb7-898e-93e8e5a9c550\"}"
}
You can see that authority is part of es_index. So it was not chosen as a field.
Many thanks in advance
I found a solution. Credits go to jpcarey (Elasticsearch Forum)
I had to remove codec => "json". That leads to another data structure. statements is now an array and not an object. So I needed to change %{[statements][authority][name]} to %{[statements][0][authority][name]}. That works without problems.
If you follow the given link you'll also find an better implementation of my mutate filters.