Access nested JSON Field in Logstash - json

I have a Problem with accessing a nested JSON field in logstash (latest version).
My config file is the following:
input {
http {
port => 5001
codec => "json"
}
}
filter {
mutate {
add_field => {"es_index" => "%{[statements][authority][name]}"}
}
mutate {
gsub => [
"es_index", " ", "_"
]
}
mutate {
lowercase => ["es_index"]
}
ruby {
init => "
def remove_dots hash
new = Hash.new
hash.each { |k,v|
if v.is_a? Hash
v = remove_dots(v)
end
new[ k.gsub('.','_') ] = v
if v.is_a? Array
v.each { |elem|
if elem.is_a? Hash
elem = remove_dots(elem)
end
new[ k.gsub('.','_') ] = elem
} unless v.nil?
end
} unless hash.nil?
return new
end
"
code => "
event.instance_variable_set(:#data,remove_dots(event.to_hash))
"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => "elasticsearch:9200"
index => "golab-%{+YYYY.MM.dd}"
}
}
I have a filter with mutate. I want to add a field that I can use as a part of the index name. When I use this "%{[statements][authority][name]}" the content in the brackets is used as string.%{[statements][authority][name]} is saved in the es_indexfield. Logstash seems to think this is a string, but why?
I've also tried to use this expression: "%{statements}". It's working like expected. Everything in the field statements is passed to es_index. If I use "%{[statements][authority]}" strange things happen. es_index is filled with the exact same output that "%{statements}" produces. What am I missing?
Logstash Output with "%{[statements][authority]}":
{
"statements" => {
"verb" => {
"id" => "http://adlnet.gov/expapi/verbs/answered",
"display" => {
"en-US" => "answered"
}
},
"version" => "1.0.1",
"timestamp" => "2016-07-21T07:41:18.013880+00:00",
"object" => {
"definition" => {
"name" => {
"en-US" => "Example Activity"
},
"description" => {
"en-US" => "Example activity description"
}
},
"id" => "http://adlnet.gov/expapi/activities/example"
},
"actor" => {
"account" => {
"homePage" => "http://example.com",
"name" => "xapiguy"
},
"objectType" => "Agent"
},
"stored" => "2016-07-21T07:41:18.013880+00:00",
"authority" => {
"mbox" => "mailto:info#golab.eu",
"name" => "GoLab",
"objectType" => "Agent"
},
"id" => "0771b9bc-b1b8-4cb7-898e-93e8e5a9c550"
},
"id" => "a7e31874-780e-438a-874c-964373d219af",
"#version" => "1",
"#timestamp" => "2016-07-21T07:41:19.061Z",
"host" => "172.23.0.3",
"headers" => {
"request_method" => "POST",
"request_path" => "/",
"request_uri" => "/",
"http_version" => "HTTP/1.1",
"http_host" => "logstasher:5001",
"content_length" => "709",
"http_accept_encoding" => "gzip, deflate",
"http_accept" => "*/*",
"http_user_agent" => "python-requests/2.9.1",
"http_connection" => "close",
"content_type" => "application/json"
},
"es_index" => "{\"verb\":{\"id\":\"http://adlnet.gov/expapi/verbs/answered\",\"display\":{\"en-us\":\"answered\"}},\"version\":\"1.0.1\",\"timestamp\":\"2016-07-21t07:41:18.013880+00:00\",\"object\":{\"definition\":{\"name\":{\"en-us\":\"example_activity\"},\"description\":{\"en-us\":\"example_activity_description\"}},\"id\":\"http://adlnet.gov/expapi/activities/example\",\"objecttype\":\"activity\"},\"actor\":{\"account\":{\"homepage\":\"http://example.com\",\"name\":\"xapiguy\"},\"objecttype\":\"agent\"},\"stored\":\"2016-07-21t07:41:18.013880+00:00\",\"authority\":{\"mbox\":\"mailto:info#golab.eu\",\"name\":\"golab\",\"objecttype\":\"agent\"},\"id\":\"0771b9bc-b1b8-4cb7-898e-93e8e5a9c550\"}"
}
You can see that authority is part of es_index. So it was not chosen as a field.
Many thanks in advance

I found a solution. Credits go to jpcarey (Elasticsearch Forum)
I had to remove codec => "json". That leads to another data structure. statements is now an array and not an object. So I needed to change %{[statements][authority][name]} to %{[statements][0][authority][name]}. That works without problems.
If you follow the given link you'll also find an better implementation of my mutate filters.

Related

Logstash Grok JSON error - mapper of different type

I have this log file:
2020-08-05 09:11:19 INFO-flask.model-{"version": "1.2.1", "time": 0.651745080947876, "output": {...}}
This is my logstash filter setting
grok{
match => {
"message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log.level}-%{DATA:model}-%{GREEDYDATA:log.message}"}
}
date {
timezone => "UTC"
match => ["timestamp" , "ISO8601", "yyyy-MM-dd HH:mm:ss"]
target => "#timestamp"
remove_field => [ "timestamp" ]
}
json{
source => "log.message"
target => "log.message"
}
mutate {
add_field => {
"execution.time" => "%{[log.message][time]}"
}
}
}
I want to extract the "time" value from the message. But I receive this error:
[2020-08-05T09:11:32,688][WARN ][logstash.outputs.elasticsearch][main][81ad4d5f6359b99ec4e52c93e518567c1fe91de303faf6fa1a4d905a73d3c334] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"index-2020.08.05", :routing=>nil, :_type=>"_doc"}, #<LogStash::Event:0xbe6a80>], :response=>{"index"=>{"_index"=>"index-2020.08.05", "_type"=>"_doc", "_id"=>"ywPjvXMByEqBCvLy1871", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"mapper [log.message.input.values] of different type, current_type [long], merged_type [text]"}}}}
Please find the filter part for your logstash configuration:
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log-level}-%{DATA:model}-%{GREEDYDATA:KV}" }
overwrite => [ "message" ]
}
kv {
source => "KV"
value_split => ": "
field_split => ", "
target => "msg"
}
}
Hope this will solve your problem.

How to fix "_geoip_lookup_failure" tag with logstash filter "geoip" when posting a json with http to logstash

I am posting a json from an application to logstash wanting to get the location of an IP-adress with logstashes geoip plugin. However i get a _geoip_lookup_failure.
this is my logstash config
http {
port => "4200"
codec => json
}
}
filter{
geoip {
source => "clientip"
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}
this is what I post to the port :
{'used_credentials': [
{'username': 'l', 'number_of_usages': 1, 'used_commands': {},
'get_access': 'false',
'timestamps': {'1': '04/15/2019, 21:08:54'}, 'password': 'l'}],
'clientip': '192.168.xxx.xx',
'unsuccessfull_logins': 1}
and this is what i get in logstash:
{
"unsuccessfull_logins" => 1,
"#version" => "1",
"used_credentials" => [
[0] {
"username" => "l",
"used_commands" => {},
"password" => "l",
"timestamps" => {
"1" => "04/15/2019, 21:08:54"
},
"number_of_usages" => 1,
"get_access" => "false"
}
],
"clientip" => "192.168.xxx.xx",
"#timestamp" => 2019-04-15T19:08:57.147Z,
"host" => "127.0.0.1",
"headers" => {
"request_path" => "/telnet",
"connection" => "keep-alive",
"accept_encoding" => "gzip, deflate",
"http_version" => "HTTP/1.1",
"content_length" => "227",
"http_user_agent" => "python-requests/2.21.0",
"request_method" => "POST",
"http_accept" => "*/*",
"content_type" => "application/json",
"http_host" => "127.0.0.1:4200"
},
"geoip" => {},
"tags" => [
[0] "_geoip_lookup_failure"
]
}
I don't understand why the input is recognized corectly but goeip does not find it
The problem is that your clientip is in the 192.168.0.0/16 network, which is a private network reserved for local use only, it is not present on the database used by the geoip filter.
The geoip filter will only work with public IP addresses.

Unable to output fields from logstash to csv

I'm trying to send logstash outputs to csv, but the columns are not being written in the file.
This is my logstash configuration:
input
{
http
{
host => "0.0.0.0"
port => 31311
}
}
filter
{
grok {
match => { "id" => "%{URIPARAM:id}?" }
}
kv
{
field_split => "&?"
source => "[headers][request_uri]"
}
}
output
{
stdout { codec => rubydebug }
csv
{
fields => ["de,cd,dl,message,bn,ua"]
path => "/tmp/logstash-bq/text.csv"
flush_interval => 0
csv_options => {"col_sep" => ";" "row_sep" => "\r\n"}
}
}
This is my input:
curl -X POST 'http://localhost:31311/?id=9decaf95-20a5-428e-a3ca-50485edb9f9f&uid=1-fg4fuqed-j0hzl5q2&ev=pageview&ed=&v=1&dl=http://dev.xxx.com.br/&rl=http://dev.xxxx.com.br/&ts=1491758180677&de=UTF-8&sr=1600x900...
This is logstash answer:
{
"headers" => {
"http_accept" => "*/*",
"request_path" => "/",
"http_version" => "HTTP/1.1",
"request_method" => "POST",
"http_host" => "localhost:31311",
"request_uri" => "/?id=xxx...",
"http_user_agent" => "curl/7.47.1"
},
"de" => "UTF-8",
"cd" => "24",
"dl" => "http://dev.xxx.com.br/",
"message" => "",
"bn" => "Chrome%2057",
"ua" => "Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_11_3)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/57.0.2987.133%20Safari/537.36",
"dt" => "xxxx",
"uid" => "1-fg4fuqed-j0hzl5q2",
"ev" => "pageview",
"#timestamp" => 2017-04-09T17:41:03.083Z,
"v" => "1",
"md" => "false",
"#version" => "1",
"host" => "0:0:0:0:0:0:0:1",
"rl" => "http://dev.xxx.com.br/",
"vp" => "1600x236",
"id" => "9decaf95-20a5-428e-a3ca-50485edb9f9f",
"ts" => "1491758180677",
"sr" => "1600x900"
}
[2017-04-09T14:41:03,137][INFO ][logstash.outputs.csv ] Opening file {:path=>"/tmp/logstash-bq/text.csv"}
But when I open /tmp/logstash-bq/text.csv I see this:
2017-04-09T16:26:17.464Z 127.0.0.1 abc2017-04-09T17:19:19.690Z 0:0:0:0:0:0:0:1 2017-04-09T17:23:12.117Z 0:0:0:0:0:0:0:1 2017-04-09T17:24:08.067Z 0:0:0:0:0:0:0:1 2017-04-09T17:31:39.269Z 0:0:0:0: 0:0:0:1 2017-04-09T17:38:02.624Z 0:0:0:0:0:0:0:1 2017-04-09T17:41:03.083Z 0:0:0:0:0:0:0:1
CSV output is bugged for logstash 5.x. I had to install logstash 2.4.1.

Logstash Parsing and Calculations with CSV

I am having trouble parsing and calculating performance Navigation Timing data I have in a csv.
I was able to parse the fields but not sure how to approach the calculations (below) properly. Some points to keep in mind:
Data sets are grouped together by the bolded value (it is the ts of when the 21 datapoints were taken
ACMEPage-1486643427973,unloadEventEnd,1486643372422
2.Calculations need to be done with data points within the group
I am assuming some tagging and grouping will need to be done but I don't have a clear vision on how to implement it. Any help would be greatly appreciated.
Thanks,
---------------Calculations-----------------
Total First byte Time = responseStart - navigationStart
Latency = responseStart – fetchStart
DNS / Domain Lookup Time = domainLookupEnd - domainLookupStart
Server connect Time = connectEnd - connectStart
Server Response Time = responseStart - requestStart
Page Load time = loadEventStart - navigationStart
Transfer/Page Download Time = responseEnd - responseStart
DOM Interactive Time = domInteractive - navigationStart
DOM Content Load Time = domContentLoadedEventEnd - navigationStart
DOM Processing to Interactive =domInteractive - domLoading
DOM Interactive to Complete = domComplete - domInteractive
Onload = loadEventEnd - loadEventStart
-------Data in CSV-----------
ACMEPage-1486643427973,unloadEventEnd,1486643372422
ACMEPage-1486643427973,responseEnd,1486643372533
ACMEPage-1486643427973,responseStart,1486643372416
ACMEPage-1486643427973,domInteractive,1486643373030
ACMEPage-1486643427973,domainLookupEnd,1486643372194
ACMEPage-1486643427973,unloadEventStart,1486643372422
ACMEPage-1486643427973,domComplete,1486643373512
ACMEPage-1486643427973,domContentLoadedEventStart,1486643373030
ACMEPage-1486643427973,domainLookupStart,1486643372194
ACMEPage-1486643427973,redirectEnd,0
ACMEPage-1486643427973,redirectStart,0
ACMEPage-1486643427973,connectEnd,1486643372194
ACMEPage-1486643427973,toJSON,{}
ACMEPage-1486643427973,connectStart,1486643372194
ACMEPage-1486643427973,loadEventStart,1486643373512
ACMEPage-1486643427973,navigationStart,1486643372193
ACMEPage-1486643427973,requestStart,1486643372203
ACMEPage-1486643427973,secureConnectionStart,0
ACMEPage-1486643427973,fetchStart,1486643372194
ACMEPage-1486643427973,domContentLoadedEventEnd,1486643373058
ACMEPage-1486643427973,domLoading,1486643372433
ACMEPage-1486643427973,loadEventEnd,1486643373514
----------Output---------------
"path" => "/Users/philipp/Downloads/build2/logDataPoints_com.concur.automation.cge.ui.admin.ADCLookup_1486643340910.csv",
"#timestamp" => 2017-02-09T12:29:57.763Z,
"navigationTimer" => "connectStart",
"#version" => "1",
"host" => "15mbp-09796.local",
"elapsed_time" => "1486643372194",
"pid" => "1486643397763",
"page" => "ADCLookupDataPage",
"message" => "ADCLookupDataPage-1486643397763,connectStart,1486643372194",
"type" => "csv"
}
--------------logstash.conf----------------
input {
file {
type => "csv"
path => "/Users/path/logDataPoints_com.concur.automation.acme.ui.admin.acme_1486643340910.csv"
start_position => beginning
# to read from the beginning of file
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["page_id", "navigationTimer", "elapsed_time"]
}
if (["elapsed_time"] == "{}" ) {
drop{}
}
else {
grok {
match => { "page_id" => "%{WORD:page}-%{INT:pid}"
}
remove_field => [ "page_id" ]
}
}
date {
match => [ "pid", "UNIX_MS" ]
target => "#timestamp"
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}
I the following to get trend my data:
-I found it easier to pivot the data, rather than going down the column, to have the data go along the rows per each "event" or "document"
-Each field needed to be mapped accordingly as an integer or string
Once the data was in Kibana properly I had problems using the ruby code filter to make simple math calculations so I ended up using the "scripted fields" to make the calculations in Kibana.
input {
file {
type => "csv"
path => "/Users/philipp/perf_csv_pivot2.csv"
start_position => beginning
# to read from the beginning of file
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["page_id","unloadEventEnd","responseEnd","responseStart","domInteractive","domainLookupEnd","unloadEventStart","domComplete","domContentLoadedEventStart","domainLookupstart","redirectEnd","redirectStart","connectEnd","toJSON","connectStart","loadEventStart","navigationStart","requestStart","secureConnectionStart","fetchStart","domContentLoadedEventEnd","domLoading","loadEventEnd"]
}
grok {
match => { "page_id" => "%{WORD:page}-%{INT:page_ts}" }
remove_field => [ "page_id", "message", "path" ]
}
mutate {
convert => { "unloadEventEnd" => "integer" }
convert => { "responseEnd" => "integer" }
convert => { "responseStart" => "integer" }
convert => { "domInteractive" => "integer" }
convert => { "domainLookupEnd" => "integer" }
convert => { "unloadEventStart" => "integer" }
convert => { "domComplete" => "integer" }
convert => { "domContentLoadedEventStart" => "integer" }
convert => { "domainLookupstart" => "integer" }
convert => { "redirectEnd" => "integer" }
convert => { "redirectStart" => "integer" }
convert => { "connectEnd" => "integer" }
convert => { "toJSON" => "string" }
convert => { "connectStart" => "integer" }
convert => { "loadEventStart" => "integer" }
convert => { "navigationStart" => "integer" }
convert => { "requestStart" => "integer" }
convert => { "secureConnectionStart" => "integer" }
convert => { "fetchStart" => "integer" }
convert => { "domContentLoadedEventEnd" => "integer" }
convert => { "domLoading" => "integer" }
convert => { "loadEventEnd" => "integer" }
}
date {
match => [ "page_ts", "UNIX_MS" ]
target => "#timestamp"
remove_field => [ "page_ts", "timestamp", "host", "toJSON" ]
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}
Hope this can help someone else,

Logstash 1.4.1 multiline codec not working

I'm trying to parse multiline data from log file.
I have tried multiline codec and multiline filter.
but it doesn't work for me.
Log data
INFO 2014-06-26 12:34:42,881 [4] [HandleScheduleRequests] Request Entity:
User Name : user
DLR : 04
Text : string
Interface Type : 1
Sender : sdr
DEBUG 2014-06-26 12:34:43,381 [4] [HandleScheduleRequests] Entitis is : 1 System.Exception
and this is configuration file
input {
file {
type => "cs-bulk"
path =>
[
"/logs/bulk/*.*"
]
start_position => "beginning"
sincedb_path => "/logstash-1.4.1/bulk.sincedb"
codec => multiline {
pattern => "^%{LEVEL4NET}"
what => "previous"
negate => true
}
}
}
output {
stdout { codec => rubydebug }
if [type] == "cs-bulk" {
elasticsearch {
host => localhost
index => "cs-bulk"
}
}
}
filter {
if [type] == "cs-bulk" {
grok {
match => { "message" => "%{LEVEL4NET:level} %{TIMESTAMP_ISO8601:time} %{THREAD:thread} %{LOGGER:method} %{MESSAGE:message}" }
overwrite => ["message"]
}
}
}
and this is what I get when logstash parsing the multiline part
It just get the first line, and tag it as multiline.
the other lines not parsed!
{
"#timestamp" => "2014-06-27T16:27:21.678Z",
"message" => "Request Entity:",
"#version" => "1",
"tags" => [
[0] "multiline"
],
"type" => "cs-bulk",
"host" => "lab",
"path" => "/logs/bulk/22.log",
"level" => "INFO",
"time" => "2014-06-26 12:34:42,881",
"thread" => "[4]",
"method" => "[HandleScheduleRequests]"
}
Place a (?m) at the beginning of your grok pattern. That will allow regex to not stop at \n.
Not quite sure what's going on, but using a multiline filter instead of a codec like this:
input {
stdin {
}
}
filter {
multiline {
pattern => "^(WARN|DEBUG|ERROR)"
what => "previous"
negate => true
}
}
Does work in my testing...
{
"message" => "INFO 2014-06-26 12:34:42,881 [4] [HandleScheduleRequests] Request Entity:\nUser Name : user\nDLR : 04\nText : string\nInterface Type : 1\nSender : sdr",
"#version" => "1",
"#timestamp" => "2014-06-27T20:32:05.288Z",
"host" => "HOSTNAME",
"tags" => [
[0] "multiline"
]
}
{
"message" => "DEBUG 2014-06-26 12:34:43,381 [4] [HandleScheduleRequests] Entitis is : 1 System.Exception",
"#version" => "1",
"#timestamp" => "2014-06-27T20:32:05.290Z",
"host" => "HOSTNAME"
}
Except... the test file I used it never prints out the last line (because it's still looking for more to follow)