Logstash JSON serialization fails on valid JSON (mapper_parsing_exception) - json

Given the following multiline log
{
"code" : 429
}
And the following pipeline logstash.conf
filter {
grok {
match =>
{
"message" =>
[
"%{GREEDYDATA:json}"
]
}
}
json {
source => "json"
target => "json"
}
}
When Log is send into logstash through filebeat
Then Logstash returns
[2018-08-07T10:48:41,067][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-to-logstash", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x2bf7b08d>], :response=>{"index"=>{"_index"=>"filebeat-to-logstash", "_type"=>"doc", "_id"=>"trAAFGUBnhQ5nUWmyzVg", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [json]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Can't get text on a START_OBJECT at 1:3846"}}}}}
This is incorrect behavior as the JSON is perfectly valid, how should this be solved?

I found out that in Logstash 6.3.0 this problem occurs when one tries to serialize JSON on the "json" field. Changing this field name to anything else solves this issue.
Since Elastic JSON filter plugin documentation does not mention anything about this behaviour and the error message is inaccurate it can be assumed this is a bug.
Bug report has been send: https://github.com/elastic/logstash/issues/9876

Related

Filtering JSON/non-JSON entries in Logstash

I have a question about filtering entries in Logstash. I have two different logs coming into Logstash. One log is just a std format with a timestamp and message, but the other comes in as JSON.
I use an if statement to test for a certain host and if that host is present, then I use the JSON filter to apply to the message... the problem is that when it encounters the non-JSON stdout message it can't parse it and throws exceptions.
Does anyone know how to test to see if an entry is JSON coming in apply the filter and if not, just ignore it?
thanks
if [agent][hostname] == "some host"
# if an entry is not in json format how to ignore?
{
json {
source => "message"
target => "gpfs"
}
}
You can try with a grok filter as a first step.
grok {
match => {
"message" => [
"{%{GREEDYDATA:json_message}}",
"%{GREEDYDATA:std_out}"
]
}
}
if [json_message]
{
mutate {
replace => { "json_message" => "{%{json_message}}"}
}
json {
source => "json_message"
target => "gpfs"
}
}
Probably there is a more cleaner solution then this, but it will do the job.

logstash not parsing all messages

im trying to parse syslog messages with logstash
im using grok match :
"message", "\[%{TIMESTAMP_ISO8601:timestamp}\] \[%{WORD:module}\] \[%{LOGLEVEL:severity}\] (\[tid=%{DATA
:tid}(\s)?\]) (\[%{DATA:auth}:%{NUMBER:linenum}:%{DATA:method}\]) %{GREEDYDATA:data}"
and i'm getting this error message:
[logstash.filters.json ] Error parsing json {:source=>"message", :raw=>"[2017-03-27 09:18:03,071] [WS-Server] [INFO] [WebSocketsHandler:81:on_message] [ON_MESSAGE] [140609651632016] [trn_id: 1062fed9-9ae3-4523-8817-657031e83af1] Received update-device-details message from box 38:b8:eb:50:00:a9", :exception=>#
what im missing ?
thanks
It's likely because this:
[2017-03-27 09:18:03,071] [WS-Server] [INFO] [WebSocketsHandler:81:on_message] [ON_MESSAGE] [140609651632016] [trn_id: 1062fed9-9ae3-4523-8817-657031e83af1] Received update-device-details message from box 38:b8:eb:50:00:a9
Is not valid JSON. Your hint is this in the error:
[logstash.filters.json ] Error parsing json
That tells me there is a json {} filter being applied to this in addition to the grok {} segment you've quoted. I suggest you review your filter { } statements and figure out how the raw syslog message (very likely still the message field) is ending up getting parsed by a json {} filter. Grok has nothing to do with this.

JSON Variants (Log4J) with LogStash

I'm not sure if this is a follow-up or separate question to this one. There is some piece about LogStash that is not clicking. For that, I apologize for a related question. Still, I'm going out of my mind here.
I have an app that writes logs to a file. Each log entry is a JSON object. An example of my .json file looks like the following:
{
"logger":"com.myApp.ClassName",
"timestamp":"1456976539634",
"level":"ERROR",
"thread":"pool-3-thread-19",
"message":"Danger. There was an error",
"throwable":"java.Exception"
},
{
"logger":"com.myApp.ClassName",
"timestamp":"1456976539649",
"level":"ERROR",
"thread":"pool-3-thread-16",
"message":"I cannot go on",
"throwable":"java.Exception"
}
This format is what's created from Log4J2's JsonLayout. I'm trying my damnedest to get the log entries into LogStash. In an attempt to do this, I've created the following LogStash configuration file:
input {
file {
type => "log4j"
path => "/logs/mylogs.log"
}
}
output {
file {
path => "/logs/out.log"
}
}
When I open /logs/out.log, I see a mess. There's JSON. However, I do not see the "level" property or "thread" property that Log4J generates. An example of a record can be seen here:
{"message":"Danger. There was an error","#version":"1","#timestamp":"2014-04-08T17:20:10.035Z","type":"log4j","host":"ip-myAddress","path":"/logs/mylogs.log"}
Sometimes I even get parse errors. I need my properties to still be properties. I do not want them crammed into the message portion or the output. I have a hunch this has something to do with Codecs. Yet, I'm not sure. I'm not sure if I should change the codec on the logstash input configuration. Or, if I should change the input on the output configuration. I would sincerely appreciate any help as I'm getting desperate at this point.
Can you change your log format?
After I change your log format to
{ "logger":"com.myApp.ClassName", "timestamp":"1456976539634", "level":"ERROR", "thread":"pool-3-thread-19", "message":"Danger. There was an error", "throwable":"java.Exception" }
{ "logger":"com.myApp.ClassName", "timestamp":"1456976539649", "level":"ERROR", "thread":"pool-3-thread-16", "message":"I cannot go on", "throwable":"java.Exception" }
One json log per one line and without the "," at the end of the log, I can use the configuration below to parse the json message to correspond field.
input {
file {
type => "log4j"
path => "/logs/mylogs.log"
codec => json
}
}
input {
file {
codec => json_lines { charset => "UTF-8" }
...
}
}
should do the trick
Use Logstash's log4j input.
http://logstash.net/docs/1.4.2/inputs/log4j
Should look something like this:
input {
log4j {
port => xxxx
}
}
This worked for me, good luck!
I think #Ben Lim was right, your Logstash config is fine, just need to properly format input JSON to have each log event in a single line. This is very simple with Log4J2's JsonLayout, just set eventEol=true and compact=true. (reference)

Using JSON with LogStash

I'm going out of my mind here. I have an app that writes logs to a file. Each log entry is a JSON object. An example of my .json file looks like the following:
{"Property 1":"value A","Property 2":"value B"}
{"Property 1":"value x","Property 2":"value y"}
I'm trying desperately to get the log entries into LogStash. In an attempt to do this, I've created the following LogStash configuration file:
input {
file {
type => "json"
path => "/logs/mylogs.log"
codec => "json"
}
}
output {
file {
path => "/logs/out.log"
}
}
Right now, I'm manually adding records to mylogs.log to try and get it working. However, they appear oddly in the stdout. When I look open out.log, I see something like the following:
{"message":"\"Property 1\":\"value A\", \"Property 2\":\"value B\"}","#version":"1","#timestamp":"2014-04-08T15:33:07.519Z","type":"json","host":"ip-[myAddress]","path":"/logs/mylogs.log"}
Because of this, if I send the message to ElasticSearch, I don't get the fields. Instead I get a jumbled mess. I need my properties to still be properties. I do not want them crammed into the message portion or the output. I have a hunch this has something to do with Codecs. Yet, I'm not sure. I'm not sure if I should change the codec on the logstash input configuration. Or, if I should change the input on the output configuration.
Try removing the json codec and adding a json filter:
input {
file {
type => "json"
path => "/logs/mylogs.log"
}
}
filter{
json{
source => "message"
}
}
output {
file {
path => "/logs/out.log"
}
}
You do not need the json codec because you do not want decode the source JSON but you want filter the input to get the JSON data in the #message field only.
By default tcp put everything to message field if json codec not specified.
An workaround to _jsonparsefailure of the message field after we specify the json codec also can be rectified by doing the following:
input {
tcp {
port => '9563'
}
}
filter{
json{
source => "message"
target => "myroot"
}
json{
source => "myroot"
}
}
output {
elasticsearch {
hosts => [ "localhost:9200" ]
}
}
It will parse message field to proper json string to field myroot
and then myroot is parsed to yield the json.
We can remove the redundant field like message as
filter {
json {
source => "message"
remove_field => ["message"]
}
}
Try with this one:
filter {
json {
source => "message"
target => "jsoncontent" # with multiple layers structure
}
}

Having Logstash reading JSON

I am trying to use logstash for analyzing a file containing JSON objects as follows:
{"Query":{"project_id":"a7565b911f324a9199a91854ea18de7e","timestamp":1392076800,"tx_id":"2e20a255448742cebdd2ccf5c207cd4e","token":"3F23A788D06DD5FE9745D140C264C2A4D7A8C0E6acf4a4e01ba39c66c7c9cbd6a123588b22dc3a24"}}
{"Response":{"result_code":"Success","project_id":"a7565b911f324a9199a91854ea18de7e","timestamp":1392076801,"http_status_code":200,"tx_id":"2e20a255448742cebdd2ccf5c207cd4e","token":"3F23A788D06DD5FE9745D140C264C2A4D7A8C0E6acf4a4e01ba39c66c7c9cbd6a123588b22dc3a24","targets":[]}}
{"Query":{"project_id":"a7565b911f324a9199a91854ea18de7e","timestamp":1392076801,"tx_id":"f7f68c7fb14f4959a1db1a206c88a5b7","token":"3F23A788D06DD5FE9745D140C264C2A4D7A8C0E6acf4a4e01ba39c66c7c9cbd6a123588b22dc3a24"}}
Ideally i'd expect Logstash to understand the JSON.
I used the following config:
input {
file {
type => "recolog"
format => json_event
# Wildcards work, here :)
path => [ "/root/isaac/DailyLog/reco.log" ]
}
}
output {
stdout { debug => true }
elasticsearch { embedded => true }
}
I built this file based on this Apache recipe
When running logstash with debug = true, it reads the objects like this:
How could i see stats in the kibana GUI based on my JSON file, for example number of Query objects and even queries based on timestamp.
For now it looks like it understand a very basic version of the data not the structure of it.
Thx in advance
I found out that logstash will automatically detect JSON byt using the codec field within the file input as follows:
input {
stdin {
type => "stdin-type"
}
file {
type => "prodlog"
# Wildcards work, here :)
path => [ "/root/isaac/Mylogs/testlog.log"]
codec => json
}
}
output {
stdout { debug => true }
elasticsearch { embedded => true }
}
Then Kibana showed the fields of the JSON perfectly.