Logstash remove field after JSON-parsing it - json

Using Logstash 1.4.2 with ElasticSearch 1.3 (I'm aware it's not the latest ES version available) on Ubuntu 14.04 LTS.
We have an event stream which contains JSON inside one of its fields named "message".
We'd like to replace the event fields by the JSON of that field if it's found.
We'd also like to remove the ORIGINAL "message" field (the one which contains the JSON string) if found and parsed.
The problem is that the JSON object inside the field's text could define a new "message" field, which we have to retain.
The following removes the "message" field always after parsing it:
json {
source => "message"
remove_field => [ "message" ]
}
Which is wrong, we want to keep it in case there was a "message" field inside the value of the original "message" field.
I tried to do the following trick, but it seems to still remove the "message" field from the result:
mutate {
rename => [ "message", "___temp_logstash_filter_message___" ]
}
json {
source => "___temp_logstash_filter_message___"
}
mutate {
remove_field => [ "___temp_logstash_filter_message___" ]
}
i.e. I try to rename the original "message" field to an arbitrary internal name which I don't expect to appear in the input value, parse the JSON string using that temporary name as a source, then remove the renamed original field.
That way I was hoping to distinguish between the "original" message field and any "message" field which may be contained inside its JSON value. But this doesn't seem to make a difference - the "message" field is still missing from the result.
Is there a way to achieve what I need?
Thanks.

Instead of renaming the field, copy the first field into a new one.
This can be done with:
filter {
...
ruby {
code => "event['new_field'] = event['old_field']"
}
...
}

Related

Filter json objects

Some of my logs contain json in their message field. I use the json filter as follow:
json {
skip_on_invalid_json => true
source => "message"
target => "json"
}
To try to parse the message field, and if it contains valid json add it to the json field.
Unfortunately from time to time, I receive logs which contain a single string like "some random message" in the message field. In these logs the string from message end-up in the json and messes up the index mapping.
I htried to filter this out by adding:
prune {
blacklist_values => { "json" => "/.+/" }
}
But this seems to always remove the json field.
Is there a way to parse the message field or keep the json field only when it contains an object and not a single string?
You could do it using a ruby filter that tests the field you are interested in
ruby {
code => '
s = event.get("json")
if s and s.instance_of? String
event.remove("json")
end
'
}
That will not remove [json] if it is a hash or array.

Pretty-print JSON document in Logstash 5

I am using Logstash 5 to ingest a file containing a set of JSON documents, extract part of each document in each event, and write them out to another file.
I want to pretty-print the output JSON. I found this thread: How do I pretty-print JSON for an email body in logstash? which looked ideal, but Logstash 5 now prevents you from accessing event attributes directly, replacing it with event.get and event.set methods (see https://www.elastic.co/guide/en/logstash/current/event-api.html).
I tried to convert the above as follows:
ruby {
init => "require 'json'"
code => "#pretty_body = JSON.pretty_generate(event.get('body'))
event.set('body', #pretty_body)
"
}
but get
ERROR logstash.filters.ruby - Ruby exception occurred: only generation of JSON objects or arrays allowed
What is the correct LS5 equivalent of the above?
I think you're looking for the following:
filter {
ruby {
init => "require 'json'"
code => "event.set('message', JSON.pretty_generate(JSON.parse(event.get('message'))))"
}
}
Here is my test and the result:
input: {"make_me_pretty":"test"}
{
"#timestamp" => 2017-03-22T04:56:00.118Z,
"#version" => "1",
"message" => "{\n \"make_me_pretty\": \"test\"\n}"
}
Thank you #fylie! I'm posting this as another answer as I needed to adjust what I was doing a bit.
My problem was that the input document is a JSON document wrapped in another JSON document, so the inner doc is one field in the outer doc, which contains metadata. So "message" contains more than I want to write to the file.
I was hung up on trying to reduce the document to only what I wanted, by parsing the inner doc field with the json filter up to the top level, removing everything else, and then pretty-printing what remained. So I had nothing left to use in the Ruby filter.
To fix this, I keep everything, don't parse the inner doc field using the json filter, pretty print it with the ruby filter instead, and output to file just that field using the { format => %{fieldname} } option in the file output plugin, and it works as desired.
The question says you want to pretty-print the JSON output to another file. Just use the File Output plugin with the rubydebug codec:
output {
file {
path => "./output/outputFile.json"
codec => rubydebug
}
}
(I'm using logstash 7.15.2.)

Logstash json filter parsed fields cannot be read within logstash

I am parsing a json file with "codec => json" in the input and " json { source=>message }" in the filter.
I have also tried alternating the two.
The parsed fields cannot be read by logstash using "if [comment]". This will not work despite the being about to see the field with values with "stdout { codec => rubydebug }" as output
I just found out that the fields that I am trying to work with are actually sub fields. I was trying to access them like normal fields.

kv filter in logstash

How does remove_field in kv work? I have a json file and need to remove fields that are deeply nested in the json file.
[url][queryString][404;http://hspb.homesearch.com:80/wcJV4LhTSmzJ1rX6FOq4RuiKe K49gUP2JvWtjdhhE] is one such field
This filter doesn't work in logstash
filter {
kv {
source => [ "[url][queryString]" ]
remove_field => [ "404;%{somefield}" "my_extraneous_field" ]
}
}
remove_field will remove the named field(s) when the underlying filter (in your case 'kv') succeeds.
If you need to refer to nested fields, try "[foo][bar]". You can test if you can use fields in the variable names...
NOTE: [foo][bar] is meant to illustrate how to refer to nested fields. If your fields are [myTopField][myNestedField], use that.

Get json data from var

I gets following data in to a variable fields
{ data: [ '{"myObj":"asdfg"}' ] }
How to get the value of myObj to another variable? I tried fields.myObj.
I am trying to upload file to server using MEANjs and node multiparty
Look at your data.
fields only has one property: data. So fields.myObj isn't going to work.
So, let's start with fields.data.
The value of that is an array. You can see the []. It has only one member, so:
fields.data[0]
This is a string. You seem to want to treat it as an object. It happens to conform to the JSON syntax, so you can parse it:
JSON.parse(fields.data[0])
This parses into an object, so now you can access the myObj property.
JSON.parse(fields.data[0]).myObj
var fields = { data: [ '{"myObj":"asdfg"}' ] };
alert(JSON.parse(fields.data[0]).myObj);