Not able to use add_field in logstash - configuration

I would like to extract the interface word from a text from logstash.
Sample log -
2013 Aug 28 13:14:49 logFile: Interface Etherface1/9 is down (Transceiver Absent)
I want to extract "Etherface1/9" out of this and add it as a field called interface.
I am having the following conf file for the same
input
{
file
{
type => "syslog"
path => [ "/home/vineeth/logstash/mylog.log" ]
#path => ["d:/New Folder/sjdc.show.tech/n5k-3a-show-tech.txt"]
start_position=>["beginning"]
}
}
filter {
grok {
type => "syslog"
add_field => [ "port", "Interface %{WORD}" ]
}
}
output
{
stdout
{
debug => true debug_format => "json"
}
elasticsearch
{
embedded => true
}
}
But then i am always getting "_grokparsefailure" under tags and none of these new fields are appearing.
Kindly let me know how i can get the required output

The grok filter expects that you're trying to match some text. Since you're not passing any possible matches, it triggers the _grokparsefailure tag (per the manual, the tag is added "when there has been no successful match").
You might use a match like this:
grok {
match => ["message", "Interface %{DATA:port} is down"]
}
This will still fail if the match text isn't present. Logstash is pretty good at parsing fields with a simple structure, but data embedded in a user-friendly string is sometimes tricky. Usually you'll need to branch based on the message format.
Here's a very simple example, using a conditional with a regex:
if [message] =~ /Interface .+ is down/ {
grok {
match => ["message", "Interface %{DATA:port} is down"]
}
}

Related

Filtering JSON/non-JSON entries in Logstash

I have a question about filtering entries in Logstash. I have two different logs coming into Logstash. One log is just a std format with a timestamp and message, but the other comes in as JSON.
I use an if statement to test for a certain host and if that host is present, then I use the JSON filter to apply to the message... the problem is that when it encounters the non-JSON stdout message it can't parse it and throws exceptions.
Does anyone know how to test to see if an entry is JSON coming in apply the filter and if not, just ignore it?
thanks
if [agent][hostname] == "some host"
# if an entry is not in json format how to ignore?
{
json {
source => "message"
target => "gpfs"
}
}
You can try with a grok filter as a first step.
grok {
match => {
"message" => [
"{%{GREEDYDATA:json_message}}",
"%{GREEDYDATA:std_out}"
]
}
}
if [json_message]
{
mutate {
replace => { "json_message" => "{%{json_message}}"}
}
json {
source => "json_message"
target => "gpfs"
}
}
Probably there is a more cleaner solution then this, but it will do the job.

How do I pretty-print JSON for an email body in logstash?

I have a Logstash configuration that I've been using to forward log messages in emails. It uses json and json_encode to parse and re-encode JSON log messages.
json_encode used to pretty-print the JSON, which made for very nice looking emails. Unfortunately, with recent Logstash upgrades, it no longer pretty prints.
Is there any way I can get a pretty form of the event into a field that I can use for the email bodies? I'm fine with JSON, Ruby debug, or most other human readable formats.
filter {
if [type] == "bunyan" {
# Save a copy of the message, in case we need to pretty-print later
mutate {
add_field => { "#orig_message" => "%{message}" }
}
json {
source => "message"
add_tag => "json"
}
}
// other filters that might add an "email" tag
if "email" in [tags] {
# pretty-print JSON for the email
if "json" in [tags] {
# re-parse the message into a field we can encode
json {
source => "#orig_message"
target => "body"
}
# encode the message, but pretty this time
json_encode {
source => "body"
target => "body"
}
}
# escape the body for HTML output
mutate {
add_field => { htmlbody => "%{body}" }
}
mutate {
gsub => [
'htmlbody', '&', '&',
'htmlbody', '<', '<'
]
}
}
}
output {
if "email" in [tags] and "throttled" not in [tags] {
email {
options => {
# config stuff...
}
body => "%{body}"
htmlbody => "
<table>
<tr><td>host:</td><td>%{host}</td></tr>
<tr><td>when:</td><td>%{#timestamp}</td></tr>
</table>
<pre>%{htmlbody}</pre>
"
}
}
}
As said by approxiblue, this issue is caused by logstash's new JSON parser (JrJackson). You can use the old parser as a workaround until pretty-print support is added again. Here is how:
You need to change two lines of the plugin's ruby file. Path should be something like:
LS_HOME/vendor/bundle/jruby/1.9/gems/logstash-filter-json_encode-0.1.5/lib/logstash/filters/json_encode.rb
Change line 5
require "logstash/json"
into
require "json"
And change line 44
event[#target] = LogStash::Json.dump(event[#source])
into
event[#target] = JSON.pretty_generate(event[#source])
That's all. After restarting logstash should pretty-print again.
Supplement:
In case you don't like changing your ruby sources you could also use a ruby filter instead of json_encode:
# encode the message, but pretty this time
ruby {
init => "require 'json'"
code => "event['body'] = JSON.pretty_generate(event['body'])"
}

Logstash JSON Parse - Ignore or Remove Sub-Tree

I'm sending JSON to logstash with a config like so:
filter {
json {
source => "event"
remove_field => [ "event" ]
}
}
Here is an example JSON object I'm sending:
{
  "#timestamp": "2015-04-07T22:26:37.786Z",
  "type": "event",
  "event": {
    "activityRecord": {
      "id": 68479,
      "completeTime": 1428445597542,
      "data": {
        "2015-03-16": true,
        "2015-03-17": true,
        "2015-03-18": true,
        "2015-03-19": true
      }
    }
  }
}
Because of the arbitrary nature of the activityRecord.data object, I don't want logstash and elasticsearch to index all these date fields. As is, I see activityRecord.data.2015-03-16 as a field to filter on in Kibana.
Is there a way to ignore this sub-tree of data? Or at least delete it after it has already been parsed? I tried remove_field with wildcards and whatnot, but no luck.
Though not entirely intuitive it is documented that subfield references are made with square brackets, e.g. [field][subfield], so that's what you'll have to use with remove_field:
mutate {
remove_field => "[event][activityRecord][data]"
}
To delete fields using wildcard matching you'd have to use a ruby filter.

Using JSON with LogStash

I'm going out of my mind here. I have an app that writes logs to a file. Each log entry is a JSON object. An example of my .json file looks like the following:
{"Property 1":"value A","Property 2":"value B"}
{"Property 1":"value x","Property 2":"value y"}
I'm trying desperately to get the log entries into LogStash. In an attempt to do this, I've created the following LogStash configuration file:
input {
file {
type => "json"
path => "/logs/mylogs.log"
codec => "json"
}
}
output {
file {
path => "/logs/out.log"
}
}
Right now, I'm manually adding records to mylogs.log to try and get it working. However, they appear oddly in the stdout. When I look open out.log, I see something like the following:
{"message":"\"Property 1\":\"value A\", \"Property 2\":\"value B\"}","#version":"1","#timestamp":"2014-04-08T15:33:07.519Z","type":"json","host":"ip-[myAddress]","path":"/logs/mylogs.log"}
Because of this, if I send the message to ElasticSearch, I don't get the fields. Instead I get a jumbled mess. I need my properties to still be properties. I do not want them crammed into the message portion or the output. I have a hunch this has something to do with Codecs. Yet, I'm not sure. I'm not sure if I should change the codec on the logstash input configuration. Or, if I should change the input on the output configuration.
Try removing the json codec and adding a json filter:
input {
file {
type => "json"
path => "/logs/mylogs.log"
}
}
filter{
json{
source => "message"
}
}
output {
file {
path => "/logs/out.log"
}
}
You do not need the json codec because you do not want decode the source JSON but you want filter the input to get the JSON data in the #message field only.
By default tcp put everything to message field if json codec not specified.
An workaround to _jsonparsefailure of the message field after we specify the json codec also can be rectified by doing the following:
input {
tcp {
port => '9563'
}
}
filter{
json{
source => "message"
target => "myroot"
}
json{
source => "myroot"
}
}
output {
elasticsearch {
hosts => [ "localhost:9200" ]
}
}
It will parse message field to proper json string to field myroot
and then myroot is parsed to yield the json.
We can remove the redundant field like message as
filter {
json {
source => "message"
remove_field => ["message"]
}
}
Try with this one:
filter {
json {
source => "message"
target => "jsoncontent" # with multiple layers structure
}
}

Having Logstash reading JSON

I am trying to use logstash for analyzing a file containing JSON objects as follows:
{"Query":{"project_id":"a7565b911f324a9199a91854ea18de7e","timestamp":1392076800,"tx_id":"2e20a255448742cebdd2ccf5c207cd4e","token":"3F23A788D06DD5FE9745D140C264C2A4D7A8C0E6acf4a4e01ba39c66c7c9cbd6a123588b22dc3a24"}}
{"Response":{"result_code":"Success","project_id":"a7565b911f324a9199a91854ea18de7e","timestamp":1392076801,"http_status_code":200,"tx_id":"2e20a255448742cebdd2ccf5c207cd4e","token":"3F23A788D06DD5FE9745D140C264C2A4D7A8C0E6acf4a4e01ba39c66c7c9cbd6a123588b22dc3a24","targets":[]}}
{"Query":{"project_id":"a7565b911f324a9199a91854ea18de7e","timestamp":1392076801,"tx_id":"f7f68c7fb14f4959a1db1a206c88a5b7","token":"3F23A788D06DD5FE9745D140C264C2A4D7A8C0E6acf4a4e01ba39c66c7c9cbd6a123588b22dc3a24"}}
Ideally i'd expect Logstash to understand the JSON.
I used the following config:
input {
file {
type => "recolog"
format => json_event
# Wildcards work, here :)
path => [ "/root/isaac/DailyLog/reco.log" ]
}
}
output {
stdout { debug => true }
elasticsearch { embedded => true }
}
I built this file based on this Apache recipe
When running logstash with debug = true, it reads the objects like this:
How could i see stats in the kibana GUI based on my JSON file, for example number of Query objects and even queries based on timestamp.
For now it looks like it understand a very basic version of the data not the structure of it.
Thx in advance
I found out that logstash will automatically detect JSON byt using the codec field within the file input as follows:
input {
stdin {
type => "stdin-type"
}
file {
type => "prodlog"
# Wildcards work, here :)
path => [ "/root/isaac/Mylogs/testlog.log"]
codec => json
}
}
output {
stdout { debug => true }
elasticsearch { embedded => true }
}
Then Kibana showed the fields of the JSON perfectly.