Logstash - splitting the log into a csv file - csv

I want to use logstash to separate the appropriate logs by a constant value appearing in these logs, and then divide the log into pieces after the separator ("|") and put it into a csv file with headers. The logs I'm looking for are recognized by the constant (WID2). I also noticed that the message pulled out by GREEDYDATA gets cut off after about 85 characters
Example log:
2022-01-02 10:32:30,0000001 | WID2 | 3313141414 | Request | STEP_1 | OK | Message
And i want from this logs create csv file with headers: TIMESTAMP, VALUE, MESSAGE_TYPE, STEP, STATUS, MESSAGE. I do not want to save a constant value (WID2) in the csv file, it only serves to find my logs among others.
I wrote it but it doesn't work:
input {
file {
path => ["path"]
start_position => "beginning"
sincedb_path => "path"
}
}
filter {
grok {
match => {
"message" => "%{GREEDYDATA:SYSLOGMESSAGE}"
}
}
if ([SYSLOGMESSAGE] !~ "WID2"){
drop {}
}
if([SYSLOGMESSAGE] =~ 'WID2") {
csv {
separator => "|"
columns => ["TIMESTAMP", "VALUE", "MESSAGE_TYPE", "STEP", "STATUS", "MESSAGE"]
}
}
}
output{
file {
path => ["path.csv"]
}
}

If your log messages have this format:
2022-01-02 10:32:30,0000001 | WID2 | 3313141414 | Request | STEP_1 | OK | Message
And you want to parse every message that has WID2 on it, the following filter will work.
filter {
if "WID2" in [message] {
csv {
separator => "|"
columns => ["TIMESTAMP", "[#metadata][wid2]", "VALUE", "MESSAGE_TYPE", "STEP", "STATUS", "MESSAGE"]
}
} else {
drop {}
}
}
The if conditional will test if WID2 is present in the message, if it is true, it will use the csv filter to parse it, since the second column of your csv is the value WID2 and you do not want to save it, you can store its value in the field [#metadata][wid2], this metadata field will not be present in the output block.
If the string WID2 is not present in the message field, the event is dropped.

Related

upload pytest json report to logstash

I have json file for test results generated by pytest using pytest-json pluggin. So each json file contains test results of one test run. I want to upload this single json file into elasticsearch through logstash. But when I try it with the below logstash conf file, it is splitting the json file and posting as multiple docs in elasticsearch where I expect it to be uploaded as only one doc. Because of this split my results data is all distributed in multiple docs and getting corrupted.
logstash conf:
input {
file {
start_position => "beginning"
path => "home/report.json"
sincedb_path => "/dev/null"
}
}
filter {
json {
source => "message"
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "demo_ds"
}
}
Configure your file input with a multiline codec
input {
file {
path => "/home/user/report.json"
sincedb_path => "/dev/null"
start_position => "beginning"
codec => multiline { pattern => "^Spalanzani" negate => true what => "previous" auto_flush_interval => 2 }
}
}
The codec takes every line that does not match the regular expression ^Spalanzani (i.e., it takes every line) and combines them into one event. The auto_flush_interval is required because otherwise it will wait forever for a line that does match ^Spalanzani.
Note that a file input only accepts absolute paths.

elasticsearch delete documents using logstash and csv

Is there any way to delete documents from ElasticSearch using Logstash and a csv file?
I read the Logstash documentation and found nothing and tried a few configs but nothing happened using action "delete"
output {
elasticsearch{
action => "delete"
host => "localhost"
index => "index_name"
document_id => "%{id}"
}
}
Has anyone tried this? Is there anything special that I should add to the input and filter sections of the config? I used file plugin for input and csv plugin for filter.
It is definitely possible to do what you suggest, but if you're using Logstash 1.5, you need to use the transport protocol as there is a bug in Logstash 1.5 when doing deletes over the HTTP protocol (see issue #195)
So if your delete.csv CSV file is formatted like this:
id
12345
12346
12347
And your delete.conf Logstash config looks like this:
input {
file {
path => "/path/to/your/delete.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["id"]
}
}
output {
elasticsearch{
action => "delete"
host => "localhost"
port => 9300 <--- make sure you have this
protocol => "transport" <--- make sure you have this
index => "your_index" <--- replace this
document_type => "your_doc_type" <--- replace this
document_id => "%{id}"
}
}
Then when running bin/logstash -f delete.conf you'll be able to delete all the documents whose id is specified in your CSV file.
In addition to Val's answer, I would add that if you have a single input that has a mix of deleted and upserted rows, you can do both if you have a flag that identifies the ones to delete. The output > elasticsearch > action parameter can be a "field reference," meaning that you can reference a per-row field. Even better, you can change that field to a metadata field so that it can be used in a field reference without being indexed.
For example, in your filter section:
filter {
# [deleted] is the name of your field
if [deleted] {
mutate {
add_field => {
"[#metadata][elasticsearch_action]" => "delete"
}
}
mutate {
remove_field => [ "deleted" ]
}
} else {
mutate {
add_field => {
"[#metadata][elasticsearch_action]" => "index"
}
}
mutate {
remove_field => [ "deleted" ]
}
}
}
Then, in your output section, reference the metadata field:
output {
elasticsearch {
hosts => "localhost:9200"
index => "myindex"
action => "%{[#metadata][elasticsearch_action]}"
document_type => "mytype"
}
}

Working on JSON based logs using logstash

I have a logs which contains logs as following format
{ "start_time" : "12-May-2011", "name" : "this is first heading", "message" : "HELLO this is first message" }
{ "start_time" : "13-May-2011", "name" : "this is second heading", "message" : "HELLO this is second message" }
{ "start_time" : "14-May-2011", "name" : "this is third heading", "message" : "HELLO this is third message" }
...
I am new to logstash, I am currently having an app that is creating this log entries as JSON strings one below the other in that file (say location as /root/applog/scheduler.log)
I m looking for some help on how to parse this json from the logs into different fields to the stdout. How does the conf file should be.
note: idea is later to use it to kibana for visualization.
Example config:
input {
file {
path => ["/root/applog/scheduler.log"]
codec => "json"
start_position => "beginning" # If your file already exists
}
}
filter { } # Add filters here (optional)
output {
elasticsearch { } # pass the output to ES to prepare visualization with kibana
stdout { codec => "rubydebug" } # If you want to see the result in stdout
}
Logstash includes a json codec that will split your json into fields for you.

How can I remove field which are nil in CSV file

My CSV file contains fields which are nil like that :
{ "message" => [
[0] "m_FRA-LIENSs-R2012-1;\r"
],
"#version" => "1",
"#timestamp" => "2015-05-24T13:51:14.735Z",
"host" => "debian",
"SEXTANT_UUID" => "m_FRA-LIENSs-R2012-1",
"SEXTANT_ALTERNATE_TITLE" => nil
}
How can I remove all : messages and fields
Here is my CSV file
SEXTANT_UUID|SEXTANT_ALTERNATE_TITLE
a1afd680-543c | ZONE_ENJEU
4b80d9ad-e59d | ZICO
800d640f-1f82 |
I want to delete the last line, I used filter ruby, but it doesn't work! It remove just the field not the entire message.
If you configure your Ruby filter like this, it will work:
filter {
# let ruby check all fields of the event and remove any empty ones
ruby {
code => "event.to_hash.delete_if {|field, value| value.blank? }"
}
}
I used if ([message]=~ "^;") { drop { } } ans it's work => that for csv file

json filter fails with >#<NoMethodError: undefined method `[]' for nil:NilClass>

I'm trying to process entries from a logfile that contains both plain messages and json formatted messages. My initial idea was to grep for messages enclosed in curly braces and have them processed by another chained filter. Grep works fine (as does plain message processing), but the subsequent json filter reports an exception. I attached the logstash configuration, input and error message below.
Do you have any ideas what the problem might be? Any alternative suggestions for processing plain and json formatted entries from the same file?
Thanks a lot,
Johannes
Error message:
Trouble parsing json {:key=>"#message", :raw=>"{\"time\":\"14.08.2013 10:16:31:799\",\"level\":\"DEBUG\",\"thread\":\"main\",\"clazz\":\"org.springframework.beans.factory.support.DefaultListableBeanFactory\",\"line\":\"214\",\"msg\":\"Returning cached instance of singleton bean 'org.apache.activemq.xbean.XBeanBrokerService#0'\"}", :exception=>#<NoMethodError: undefined method `[]' for nil:NilClass>, :level=>:warn}
logstash conf:
file {
path => [ "plain.log" ]
type => "plainlog"
format => "plain"
}
}
filter {
# Grep json formatted messages and send them to following json filter
grep {
type => "plainlog"
add_tag => [ "grepped_json" ]
match => [ "#message", "^{.*}" ]
}
json {
tags => [ "grepped_json" ]
source => "#message"
}
}
output {
stdout { debug => true debug_format => "json"}
elasticsearch { embedded => true }
}
Input from logfile (just one line):
{"time":"14.08.2013 10:16:31:799","level":"DEBUG","thread":"main","clazz":"org.springframework.beans.factory.support.DefaultListableBeanFactory","line":"214","msg":"Returning cached instance of singleton bean 'org.apache.activemq.xbean.XBeanBrokerService#0'"}
I had the same problem and solved it by adding a target to the json filter.
The documentation does say the target is optional but apparently it isn't.
Changing your example you should have:
json {
tags => [ "grepped_json" ]
source => "#message"
target => "data"
}