logstash write log from filebeat on two indexes - json

I installed the elk stack on a server, and on another server I installed filebeat to send syslog on filebeats-[data] indexes and it works fine.
Now, on the elk server I configured another input in logstash to send a json file on json_data indexes and it work fine but now I find the filebeat log on both indexes and I don't understand why.
I want the filebeat log only on filebeat-[data] index and not on json_data index.
Where do I wrong?
This is my logstash conf file
input {
file {
path => "/home/centos/json/test.json"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
json {
source => "message"
}
}
output {
elasticsearch {
hosts => "http://10.xxx.xxx.xxx:9200"
index => "json_data"
}
}
input {
beats {
port => 5044
}
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
output {
elasticsearch {
hosts => "http://10.xxx.xxx.xxx:9200"
sniffing => true
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
}
}
I tried different configuration, I tried also to delete the json.conf and in this case filebeat write only on the filebeat-[data] index

For the logs coming from filebeat to logstash, you can set the index name in filebeat configuration. In this case, logstash will not populate or manipulate the index name, ofcourse you need to remove the index part from logstash's filebeat config as well.
For json_file, keep the config as is, no need to change anything there.
To set custom index name in filebeat, you can refer: https://www.elastic.co/guide/en/beats/filebeat/current/change-index-name.html

Related

Error when importing big csv file into ELK by Logstash

Each day I have big csv file (about 800M, more than 3.5 mil rows) needed to import into ElasticSearch by Logstash. Some days it works perfectly, but some days it imports not enough rows in csv file. I watch in log files in Logstash (logstash-plain.log) and ElasticSearch and there is not any errors.
Information of ELK: ElasticSearch: 3 nodes, Logstash config:
input {
file {
path => "/home/importdata/pack_*.csv"
start_position => "beginning"
sincedb_path => "csv_data.db"
}
}
filter {
if [path] =~ "pack_" {
csv {
separator => ";"
skip_header => "true"
columns => ["id","cif","global_id","cus_name","cus_dob","cus_address","cus_email","cus_phone","cus_branch","cus_acct_exec_code","cus_acct_exec_name","cus_branch_name","created_by","created_time","updated_by","updated_time","is_deleted","status","deleted_by","deleted_time","route","client_type","client_group"]
}
}
grok {
match => [ "path", "/(?<filename>[^/]+).csv" ]
}
}
output {
elasticsearch {
hosts => ["http://ip1:9200","http://ip2:9200","http://ip3:9200"]
index => "%{filename}"
}
}
Please help me.
Thanks

Add a header line before each json document

I have a json file with 1000 json object.
is there any way to add a header line before each json document ? Is there any easiest way ?
Example : I have 1000 object like this
{"id":58,"first_name":"Louis","last_name":"Jordan","email":"ljordan1l#nature.com","gender":"Male","Latitude":"-15.93444","Longitude":"-50.14028"}
i want to add index header like below for every json object so that i can use in Elasticsearch Bulk api
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "unique_id" } }
{"id":58,"first_name":"Louis","last_name":"Jordan","email":"ljordan1l#nature.com","gender":"Male","Latitude":"-15.93444","Longitude":"-50.14028"}
If you are willing to leverage Logstash, you don't need to modify your file and can simply read it line by line and stream it to ES using the elasticsearch output which leverages the Bulk API.
Store the following Logstash configuration in a file named es.conf (make sure the file path and ES hosts match your settings):
input {
file {
path => "/path/to/your/json"
sincedb_path => "/dev/null"
start_position => "beginning"
codec => "json"
}
}
filter {
mutate {
remove_fields => ["#version", "#timestamp"]
}
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "test"
document_type => "type1"
document_id => "%{id}"
}
}
Then, you need to install logstash and you'll be able to run the following command in order to load your JSON files to your ES server:
bin/logstash -f es.conf
I found the best way to Add a header line before each json document.
https://stackoverflow.com/a/30899000/5029432

elasticsearch delete documents using logstash and csv

Is there any way to delete documents from ElasticSearch using Logstash and a csv file?
I read the Logstash documentation and found nothing and tried a few configs but nothing happened using action "delete"
output {
elasticsearch{
action => "delete"
host => "localhost"
index => "index_name"
document_id => "%{id}"
}
}
Has anyone tried this? Is there anything special that I should add to the input and filter sections of the config? I used file plugin for input and csv plugin for filter.
It is definitely possible to do what you suggest, but if you're using Logstash 1.5, you need to use the transport protocol as there is a bug in Logstash 1.5 when doing deletes over the HTTP protocol (see issue #195)
So if your delete.csv CSV file is formatted like this:
id
12345
12346
12347
And your delete.conf Logstash config looks like this:
input {
file {
path => "/path/to/your/delete.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["id"]
}
}
output {
elasticsearch{
action => "delete"
host => "localhost"
port => 9300 <--- make sure you have this
protocol => "transport" <--- make sure you have this
index => "your_index" <--- replace this
document_type => "your_doc_type" <--- replace this
document_id => "%{id}"
}
}
Then when running bin/logstash -f delete.conf you'll be able to delete all the documents whose id is specified in your CSV file.
In addition to Val's answer, I would add that if you have a single input that has a mix of deleted and upserted rows, you can do both if you have a flag that identifies the ones to delete. The output > elasticsearch > action parameter can be a "field reference," meaning that you can reference a per-row field. Even better, you can change that field to a metadata field so that it can be used in a field reference without being indexed.
For example, in your filter section:
filter {
# [deleted] is the name of your field
if [deleted] {
mutate {
add_field => {
"[#metadata][elasticsearch_action]" => "delete"
}
}
mutate {
remove_field => [ "deleted" ]
}
} else {
mutate {
add_field => {
"[#metadata][elasticsearch_action]" => "index"
}
}
mutate {
remove_field => [ "deleted" ]
}
}
}
Then, in your output section, reference the metadata field:
output {
elasticsearch {
hosts => "localhost:9200"
index => "myindex"
action => "%{[#metadata][elasticsearch_action]}"
document_type => "mytype"
}
}

logstash : http input takes only first line (with csv filter)

i'm newbie to elk stack and trying to monitor logs send through http. I have below logstash configuration. But it only read and send first line to elastic search although I send multiple lines in my http POST request body (Im using chromes DHC plugin to send http request to logstash). Please help me to read full data and send them to elastic search.
input {
http {
host => "127.0.0.1" # default: 0.0.0.0
port => 8081 # default: 8080
threads => 10
}
}
filter {
csv {
separator => ","
columns => ["posTimestamp","posCode","logLevel","location","errCode","errDesc","detail"]
}
date {
match => ["posTimestamp", "ISO8601"]
}
mutate {
strip => ["posCode", "logLevel", "location", "errCode", "errDesc" ]
remove_field => [ "path", "message", "headers" ]
}
}
output {
elasticsearch {
protocol => "http"
host => "localhost"
index => "temp"
}
stdout {
codec => rubydebug
}
}
Sample data:
2015-08-24T05:21:40.468,352701060205140,ERROR,Colombo,ERR_01,INVALID_CARD,Testing POS errors
2015-08-24T05:21:41.468,352701060205140,ERROR,Colombo,ERR_01,INVALID_CARD,Testing POS errors
2015-08-24T05:23:40.468,81021320,ERROR,Colombo,ERR_01,INVALID_CARD,Testing POS errors
2015-08-25T05:23:50.468,352701060205140,ERROR,Colombo,ERR_02,TIME_OUT,Testing POS errors
Managed to solve this by adding split filter.
split {
}

Logstash csv importing

I'm using Ubuntu 14.04 LTS, Kibana, Logstash and Elasticsearch. I tried the following code to import my csv file to LogStash but it doesnt detect.
input
{
file
{
path => "/home/kibana/Downloads/FL_insurance_sample.csv"
type => "FL_insurance_sample.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter
{
csv
{
columns => ["policyID","statecode","country","eq_site_limit","hu_site_limit",
"fl_sitelimit","fr_site_limit","tiv_2011","tiv_2012","eq_site_deductible",
"hu_site_deductible","fl_site_deductible","fr_site_deductible","point_latitude",
"point_longtitude","line","construction","point_granularity"]
separator => ","
}
}
output
{
elasticsearch {
action => "index"
host => "localhost"
index => "promosms-%{+dd.MM.YYYY}"
workers => 1
}
stdout
{
codec => rubydebug
}
}
I even did
sudo service logstash restart
When I went into index mapping in Kibana GUI interface, i chose Logstash-* and couldn't find the data that I wanted.
P.S. my config file is stored in /etc/logstash/conf.d/simple.conf
In your question, you state that you went to Logstash-* in Kibana, but your configuration file says that you are putting data into promosms-%{+dd.MM.YYYY}.
You need to go into kibana4's setup section and put [promosms-]DD.MM.YYYY into the Index name or pattern box and check both the "index contains time-based events" and "Use event times to create index names".
Then you might also want to set that as your default index.