I've got a question regarding JSON in Logstash.
I have got a JSON Input that looks something like this:
{
"2": {
"name": "name2",
"state": "state2"
},
"1": {
"name": "name1",
"state": "state1"
},
"0": {
"name": "name0",
"state": "state0"
}
}
Now, let's say I want to add a field in the logstash config
json{
source => "message"
add_field => {
"NAME" => "%{ What to write here ?}"
"STATE" => "%{ What to write here ?}"
}
}
Is there a way to access the JSON Input such that I get a field Name with value name1, another field with name 2 and a third field with name 3. The first key in the JSON is changing, that means there can only be one or many more parts. So I don't want to hardcode it like
%{[0][name]}
Thanks for your help.
If you remove all new lines in your input you can simply use the json filter. You don't need any add_field action.
Working config without new lines:
filter {
json { source => message }
}
If you can't remove the new lines in your input you need to merge the lines with the multiline codec.
Working config with new lines:
input {
file {
path => ["/path/to/your/file"] # I suppose your input is a file.
start_position => "beginning"
sincedb_path => "/dev/null" # just for testing
codec => multiline {
pattern => "^}"
what => "previous"
negate => "true"
}
}
}
filter {
mutate { replace => { "message" => "%{message}}" } }
json { source => message }
}
I suppose that you use the file input. In case you don't, just change it.
Output (for both):
"2" => {
"name" => "name2",
"state" => "state2"
},
"1" => {
"name" => "name1",
"state" => "state1"
},
"0" => {
"name" => "name0",
"state" => "state0"
}
Related
I have some JSON data sent in to my logstash filter and wish to mask secrets from appearing in Kibana. My log looks like this:
{
"payloads":
[
{
"sequence": 1,
"request":
{
"url": "https://hello.com",
"method": "POST",
"postData": "{\"one:\"1\",\"secret:"THISISSECRET",\"username\":\"hello\",\"secret2\":\"THISISALSOSECRET\"}",
},
"response":
{
"status": 200,
}
}
],
...
My filter converts the payloads to payload and I then wish to mask the JSON in postData to be:
"postData": "{\"one:\"1\",\"secret\":\"[secret]\",\"username\":\"hello\",\"secret2\":\"[secret]\"}"
My filter now looks like this:
if ([payloads]) {
split {
field => "payloads"
target => "payload"
remove_field => [payloads]
}
}
# innetTmp is set to JSON here - this works
json {
source => "innerTmp"
target => "parsedJson"
if [parsedJson][secret] =~ /.+/ {
remove_field => [ "secret" ]
add_field => { "secret" => "[secret]" }
}
if [parsedJson][secret2] =~ /.+/ {
remove_field => [ "secret2" ]
add_field => { "secret2" => "[secret]" }
}
}
Is this a correct approach? I cannot see the filter replacing my JSON key/values with "[secret]".
Kind regards /K
The approach is good, you are using the wrong field
After the split the secret field is part of postData and that field is part of parsedJson.
if [parsedJson][postData][secret] {
remove_field => [ "[parsedJson][postData][secret]" ]
add_field => { "[parsedJson][postData][secret]" => "[secret]" }
}
i am a newbie in the use of logstash and i need help with the following json log format:
{
"field1" :[
{
"sub_field1": {
"sub_field2":"value X"
"sub_field3": {"sub_field4":"value Y"}
}
"sub_field5":"value W"
}
]
}
i want to know how to get Value X,Value Y and Value W using the mutate: "Add_field".
Thanks in advance!
Assuming that you will only have one array element under field1, it's just:
add_field => {
sub_field1 => '%{[field1][0][sub_field1]}'
sub_field2 => '%{[field1][0][sub_field1][sub_field2]}'
...
}
A good way to test this -- create a file called test.json
{ "field1" :[ { "sub_field1": { "sub_field2":"value X","sub_field3": {"sub_field4":"value Y"} }, "sub_field5":"value W" } ] }
Create a config file like test.conf:
{
stdin { codec => 'json_lines' }
}
filter {
mutate {
add_field => {
sub_field1 => '%{[field1][0][sub_field1]}'
sub_field2 => '%{[field1][0][sub_field1][sub_field2]}'
}
}
}
output {
stdout { codec => "rubydebug" }
}
And then run it: cat test.json | bin/logstash -f test.conf
You'll get output like this:
{
"field1" => [
[0] {
"sub_field5" => "value W",
"sub_field1" => {
"sub_field3" => {
"sub_field4" => "value Y"
},
"sub_field2" => "value X"
}
}
],
"#timestamp" => 2020-02-17T17:26:59.471Z,
"#version" => "1",
"host" => "xxxxxxxx",
"sub_field2" => "value X",
"sub_field1" => "{\"sub_field3\":{\"sub_field4\":\"value Y\"},\"sub_field2\":\"value X\"}",
"tags" => []
}
Which shows sub_field2 and sub_field1.
If you don't predictable field names then you'll need to resort to a ruby filter or something like that. And if you have more than one element you need to spit out, you'll need to use a strategy that's discussed in the comments here: https://stackoverflow.com/a/45493411/2785358
I want to save CSV data in Elasticsearch using Logstash to receive the following result:
"my_field": [{"col1":"AAA", "col2": "BBB"},{"col1":"CCC", "col2": "DDD"}]
So, it's important that CSV data gets saved as the array [...] in a specific document.
However, I get this result:
"path": "path/to/csv",
"#timestamp": "2017-09-22T11:28:59.143Z",
"#version": "1",
"host": "GT-HYU",
"col2": "DDD",
"message": "CCC,DDD",
"col1": "CCC"
It looks like only the last CSV row gets saved (because of overwriting). I tried to use document_id => "1" in Logstash, but it obviously provokes the overwriting. How can I save data in the array?
Also, I don't understand how to define that the data gets saved in my_field.
input {
file {
path => ["path/to/csv"]
sincedb_path => "/dev/null"
start_position => beginning
}
}
filter {
csv {
columns => ["col1","col2"]
separator => ","
}
if [col1] == "col1" {
drop {}
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
action => "update"
hosts => ["127.0.0.1:9200"]
index => "my_index"
document_type => "my_type"
document_id => "1"
workers => 1
}
}
I have an ELK stack that receives from filebeat structured JSON logs like these:
{"what": "Connected to proxy service", "who": "proxy.service", "when": "03.02.2016 13:29:51", "severity": "DEBUG", "more": {"host": "127.0.0.1", "port": 2004}}
{"what": "Service registered with discovery", "who": "proxy.discovery", "when": "03.02.2016 13:29:51", "severity": "DEBUG", "more": {"ctx": {"node": "igz0", "ip": "127.0.0.1:5301", "irn": "proxy"}, "irn": "igz0.proxy.827378e7-3b67-49ef-853c-242de033e645"}}
{"what": "Exception raised while setting service value", "who": "proxy.discovery", "when": "03.02.2016 13:46:34", "severity": "WARNING", "more": {"exc": "ConnectionRefusedError('Connection refused',)", "service": "igz0.proxy.827378e7-3b67-49ef-853c-242de033e645"}}
The "more" field which is a nested JSON is broken down (not sure by what part of the stack) to different fields ("more.host", "more.ctx" and such) in kibana.
This is my beats input:
input {
beats {
port => 5044
}
}
filter {
if [type] == "node" {
json {
source => "message"
add_field => {
"who" => "%{name}"
"what" => "%{msg}"
"severity" => "%{level}"
"when" => "%{time}"
}
}
} else {
json {
source => "message"
}
}
date {
match => [ "when" , "dd.MM.yyyy HH:mm:ss", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"]
}
}
And this is my output:
output {
elasticsearch {
hosts => ["localhost"]
sniffing => true
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
stdout { codec => rubydebug }
}
Is there any way of making a field which will contain the entire "more" field without breaking it apart?
You should be able to use a ruby filter to take the hash and convert it back into a string.
filter {
ruby {
code => "event['more'] = event['more'].to_s"
}
}
You'd probably want to surround it with an if to make sure that the field exists first.
I am using the latest ELK (Elasticsearch 1.5.2 , Logstash 1.5.0, Kibana 4.0.2)
I have a question that
sample .json
{ "field1": "This is value1", "field2": "This is value2" }
longstash.conf
input {
stdin{ }
}
filter {
json {
source => "message"
add_field =>
{
"field1" => "%{field1}"
"field2" => "%{field2}"
}
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
host => "localhost"
index => "scan"
}
}
Output:
{
"message" => "{ \"field1\": \"This is value1\", \"field2\": \"This is value2\" }",
"#version" => "1",
"#timestamp" => "2015-05-07T06:02:56.088Z",
"host" => "myhost",
"field1" => [
[0] "This is value1",
[1] "This is value1"
],
"field2" => [
[0] "This is value2",
[1] "This is value2"
]
}
My question is 1) why the field result appear double in the result? 2) If there is nested array , how is it should reference in the logstash configure?
Thanks a lot!
..Petera
I think you have misunderstood what the json filter does. When you process a field through the json filter it will look for field names and corresponding values.
In your example, you have done that with this part:
filter {
json {
source => "message"
Then you have added a field called "field1" with the content of field "field1", since the field already exists you have just added the same information to the field that was already there, it has now become an array:
add_field =>
{
"field1" => "%{field1}"
"field2" => "%{field2}"
}
}
}
If you simplify your code to the following you should be fine:
filter {
json {
source => "message"
}
}
I suspect your question about arrays becomes moot at this point, as you probably don't need the nested array, and therefore, won't need to address it, but in case you do, I believe you can do this like so:
[field1][0]
[field1][1]