Logstash is not converting correctly in JSON - json

Following is my json log file
[
{
"error_message": " Failed to get line from input file (end of file?).",
"type": "ERROR",
"line_no": "2625",
"file": "GTFplainText.c",
"time": "17:40:02",
"date": "01/07/16",
"error_code": "GTF-00014"
},
{
"error_message": " Bad GTF plain text file header or footer line. ",
"type": "ERROR",
"line_no": "2669",
"file": "GTFplainText.c",
"time": "17:40:02",
"date": "01/07/16",
"error_code": "GTF-00004"
},
{
"error_message": " '???' ",
"type": "ERROR",
"line_no": "2670",
"file": "GTFplainText.c",
"time": "17:40:02",
"date": "01/07/16",
"error_code": "GTF-00005"
},
{
"error_message": " Failed to find 'event source'/'product detail' records for event source '3025188506' host event type 1 valid",
"type": "ERROR",
"line_no": "0671",
"file": "RGUIDE.cc",
"time": "15:43:48",
"date": "06/07/16",
"error_code": "RGUIDE-00033"
}
]
According to my understanding As the log is already in json, We do not need filter section in logstash configuration. Following is my logstash config
input {
file{
path => "/home/ishan/sf_shared/log_json.json"
start_position => "beginning"
codec => "json"
}
}
and the output configuration is
output {
elasticsearch {
hosts => ["localhost:9200"]
sniffing => true
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
stdout { codec => rubydebug }
}
But it seems like the data is not going into ES, as I am not able to see the data when I query the index. What am I missing?

I think the problem is that the json codec expects a full json message on one line and won't work with a message on multiple lines.
A possible work around would be to use the multiline codex and use the json filter.
The configuration for the multiline codec would be:
multiline {
pattern => "]"
negate => "true"
what => "next"
}
All the lines that do not begin with ] will be regrouped with the next line, so you'll have one full json document to give to the json filter.

Related

How to filter only JSON from TEXT and JSON mixed format in logstash

We have input coming from one of the applications in TEXT + JSON format like the below:
<12>1 2022-10-18T10:48:40.163Z 7VLX5D8 ERAServer 14016 - - {"event_type":"FilteredWebsites_Event","ipv4":"192.168.0.1","hostname":"9krkvs1","source_uuid":"11160173-r3bc-46cd-9f4e-99f66fc0a4eb","occured":"18-Oct-2022 10:48:37","severity":"Warning","event":"An attempt to connect to URL","target_address":"172.66.43.217","target_address_type":"IPv4","scanner_id":"HTTP filter","action_taken":"Blocked","handled":true,"object_uri":"https://free4pc.org","hash":"0E9ACB02118FBF52B28C3570D47D82AFB82EB58C","username":"CKFCVS1\\some.name","processname":"C:\\Users\\some.name\\AppData\\Local\\Programs\\Opera\\opera.exe","rule_id":"Blocked by internal blacklist"}
that is <12>1 2022-10-18T10:48:40.163Z 7VLX5D8 ERAServer 14016 - - in TEXT and other in JSON.
The TEXT part is similar, only the date and time is different, so even if we delete all TEXT part it is okay.
The JSON part is random, but it contains useful information.
Currently, on Kibana, the logs are appearing in the message field, but the separate fields are not appearing because of improper JSON.
So actually we tried to push ONLY the required JSON part by putting manually in the file gives us the required output in Kibana.
So our question is how to achieve this through logstash filters/grok.
Update:
#Val - We already have below configuration
input {
syslog {
port => 5044
codec => json
}
}
But the output on the Kibana is appearing as
And we want it like:
Even though syslog seems like an appealing way of shipping data, it is a big mess in terms of standardization and anyone has a different way of shipping data. The Logstash syslog input only supports RFC3164 and your log format doesn't match that standard.
You can still bypass the normal RFC3164 parsing by providing your own grok pattern, as shown below:
input {
syslog {
port => 5044
grok_pattern => "<%{POSINT:priority_key}>%{POSINT:version} %{TIMESTAMP_ISO8601:timestamp} %{HOSTNAME:[observer][hostname]} %{WORD:[observer][name]} %{WORD:[process][id]} - - %{GREEDYDATA:[event][original]}"
}
}
filter {
json {
source => "[event][original]"
}
}
output {
stdout { codec => json }
}
Running Logstash with the above config, your sample log line gets parsed as this:
{
"#timestamp": "2022-10-18T10:48:40.163Z",
"#version": "1",
"action_taken": "Blocked",
"event": "An attempt to connect to URL",
"event_type": "FilteredWebsites_Event",
"facility": 0,
"facility_label": "kernel",
"handled": true,
"hash": "0E9ACB02118FBF52B28C3570D47D82AFB82EB58C",
"host": "0:0:0:0:0:0:0:1",
"hostname": "9krkvs1",
"ipv4": "192.168.0.1",
"message": "<12>1 2022-10-18T10:48:40.163Z 7VLX5D8 ERAServer 14016 - - {\"event_type\":\"FilteredWebsites_Event\",\"ipv4\":\"192.168.0.1\",\"hostname\":\"9krkvs1\",\"source_uuid\":\"11160173-r3bc-46cd-9f4e-99f66fc0a4eb\",\"occured\":\"18-Oct-2022 10:48:37\",\"severity\":\"Warning\",\"event\":\"An attempt to connect to URL\",\"target_address\":\"172.66.43.217\",\"target_address_type\":\"IPv4\",\"scanner_id\":\"HTTP filter\",\"action_taken\":\"Blocked\",\"handled\":true,\"object_uri\":\"https://free4pc.org\",\"hash\":\"0E9ACB02118FBF52B28C3570D47D82AFB82EB58C\",\"username\":\"CKFCVS1\\\\some.name\",\"processname\":\"C:\\\\Users\\\\some.name\\\\AppData\\\\Local\\\\Programs\\\\Opera\\\\opera.exe\",\"rule_id\":\"Blocked by internal blacklist\"}\n",
"object_uri": "https://free4pc.org",
"observer": {
"hostname": "7VLX5D8",
"name": "ERAServer"
},
"occured": "18-Oct-2022 10:48:37",
"priority": 0,
"priority_key": "12",
"process": {
"id": "14016"
},
"processname": "C:\\Users\\some.name\\AppData\\Local\\Programs\\Opera\\opera.exe",
"rule_id": "Blocked by internal blacklist",
"scanner_id": "HTTP filter",
"severity": "Warning",
"severity_label": "Emergency",
"source_uuid": "11160173-r3bc-46cd-9f4e-99f66fc0a4eb",
"target_address": "172.66.43.217",
"target_address_type": "IPv4",
"timestamp": "2022-10-18T10:48:40.163Z",
"username": "CKFCVS1\\some.name",
"version": "1"
}

Can Filebeat parse JSON fields instead of the whole JSON object into kibana?

I am able to get a single JSON object in Kibana:
By having this in the filebeat.yml file:
output.elasticsearch:
hosts: ["localhost:9200"]
How can I get the individual elements in the JSON string. So say if I wanted to compare all the "pseudorange" fields of all my JSON objects. How would I:
Select "pseudorange" field from all my JSON messages to compare them.
Compare them visually in kibana. At the moment I can't even find the message let alone the individual fields in the visualisation tab...
I have heard of people using logstash to parse the string somehow but is there no way of doing this simply with filebeat? If there isn't then what do I do with logstash to help filter the individual fields in the json instead of have my message just one big json string that I cannot interact with?
I get the following output from output.console, note I am putting some information in <> to hide it:
"#timestamp": "2021-03-23T09:37:21.941Z",
"#metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.8.14",
"truncated": false
},
"message": "{\n\t\"Signal_data\" : \n\t{\n\t\t\"antenna type:\" : \"GPS\",\n\t\t\"frequency type:\" : \"GPS\",\n\t\t\"position x:\" : 0.0,\n\t\t\"position y:\" : 0.0,\n\t\t\"position z:\" : 0.0,\n\t\t\"pseudorange:\" : 20280317.359730639,\n\t\t\"pseudorange_error:\" : 0.0,\n\t\t\"pseudorange_rate:\" : -152.02620448094211,\n\t\t\"svid\" : 18\n\t}\n}\u0000",
"source": <ip address>,
"log": {
"source": {
"address": <ip address>
}
},
"input": {
"type": "udp"
},
"prospector": {
"type": "udp"
},
"beat": {
"name": <ip address>,
"hostname": "ip-<ip address>",
"version": "6.8.14"
},
"host": {
"name": "ip-<ip address>",
"os": {
<ubuntu info>
},
"id": <id>,
"containerized": false,
"architecture": "x86_64"
},
"meta": {
"cloud": {
<cloud info>
}
}
}
In Filebeat, you can leverage the decode_json_fields processor in order to decode a JSON string and add the decoded fields into the root obejct:
processors:
- decode_json_fields:
fields: ["message"]
process_array: false
max_depth: 2
target: ""
overwrite_keys: true
add_error_key: false
Credit to Val for this. His answer worked however as he suggested my JSON string had a \000 at the end which stops it being JSON and prevented the decode_json_fields processor from working as it should...
Upgrading to version 7.12 of Filebeat (also ensure version 7.12 of Elasticsearch and Kibana because mismatched versions between them can cause issues) allows us to use the script processor: https://www.elastic.co/guide/en/beats/filebeat/current/processor-script.html.
Credit to Val here again, this script removed the null terminator:
- script:
lang: javascript
id: trim
source: >
function process(event) {
event.Put("message", event.Get("message").trim());
}
After the null terminator was removed the decode_json_fields processor did its job as Val suggested and I was able to extract the individual elements of the JSON field which allowed Kibana visualisation to look at the elements I wanted!

What logstash filter plugin to use for Elasticsearch?

I'm having trouble using logstash to bring in the following raw data to elasticsearch. Abstracted the raw data below, was hoping the JSON plugin worked but it currently does not. I've viewed other posts regarding json to no avail.
{
"offset": "stuff",
"results": [
{
"key": "value",
"key1": null,
"key2": null,
"key3": "true",
"key4": "value4",
"key4": [],
"key5": value5,
"key6": "value6",
"key7": "value7",
"key8": value8,
"key9": "value9",
"key10": null,
"key11": null,
"key12": "value12",
"key13": "value13",
"key14": [],
"key15": "key15",
"key16": "value16",
"key17": "value17",
"key18": "value18",
"key19": "value19"
},
{
"key20": "value20",
"key21": null,
"key22": null,
"key23": "value23",
"key24": "value24",
<etc.>
My current conf file:
input {
file {
codec => multiline
{
pattern => '^\{'
negate => true
what => previous
}
#type => "json"
path => <my path>
sincedb_path => "/dev/null"
start_position => "beginning"
}
}
#filter
#{
# json {
# source => message
# remove_field => message
# }
#}
filter
{
mutate
{
replace => [ "message", "%{message}}" ]
gsub => [ 'message','\n','']
}
if [message] =~ /^{.*}$/
{
json { source => message }
}
}
output {
#stdout { codec => rubydebug }
stdout { codec => json }
}
I get a long error that I can't read since it's full of
" \"key10\": null,\r \"key11\": \"value11\",\r
etc.
Does anyone know what I'm doing wrong or how to better see my error? This is valid json but maybe I'm using my regex for multiline codec wrong.
Can you use a different input plugin than file? Parsing a JSON file as a multiline may be problematic. If possible use a plugin with a JSON codec.
In the file input, you can set a real sincedb_path where logstash can write
In the line where you replace message you have one curly bracket } too many
replace => [ "message", "%{message}}" ]
I would write the output to elasticsearch instead of stdout, but ofcourse for testing you don't have to, but when you write the output to elasticsearch you can see the index being created and use kibana to discover if they the content is to your liking.
output {
elasticsearch {
hosts => "localhost"
index => "stuff-%{+xxxx.ww}"
}
}
I use these curl commands to read from the elasticsearch,
curl -s -XGET 'http://localhost:9200/_cat/indices?v&pretty'
and
curl -s -XGET 'http://localhost:9200/stuff*/_search?pretty=true'

React get JavaScript Object from JSON file?

The issue is that I'm trying to get a JavaScript object like the following:
[
"id" : 11,
"name" : "Peter"
"other": {
"id": 22,
"item": 534
},
"main": false
]
Since I want to get this via reactjs: I trying to do this:
http.get(API.BASE_URL + API.USER_INFO)
.accept('Application/json')
.end((err, res) => {
//console.log(x);
console.log(err);
console.log(res);
});
When I try a normal json string I get the right result, but with this javascript object I get:
Error: Parser is unable to parse the response
undefined
Has anyone come across this before? Any idea?
What you're trying to parse isn't valid JSON (as well as JavaScript) because you've written it out as an array, but still use key/value pairs as if it were an object. Try this instead:
{
"id": 11,
"name": "Peter",
"other": {
"id": 22,
"item": 534
},
"main": false
}

Logstash - won't parse JSON

I want to parse data to Elasticsearch using Logstash. So far this worked great but when I try to parse JSON files, Logstash just won't do ...anything. I can start Logstash without any exception but it won't parse anything.
Is there something wrong with my config? The path to the JSON file is correct.
my JSON:
{
"stats": [
{
"liveStatistic": {
"#scope": "21",
"#scopeType": "foo",
"#name": "minTime",
"#interval": "60",
"lastChange": "2011-01-11T15:19:53.259+02:00",
"start": "2011-01-18T14:19:48.333+02:00",
"unit": "s",
"value": 10
}
},
{
"liveStatistic": {
"#scope": "26",
"#scopeType": "bar",
"#name": "newCount",
"#interval": "60",
"lastChange": "2014-01-11T15:19:59.894+02:00",
"start": "2014-01-12T14:19:48.333+02:00",
"unit": 1,
"value": 5
}
},
...
]
}
my Logstash agent config:
input {
file {
path => "/home/me/logstash-1.4.2/values/stats.json"
codec => "json"
start_position => "beginning"
}
}
output {
elasticsearch {
host => localhost
protocol =>"http"
}
stdout {
codec => rubydebug
}
}
You should add the following line to your input:
start_position => "beginning"
Also put the complete document on one line and maybe add {} around your document to make it a valid json document.
Okay, two things:
First, the file input is by default set to start reading at the end of the file. If you want the file to start reading at the beginning, you will need to set start_position. Example:
file {
path => "/mypath/myfile"
codec => "json"
start_position => "beginning"
}
Second, keep in mind that logstash keeps a sincedb file which records how many lines of a file you have already read (so as to not parse information repeatedly!). This is usually a desirable feature, but for testing over a static file (which is what it looks like you're trying to do) then you want to work around this. There are two ways I know of.
One way is you can just make a new copy of the file every time you want to run logstash, and remember to tell logstash to read from that file.
The other way is you can go and delete the sincedb file, wherever it is located. You can tell logstash where to write the sincedeb file with the sincedb_path feature.
I hope this all helped!