fluentd json wraps log lines - json

I'm not a fluentd expert but in my source I got this parser:
<source>
...
path ..../myFile.log
<parse>
#type json
</parse>
</source>
which for each line contains simple json:
{ "id": 33, "name": "myName", myData: { "etc": "etc", "etc1": "etc1"}}
...
This works, since every line of my log is sent to source to ouput via http json post, but fluentd adds another wrap so the receiver gets:
{{ "id": 33, "name": "myName", myData: { "etc": "etc", "etc1": "etc1"}}}
which causes and error at parsing on the other side.
Is there a way to avoid that?
I've tried with "Unescape_Key ON" but it still wraps it...

Related

How to filter only JSON from TEXT and JSON mixed format in logstash

We have input coming from one of the applications in TEXT + JSON format like the below:
<12>1 2022-10-18T10:48:40.163Z 7VLX5D8 ERAServer 14016 - - {"event_type":"FilteredWebsites_Event","ipv4":"192.168.0.1","hostname":"9krkvs1","source_uuid":"11160173-r3bc-46cd-9f4e-99f66fc0a4eb","occured":"18-Oct-2022 10:48:37","severity":"Warning","event":"An attempt to connect to URL","target_address":"172.66.43.217","target_address_type":"IPv4","scanner_id":"HTTP filter","action_taken":"Blocked","handled":true,"object_uri":"https://free4pc.org","hash":"0E9ACB02118FBF52B28C3570D47D82AFB82EB58C","username":"CKFCVS1\\some.name","processname":"C:\\Users\\some.name\\AppData\\Local\\Programs\\Opera\\opera.exe","rule_id":"Blocked by internal blacklist"}
that is <12>1 2022-10-18T10:48:40.163Z 7VLX5D8 ERAServer 14016 - - in TEXT and other in JSON.
The TEXT part is similar, only the date and time is different, so even if we delete all TEXT part it is okay.
The JSON part is random, but it contains useful information.
Currently, on Kibana, the logs are appearing in the message field, but the separate fields are not appearing because of improper JSON.
So actually we tried to push ONLY the required JSON part by putting manually in the file gives us the required output in Kibana.
So our question is how to achieve this through logstash filters/grok.
Update:
#Val - We already have below configuration
input {
syslog {
port => 5044
codec => json
}
}
But the output on the Kibana is appearing as
And we want it like:
Even though syslog seems like an appealing way of shipping data, it is a big mess in terms of standardization and anyone has a different way of shipping data. The Logstash syslog input only supports RFC3164 and your log format doesn't match that standard.
You can still bypass the normal RFC3164 parsing by providing your own grok pattern, as shown below:
input {
syslog {
port => 5044
grok_pattern => "<%{POSINT:priority_key}>%{POSINT:version} %{TIMESTAMP_ISO8601:timestamp} %{HOSTNAME:[observer][hostname]} %{WORD:[observer][name]} %{WORD:[process][id]} - - %{GREEDYDATA:[event][original]}"
}
}
filter {
json {
source => "[event][original]"
}
}
output {
stdout { codec => json }
}
Running Logstash with the above config, your sample log line gets parsed as this:
{
"#timestamp": "2022-10-18T10:48:40.163Z",
"#version": "1",
"action_taken": "Blocked",
"event": "An attempt to connect to URL",
"event_type": "FilteredWebsites_Event",
"facility": 0,
"facility_label": "kernel",
"handled": true,
"hash": "0E9ACB02118FBF52B28C3570D47D82AFB82EB58C",
"host": "0:0:0:0:0:0:0:1",
"hostname": "9krkvs1",
"ipv4": "192.168.0.1",
"message": "<12>1 2022-10-18T10:48:40.163Z 7VLX5D8 ERAServer 14016 - - {\"event_type\":\"FilteredWebsites_Event\",\"ipv4\":\"192.168.0.1\",\"hostname\":\"9krkvs1\",\"source_uuid\":\"11160173-r3bc-46cd-9f4e-99f66fc0a4eb\",\"occured\":\"18-Oct-2022 10:48:37\",\"severity\":\"Warning\",\"event\":\"An attempt to connect to URL\",\"target_address\":\"172.66.43.217\",\"target_address_type\":\"IPv4\",\"scanner_id\":\"HTTP filter\",\"action_taken\":\"Blocked\",\"handled\":true,\"object_uri\":\"https://free4pc.org\",\"hash\":\"0E9ACB02118FBF52B28C3570D47D82AFB82EB58C\",\"username\":\"CKFCVS1\\\\some.name\",\"processname\":\"C:\\\\Users\\\\some.name\\\\AppData\\\\Local\\\\Programs\\\\Opera\\\\opera.exe\",\"rule_id\":\"Blocked by internal blacklist\"}\n",
"object_uri": "https://free4pc.org",
"observer": {
"hostname": "7VLX5D8",
"name": "ERAServer"
},
"occured": "18-Oct-2022 10:48:37",
"priority": 0,
"priority_key": "12",
"process": {
"id": "14016"
},
"processname": "C:\\Users\\some.name\\AppData\\Local\\Programs\\Opera\\opera.exe",
"rule_id": "Blocked by internal blacklist",
"scanner_id": "HTTP filter",
"severity": "Warning",
"severity_label": "Emergency",
"source_uuid": "11160173-r3bc-46cd-9f4e-99f66fc0a4eb",
"target_address": "172.66.43.217",
"target_address_type": "IPv4",
"timestamp": "2022-10-18T10:48:40.163Z",
"username": "CKFCVS1\\some.name",
"version": "1"
}

Can Filebeat parse JSON fields instead of the whole JSON object into kibana?

I am able to get a single JSON object in Kibana:
By having this in the filebeat.yml file:
output.elasticsearch:
hosts: ["localhost:9200"]
How can I get the individual elements in the JSON string. So say if I wanted to compare all the "pseudorange" fields of all my JSON objects. How would I:
Select "pseudorange" field from all my JSON messages to compare them.
Compare them visually in kibana. At the moment I can't even find the message let alone the individual fields in the visualisation tab...
I have heard of people using logstash to parse the string somehow but is there no way of doing this simply with filebeat? If there isn't then what do I do with logstash to help filter the individual fields in the json instead of have my message just one big json string that I cannot interact with?
I get the following output from output.console, note I am putting some information in <> to hide it:
"#timestamp": "2021-03-23T09:37:21.941Z",
"#metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.8.14",
"truncated": false
},
"message": "{\n\t\"Signal_data\" : \n\t{\n\t\t\"antenna type:\" : \"GPS\",\n\t\t\"frequency type:\" : \"GPS\",\n\t\t\"position x:\" : 0.0,\n\t\t\"position y:\" : 0.0,\n\t\t\"position z:\" : 0.0,\n\t\t\"pseudorange:\" : 20280317.359730639,\n\t\t\"pseudorange_error:\" : 0.0,\n\t\t\"pseudorange_rate:\" : -152.02620448094211,\n\t\t\"svid\" : 18\n\t}\n}\u0000",
"source": <ip address>,
"log": {
"source": {
"address": <ip address>
}
},
"input": {
"type": "udp"
},
"prospector": {
"type": "udp"
},
"beat": {
"name": <ip address>,
"hostname": "ip-<ip address>",
"version": "6.8.14"
},
"host": {
"name": "ip-<ip address>",
"os": {
<ubuntu info>
},
"id": <id>,
"containerized": false,
"architecture": "x86_64"
},
"meta": {
"cloud": {
<cloud info>
}
}
}
In Filebeat, you can leverage the decode_json_fields processor in order to decode a JSON string and add the decoded fields into the root obejct:
processors:
- decode_json_fields:
fields: ["message"]
process_array: false
max_depth: 2
target: ""
overwrite_keys: true
add_error_key: false
Credit to Val for this. His answer worked however as he suggested my JSON string had a \000 at the end which stops it being JSON and prevented the decode_json_fields processor from working as it should...
Upgrading to version 7.12 of Filebeat (also ensure version 7.12 of Elasticsearch and Kibana because mismatched versions between them can cause issues) allows us to use the script processor: https://www.elastic.co/guide/en/beats/filebeat/current/processor-script.html.
Credit to Val here again, this script removed the null terminator:
- script:
lang: javascript
id: trim
source: >
function process(event) {
event.Put("message", event.Get("message").trim());
}
After the null terminator was removed the decode_json_fields processor did its job as Val suggested and I was able to extract the individual elements of the JSON field which allowed Kibana visualisation to look at the elements I wanted!

Docker logs interpreting JSON logs as string

Our Go server outputs logs to stdout in JSON, each line looking something like the following:
{"time": "2017-06-01 14:00:00", "message": "Something happened", "level": "DEBUG"}
Our docker-compose uses the standard json-file logger, which wraps each line in a log field as an escaped string, like so:
{
"log": "{\"time\": \"2017-06-01 14:00:00\", \"message\": \"Something happened\", \"level\": \"DEBUG\"}\"",
"timestamp": "<the time>",
...more fields...
}
But we don't want the log field to be escaped as a string, we want it embedded as JSON at the same level:
{
"log": {
"time": "2017-06-01 14:00:00",
"message": "Something happened",
"level": "DEBUG"
},
"timestamp": "<the time>",
...more fields...
}
Can this be achieved?
Looks like this can't be done. But I can convert the JSON string to actual JSON in Filebeat, which we are using to pass logs to Kibana and Elastalert. To do that I used the decode_json_fields option under processors in filebeat.yml.

Parse.com File storage creating JSON with key:property - "__type":"File"

After downloading a parse Class, I found that it stores file type column as:
{ "results": [
{
"createdAt": "2015-10-27T15:06:37.324Z",
"file": {
"__type": "File",
"name": "uniqueidentifier1-filename.ext",
"url": "http://files.parsetfss.com/example-file-url.png"
},
"objectId": "8eBlOHHchQ",
"updatedAt": "2015-10-27T15:06:37.324Z"
},
{
"createdAt": "2015-10-27T14:35:02.853Z",
"file": {
"__type": "File",
"name": "uniqueidentifier2-filename.ext",
"url": "http://files.parsetfss.com/example-file-url.png"
},
"objectId": "B2tg7tBsHL",
"updatedAt": "2015-10-27T14:35:02.853Z"
}] }
For an app, I need to locally construct a JSON class like this and then manually upload it to the parse app. So I save the file first to parse and the get the file name and file url by file.url() and file.name() and then construct an object like this:
object.file.name = file.name();
object.file.url = file.url();
This works fine and sets the url and name keys as expected. However, after this if I do
object.file['__type'] = 'file'
the object.file object get converted into some weird parse file object and console.log(object) gives (notice the extra underscore and no __type key)
file: b.File
_name: "uniqueidentifier1-filename.ext"
_url: "http://files.parsetfss.com/example-file-url.png"
but console.log(object.file) gives properly
Object {url: "http://files.parsetfss.com/example-file-url.png", name: "uniqueidentifier1-filename.ext", __type: "File"}
saving the object in a text file also gives the same result as console.log(object). However, I want the text file to be similar to how parse actually stores it so that I can then upload the text file to a parse class.
In Javascript, call the toJSON() function on your PFObject which returns a JSON object suitable for saving on Parse.

Extracting data from a JSON file

I have a large JSON file that looks similar to the code below. Is there anyway I can iterate through each object, look for the field "element_type" (it is not present in all objects in the file if that matters) and extract or write each object with the same element type to a file? For example each user would end up in a file called user.json and each book in a file called book.json?
I thought about using javascript but to my knowledge js can't write to files, I also tried to do it using linux command line tools by removing all new lines, then inserting a new line after each "}," and then iterating through each line to find the element type and write it to a file. This worked for most of the data; however, where there were objects like the "problem_type" below, it inserted a new line in the middle of the data due to the nested json in the "times" element. I've run out of ideas at this point.
{
"data": [
{
"element_type": "user",
"first": "John",
"last": "Doe"
},
{
"element_type": "user",
"first": "Lucy",
"last": "Ball"
},
{
"element_type": "book",
"name": "someBook",
"barcode": "111111"
},
{
"element_type": "book",
"name": "bookTwo",
"barcode": "111111"
},
{
"element_type": "problem_type",
"name": "problem object",
"times": "[{\"start\": \"1230\", \"end\": \"1345\", \"day\": \"T\"}, {\"start\": \"1230\", \"end\": \"1345\", \"day\": \"R\"}]"
}
]
}
I would recommend Java for this purpose. It sounds like you're running on Linux so it should be a good fit.
You'll have no problems writing to files. And you can use a library like this - http://json-lib.sourceforge.net/ - to gain access to things like JSONArray and JSONObject. Which you can easily use to iterate through the data in your JSON request, and check what's in "element_type" and write to a file accordingly.