How to parse multiline json in Promtail - json

I am using log4js to log data to a file in my app. I want to display some of this data in my Grafana dashboard and for that I am using Promtail to read logs from the file, pre-process it and send it to Loki. In Loki, I want to filter the data based on the parsed values.
Here is an example of my logs:
[2023-02-12T04:01:23.587] [DEBUG] default - {
"message_id": 123,
"from": {
"id": 123,
"is_bot": false,
"first_name": "XXX",
"last_name": "XXXXX",
"username": "XXXXX",
"is_premium": true
},
"chat": {
"id": 123,
"title": "XXXXX",
"username": "XXXXX",
"type": "supergroup"
},
"date": 123,
"message_thread_id": 123,
"text": "XXX XXXXX XXXXX"
}
Here is my current Promtail configuration:
server:
http_listen_port: 80
grpc_listen_port: 9095
log_level: debug
positions:
filename: /tmp/positions.yaml
clients:
- url: http://192.168.1.64:3100/loki/api/v1/push
scrape_configs:
- job_name: patriotbot
pipeline_stages:
- multiline:
firstline: \[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}\]
- regex:
expression: '^\[.*\]\s\w*\s-\s(\{.*\})'
name: log_entry
- json:
source: log_entry
name: only_json
expressions:
from_id: 'from.id'
is_bot: 'from.is_bot'
first_name: 'from.first_name'
last_name: 'from.last_name'
username: 'from.username'
chat_id: 'chat.id'
chat_title: 'chat.title'
chat_type: 'chat.type'
- output:
source: only_json
static_configs:
- targets:
- localhost
labels:
job: patriot
type: all
__path__: /logs/all.log
I have two issues with my current configuration:
My logs are saved as multiple lines, making them difficult to parse. I have attempted to fix this issue with the multiline stage, but which is displayed properly in Loki, though maybe it won't work for following parsing.
The timestamp and metadata in front of the JSON object is preventing it from being parsed properly. Should I get rid it before sending it to the pipe and parse?
So can somebody suggest changes to my configuration that would allow me to properly parse these multiline logs and extract the relevant data?

Related

problem when parsing JSON with Go JSONPath

I used the code to convert json data from Yarn REST API to Prometheus data type:
https://github.com/prometheus-community/json_exporter.
However, it printed errors:
level=error ts=2021-07-08T06:31:03.712Z caller=collector.go:83 msg="Failed to extract value for metric" path={.capacity} err="capacity is not found" metric="Desc{fqName: "queues_capacity", help: "information on queues", constLabels: {}, variableLabels: [type]}"
I was wondering if there is some wrong in my configuration of YAML file (such as in terms of nested json) or just the reason about the code.
my yaml config is:
metrics:
- name: queues
path: "{ .scheduler.schedulerInfo.queues.queue }"
help: information on queues
type: object
labels:
type: '{.type}'
values:
capacity: '{.capacity}'
and part of the json file is:
{
"scheduler": {
"schedulerInfo": {
"type": "capacityScheduler",
"capacity": 100,
"usedCapacity": 1.0526316,
"maxCapacity": 100,
"queueName": "root",
"queues": {
"queue": [
{
"type": "capacitySchedulerLeafQueueInfo",
"capacity": 10,
"usedCapacity": 10.526316,
"maxCapacity": 100,
use [*] to get all object first, so it should be:
path: "{ .scheduler.schedulerInfo.queues.queue[*] }"
according to this: https://kubernetes.io/docs/reference/kubectl/jsonpath/

Can Filebeat parse JSON fields instead of the whole JSON object into kibana?

I am able to get a single JSON object in Kibana:
By having this in the filebeat.yml file:
output.elasticsearch:
hosts: ["localhost:9200"]
How can I get the individual elements in the JSON string. So say if I wanted to compare all the "pseudorange" fields of all my JSON objects. How would I:
Select "pseudorange" field from all my JSON messages to compare them.
Compare them visually in kibana. At the moment I can't even find the message let alone the individual fields in the visualisation tab...
I have heard of people using logstash to parse the string somehow but is there no way of doing this simply with filebeat? If there isn't then what do I do with logstash to help filter the individual fields in the json instead of have my message just one big json string that I cannot interact with?
I get the following output from output.console, note I am putting some information in <> to hide it:
"#timestamp": "2021-03-23T09:37:21.941Z",
"#metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.8.14",
"truncated": false
},
"message": "{\n\t\"Signal_data\" : \n\t{\n\t\t\"antenna type:\" : \"GPS\",\n\t\t\"frequency type:\" : \"GPS\",\n\t\t\"position x:\" : 0.0,\n\t\t\"position y:\" : 0.0,\n\t\t\"position z:\" : 0.0,\n\t\t\"pseudorange:\" : 20280317.359730639,\n\t\t\"pseudorange_error:\" : 0.0,\n\t\t\"pseudorange_rate:\" : -152.02620448094211,\n\t\t\"svid\" : 18\n\t}\n}\u0000",
"source": <ip address>,
"log": {
"source": {
"address": <ip address>
}
},
"input": {
"type": "udp"
},
"prospector": {
"type": "udp"
},
"beat": {
"name": <ip address>,
"hostname": "ip-<ip address>",
"version": "6.8.14"
},
"host": {
"name": "ip-<ip address>",
"os": {
<ubuntu info>
},
"id": <id>,
"containerized": false,
"architecture": "x86_64"
},
"meta": {
"cloud": {
<cloud info>
}
}
}
In Filebeat, you can leverage the decode_json_fields processor in order to decode a JSON string and add the decoded fields into the root obejct:
processors:
- decode_json_fields:
fields: ["message"]
process_array: false
max_depth: 2
target: ""
overwrite_keys: true
add_error_key: false
Credit to Val for this. His answer worked however as he suggested my JSON string had a \000 at the end which stops it being JSON and prevented the decode_json_fields processor from working as it should...
Upgrading to version 7.12 of Filebeat (also ensure version 7.12 of Elasticsearch and Kibana because mismatched versions between them can cause issues) allows us to use the script processor: https://www.elastic.co/guide/en/beats/filebeat/current/processor-script.html.
Credit to Val here again, this script removed the null terminator:
- script:
lang: javascript
id: trim
source: >
function process(event) {
event.Put("message", event.Get("message").trim());
}
After the null terminator was removed the decode_json_fields processor did its job as Val suggested and I was able to extract the individual elements of the JSON field which allowed Kibana visualisation to look at the elements I wanted!

JSON int key issue data e2e (invalid character '1' looking for beginning of object key string)

My app uses aerospike to store Map in one of the bins,
I am use endly for e2e testing, which uses JSON for data representation:
How do to populate datastore with with JSON where key needs to be an int ?
Since json does not allowed int key I am getting the following error: invalid character '1' looking for beginning of object key string
Here is my data workflow (invoked by regression workflow)
#data.yaml
defaults:
datastore: db1
pipeline:
register:
action: dsunit:register
config:
driverName: aerospike
descriptor: "tcp([host]:3000)/[namespace]"
parameters:
dbname: db1
namespace: test
host: 127.0.0.1
port: 3000
dateFormat: yyyy-MM-dd hh:mm:ss
prepare:
data:
action: nop
init:
- key = data.db.mydaaset
- mydaaset = $AsTableRecords($key)
setup:
action: dsunit:prepare
URL: regression/db1/data/
data: $mydaaset
Here is my use case level data:
#mydaaset.json
[
{
"Table": "myDataset",
"Value": [{
"id": "$id",
"created": "$timestamp.yesterday",
"fc":{
1191: "$timestamp.yesterday",
1192: "$timestamp.now",
}
}],
"AutoGenerate": {
"id": "uuid.next"
},
"Key": "${tagId}_myDataset"
}
]
In your example #mydaaset.json file is invalid JSON, thus you getting
'invalid character '1' looking for beginning of object key string'
parsing error
In order to pre-seed your use case test data in aerospike with map[int]int bin you can use AsInt UDF
#mydaaset.json
[
{
"Table": "myDataset",
"Value": [{
"id": "$id",
"created": "$timestamp.yesterday",
"fc":{
"$AsInt(1191)": "$timestamp.yesterday",
"$AsInt(1192)": "$timestamp.now",
}
}],
"AutoGenerate": {
"id": "uuid.next"
},
"Key": "${tagId}_myDataset"
}
]

Extract field from JSON response with Ansible

I have a task which performs a GET request to a page. The response's body is a JSON like the following.
{
"ips": [
{
"organization": "1233124121",
"reverse": null,
"id": "1321411312",
"server": {
"id": "1321411",
"name": "name1"
},
"address": "x.x.x.x"
},
{
"organization": "2398479823",
"reverse": null,
"id": "2418209841",
"server": {
"id": "234979823",
"name": "name2"
},
"address": "x.x.x.x"
}
]
}
I want to extract the fields id and address, and tried (for id field):
tasks:
- name: get request
uri:
url: "https://myurl.com/ips"
method: GET
return_content: yes
status_code: 200
headers:
Content-Type: "application/json"
X-Auth-Token: "0010101010"
body_format: json
register: json_response
- name: copy ip_json content into a file
copy: content={{json_response.json.ips.id}} dest="/dest_path/json_response.txt"
but I get this error:
the field 'args' has an invalid value, which appears to include a variable
that is undefined. The error was: 'list object' has no attribute 'id'..
Where is the problem?
The error was: 'list object' has no attribute 'id'
json_response.json.ips is a list.
You either need to choose one element (first?): json_response.json.ips[0].id.
Or process this list for example with map or json_query filters if you need all ids.
Ansible command to copy to file:
copy:
content: "{{ output.stdout[0] }}"
dest: "~/ansible/local/facts/{{ inventory_hostname }}.txt"

How to reuse json in RAML example

I have the following files
user.json
"user": {
"id": 1,
"name": "nameuser",
"online": true,
"profile": {
"photo": "",
"validated": true,
"popular": true,
"suspect": false,
"moderator": false,
"age": "22 ani",
"gender_id": "M"
}
}
profile.raml
displayName: Profile
get:
description: Get profile data
queryParameters:
userId:
description: The user id for which we are requesting the profile data
type: integer
required: true
responses:
200:
body:
application/json:
example: |
{
"user": !include user.json,
"details": {
"friend": true
}
}
The user json is present in more examples and I want to reuse it.
I'm using raml2html and it compiles it to
so how do I do this ?
I have used parameters successfully in the past. You will not be able to put a parameter inside an included file because RAML views all included files as strings. But you can do something like this in your profile.raml:
example: |
{
"user": <<userItem>>,
"details": {
"friend": true
}
}
The RAML 200 Tutorial has a good explanation and code examples (see snippets below) on how to declare parameters and them pass them in. I highly recommend reading the entire tutorial though.
resourceTypes:
- collection:
description: Collection of available <<resourcePathName>> in Jukebox.
get:
description: Get a list of <<resourcePathName>>.
responses:
200:
body:
application/json:
example: |
<<exampleCollection>>
/songs:
type:
collection:
exampleCollection: !include jukebox-include-songs.sample