Logstash : how to output a selection of fields

Logstash : how to output a selection of fields - output

I have a logstash exe that read JSON events from a RabbitMQ queue.
input {
rabbitmq {
codec => json
...
}
}
I need to have 2 outputs. The first one is a MongoDB output with the entire JSON document (no problem, it works), and the second is another rabbitMQ queue but I don't need the entire JSON. I just need a selection of fields.
How can I do that?
Thank you

You'd use a mutate filter with a remove_field entry to remove all of the fields you don't want. If you don't know what all of the fields are that you need to remove, you'll need to create a ruby filter the iterates over the event and removes anything that isn't in your desired list.
That ruby code would look something like this:
filter {
ruby {
code => "
event.to_hash.each {|k,v|
if (!['a','b','c'].include?(k))
event.remove(k)
end
}"
}
}
where ['a','b','c'] is the list of fields you want to keep.

Or you can use the following codec: https://github.com/payscale/logstash-codec-fieldselect It selects some fields for output. Just like the above but without a ruby filter.

Related

Processing a Kafka message using KSQL that has a field that can be either an ARRAY or a STRUCT

I'm consuming a Kafka topic published by another team (so I have very limited influence over the message format). The message has a field that holds an ARRAY of STRUCTS (an array of objects), but if the array has only one value then it just holds that STRUCT (no array, just an object). I'm trying to transform the message using Confluent KSQL. Unfortunately, I cannot figure out how to do this.
For example:
{ "field": {...} } <-- STRUCT (single element)
{ "field": [ {...}, {...} ] } <-- ARRAY (multiple elements)
{ "field": [ {...}, {...}, {...} ] <-- ARRAY (multiple elements)
If I configure the field in my message schema as a STRUCT then all messages with multiple values error. If I configure the field in my message schema as an ARRAY then all messages with a single value error. I could create two streams and merge them, but then my error log will be polluted with irrelevant errors.
I've tried capturing this field as a STRING/VARCHAR which is fine and I can split the messages into two streams. If I do this, then I can parse the single value messages and extract the data I need, but I cannot figure out how to parse the multivalue messages. None of the KSQL JSON functions seem to allow parsing of JSON Arrays out of JSON Strings. I can use EXTRACTJSONFIELD() to extract a particular element of the array, but not all of the elements.
Am I missing something? Is there any way to handle this reasonably?

In my experience, this is one use-case where KSQL just doesn't work. You would need to use Kafka Streams or a plain consumer to deserialize the event as a generic JSON type, then check object.get("field").isArray() or isObject(), and handle accordingly.
Even if you used a UDF in KSQL, the STREAM definition would be required to know ahead of time if you have field ARRAY<?> or field STRUCT<...>

I finally solved this in a roundabout way...
First, I created an initial stream reading the transaction as a stream of bytes using KAFKA format instead of JSON format. This allows me to put a filter conditional filter on the data so I can fork the stream into a version for the single (STRUCT) variation and a version for the multiple (ARRAY) variation.
The initial stream looks like:
CREATE OR REPLACE STREAM `my-topic-stream` (
id STRING KEY,
data BYTES
)
WITH (
KAFKA_TOPIC='my-topic',
VALUE_FORMAT='KAFKA'
);
Forking that stream looks like this with a second for a multiple version filtering for IS NOT NULL:
CREATE OR REPLACE STREAM `my-single-stream`
WITH (
kafka_topic='my-single-topic'
) AS
SELECT *
FROM `my-topic-stream`
WHERE JSON_ARRAY_LENGTH(EXTRACTJSONFIELD(FROM_BYTES(data, 'utf8'), '$.field')) IS NULL;
At this point I can create a schema for both variations, explode field, and merge the two streams back together. I don't know if this can be refined to be more efficient, but this successfully processes the transactions as I wanted.

How to add entries to a JSON array/list

I'm trying to set up a Discord bot that only lets people on a list in a JSON file use it, I am wondering how to add data to the JSON array/list but I'm not sure how to move forward and I have had no real luck looking for answers elsewhere.
This is an example of how the JSON file looks:
{
IDs: [
"2359835092385",
"4634637576835",
"3454574836835"
]
}
Now, what I am looking to do, is add a new ID to "IDs" and not have it completely break, and I wish to be able to have other entries in the JSON file as well so i can make something like "AdminIDs" for people that can do more stuff to the bot.
Yes. I know I can do this stuff role based in guilds/servers, but I would like to be able to use the bot in DMs as well as on guilds/server.
What I want/need is a short and simple to manipulate script that I can easily put in to a new command so I can add new people to the bot without having to open and edit the JSON file manually.

If you haven't parsed your data already via the package json then you can do the following for parsing the data:
import json
json_code = { "..": ... }
parsed_json = json.dumps(json_code)
print(parsed_json['IDs'])
Then you can simply use this data like a normal list and append data to it.

All keys must be surrounded by a string
In this cause the key is the IDs while the value is the list and the value of the list would be the items inside it
import json
data={
"IDs":[
"2359835092385",
"4634637576835",
"3454574836835"
]
}
Let's say that your JSON data is from a file, to load it so that you can manipulate it do the following
raw_json_data=open('filename.json',encoding='utf-8')
j_data=json.load(raw_json_data) #Now j_data is basically the same as data except difference in name
print(j_data)
# >> {'IDs': ['2359835092385', '4634637576835', '3454574836835']}
To add things inside the list IDs you use the append method
data['IDs'].append('adding something') #or j_data['IDs'].append("SOMEthing")
print(data)
# >> {'IDs': ['2359835092385', '4634637576835', '3454574836835', 'adding something']}
To add a new key
data['Names']=['Jack','Nick','Alice','Nancy']
print(data)
# >> {'IDs': ['2359835092385', '4634637576835', '3454574836835', 'adding something'], 'Names': ['Jack', 'Nick', 'Alice', 'Nancy']}

How to nest JSON fields using grok filter in logstash?

I receive logs from my Spring Boot App via logback and want to transform several fields to a nested separate json element in using logstash filters. I've done that with the
mutate filter:
filter {
mutate {
rename => {
"appName" => "[app][name]"
"appNode" => "[app][node]"
"appStatus" => "[app][status]"
... etc
}
}
}
and it works (I look at the result in Kibana).
But it's not convienent to list all those fields (which I'm gonna have more than 3). It'd be great to write a regexp (perhaps) which would match fields starting with 'app' and nest them under 'app' element.
I know it's possible to achieve this with ruby and grok filters. I don't have enought time to learn ruby to write a script, so I want to know how can I group it using grok template?

How to use the JSON Filter Plugin for Logstash

I have read this documentation on their website (https://www.elastic.co/guide/en/logstash/current/plugins-filters-json.html) and am still struggling to understand how to use this plugin. The goal I had was to take a JSON file and cut out everything except for a few particular fields that are spread out around a JSON log. Does the Add_field filter make it so only the fields added are passed on? If so how would I specify what to pass on?

The JSON filter is for expanding json in a field. For reading a JSON file into logstash you probably want to use the json codec with a file input, somewhat like this:
file {
path => "/path/to/file"
codec => "json"
}
That will read a json file into logstash as one event or
If the data being sent is a JSON array at its root multiple events will be created (one per element).
Then you could use a mutate filter to remove the fields you don't want.
mutate {
remove_field => [ "uneeded-field", "my_extraneous_field" ]
}
You could also look into using the community prune filter, which can remove everything except a whitelist.

Zabbix discovery: read JSON value

I have discovery rule, which returns JSON document:
{
"data":[
{"SIZE":9556},
{"SIZE_DIFFERENCE":0.00502302218501465},
{"DUPLICATES":0},
{"TODAY_ZERO_CLPRICE":9556},
{"LISTED_SYMBOLS":true}
]
}
Can I assign values of this JSON objects to Item prototypes? Or handle it in triggers.
Like "If SIZE < 1 Warning will appear"
Thank you

The JSON document in the question is not very valid for low-level discovery.
In that JSON, the data element has five objects, each with distinct attributes. Something like that would be more appropriate (note the LLD macro syntax):
{
"data":[
{
"{#SIZE}":9556,
"{#SIZE_DIFFERENCE}":0.00502302218501465,
"{#DUPLICATES}":0,
"{#TODAY_ZERO_CLPRICE}":9556,
"{#LISTED_SYMBOLS}":true
}
]
}
If you wish to create items with fixed values, you could probably create calculated items with a constant expression, like so:
{#SIZE}
However, a better approach would be to create trapper items during LLD and send those values separately.
Please see official documentation on low-level discovery and trapper items for more information.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Logstash : how to output a selection of fields - output

Or you can use the following codec: https://github.com/payscale/logstash-codec-fieldselect It selects some fields for output. Just like the above but without a ruby filter.

Related

Processing a Kafka message using KSQL that has a field that can be either an ARRAY or a STRUCT

How to add entries to a JSON array/list

How to nest JSON fields using grok filter in logstash?

How to use the JSON Filter Plugin for Logstash

Zabbix discovery: read JSON value

Categories

Resources