Using logstash to strip XSSI prefix from JSON response - json

I have a fairly simple problem but it's confusing to me. I'm trying to use Logstash to get Gerrit data via rest api. I'm using http_poller and I get a right response with my configuration, so I'm almost there.
Now I need to strip the XSSI prefix )]}' from the start of Gerrits JSON response. The question is, how? How to strip or split or mutate it, or how should I proceed?
My input configuration:
input {
http_poller {
urls => {
gerrit_projects => {
method => get
url => "http://url.to/gerrit/a/projects/"
headers => { Accept => "application/json" }
auth => { user => "userid" password => "supresecret" }
}
}
target => "http_poller_data"
metadata_target => "http_poller_metadata"
request_timeout => 60
interval => 60
}
}
filter {
if [http_poller_metadata] {
mutate {
add_field => {
"http_poller_host" => "%{http_poller_metadata[host]}"
"http_poller" => "%{http_poller_metadata[name]}"
}
}
}
if [http_poller_metadata][runtime_seconds] and [http_poller_metadata][runtime_seconds] > 0.5 {
mutate { add_tag => "slow_request" }
}
if [http_request_failure] or [http_poller_metadata][code] != 200 {
mutate { add_tag => "bad_request" }
}
}
output {
stdout { codec => rubydebug }
}
And parts of the response:
Pipeline main started
JSON parse failure. Falling back to plain-text {:error=>#<LogStash::Json::ParserError: Unexpected character (')' (code 41)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at ... (bunch of lines)...
{
"http_poller_data" => {
"message" => ")]}'\n{\"All-Users\":{\"id\":\"All-Users\",....(more valid JSON)...",
"tags" => [
[0] "_jsonparsefailure"
],
"#version" => "1",
"#timestamp" => "2016-12-13T09:48:25.397Z"
},
"#version" => "1",
"#timestamp" => "2016-12-13T09:48:25.397Z",
"http_poller_metadata" => { ... }
This is my first question to StackOverflow. Thank you for being kind with your answers!

You can use the mutate filter with the gsub option (link) to remove the )]}
mutate {
gsub => [
"message", "\)]}'", ""
]
}
But the gsub replace all occurences of a regex, so you have to be sure that the pattern only appears once.

I use "sed 1d" to remove the ")]}'" prefix and "jq" to process the JSON output. For example, to get the state of a Gerrit project I execute:
curl -s --header 'Content-Type:application/json' --request GET --netrc https://<GERRIT-SERVER>/a/projects/?r=<GERRIT-PROJECT> | sed 1d | jq --raw-output ".[] | .state"
ACTIVE

Related

How to use JSON filter plugin to seperate message in JSON in logstash?

This is my logstash conf file.
input {
http_poller{
urls =>{
urlname =>"http://ivivaanywhere.ivivacloud.com/api/Asset/Asset/All?apikey=SC:demo:64a9aa122143a5db&max=10&last=0"
}
request_timeout =>60
schedule => {every => "20s"}
codec => "line"
}
}
filter {
json {
source => "message"
}
}
output {
elasticsearch {
hosts => "http://127.0.0.1:9200"
index => "apilogs1"
}
stdout { codec => rubydebug }
}
I need to separate "message" in JSON in to fields to show in kibana
JSON message is some thing like this
[{"AssetID":"12341234","AssetCategoryKey":"50","Description":"Test AC Asset","OperationalStatus":"Operational","OperationalStatusChangeComment":"","InstalledLocationKey":"5","Make":"","Model":"","SerialNumber":"","BarCode":"","InstalledDate":"","CommissionedDate":"","Ownership":"","IsMobile":"0","ParentAssetKey":"","PurchasedDate":"","CurrentAmount":"","CurrentDepreciationAmount":"","UpdatedTime":"","PurchasedAmount":"","SalvageValue":"","DisposalDate":"","WarrantyExpiry":"","WarrantyStatus":"0","ClassKey":"","Specification":"","OwnerKey":"0","OwnerType":"","AssigneeAddedDate":"","AssigneeKey":"","AssigneeType":"","IsSold":"0","IsBackup":"0","CurrentLocationKey":"","Manufacturer_VendorKey":"","Supplier_VendorKey":"","EndofUsefullLifeDate":"","Hidden":"0","CreatedDateTime":"20200430:124909","CreatedUserKey":"141","ModifiedDateTime":"","ModifiedUserKey":"","IsLocked":"0","LockedUserKey":"","LockedDateTime":"","AssetKey":"389","ObjectKey":"389","__key__":"389","ObjectID":"12341234","InstalledLocationName":"Singapore.Office","AssetCategoryID":"Access
Modify input codec as "json".
input {
http_poller{
urls =>{
urlname =>"http://ivivaanywhere.ivivacloud.com/api/Asset/Asset/All?apikey=SC:demo:64a9aa122143a5db&max=10&last=0"
}
request_timeout =>60
schedule => {every => "20s"}
codec => "json"
}
}

Remove characters from JSON

I try to parse some json with logstash, currently the file which I like to enter has the following structure (simplified):
-4: {"audit":{"key1":"value1","key2":"value2"}}
-4: {"audit":{"key1":"value1","key2":"value2"}}
Therefore I need to remove the -4: prefix in order to proper parse the file using json. Unfortunately I can not use the json codec for the input plugin, because it is not in proper json format. Therefore my requirements for the pipeline are:
Remove the -4: prefix
Code the event to json
Do proper mutation
I have tried with the following pipeline, which gives me a parse error:
input {
tcp {
port => 5001
codec => multiline {
pattern => "^-\d:."
what => previous
}
#codec => json_lines
type => "raw_input"
}
}
filter {
if [type] == "raw_input" {
mutate {
gsub => ["message", "^-\d:.", ""]
}
}
json {
source => "message"
}
mutate {
convert => { "[audit][sequenceNumber]" => "integer" }
add_field => { "test" => "%{[audit][sequenceNumber]}"}
}
}
output {
file {
path => "/var/log/logstash/debug-output.log"
codec => line { format => "%{message}" }
}
}
Is it possible to achieve this with logstash? Any suggestions how to do it?
I would use the dissect filter
if [type] == "raw_input" {
dissect {
mapping => {
"message" => "-%{num}: %{msg}"
}
}
}
json {
source => "msg"
}

Symfony - ELK - interpret json logged across Monolog

i'm logging and analysing my logs with the ELK stack on a symfony3 application.
From the symfony application i want to log jsons object that might be a little bit deep.
Is there any way that Kibana interprets my json as a json and not a string ?
Here is an example of the way i'm logging,
$this->logger->notice('My log message', array(
'foo' => 'bar,
'myDeepJson1' => $deepJson1,
'myDeepJson2' => $deepJson2
));
And there, my logstash.conf. I used the symfony's pattern that i found here : https://github.com/eko/docker-symfony
input {
redis {
type => "symfony"
db => 1
key => monolog
data_type => ['list']
host => "redis"
port => 6379
}
}
filter {
if [type] == "symfony" {
grok {
patterns_dir => "./patterns"
match => [ "message", "%{SYMFONY}" ]
}
date {
match => [ "date", "YYYY-MM-dd HH:mm:ss" ]
}
if [log_type] == "app" {
json {
source => "log_context"
}
}
}
}
output {
if [type] == "symfony" {
elasticsearch {
hosts => ["172.17.0.1:9201"]
index => "azureva-logstash"
}
}
}
Actually, enverything i'm logging is in the log_context variable, but monolog transforms the array into a json, so, my $deepJson variables are double encoded, but, there's no way to log a multidimensional array in the context...
any help would be appreciated. Thanks !
Once you have the retrieved the requestJson from log_context, you have to remplace all the \" with " in order to be able to parse it with the json plugin.
You can do it with the mutate filter and its gsub option (documentation).
After you can then parse the resulting json with the json plugin.
You can update your documentation with:
if [log_type] == "app" {
json {
source => "log_context"
}
mutate {
gsub => ["requestJson", "\\"", "\""]
}
json {
source => "requestJson"
target => "requestJsonDecode"
}
}

How can I break up json data with logstash and kibana

I have a log file with a bunch of lines of json data. For example, here is one line:
{"name":"sampleApplicationName","hostname":"sampleHostName","pid":000000,"AppModule":"sampleAppModuleName","msg":"testMessage","time":"2016-02-23T19:33:10.468Z","v":0}
I want logstash to be able to break up these different components of the json string so that I can create visualizations in Kibana based off these components. I have tried playing around with the indexer file and tries countless variations, using both the json filter and grok patterns but I can't get anything to work. Any help is much appreciated.
Below is an exampke config that I use. Try pasting your json line to the command prompt to validate that it is working fine.
input {
stdin {}
}
filter {
json {
source => "message"
}
mutate {
add_field => {
"[#metadata][tenant-id]" => "%{[tenant-id]}"
"[#metadata][data-type]" => "%{[data-type]}"
"[#metadata][data-id]" => "%{[data-id]}"
}
}
if [data-type] == "build" {
mutate {
add_field => { "[#metadata][action]" => "index" }
}
}
}
output {
stdout { codec => rubydebug { metadata => true } }
file { path => "/tmp/jenkins-logstash.log" }
elasticsearch {
action => "%{[#metadata][action]}"
hosts => "XXX:9200"
index => "tenant-%{[#metadata][tenant-id]}"
document_type => "builds"
document_id => "%{[#metadata][data-id]}"
workers => 1
}
}

Logstash delete type and keep _type

I have a logstash client and server.
The client sends logfiles with the udp output of logstash to the server and the server also runs logstash to get these logs. On the server, I have a json filter that pulls the json formatted message in the fields of the actual log, so that elasticsearch can index them.
Here is my code from the server:
input{
udp{}
}
filter{
json {
source => "message"
}
}
output{
elasticsearch{
}
}
And from the client:
input{
file{
type => "apache-access"
path => "/var/log/apache2/access.log"
}
}
output{
udp{
host => "192.168.0.3"
}
}
This code works fine except one thing:
In some way i get the field type twice, once as type and once as _type, they have the same content.
I've tried to delete the type-field with the mutate-filter like this:
mutate{
remove_field => [ "type" ]
}
but this filter removes both type fields.(the _type field is set to default: logs)
How can I keep the _type field and remove the type field?
It works for me in this way:
input {
file {
add_field => { "[#metadata][type]" => "apache-access" }
path => "/var/log/apache2/access.log"
}
}
filter {
......
if [#metadata][type] == "xxx" {
}
......
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
#metadata and document_type