How to include multiple JSON fields when using JSON logging with SLF4J? - logback

I'm working with Dropwizard 1.3.2, which does logging using SLF4J over Logback. I am writing logs for ingestion into ElasticSearch, so I thought I'd use JSON logging and make some Kibana dashboards. But I really want more than one JSON item per log message - if I am recording a status update with ten fields, I would ideally like to log the object and have the JSON fields show up as top level entries in the JSON log. I did get MDC working but that is very clumsy and doesn't flatten objects.
That's turned out to be difficult! How can I do that? I have it logging in JSON, but I can't nicely log multiple JSON fields!
Things I've done:
My Dropwizard configuration has this appender:
appenders:
- type: console
target: stdout
layout:
type: json
timestampFormat: "ISO_INSTANT"
prettyPrint: false
appendLineSeparator: true
additionalFields:
keyOne: "value one"
keyTwo: "value two"
flattenMdc: true
The additional fields show up, but those values seem to be fixed in the configuration file and don't change. There is a "customFieldNames" but no documentation on how to use it, and no matter what I put in there I get a "no String-argument constructor/factory method to deserialize from String value" error. (The docs have an example value of "#timestamp" but no explanation, and even that generates the error. They also have examples like "(requestTime:request_time, userAgent:user_agent)" but again, undocumented and I can't make anything similar work, everything I've tried generates the error above.
I did get MDC to work, but it seems silly to plug in each item into MDC and then clear it.
And I can deserialize an object and log it as nested JSON, but that also seems weird.
All the answers I've seen on this are old - does anyone have any advice on how to do this nicely inside Dropwizard?

You can use logback explicitly in Dropwizard using a custom logger factory, and then set it up with logstash-logback-encoder, and configure it to write out to a JSON appender.
The JSON encoder may look like this:
<included>
<encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
<providers>
<pattern>
<pattern>
{
"id": "%uniqueId",
"relative_ns": "#asLong{%nanoTime}",
"tse_ms": "#asLong{%tse}",
"start_ms": "#asLong{%startTime}",
"cpu": "%cpu",
"mem": "%mem",
"load": "%loadavg"
}
</pattern>
</pattern>
<timestamp>
<!-- UTC is the best server consistent timezone -->
<timeZone>${encoders.json.timeZone}</timeZone>
<pattern>${encoders.json.timestampPattern}</pattern>
</timestamp>
<version/>
<message/>
<loggerName/>
<threadName/>
<logLevel/>
<logLevelValue/><!-- numeric value is useful for filtering >= -->
<stackHash/>
<mdc/>
<logstashMarkers/>
<arguments/>
<provider class="com.tersesystems.logback.exceptionmapping.json.ExceptionArgumentsProvider">
<fieldName>exception</fieldName>
</provider>
<stackTrace>
<!--
https://github.com/logstash/logstash-logback-encoder#customizing-stack-traces
-->
<throwableConverter class="net.logstash.logback.stacktrace.ShortenedThrowableConverter">
<rootCauseFirst>${encoders.json.shortenedThrowableConverter.rootCauseFirst}</rootCauseFirst>
<inlineHash>${encoders.json.shortenedThrowableConverter.inlineHash}</inlineHash>
</throwableConverter>
</stackTrace>
</providers>
</encoder>
</included>
File on Github
and produce output like this:
{"id":"FfwJtsNHYSw6O0Qbm7EAAA","relative_ns":20921024,"tse_ms":1584163814965,"start_ms":null,"#timestamp":"2020-03-14T05:30:14.965Z","#version":"1","message":"Creating Pool for datasource 'logging'","logger_name":"play.api.db.HikariCPConnectionPool","thread_name":"play-dev-mode-akka.actor.default-dispatcher-7","level":"INFO","level_value":20000}

Related

Filtering with regex vs json

When filtering logs, Logstash may use grok to parse the received log file (let's say it is Nginx logs). Parsing with grok requires you to properly set the field type - e.g., %{HTTPDATE:timestamp}.
However, if Nginx starts logging in JSON format then Logstash does very little processing. It simply creates the index, and outputs to Elasticseach. This leads me to believe that only Elasticsearch benefits from the "way" it receives the index.
Is there any advantage for Elasticseatch in having index data that was processed with Regex vs. JSON? E.g., Does it impact query time?
For elasticsearch it doesn't matter how you are parsing the messages, it has no information about it, you only need to send a JSON document with the fields that you want to store and search on according to your index mapping.
However, how you are parsing the message matters for Logstash, since it will impact directly in the performance.
For example, consider the following message:
2020-04-17 08:10:50,123 [26] INFO ApplicationName - LogMessage From The Application
If you want to be able to search and apply filters on each part of this message, you will need to parse it into fields.
timestamp: 2020-04-17 08:10:50,123
thread: 26
loglevel: INFO
application: ApplicationName
logmessage: LogMessage From The Application
To parse this message you can use different filters, one of them is grok, which uses regex, but if your message has always the same format, you can use another filter, like dissect, in this case both will achieve the same thing, but while grok uses regex to match the fields, dissect is only positional, this make a huge difference in CPU use when you have a high number of events per seconds.
Consider now that you have the same message, but in a JSON format.
{ "timestamp":"2020-04-17 08:10:50,123", "thread":26, "loglevel":"INFO", "application":"ApplicationName","logmessage":"LogMessage From The Application" }
It is easier and fast for logstash to parse this message, you can do it in your input using the json codec or you can use the json filter in your filter block.
If you have control on how your log messages are created, choose something that will make you do not need to use grok.

Camel - json body is consumed after have used jsonpath

i'm using camel in a rest context and i've to manipulate a json got from a request . It's something like:
{
'field1':'abc',
'field2':'def'
}
All i've to do is to extract field1 and field2 and put them in 2 properties, so i tried something like that
<setProperty propertyName="Field1">
<jsonpath>$.field1</jsonpath>
</setProperty>
<setProperty propertyName="Field2">
<jsonpath>$.field2</jsonpath>
</setProperty>
but i get this error:
org.apache.camel.ExpressionEvaluationException:
com.jayway.jsonpath.PathNotFoundException: Expected to find an object with property ['field2'] in path $ but found 'java.lang.String'. This is not a json object according to the JsonProvider: 'com.jayway.jsonpath.spi.json.JsonSmartJsonProvider'.
and after some tests i found out my body was empty after the first use of jsonpath.
The same process applied to an XML using xpath doesn't give any error, and i'm wondering if it's possible to do the same with jsonpath instead to create a mapper object in java. thank you in advance
If the processed Camel message is of type InputStream, this stream can obviously be read only once.
To solve this:
either enable Camel stream caching (http://camel.apache.org/stream-caching.html)
or insert a step (before jsonpath queries) in your route to convert message body to a string (so that it can be read multiple times:
(eg <convertBodyTo type="java.lang.String" charset="ISO-8859-1">) )

Netsuite connector in mule "returnSearchColumns" attribute

I created a saved search of "items" in netsuite.
<netsuite:search config-ref="NetSuite__Login_Authentication" searchRecord="ITEM_ADVANCED" bodyFieldsOnly="false" returnSearchColumns="true" doc:name="NetSuite"/>
<json:object-to-json-transformer doc:name="Object to JSON"/>
When 'returnSearchColumns' is set to "true", receiving the below exception. If this attribute is set to false, there is no exception but response is missing the columns selected.
java.lang.IllegalArgumentException: No enum constant org.mule.module.netsuite.RecordTypeEnum.ITEM
Also, received 'ConsumerIterator' object as response from netsuite and used "Object to JSON" transformer right after netsuite connector. The response received is an array of item objects.
1) Is there a way to convert this payload into XML format? Both object to XML and JSON to XML are not giving entire XML.
2) How to avoid the above mentioned illegal argument exception ?
1) object-to-xml should convert all fields to XML, or you could try something like Dataweave. What exactly is missing?
2) There is no type called 'ITEM'. You have to use one mentioned in this list: http://mulesoft.github.io/netsuite-connector/6.0.1/java/org/mule/module/netsuite/RecordTypeEnum.html such as 'INVENTORY_ITEM '

JSON-formatted extract-document-data options node throws "unbalanced pairs" error when using multiple extract-paths

MarkLogic REST Client API's default search endpoint results in server error when using a query options node that contains more than one extract-path even though the request is successful when either extract-path is used individually within extract-document-data:
{"errorResponse":{"statusCode":500, "status":"Internal Server Error", "messageCode":"RESTAPI-INTERNALERROR", "message":"RESTAPI-INTERNALERROR: (err:FOER0000) Internal error: JSON build, unbalanced pairs: "}}
The offending paths:
<extract-path xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:FO="http://founders.archives.gov/">/tei:text/FO:metadata/FO:ProjectCode</extract-path>
<extract-path xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:FO="http://founders.archives.gov/">/tei:text/FO:metadata/FO:ShortProjectTitle</extract-path>
Only occurs when the format is JSON--XML format behaves as expected. This error can be reproduced across disparate datasets.
The entire options node:
<options xmlns="http://marklogic.com/appservices/search">
<search-option>unfiltered</search-option>
<quality-weight>0</quality-weight>
<page-length>10</page-length>
<extract-document-data selected="include">
<extract-path xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:FO="http://founders.archives.gov/">/tei:text/FO:metadata/FO:ProjectCode</extract-path>
<extract-path xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:FO="http://founders.archives.gov/">/tei:text/FO:metadata/FO:ShortProjectTitle</extract-path>
</extract-document-data>
</options>
I would simply extract the parent element FO:metadata; however, that returns a string, indicating a dependency on a parsing library (does it not?) which I would rather avoid if possible.
Any suggested workarounds are appreciated. Thanks.
There is a known bug with the inline response that should be fixed in 8.0-3.
In the interim, it should work to get the extracted fragments either as XML or as a multiplepart/mixed response (which, if the source documents are XML would also be XML).

JSONPath expression for checking string in Apache Camel XML

Let's say I have a simple json file such as the following
{
"log": {
"host": "blah",
"severity": "INFO",
"system": "1"
}
}
I'm using Apache Camel, and it's Spring XML to process and route the json file. My routing code looks something like this:
<route>
<from uri="file:/TESTFOLDER/input"/>
<choice>
<when>
<jsonpath>$.log?(#.severity == 'WARNING')</jsonpath>
<to uri="smtp://(smtpinfo...not important)"/>
</when>
<otherwise>
<to uri="file:/TESTFOLDER/output"/>
</otherwise>
</choice>
</route>
The part that I'm really confused about is the JSONPath expression. The expression I have above isn't even syntactically correct, because its hard to find examples for the case where you aren't trying to sort through a list of elements. My goal is to only send an email if the severity of the log is 'WARNING' but I can't come up with the expression.
This worked for me using Camel 2.13.1 (I checked for INFO as your JSON example has this severity; you may change this according to your needs):
<jsonpath>$..log[?(#.severity == 'INFO')]</jsonpath>
Note the .. and the []. However, using a single dot . at the beginning of the search path failed:
<jsonpath>$.log[?(#.severity == 'INFO')]</jsonpath>
The error messages said:
java.lang.IllegalArgumentException: Invalid container object
This may be a bug.
According to the JSON Path doc, .. stands for "recursive descent". This may not meet your requirements. However, as a single dot . didn't work, this was the only possible work around I figured out. Otherwise, you may rise a bug ticket.