NiFi non-Avro JSON Reader/Writer - json

It appears that the standard Apache NiFi readers/writers can only parse JSON input based on Avro schema.
Avro schema is limiting for JSON, e.g. it does not allow valid JSON properties starting with digits.
JoltTransformJSON processor can help here (it doesn't impose Avro limitations to how the input JSON may look like), but it seems that this processor does not support batch FlowFiles. It is also not based on the readers and writers (maybe because of that).
Is there a way to read arbitrary valid batch JSON input, e.g. in multi-line form
{"myprop":"myval","12345":"12345",...}
{"myprop":"myval2","12345":"67890",...}
and transform it to other JSON structure, e.g. defined by JSON schema, and e.g. using JSON Patch transformation, without writing my own processor?
Update
I am using Apache NiFi 1.7.1
Update 2
Unfortunately, #Shu's suggestion did work. I am getting same error.
Reduced the case to a single UpdateRecord processor that reads JSON with numeric properties and writes to a JSON without such properties using
myprop : /data/5836c846e4b0f28d05b40202
mapping. Still same error :(

it does not allow valid JSON properties starting with digits?
This bug NiFi-4612 fixed in NiFi-1.5 version, We can use AvroSchemaRegistry to defined your schema and change the
Validate Field Names
false
Then we can have avro schema field names starting with digits.
For more details refer to this link.
Is there a way to read arbitrary valid batch JSON input, e.g. in multi-line form?
This bug NiFi-4456 fixed in NiFi-1.7, if you are not using this version of NiFi then we can do workaround to create an array of json messages with ,(comma delimiter) by using.
Flow:
1.SplitText //split the flowfile with 1 line count
2.MergeRecord //merge the flowfiles into one
3.ConvertRecord
For more details regards to this particular issues refer to this link(i have explained with the flow).

Related

Spark from_avro 2nd argument is a constant string, any way to obtain schema string from some column of each record?

suppose we are developing an application that pulls Avro records from a source
stream (e.g. Kafka/Kinesis/etc), parses them into JSON, then further processes that
JSON with additional transformations. Further assume these records can have a
varying schema (which we can look up and fetch from a registry).
We would like to use Spark's built in from_avro function, But it is pretty clear that
Spark from_avro wants you to hard code a >Fixed< schema into your code. It doesn't seem
to allow the schema to vary row by incoming row.
That sort of makes sense if you are parsing the Avro to Internal row format.. One would need
a consistent structure for the dataframe. But what if we wanted something like
from_avro which grabbed the bytes from some column in the row and also grabbed the string
representation of the Avro schema from some other column in the row, and then parsed that Avro
into a JSON string.
Does such built-in method exist? Or is such functionality available in a 3rd party library ?
Thanks !

Nifi Controller Service What is the difference between JsontreeReader and JsonPath reader

I have a simple question. What is the difference between JsontreeReader and JsonPath reader?
I am new to NiFi. I am trying to parse a string. I have another open question that no one seems to want to answer. For a example!
Nifi JOLT Transform string delimited into different elements and subelements
JsonTreeReader reads the entire JSON as a record, or in the case of a top-level array, each element in the array is read (entirely) as a record. JsonPathReader lets you specify JSONPath expressions in order to get at particular objects/records/fields within the overall flow file.

Jmeter - What is the best extractor to use on a json message?

Currently testing system where the output is in the form of formatted json.
As part of my tests I need to extract and validate two values from the json record.
The values both have individual identifiers on them but don't appear in the same part of the record, so I can't just grab a single long string.
Loose format of the information in both cases:
"identifier1": [{"identifier2":"idname","values":["bit_I_want!]}]
In the case of the bit I want, this can either be a single quoted value (e.g. "12345") or multiple quoted values (e.g. "12345","23456","98765").
In both cases I'm only interested in validating the whole string of values, not individual values from the set.
Can anyone recommend which of the various extractors in Jmeter would be best to achieve this?
Many Thanks!
The most obvious choicse seems to be JSON Path Assertion (available via JMeter Plugins), it allows not only executing arbitrary JSON queries but conditionally failing the sampler basing on actual and expected result match.
The recommended way of installing JMeter Plugins and keeping them up-to-date is using JMeter Plugins Manager
JMeter 3.1 comes with JSON Extractor to parse JSON response. you could use this expression $.identifier1[0].values
as the JSON Path to extract the values.
If your JSON response is going to simple always as shown in your question, you could use Regular Expression Extractor as well. Advantage is it is faster than JSON extractor. The regular expression would be "values":\[(.*?)\]
Reference: http://www.testautomationguru.com/jmeter-response-data-extractors-comparison/

Library to convert JSON string to Erlang record

I've a large JSON string, I want to convert this string into Erlang record.
I found jiffy library but it doesn't completely convert to record.
For example:
jiffy:decode(<<"{\"foo\":\"bar\"}">>).
gives
{[{<<"foo">>,<<"bar">>}]}
but I want the following output:
{ok,{obj,[{"foo",<<"bar">>}]},[]}
Is there any library that can be used for the desired output?
Or is there any library that can be used in combination of jiffy for further modifying the output of it.
Consider the fact the JSON string is large, and I want the output is minimum time.
Take a look at ejson, from the documentation:
JSON library for Erlang on top of jsx. It gives a declarative interface for jsx by which we need to specify conversion rules and ejson will convert tuples according to the rules.
I made this library to make easy not just the encoding but rather the decoding of JSONs to Erlang records...
In order for ejson to take effect the source files need to be compiled with parse_transform ejson_trans. All record which has -json attribute can be converted to JSON later.

Specifiy type when converting from XML to JSON in MarkLogic

Using MarkLogic 8, I'm using a custom XML to JSON conversion for json:transform-to-json, and I've got it working just about right except the conversion is outputting numbers as strings.
Is there a way to specify that the value of a particular element should be a number value, not a string?
I don't see anything in the doc for json:config, but just in case there's something I've missed, or if you have a neat post-processing trick, I'd love to hear about how to solve this problem.
You can do that by defining an XML Schema for the non-string type elements. Just make sure it is available in the context (by loading it into xdmp:schemas-database()), and that it is recognized (your XML needs to have a namespace that matches the XML Schema, and you might wanna use import schema)..
HTH!