Apache Nifi - Extract Attributes From Avro - json

I'm trying to get my head around on extracting attributes from Avro and JSON. I'm able to extract attributes from JSON by using EvaluateJsonPath processor. I'm trying to do the same on Avro, but i'm not sure whether it is achievable.
Here is my flow, ExecuteSQL -> SplitAvro -> UpdateAttribute
UpdateAttribute is the processor where i want to extract the attributes. Please find below snapshot of UpdateAttribute processor,
So, my basic question is, could we extract attributes form Avro? If yes, please provide me the right approach. Or is it necessary to use ConvertAvroToJSON always before extracting the attributes?

Currently, there is no way in NiFi to extract attributes directly from Avro (there is not yet an AvroPath like XPath for XML or JsonPath for JSON) so as you said you can use ConvertAvroToJSON before extracting the attributes.
Alternatively, I wrote a Groovy script for use in an ExecuteScript processor, it takes "Avro path" values as dynamic properties (each starting with avro.path and whose value is really JsonPath), does the conversion of Avro to JSON in memory, and requires you download and point to the Avro JARs. I can post it here if you are interested, but really its only advantage is to maintain the flow file content in Avro, and although it might be annoying, you could use ConvertAvroToJson -> EvaluateJsonPath -> ConvertJsonToAvro as the workaround.

Related

How can I query JSON of an XML parsing in a client only app?

Sorry for the inappropriate question. But what do you recommend me to use to structure a library that can put a query arrangement on json formats generated by an XML parsing based on TEI p5? I tried to use GraphQL by converting the interfaces of my Angular application related to parsing information from XML to JSON in type to define a GraphQL schema but I don't think that's the way.
What I have to do is query, client only, some data encoded in XML (also wanting already parsed in JSON) and, for example, search for all occurrences of a specific data.
Do you have any roadmaps to recommend or some JSON query system that might be right for me?
You might take a look at https://www.npmjs.com/package/saxon-js. With SaxonJS you're able to run XPath expression against XML using JavaScript.

Processing json data from kafka using structured streaming

I want to convert incoming JSON data from Kafka into a dataframe.
I am using structured streaming with Scala 2.12
Most people add a hard coded schema, but if the json can have additional fields, it requires changing the code base every-time, which is tedious.
One approach is to write it into a file and infer it with but I rather avoid doing that.
Is there any other way to approach this problem?
Edit: Found a way to turn a json string into a dataframe but cant extract it from the stream source, it is possible to extract it?
One way is to store the schema itself in the message headers (not in the key or value).
Though, this increases message size, it will be easy to parse the JSON value without the need for any external resource like a file or a schema registry.
New messages can have new schemas while at the same time old messages can still be processed using their old schema itself, because the schema is within the message itself.
Alternatively, you can version the schemas and include an id for every schema in the message headers (or) a magic byte in the key or value and infer the schema from there.
This approach is followed by Confluent Schema registry. It allows you to basically go through different versions of same schema and see how your schema has evolved over time.
Read the data as string and then convert it to map[string,String], this way you can process the any json without even knowing its schema
based on JavaTechnical answer , the best approach would be to use a schema registry and
avro data instead of json, there is no going around hardcoding a schema (for now).
include your schema name and id as a header and use them to read the schema from the schema registry.
use the from_avro fucntion to turn that data into a df!

Generate Angular2 forms from Swagger API specification

I'm looking for a way to generate a set of Angular2 form templates from a Swagger API definition file. I want a result that will allow me to test my POST/PUT requests, and even use it in my app.
After some research I found this Angular2 form library that takes a JSON schema as input: https://github.com/makinacorpus/angular2-schema-form
So if you know of a Swagger -> JSON Schema converter that will work too.
Cheers!
So if you know of a Swagger -> JSON Schema converter that will work
too.
Swagger 2.0 supports a subset of JSON schema draft 4. This is what swagger's Schema object is. From the docs:
The following properties are taken directly from the JSON Schema
definition and follow the same specifications:
$ref - As a JSON Reference
format (See Data Type Formats for further details)
title
description (GFM syntax can be used for rich text representation)
default (Unlike JSON Schema, the value MUST conform to the defined type for the Schema Object)
multipleOf
...
The following properties are taken from the JSON Schema definition but
their definitions were adjusted to the Swagger Specification.
items
allOf
properties
additionalProperties
It should be a fairly simple exercise to manually extract the schema from your swagger, but I don't know of any automated tool to do this. I think the fact some of the JSON schema properties have been modified by swagger may make auto conversion problematic in certain circumstances.

Javascript in place of json input step

I am loading data from a mongodb collection to a mysql table through Kettle transformation.
First I extract them using MongodbInput and then I use json input step.
But since json input step has very low performance, I wanted to replace it with a
javacript script.
I am a beginner in Javascript and even though i tried somethings, the kettle javascript script is not recognizing any keywords.
can anyone give me sample code to convert Json data to different columns using javascript?
To solve your problem you need to see three aspects:
Reading from MongoDB
Reading from JSON
Reading from (probably) String
Reading from MongoDB Except if you changed the interface, MongoDB returns not JSON but BSON files (~binary JSON). You need to see the MongoDB documentation about reading and writing BSON: probably something like BSON.to() and BSON.from() but I don't know it by heart.
Reading from JSON Once you have your BSON in JSON format, you can read it using JSON.stringify() which returns a String.
Reading from (probably) String If you want to use the capabilities of JSON (why else would you use JSON?), you also want to use JSON.parse() which returns a JSON object.
My experience is that to send a JSON object from one step to the other, using a String is not a bad idea, i.e. at the end of a JavaScript step, you write your JSON object to a String and at the beginning of the next JavaScript step (can be further down the stream) you parse it back to JSON to work with it.
I hope this answers your question.
PS: writing JavaScript steps requires you to learn JavaScript. You don't have to be a master, but the basics are required. There is no way around it.
you could use the json input step to get the values of this json and put in common rows

Why should I use JavaScriptObject overlay classes instead of native Java classes when processing JSON data?

In my GWT project I need to process json data retrieved from a database via PHP. I have seen the Google examples using JavaScriptObject overlay classes. What I don't understand is why this seems to be the prefered method of processing the json data. Why shouldn't I use all native Java code to pull in the data?
Think about it the other way around: what does it mean to use POJOs? (or native Java classes as you name them)
You have to:
parse the JSON into some Java-accessible structure (e.g. com.google.gwt.json.client.JSONObject, or elemental.json.JsonObject)
create POJOs
fill the POJOs with the data from the parsed JSON structure
now you can forget the parsed JSON structure from step 1
On the other hand, with JavaScriptObject, you use JsonUtil.safeEval and TA-DA! you get your JSON parsed right into a typed Java object!
Now, to deal with JSON, there's also AutoBeans.
Choose your poison.