Convert JSON to Avro in Nifi - json

I want to convert json data to avro.
I have used GenerateFlowFile and put dummy json value [{"firstname":"prathik","age":21},{"firstname":"arun","age":22}].
I have then used ConvertRecord processor and set JsonTreeReader and AvroRecordSetWriter with AvroSchemaRegistry which has the following schema:AvroScehma
But i am getting this as my output: Output (Avro Data)
I am new to Apache Nifi.
Thanks in Advance.

But i am getting this as my output: Output (Avro Data)
That's to be expected. Avro is a binary file format, and what you see is an attempt to make that data viewable in a text format. It's supposed to act like that.

Related

Attaining the data in json format from the payload which is available as "org.mule.munit.common.util.ReusableByteArrayInputStream#53534c15" in mule 3

I need the real payload json data to be able to assert it against another hardcoded json file in munit (mule 3.9 and dataweave 1). The issue is the payload show as "org.mule.munit.common.util.ReusableByteArrayInputStream#53534c15" under payload. When I convert it to java I can see the data, but not in json format. How can I extract the json in this byte array stream to be able to assert it against a json hardcoded file.
I resolved it by using the "Byte to String" block
Then, I have added the "Assert Equals" block, but made sure to format both values likes this.
#[payload.replaceAll("\\s+","")]
#[getResource('sample.json').asString().replaceAll("\\s+","")]
This did exactly what I needed.

How to convert a string into json format in flink side?

I received a one digit as string format, for example, which look like 12.
What I want to do is to convert those strings into a json format and
write them to the text file in my local directory.
However, I didn't get the right solution except for those things that manually change the strings so that it looks like the json format. I think it is tedious and laborious tasks.
The completed json format will be shown as below.
{"temperature": 12}
Is there any libraries that achieve my issue?
Thanks.
Check out Gson. In particular, if you have a Java class with a single "temperature" field, then see this for how to convert to Json.

Nifi: Flow content (dynamic json format) to csv

I have case: in flow content is always json format and the data inside json always change (both kyes and values). Is this possible to convert this flow content to csv?
Please note that, keys in json are always change.
Many thanks,
To achieve this usecase we need to generate avro schema dynamically for each json record first then convert to AVRO finally convert AVRO to CSV
Flow:
1.SplitJson //split the array of json records into individual records
2.InferAvroSchema //infer the avro schema based on Json record and store in attribute
3.ConvertJSONToAvro //convert each json record into Avro data file
4.ConvertRecord //read the avro data file dynamically and convert into CSV format
5.MergeContent (or) MergeRecord processor //to merge the splitted flowfiles into one flowfile based on defragment strategy.
Save this xml and upload to your nifi instance and change as per your requirements.

How to load json and value out of json in Pig?

I have a json and value out of json
000000,{"000":{"phoneNumber":null,"firstName":"xyz","lastName":"pqr","email":"email#xyz.com","alternatePickup":true,"sendTextNotification":false,"isSendTextNotification":false,"isAlternatePickup":true}}
I'm trying to load this json in pig using elephant bird json loader but unable to do that.
I'm able to load the following json
{"000":{"phoneNumber":null,"firstName":"xyz","lastName":"pqr","email":"email#xyz.com","alternatePickup":true,"sendTextNotification":false,"isSendTextNotification":false,"isAlternatePickup":true}}
Using following script -
REGISTER json-simple-1.1.1.jar;
REGISTER elephant-bird-pig-4.3.jar;
REGISTER elephant-bird-hadoop-compat-4.3.jar;
json_data = load 'ek.json' using com.twitter.elephantbird.pig.load.JsonLoader() AS (json_key: [(phoneNumber:chararray,firstName:chararray,lastName:chararray,email:chararray,alternatePickup:boolean,sendTextNotification:boolean,isSendTextNotification:boolean,isAlternatePickup:boolean)]);
dump json_data;
But when I include value out of json
json_data = load 'ek.json' using com.twitter.elephantbird.pig.load.JsonLoader() AS (id:int,json_key: [(phoneNumber:chararray,firstName:chararray,lastName:chararray,email:chararray,alternatePickup:boolean,sendTextNotification:boolean,isSendTextNotification:boolean,isAlternatePickup:boolean)]);
it is not working!! Appreciate the help in advance.
JsonLoader allows loading only of correct json, while your format is actually CSV. There are three options for you ordered by incresing complexity:
Adjust your input format and make id part of it
Load data as CSV (as 2 fields: id and json, then use custom UDF to parse json field into a tuple)
Write custom loader that will allow you your original format.
You can use builtin JsonStorage and JsonLoader()
a = load 'a.json' using JsonLoader('a0:int,a1:{(a10:int,a11:chararray)},a2:(a20:double,a21:bytearray),a3:[chararray]');
In this example data is loaded without a schema; it assumes there is a .pig_schema (produced by JsonStorage) in the input directory.
a = load 'a.json' using JsonLoader();

how to convert nested json file into csv in scala

I want to convert my nested json into csv ,i used
df.write.format("com.databricks.spark.csv").option("header", "true").save("mydata.csv")
But it can use to normal json but not nested json. Anyway that I can convert my nested json to csv?help will be appreciated,Thanks!
When you ask Spark to convert a JSON structure to a CSV, Spark can only map the first level of the JSON.
This happens because of the simplicity of the CSV files. It is just asigning a value to a name. That is why {"name1":"value1", "name2":"value2"...} can be represented as a CSV with this structure:
name1,name2, ...
value1,value2,...
In your case, you are converting a JSON with several levels, so Spark exception is saying that it cannot figure out how to convert such a complex structure into a CSV.
If you try to add only a second level to your JSON, it will work, but be careful. It will remove the names of the second level to include only the values in an array.
You can have a look at this link to see the example for json datasets. It includes an example.
As I have no information about the nature of the data, I can't say much more about it. But if you need to write the information as a CSV you will need to simplify the structure of your data.
Read json file in spark and create dataframe.
val path = "examples/src/main/resources/people.json"
val people = sqlContext.read.json(path)
Save the dataframe using spark-csv
people.write
.format("com.databricks.spark.csv")
.option("header", "true")
.save("newcars.csv")
Source :
read json
save to csv