I am trying to process a Json data in Java. I have the data in below format (it is nested data structure with arrays etc.)
person.name,person.friend[0],person.friend[1],person.address.city,person.address.country
1,x,y,kolkata,india
2,a,b,london,uk
The first line is header denoting the nested object hierarchy. I want a json in below format,
{
"data" : [
{
"name" : "1",
"friend" : ["x","y"],
"address" : { "city" : "kolkata", "country" : "india" }
},
{
"name" : "2",
"friend" : ["a","b"],
"address" : { "city" : "london", "country" : "uk" }
} ]
}
The object structure is dynamic and I dont know the columns or header in advance, i.e. I can not use any predefined POJO to get populated with the data. In this example, it "Person" object but it may be any object structure.
I have gone through Jackson or Gson API, but none seems to fulfill this requirement. Is there any API that can help? or any other wayout?
Thanks
You need to do it in 2 steps.
First, you have to parse your CSV. I recommend superCSV. Parsing CSV may be fancy sometimes, so I really recommend you to use a library for that.
Second, you can serialize into JSON. Then you can use GSON, jackson, flexjson, whatever.
After a long Google...I found that the only option is to represent a collection based object structure in flat file is repeated rows,
person.name,person.friends,person.address.city,person.address.country
1,x,kolkata,india
1,y,kolkata,india
2,a,london,uk
2,b,london,uk
where the non-array elements repeats. We need to form a json from this, then need to filter or club the same object by its ID (here person.name)
Related
I have converted some columns to JSON using the columns to json node. The output from that is:
{
"Material" : 101,
"UOM" : "GRAM",
"EAN" : 7698,
"Description" : "CHALK BOX"
}
I would like to add the value of the material property as a key to each JSON object. So, my desired output is:
"101": {
"Material" : 101,
"UOM" : "GRAM",
"EAN" : 7698,
"Description" : "CHALK BOX"
}
I have tried entering the following expression in the JSON transformer node but all I get is a question mark in the new column it generates:
$Material$:{"Material":$Material$,"UOM":$UOM$,"EAN":$EAN$,"Description":$Description$}
I have also tried replacing the $Material$ with "Material" but got the same result.
How would I go about this, please?
In case you convert the Material column to String (for example with String Manipulator), you can easily configure the Columns to JSON:
As you can see the Data bound key is the important part.
The String Manipulator node configuration (string($Material$)):
I finally managed to solve this by a different method.
I split the JSON data into several columns, then used the join function to create a string in the required order. I put the resulting string through the string to JSON node to create the new JSON object.
Thanks for all your tips and comments !
I'm trying to convert a NiFi flow file containing JSON to an AVRO record.
The problem I have is that I don't know how to deal with a fixed type in AVRO, i.e. how to specifiy the proper JSON for converting to fixed?
Currently I'm using the ConvertJsonToAvro-processor.
The AVRO output schema:
{
"type" : "record",
"name" : "Message",
"namespace" : "com.example",
"fields" : [ {
"name" : "MAC",
"type" : {
"type" : "fixed",
"name" : "MY_FIXED_TYPE",
"size" : 6
}
}]
}
The input JSON-forms I tried are
{ "MAC": [ 0, 1, 2, 3, 4, 5] }
{ "MAC": "012345" }
{"MAC":"\u0000\u0001\u0002\u0003\u0004\u0005"}
{"MAC":{"MY_FIXED_TYPE": "\u0000\u0001\u0002\u0003\u0004\u0005"}}
Unfortunately none of them worked for me.
I also tried the ConvertRecord-processor instead of the ConvertJsonToAvro-processor. Also without any luck.
Any ideas?
After some further investigation, it looks like the ConvertJsonToAvro processor can't be used to generate Avro FIXED or BYTES datum. This is likely a bug with NiFi and how the processor uses Avro.
If I'm not mistaken:
The NiFi ConvertJsonToAvro uses the KiteSDK to interpret JSON into Avro data. This JSON-to-Avro conversion is not the same as Avro JSON encoding from the specification.
This processor reads the incoming string into a jackson JsonNode.
FIXED and BYTES types need to correspond to a JsonNode where isBinary() is true.
As far as I can tell, parsing a JSON string with Jackson never generates such a JSON node.
I would raise a NiFi JIRA about this, or an issue on the KiteSDK.
Note: this answer does not apply to NiFi JSON to Avro conversion. My apologies for the mistaken assumption! I'm unsure of the best practice for an answer known to be wrong.
An example of a "correct" Avro JSON encoding for the bytes type is given in the spec. I think you're looking for:
{"MAC":"\u0000\u0001\u0002\u0003\u0004\u0005"}
Or alternatively (for a fixed schema in a union):
{"MAC":{"MY_FIXED_TYPE": "\u0000\u0001\u0002\u0003\u0004\u0005"}}
You can correctly parse this input string using your given Schema.
Here is the desired schema and json for illustration purpose. Please see the link below.
JSON Schema and JSON
{
"id": "123" ,
"ts": "1234567890",
"complex_rules":
[
{
"type":"admin",
"rule":{
"rights":"all",
"remarks": "some admin remarks"
}
},
{
"type":"guest",
"rights": "limited"
},
{
"type":"anonymous",
"rights": "blocked"
}
]
}
The 'complex_rules' is an array of json object:
With type either be a : "admin", "guest", "anonymous" and the 'type' attribute is MANDATORY.
Each object in array can have its own structure, but the type can be either of: "admin", "guest", "anonymous" only. No other type attribute is acceptable.
The conditions to evaluate:
The type of object in the array cannot re-occur in the array. (I know this seems to be not possible, so we can ignore this)
If attribute "rights" in the {type=admin object} with any value, then we cannot have "rights": "limited" or any value in {type=guest object}. The JSON Schema validation must complain about this.
Another twist, either object {type":"guest"}or {type":"anonymous"} can exist. Both types cannot coexist along with other types.
----Update
The above link is the solution this question.
In regards to 1 and 2:
You need to use a combination of if, then, and not keywords to construct the logic you require with the correct level of applicability.
In regards to 3:
The type of object in the array cannot re-occur in the array. (I know
this seems to be not possible, so we can ignore this)
Right, that's correct, it's not possible as of draft-7 JSON Schema.
I'm wondering if there is a JSON format for editing JSON ?
eg if I had some json
{ "first name" : "Joe" }
and + another json file
{ "action" : "add",
"dest" : "root" -- json pointer maybe
"value" : { "surname : "Blogs" }
}
would get me =
{ "first name" : "Joe", "surname : "Blogs" }
similarly a delete , and change ..
Is there a JSON format that does this? It may be part of a noSQL db or may not be or some javascript library - but i'm not after a JS library more is there a JSON format to do this, One would assume someone has done this before!
It doesn't work like that. JSON is just a data-interchange format (http://www.json.org/). There is no transactional semantic or anything that goes with it.
We are currently investigating JSON as a potential API data transfer language for our system and a question about using JSON Reference came up.
Consider the following example:
{
"invoice-address" : { "street": "John Street", "zip": "12345", "city": "Someville" },
"shipping-address": { "$ref": "#/invoice-address" }
}
According to our research, this is a valid usage of JSON Reference. We replace the instance of an object with another object containing the reference pointing to a different object using a JSON Pointer fragment.
Now, a JSON Reference always consists of a key-value pair and thus has to be enclosed in an object. This would mean that in order to reference a non-object data type (e.g. the zip and city strings in the example above) you would have to do the following:
{
"invoice-address" : { "street": "John Street", "zip": "12345", "city": "Someville" },
"shipping-address": { "street": "Doe Street", "zip": { "$ref": "#/invoice-address/zip" }, "city": { "$ref": "#/invoice-address/city" } }
}
Even though the JSON Pointers now correctly point to string values, we had to change the data type of zip and city from string to object, which make them fail validation against our JSON Schema, because it declares them as strings.
However, the JSON Reference draft states:
Implementations MAY choose to replace the reference with the referenced value.
Does that mean that we are allowed to "preprocess" the file and replace the JSON Reference object with the resolved string value before validating against the JSON Schema? Or are references limited to object types only?
Thanks to anyone who can shed some light onto this.
I wouldn't expect most validators to resolve JSON References before validation. You could either:
resolve JSON References before validation
adapt the JSON Schemas to allow for JSON Reference objects in certain places
Personally, I think the first option is much neater.
You could end up with circular references I suppose - I don't know which validator/language you're using, but tv4 can definitely handle it.