Circular references not handled in Avro - json

There is a tool called Avro-Tools which ships with Avro and can be used to convert between JSON, Avro-Schema (.avsc) and binary formats.
But it does not work with circular references.
We have two files:
circular.avsc (generated by Avro)
circular.json (generated by Jackson because it has circular reference and Avro doesn't like the same).
circular.avsc
{
"type":"record",
"name":"Parent",
"namespace":"bigdata.example.avro",
"fields":[
{
"name":"name",
"type":[
"null",
"string"
],
"default":null
},
{
"name":"child",
"type":[
"null",
{
"type":"record",
"name":"Child",
"fields":[
{
"name":"name",
"type":[
"null",
"string"
],
"default":null
},
{
"name":"parent",
"type":[
"null",
"Parent"
],
"default":null
}
]
}
],
"default":null
}
]
}
circular.json
{
"#class":"bigdata.example.avro.Parent",
"#circle_ref_id":1,
"name":"parent",
"child":{
"#class":"bigdata.example.avro.DerivedChild",
"#circle_ref_id":2,
"name":"hello",
"parent":1
}
}
Command to run avro-tools on the above
java -jar avro-tools-1.7.6.jar fromjson --schema-file circular.avsc circular.json
Output
2014-06-09 14:29:17.759 java[55860:1607] Unable to load realm mapping info from SCDynamicStore
Objavro.codenullavro.schema?
{"type":"record","name":"Parent","namespace":"bigdata.example.avro","fields":[{"name":"name","type":["null","string"],"default":null},{"name":"child","type":["null",{"type":"record","name":"Child","fields":[{"name":"name","type":["null","string"],"default":null},{"name":"parent","type":["null","Parent"],"default":null}]}],"default":null}]}?'???K?jH!??Ė?Exception in thread "main" org.apache.avro.AvroTypeException: Expected start-union. Got VALUE_STRING
at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697)
at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441)
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
Some other JSON values tried with the same schema but that did not work
JSON 1
{
"name":"parent",
"child":{
"name":"hello",
"parent":null
}
}
JSON 2
{
"name":"parent",
"child":{
"name":"hello",
}
}
JSON 3
{
"#class":"bigdata.example.avro.Parent",
"#circle_ref_id":1,
"name":"parent",
"child":{
"#class":"bigdata.example.avro.DerivedChild",
"#circle_ref_id":2,
"name":"hello",
"parent":null
}
}
Removing some of the "optional" elements:
circular.avsc
{
"type":"record",
"name":"Parent",
"namespace":"bigdata.example.avro",
"fields":[
{
"name":"name",
"type":
"string",
"default":null
},
{
"name":"child",
"type":
{
"type":"record",
"name":"Child",
"fields":[
{
"name":"name",
"type":
"string",
"default":null
},
{
"name":"parent",
"type":
"Parent",
"default":null
}
]
},
"default":null
}
]
}
circular.json
{
"#class":"bigdata.example.avro.Parent",
"#circle_ref_id":1,
"name":"parent",
"child":{
"#class":"bigdata.example.avro.DerivedChild",
"#circle_ref_id":2,
"name":"hello",
"parent":1
}
}
output
2014-06-09 15:30:53.716 java[56261:1607] Unable to load realm mapping info from SCDynamicStore
Objavro.codenullavro.schema?{"type":"record","name":"Parent","namespace":"bigdata.example.avro","fields":[{"name":"name","type":"string","default":null},{"name":"child","type":{"type":"record","name":"Child","fields":[{"name":"name","type":"string","default":null},{"name":"parent","type":"Parent","default":null}]},"default":null}]}?x?N??O"?M?`AbException in thread "main" java.lang.StackOverflowError
at org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:212)
at org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
at org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:216)
at org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
at org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:216)
at org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
Does anyone know how I can make circular reference work with Avro?

I met this same problem recently and resolved in a work-around way, hopefully it could help.
Based on the Avro specification:
JSON Encoding
Except for unions, the JSON encoding is the same as is used to encode field default values.
The value of a union is encoded in JSON as follows:
if its type is null, then it is encoded as a JSON null;
otherwise it is encoded as a JSON object with one name/value pair whose name is the type's name and whose value is the recursively encoded value. For Avro's named types (record, fixed or enum) the user-specified name is used, for other types the type name is used.
For example, the union schema ["null","string","Foo"], where Foo is a record name, would encode:
null as null;
the string "a" as {"string": "a"};
and a Foo instance as {"Foo": {...}}, where {...} indicates the JSON encoding of a Foo instance.
If the source file could not be changed to follow the requirement, maybe we have to change the code. So I customized the original org.apache.avro.io.JsonDecoder class from avro-1.7.7 package and created my own class MyJsonDecoder.
Here is the key placed I changed besides create new constructors and class name:
#Override
public int readIndex() throws IOException {
advance(Symbol.UNION);
Symbol.Alternative a = (Symbol.Alternative) parser.popSymbol();
String label;
if (in.getCurrentToken() == JsonToken.VALUE_NULL) {
label = "null";
//***********************************************
// Original code: according to Avor document "JSON Encoding":
// it is encoded as a Json object with one name/value pair whose name is
// the type's name and whose value is the recursively encoded value.
// Can't change source data, so remove this rule.
// } else if (in.getCurrentToken() == JsonToken.START_OBJECT &&
// in.nextToken() == JsonToken.FIELD_NAME) {
// label = in.getText();
// in.nextToken();
// parser.pushSymbol(Symbol.UNION_END);
//***********************************************
// Customized code:
// Add to check if type is in the union then parse it.
// Check if type match types in union or not.
} else {
label = findTypeInUnion(in.getCurrentToken(), a);
// Field missing but not allow to be null
// or field type is not in union.
if (label == null) {
throw error("start-union, type may not be in UNION,");
}
}
//***********************************************
// Original code: directly error out if union
// } else {
// throw error("start-union");
// }
//***********************************************
int n = a.findLabel(label);
if (n < 0)
throw new AvroTypeException("Unknown union branch " + label);
parser.pushSymbol(a.getSymbol(n));
return n;
}
/**
* Method to check if current JSON token type is declared in union.
* Do NOT support "record", "enum", "fix":
* Because there types require user defined name in Avro schema,
* if user defined names could not be found in Json file, can't decode.
*
* #param jsonToken JsonToken
* #param symbolAlternative Symbol.Alternative
* #return String Parsing label, decode in which way.
*/
private String findTypeInUnion(final JsonToken jsonToken,
final Symbol.Alternative symbolAlternative) {
// Create a map for looking up: JsonToken and Avro type
final HashMap<JsonToken, String> json2Avro = new HashMap<>();
for (int i = 0; i < symbolAlternative.size(); i++) {
// Get the type declared in union: symbolAlternative.getLabel(i).
// Map the JsonToken with Avro type.
switch (symbolAlternative.getLabel(i)) {
case "null":
json2Avro.put(JsonToken.VALUE_NULL, "null");
break;
case "boolean":
json2Avro.put(JsonToken.VALUE_TRUE, "boolean");
json2Avro.put(JsonToken.VALUE_FALSE, "boolean");
break;
case "int":
json2Avro.put(JsonToken.VALUE_NUMBER_INT, "int");
break;
case "long":
json2Avro.put(JsonToken.VALUE_NUMBER_INT, "long");
break;
case "float":
json2Avro.put(JsonToken.VALUE_NUMBER_FLOAT, "float");
break;
case "double":
json2Avro.put(JsonToken.VALUE_NUMBER_FLOAT, "double");
break;
case "bytes":
json2Avro.put(JsonToken.VALUE_STRING, "bytes");
break;
case "string":
json2Avro.put(JsonToken.VALUE_STRING, "string");
break;
case "array":
json2Avro.put(JsonToken.START_ARRAY, "array");
break;
case "map":
json2Avro.put(JsonToken.START_OBJECT, "map");
break;
default: break;
}
}
// Looking up the map to find out related Avro type to JsonToken
return json2Avro.get(jsonToken);
}
The generate idea is to check the type from source file could be found in union or not.
Here still has some issues:
This solution doesn't support "record", "enum", or "fixed" Avro type because these types require user defined name. E.g. if you want union "type": ["null", {"name": "abc", "type": "record", "fields" : ...}], this code will not work. For Primitive type, this should work. But please test it before your use it for your project.
Personally I think records should not be null because I consider records are what I need to make sure exists, if something is missing, that means I have bigger problem. If it could be omit, I prefer to use "map" as type instead of using "record" when you define the schema.
Hopefully this could help.

Related

How to remove key value pair from json using Dataweave in mule4

I want to remove a key from json payload if its empty for eg.
{
"transactionDetails": {
"maintenanceType": null,
"transactionDate": "2021-10-07T05:38:38.44-05:00"
},
"account": {
"agentOfRecord": {
"type": "true",
"rateType": ""
},
"subAccounts": {
"subAccount": [{
"agentOfRecord": []
}]
}
}
}
In the above example two keys which are empty "rateType" and "agentOfRecord". How can I remove this two keys from the payload.
Expected result will be lyk this
{
"transactionDetails": {
"maintenanceType": null,
"transactionDate": "2021-10-07T05:38:38.44-05:00"
},
"account": {
"agentOfRecord": {
"type": "true"
},
"subAccounts": {
"subAccount": [{
}]
}
}
}
I tried below code but not working its not filtering the actual key
%dw 2.0
output application/json
---
payload filterObject ((value, key) -> (key as String != "Test"))
The variable filterList is not used directly, so that the function is more reusable. The variable could be replaced by a list obtained from a configuration or a database.
This script should remove all keys mentioned in filterList that are 'empty'. I used a custom empty function, because the built-in isEmpty() function also includes empty objects, and I wasn't sure if you wanted that. Otherwise you can use the built-in version.
%dw 2.0
output application/json
var filterList=["rateType", "agentOfRecord"]
fun isEmptyCustom(x)=
x match {
case is Array -> sizeOf(x) == 0
case is String -> sizeOf(x) == 0
else -> false
}
fun filterKey(k, v, f)= !isEmptyCustom(log("v",v)) or
!(f contains (log("k", k) as String))
fun filteKeyRecursive(x, f) =
x match {
case is Object ->
x
filterObject ((value, key, index) -> filterKey(key, value, f))
mapObject ($$): filteKeyRecursive($, f)
case is Array -> x map filteKeyRecursive($, f)
else -> x
}
---
filteKeyRecursive(payload, filterList)
Update: fixed the condition.

How to search in json object in angular using a string?

Json
{
"rootData": {
"test1": {
"testData0": "Previous data",
"testData1": "Earlier Data"
},
"test2": {
"testData0": "Partial data",
"testData1": "Services data"
},
"test3": {
"testData0": "Regular data",
"testData1": {
"testData0": "Your package"
}
}
}
}
Component.ts
import * as configData from './myData.json';
getData(data: string){
console.log(configData.rootData.test1.testData0); //returns "Previous Data.
return configData.rootData.{{data}}.testData0;
}
This getData method is being called in a loop passing a string with values of "test1" the first time "test2" the second time and "test3" the third time called.
I want to do something like this
return configData.rootData.{{data}}.testData0; //should return Previous data, then "partial data" if called again because test2 will be passed in data string.
I know this is not possible the way I am doing it because {{data}} is not defined in my json object.
The goal is to check for the object inside the object. The string data is returning values existing in the json object. I want to use that data to dynamically search in the json file and pull the values.
I know my attempt is not valid. I would like to know if there is an alternative to make this work as I intended.
To get the value with the key in Object, you can use Object[key] (here, key is variable name) and this will return the value of the selected key.
return configData.rootData[data]?.testData0; // Javascript case
So instead of using {{ }}, replace it with square brackets and you will get the result.
And on the above code, rootData[data]?.testData0 has same meaning as rootData[data] ? rootData[data].testData0 : undefined so this will be needed for validation check. (unexpected data value input)
On Typescript,
if (data in configData.rootData && "testData0" in configData.rootData[data]) {
return configData.rootData[data].testData0;
} else {
return undefined;
}
const input = {
"rootData": {
"test1": {
"testData0": "Previous data",
"testData1": "Earlier Data"
},
"test2": {
"testData0": "Partial data",
"testData1": "Services data"
},
"test3": {
"testData0": "Regular data",
"testData1": {
"testData0": "Your package"
}
}
}
};
let data = 'test1';
console.log(input.rootData[data]?.testData0);
data = 'test2';
console.log(input.rootData[data]?.testData0);
data = 'test3';
console.log(input.rootData[data]?.testData0);
data = 'test4';
console.log(input.rootData[data]?.testData0);
data = 'test5';
if (data in input.rootData) {
console.log('Existed', input.rootData[data].testData0);
} else {
console.log('Not Existed');
}
I use ng2-search-filter.
By directive
<tr *ngFor="let data of configData | filter:searchText">
<td>{{testData0}}</td>
<td>{{testData1}}</td>
...
</tr>
Or programmatically
let configDataFiltered = new Ng2SearchPipe().transform(this.configData, searchText);
Practical example: https://angular-search-filter.stackblitz.io

Remove a specific JSONObject from JSONArray in groovy

Say I have a JSON request payload like
{
"workflow": {
"approvalStore": {
"sessionInfo": {
"user": "baduser"
},
"guardType": "Transaction"
}
}
}
I get the value of user via
def user = req.get("workflow").get("approvalStore").get("sessionInfo").get("user")
Now, I get a RestResponse approvalList which I store as list and return to caller as return approvalList.json as JSON. All well so far.
Suppose the response (approvalList.json) looks like below JSONArray -
[
{
"objId": "abc2",
"maker": "baduser"
},
{
"objId": "abc1",
"maker": "baduser"
},
{
"objId": "abc4",
"maker": "gooduser"
}
]
Question : How may I filter the approvalList.json so that it doesn't contain entries (objects) that have "maker": "baduser" ? The value passed to maker should essentially be the user variable I got earlier.
Ideal required output -
It's not entirely clear if you always want a single object returned or a list of objects but using collect is going to be the key here:
// given this list
List approvalList = [
[objId: "abc2", maker: "baduser"],
[objId: "abc1", maker: "baduser"],
[objId: "abc4", maker: "gooduser"]
]
// you mentioned you wanted to match a specific user
String user = "baduser"
List filteredList = approvalList.findAll{ it.maker != user}​​​​​​
// wasn't sure if you wanted a single object or a list...
if (filteredList.size() == 1) {
return filteredList[0] as JSON
} else {
return filteredList as JSON
}​
Pretty simple. First parse the JSON into an object, then walk through and test.
JSONObject json = JSON.parse(text)
json.each(){ it ->
it.each(){ k,v ->
if(v=='baduser'){
// throw exception or something
}
}
}

How to get the exact json node instance using groovy?

Input
Json file :
{
"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{
"value": "New",
"onclick": ["CreateNewDoc()","hai"],
"newnode":"added"
}
]
}
}
}
Groovy code :
def newjson = new JsonSlurper().parse(new File ('/tmp/test.json'))
def value=newjson.menu.popup.menuitem.value
def oneclick=newjson.menu.popup.menuitem.onclick
println value
println value.class
println oneclick
println oneclick.class
Output:
[New]
class java.util.ArrayList
[[CreateNewDoc(), hai]]
class java.util.ArrayList
Here,
The json nodes which carries String and List returns the same class name with the groovy code above shown.
How can i differentiate that nodes value and oneclick. Logically I expect value should be a instance of String. but both returns as ArrayList.
How to get the exact type of node in json using groovy.
Update 1:
I don't exactly know, can do this like shown below. My expectation to get the results this,
New
class java.util.String
[CreateNewDoc(), hai]
class java.util.ArrayList
Here you go:
In the below script using closure to show the details of each value and its type
Another closure is used to show the each map in the menuitem list.
def printDetails = { key, value -> println "Key - $key, its value is \"${value}\" and is of typpe ${value.class}" }
def showMap = { map -> map.collect { k, v -> printDetails (k,v) } }
def json = new groovy.json.JsonSlurper().parse(new File('/tmp/test.json'))
def mItem = json.menu.popup.menuitem
if (mItem instanceof List) {
mItem.collect { showMap it }
}
println 'done'
You can quickly try the same online demo
menuitem is list, so you need to get property on concrete list element:
assert newjson.menu.popup.menuitem instanceof List
assert newjson.menu.popup.menuitem[0].value instanceof String
assert newjson.menu.popup.menuitem[0].onclick instanceof List
in your json the menuitem contains array of one object:
"menuitem": [
{
"value": "New",
"onclick": ["CreateNewDoc()","hai"],
"newnode":"added"
}
]
and when you try to access menuitem.value groovy actually returns a list of value attributes for all objects in menuitem array.
that's why menuitem.value returns array ["New"]
in this case
"menuitem": [
{
"value": "New",
"onclick": ["CreateNewDoc()","hai"],
"newnode":"added"
},
{
"value": "Old",
"onclick": ["CreateOldDoc()","hai"],
"newnode":"added"
}
]
menuitem.value will return array ["New", "Old"]
but menuitem[0].value will return the string value "New"
so in your groovy code to get attributes of first menu item:
def value=newjson.menu.popup.menuitem[0].value
def oneclick=newjson.menu.popup.menuitem[0].onclick

Need help in Append JSON file using ContentMerge in Apache NIFi

I'm trying to merge a JSON file which has multiple objects. Below is my Oringinal JSON file.
{
"applicant": {
"full-name": "Tyrion Lannister",
"mobile-number" : "8435739739",
"email-id" : "tyrionlannister_casterlyrock#gmail.com"
},
"product": {
"product-category" : "Credit Card",
"product-type" : "Super Value Card - Titanium"
}
}
I will get some more JSON data as below from other source.
{
"flags": {
"duplicate-flag" : "No"
"contact-flag" : "Yes"
}
}
My task is to append the new JSON in the old JSON recods as a new object as below.
{
"applicant": {
"full-name": "Tyrion Lannister",
"mobile-number" : "8435739739",
"email-id" : "tyrionlannister_casterlyrock#gmail.com"
},
"product": {
"product-category" : "Credit Card",
"product-type" : "Super Value Card - Titanium"
},
"flags": {
"duplicate-flag" : "No"
"contact-flag" : "Yes"
}
}
Can someone help to guide, how it can be achieved in NiFi ?
I recommend accumulating your components as flowfile attributes, then forming a merged object with an ExecuteScript processor using JavaScript/ECMAScript. Sometimes there's just no substitute for JavaScript. Something like the following might work:
flowFile = session.get();
if (flowFile != null) {
var OutputStreamCallback = Java.type("org.apache.nifi.processor.io.OutputStreamCallback");
var StandardCharsets = Java.type("java.nio.charset.StandardCharsets");
// Get attributes
var applicant = JSON.parse(flowFile.getAttribute("applicant"));
var product = JSON.parse(flowFile.getAttribute("product"));
var flags = JSON.parse(flowFile.getAttribute("flags"));
// Combine
var merged = {
"applicant": applicant,
"product": product,
"flags": flags
};
// Write output content
flowFile = session.write(flowFile, new OutputStreamCallback(function(outputStream) {
outputStream.write(JSON.stringify(merged, null, "\t").getBytes(StandardCharsets.UTF_8));
}));
session.transfer(flowFile, REL_SUCCESS);
}