There is a tool called Avro-Tools which ships with Avro and can be used to convert between JSON, Avro-Schema (.avsc) and binary formats.
But it does not work with circular references.
We have two files:
circular.avsc (generated by Avro)
circular.json (generated by Jackson because it has circular reference and Avro doesn't like the same).
circular.avsc
{
"type":"record",
"name":"Parent",
"namespace":"bigdata.example.avro",
"fields":[
{
"name":"name",
"type":[
"null",
"string"
],
"default":null
},
{
"name":"child",
"type":[
"null",
{
"type":"record",
"name":"Child",
"fields":[
{
"name":"name",
"type":[
"null",
"string"
],
"default":null
},
{
"name":"parent",
"type":[
"null",
"Parent"
],
"default":null
}
]
}
],
"default":null
}
]
}
circular.json
{
"#class":"bigdata.example.avro.Parent",
"#circle_ref_id":1,
"name":"parent",
"child":{
"#class":"bigdata.example.avro.DerivedChild",
"#circle_ref_id":2,
"name":"hello",
"parent":1
}
}
Command to run avro-tools on the above
java -jar avro-tools-1.7.6.jar fromjson --schema-file circular.avsc circular.json
Output
2014-06-09 14:29:17.759 java[55860:1607] Unable to load realm mapping info from SCDynamicStore
Objavro.codenullavro.schema?
{"type":"record","name":"Parent","namespace":"bigdata.example.avro","fields":[{"name":"name","type":["null","string"],"default":null},{"name":"child","type":["null",{"type":"record","name":"Child","fields":[{"name":"name","type":["null","string"],"default":null},{"name":"parent","type":["null","Parent"],"default":null}]}],"default":null}]}?'???K?jH!??Ė?Exception in thread "main" org.apache.avro.AvroTypeException: Expected start-union. Got VALUE_STRING
at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697)
at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441)
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
Some other JSON values tried with the same schema but that did not work
JSON 1
{
"name":"parent",
"child":{
"name":"hello",
"parent":null
}
}
JSON 2
{
"name":"parent",
"child":{
"name":"hello",
}
}
JSON 3
{
"#class":"bigdata.example.avro.Parent",
"#circle_ref_id":1,
"name":"parent",
"child":{
"#class":"bigdata.example.avro.DerivedChild",
"#circle_ref_id":2,
"name":"hello",
"parent":null
}
}
Removing some of the "optional" elements:
circular.avsc
{
"type":"record",
"name":"Parent",
"namespace":"bigdata.example.avro",
"fields":[
{
"name":"name",
"type":
"string",
"default":null
},
{
"name":"child",
"type":
{
"type":"record",
"name":"Child",
"fields":[
{
"name":"name",
"type":
"string",
"default":null
},
{
"name":"parent",
"type":
"Parent",
"default":null
}
]
},
"default":null
}
]
}
circular.json
{
"#class":"bigdata.example.avro.Parent",
"#circle_ref_id":1,
"name":"parent",
"child":{
"#class":"bigdata.example.avro.DerivedChild",
"#circle_ref_id":2,
"name":"hello",
"parent":1
}
}
output
2014-06-09 15:30:53.716 java[56261:1607] Unable to load realm mapping info from SCDynamicStore
Objavro.codenullavro.schema?{"type":"record","name":"Parent","namespace":"bigdata.example.avro","fields":[{"name":"name","type":"string","default":null},{"name":"child","type":{"type":"record","name":"Child","fields":[{"name":"name","type":"string","default":null},{"name":"parent","type":"Parent","default":null}]},"default":null}]}?x?N??O"?M?`AbException in thread "main" java.lang.StackOverflowError
at org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:212)
at org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
at org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:216)
at org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
at org.apache.avro.io.parsing.Symbol.flattenedSize(Symbol.java:216)
at org.apache.avro.io.parsing.Symbol$Sequence.flattenedSize(Symbol.java:323)
Does anyone know how I can make circular reference work with Avro?
I met this same problem recently and resolved in a work-around way, hopefully it could help.
Based on the Avro specification:
JSON Encoding
Except for unions, the JSON encoding is the same as is used to encode field default values.
The value of a union is encoded in JSON as follows:
if its type is null, then it is encoded as a JSON null;
otherwise it is encoded as a JSON object with one name/value pair whose name is the type's name and whose value is the recursively encoded value. For Avro's named types (record, fixed or enum) the user-specified name is used, for other types the type name is used.
For example, the union schema ["null","string","Foo"], where Foo is a record name, would encode:
null as null;
the string "a" as {"string": "a"};
and a Foo instance as {"Foo": {...}}, where {...} indicates the JSON encoding of a Foo instance.
If the source file could not be changed to follow the requirement, maybe we have to change the code. So I customized the original org.apache.avro.io.JsonDecoder class from avro-1.7.7 package and created my own class MyJsonDecoder.
Here is the key placed I changed besides create new constructors and class name:
#Override
public int readIndex() throws IOException {
advance(Symbol.UNION);
Symbol.Alternative a = (Symbol.Alternative) parser.popSymbol();
String label;
if (in.getCurrentToken() == JsonToken.VALUE_NULL) {
label = "null";
//***********************************************
// Original code: according to Avor document "JSON Encoding":
// it is encoded as a Json object with one name/value pair whose name is
// the type's name and whose value is the recursively encoded value.
// Can't change source data, so remove this rule.
// } else if (in.getCurrentToken() == JsonToken.START_OBJECT &&
// in.nextToken() == JsonToken.FIELD_NAME) {
// label = in.getText();
// in.nextToken();
// parser.pushSymbol(Symbol.UNION_END);
//***********************************************
// Customized code:
// Add to check if type is in the union then parse it.
// Check if type match types in union or not.
} else {
label = findTypeInUnion(in.getCurrentToken(), a);
// Field missing but not allow to be null
// or field type is not in union.
if (label == null) {
throw error("start-union, type may not be in UNION,");
}
}
//***********************************************
// Original code: directly error out if union
// } else {
// throw error("start-union");
// }
//***********************************************
int n = a.findLabel(label);
if (n < 0)
throw new AvroTypeException("Unknown union branch " + label);
parser.pushSymbol(a.getSymbol(n));
return n;
}
/**
* Method to check if current JSON token type is declared in union.
* Do NOT support "record", "enum", "fix":
* Because there types require user defined name in Avro schema,
* if user defined names could not be found in Json file, can't decode.
*
* #param jsonToken JsonToken
* #param symbolAlternative Symbol.Alternative
* #return String Parsing label, decode in which way.
*/
private String findTypeInUnion(final JsonToken jsonToken,
final Symbol.Alternative symbolAlternative) {
// Create a map for looking up: JsonToken and Avro type
final HashMap<JsonToken, String> json2Avro = new HashMap<>();
for (int i = 0; i < symbolAlternative.size(); i++) {
// Get the type declared in union: symbolAlternative.getLabel(i).
// Map the JsonToken with Avro type.
switch (symbolAlternative.getLabel(i)) {
case "null":
json2Avro.put(JsonToken.VALUE_NULL, "null");
break;
case "boolean":
json2Avro.put(JsonToken.VALUE_TRUE, "boolean");
json2Avro.put(JsonToken.VALUE_FALSE, "boolean");
break;
case "int":
json2Avro.put(JsonToken.VALUE_NUMBER_INT, "int");
break;
case "long":
json2Avro.put(JsonToken.VALUE_NUMBER_INT, "long");
break;
case "float":
json2Avro.put(JsonToken.VALUE_NUMBER_FLOAT, "float");
break;
case "double":
json2Avro.put(JsonToken.VALUE_NUMBER_FLOAT, "double");
break;
case "bytes":
json2Avro.put(JsonToken.VALUE_STRING, "bytes");
break;
case "string":
json2Avro.put(JsonToken.VALUE_STRING, "string");
break;
case "array":
json2Avro.put(JsonToken.START_ARRAY, "array");
break;
case "map":
json2Avro.put(JsonToken.START_OBJECT, "map");
break;
default: break;
}
}
// Looking up the map to find out related Avro type to JsonToken
return json2Avro.get(jsonToken);
}
The generate idea is to check the type from source file could be found in union or not.
Here still has some issues:
This solution doesn't support "record", "enum", or "fixed" Avro type because these types require user defined name. E.g. if you want union "type": ["null", {"name": "abc", "type": "record", "fields" : ...}], this code will not work. For Primitive type, this should work. But please test it before your use it for your project.
Personally I think records should not be null because I consider records are what I need to make sure exists, if something is missing, that means I have bigger problem. If it could be omit, I prefer to use "map" as type instead of using "record" when you define the schema.
Hopefully this could help.
Related
I want to remove a key from json payload if its empty for eg.
{
"transactionDetails": {
"maintenanceType": null,
"transactionDate": "2021-10-07T05:38:38.44-05:00"
},
"account": {
"agentOfRecord": {
"type": "true",
"rateType": ""
},
"subAccounts": {
"subAccount": [{
"agentOfRecord": []
}]
}
}
}
In the above example two keys which are empty "rateType" and "agentOfRecord". How can I remove this two keys from the payload.
Expected result will be lyk this
{
"transactionDetails": {
"maintenanceType": null,
"transactionDate": "2021-10-07T05:38:38.44-05:00"
},
"account": {
"agentOfRecord": {
"type": "true"
},
"subAccounts": {
"subAccount": [{
}]
}
}
}
I tried below code but not working its not filtering the actual key
%dw 2.0
output application/json
---
payload filterObject ((value, key) -> (key as String != "Test"))
The variable filterList is not used directly, so that the function is more reusable. The variable could be replaced by a list obtained from a configuration or a database.
This script should remove all keys mentioned in filterList that are 'empty'. I used a custom empty function, because the built-in isEmpty() function also includes empty objects, and I wasn't sure if you wanted that. Otherwise you can use the built-in version.
%dw 2.0
output application/json
var filterList=["rateType", "agentOfRecord"]
fun isEmptyCustom(x)=
x match {
case is Array -> sizeOf(x) == 0
case is String -> sizeOf(x) == 0
else -> false
}
fun filterKey(k, v, f)= !isEmptyCustom(log("v",v)) or
!(f contains (log("k", k) as String))
fun filteKeyRecursive(x, f) =
x match {
case is Object ->
x
filterObject ((value, key, index) -> filterKey(key, value, f))
mapObject ($$): filteKeyRecursive($, f)
case is Array -> x map filteKeyRecursive($, f)
else -> x
}
---
filteKeyRecursive(payload, filterList)
Update: fixed the condition.
Json
{
"rootData": {
"test1": {
"testData0": "Previous data",
"testData1": "Earlier Data"
},
"test2": {
"testData0": "Partial data",
"testData1": "Services data"
},
"test3": {
"testData0": "Regular data",
"testData1": {
"testData0": "Your package"
}
}
}
}
Component.ts
import * as configData from './myData.json';
getData(data: string){
console.log(configData.rootData.test1.testData0); //returns "Previous Data.
return configData.rootData.{{data}}.testData0;
}
This getData method is being called in a loop passing a string with values of "test1" the first time "test2" the second time and "test3" the third time called.
I want to do something like this
return configData.rootData.{{data}}.testData0; //should return Previous data, then "partial data" if called again because test2 will be passed in data string.
I know this is not possible the way I am doing it because {{data}} is not defined in my json object.
The goal is to check for the object inside the object. The string data is returning values existing in the json object. I want to use that data to dynamically search in the json file and pull the values.
I know my attempt is not valid. I would like to know if there is an alternative to make this work as I intended.
To get the value with the key in Object, you can use Object[key] (here, key is variable name) and this will return the value of the selected key.
return configData.rootData[data]?.testData0; // Javascript case
So instead of using {{ }}, replace it with square brackets and you will get the result.
And on the above code, rootData[data]?.testData0 has same meaning as rootData[data] ? rootData[data].testData0 : undefined so this will be needed for validation check. (unexpected data value input)
On Typescript,
if (data in configData.rootData && "testData0" in configData.rootData[data]) {
return configData.rootData[data].testData0;
} else {
return undefined;
}
const input = {
"rootData": {
"test1": {
"testData0": "Previous data",
"testData1": "Earlier Data"
},
"test2": {
"testData0": "Partial data",
"testData1": "Services data"
},
"test3": {
"testData0": "Regular data",
"testData1": {
"testData0": "Your package"
}
}
}
};
let data = 'test1';
console.log(input.rootData[data]?.testData0);
data = 'test2';
console.log(input.rootData[data]?.testData0);
data = 'test3';
console.log(input.rootData[data]?.testData0);
data = 'test4';
console.log(input.rootData[data]?.testData0);
data = 'test5';
if (data in input.rootData) {
console.log('Existed', input.rootData[data].testData0);
} else {
console.log('Not Existed');
}
I use ng2-search-filter.
By directive
<tr *ngFor="let data of configData | filter:searchText">
<td>{{testData0}}</td>
<td>{{testData1}}</td>
...
</tr>
Or programmatically
let configDataFiltered = new Ng2SearchPipe().transform(this.configData, searchText);
Practical example: https://angular-search-filter.stackblitz.io
Say I have a JSON request payload like
{
"workflow": {
"approvalStore": {
"sessionInfo": {
"user": "baduser"
},
"guardType": "Transaction"
}
}
}
I get the value of user via
def user = req.get("workflow").get("approvalStore").get("sessionInfo").get("user")
Now, I get a RestResponse approvalList which I store as list and return to caller as return approvalList.json as JSON. All well so far.
Suppose the response (approvalList.json) looks like below JSONArray -
[
{
"objId": "abc2",
"maker": "baduser"
},
{
"objId": "abc1",
"maker": "baduser"
},
{
"objId": "abc4",
"maker": "gooduser"
}
]
Question : How may I filter the approvalList.json so that it doesn't contain entries (objects) that have "maker": "baduser" ? The value passed to maker should essentially be the user variable I got earlier.
Ideal required output -
It's not entirely clear if you always want a single object returned or a list of objects but using collect is going to be the key here:
// given this list
List approvalList = [
[objId: "abc2", maker: "baduser"],
[objId: "abc1", maker: "baduser"],
[objId: "abc4", maker: "gooduser"]
]
// you mentioned you wanted to match a specific user
String user = "baduser"
List filteredList = approvalList.findAll{ it.maker != user}
// wasn't sure if you wanted a single object or a list...
if (filteredList.size() == 1) {
return filteredList[0] as JSON
} else {
return filteredList as JSON
}
Pretty simple. First parse the JSON into an object, then walk through and test.
JSONObject json = JSON.parse(text)
json.each(){ it ->
it.each(){ k,v ->
if(v=='baduser'){
// throw exception or something
}
}
}
Input
Json file :
{
"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{
"value": "New",
"onclick": ["CreateNewDoc()","hai"],
"newnode":"added"
}
]
}
}
}
Groovy code :
def newjson = new JsonSlurper().parse(new File ('/tmp/test.json'))
def value=newjson.menu.popup.menuitem.value
def oneclick=newjson.menu.popup.menuitem.onclick
println value
println value.class
println oneclick
println oneclick.class
Output:
[New]
class java.util.ArrayList
[[CreateNewDoc(), hai]]
class java.util.ArrayList
Here,
The json nodes which carries String and List returns the same class name with the groovy code above shown.
How can i differentiate that nodes value and oneclick. Logically I expect value should be a instance of String. but both returns as ArrayList.
How to get the exact type of node in json using groovy.
Update 1:
I don't exactly know, can do this like shown below. My expectation to get the results this,
New
class java.util.String
[CreateNewDoc(), hai]
class java.util.ArrayList
Here you go:
In the below script using closure to show the details of each value and its type
Another closure is used to show the each map in the menuitem list.
def printDetails = { key, value -> println "Key - $key, its value is \"${value}\" and is of typpe ${value.class}" }
def showMap = { map -> map.collect { k, v -> printDetails (k,v) } }
def json = new groovy.json.JsonSlurper().parse(new File('/tmp/test.json'))
def mItem = json.menu.popup.menuitem
if (mItem instanceof List) {
mItem.collect { showMap it }
}
println 'done'
You can quickly try the same online demo
menuitem is list, so you need to get property on concrete list element:
assert newjson.menu.popup.menuitem instanceof List
assert newjson.menu.popup.menuitem[0].value instanceof String
assert newjson.menu.popup.menuitem[0].onclick instanceof List
in your json the menuitem contains array of one object:
"menuitem": [
{
"value": "New",
"onclick": ["CreateNewDoc()","hai"],
"newnode":"added"
}
]
and when you try to access menuitem.value groovy actually returns a list of value attributes for all objects in menuitem array.
that's why menuitem.value returns array ["New"]
in this case
"menuitem": [
{
"value": "New",
"onclick": ["CreateNewDoc()","hai"],
"newnode":"added"
},
{
"value": "Old",
"onclick": ["CreateOldDoc()","hai"],
"newnode":"added"
}
]
menuitem.value will return array ["New", "Old"]
but menuitem[0].value will return the string value "New"
so in your groovy code to get attributes of first menu item:
def value=newjson.menu.popup.menuitem[0].value
def oneclick=newjson.menu.popup.menuitem[0].onclick
I'm trying to merge a JSON file which has multiple objects. Below is my Oringinal JSON file.
{
"applicant": {
"full-name": "Tyrion Lannister",
"mobile-number" : "8435739739",
"email-id" : "tyrionlannister_casterlyrock#gmail.com"
},
"product": {
"product-category" : "Credit Card",
"product-type" : "Super Value Card - Titanium"
}
}
I will get some more JSON data as below from other source.
{
"flags": {
"duplicate-flag" : "No"
"contact-flag" : "Yes"
}
}
My task is to append the new JSON in the old JSON recods as a new object as below.
{
"applicant": {
"full-name": "Tyrion Lannister",
"mobile-number" : "8435739739",
"email-id" : "tyrionlannister_casterlyrock#gmail.com"
},
"product": {
"product-category" : "Credit Card",
"product-type" : "Super Value Card - Titanium"
},
"flags": {
"duplicate-flag" : "No"
"contact-flag" : "Yes"
}
}
Can someone help to guide, how it can be achieved in NiFi ?
I recommend accumulating your components as flowfile attributes, then forming a merged object with an ExecuteScript processor using JavaScript/ECMAScript. Sometimes there's just no substitute for JavaScript. Something like the following might work:
flowFile = session.get();
if (flowFile != null) {
var OutputStreamCallback = Java.type("org.apache.nifi.processor.io.OutputStreamCallback");
var StandardCharsets = Java.type("java.nio.charset.StandardCharsets");
// Get attributes
var applicant = JSON.parse(flowFile.getAttribute("applicant"));
var product = JSON.parse(flowFile.getAttribute("product"));
var flags = JSON.parse(flowFile.getAttribute("flags"));
// Combine
var merged = {
"applicant": applicant,
"product": product,
"flags": flags
};
// Write output content
flowFile = session.write(flowFile, new OutputStreamCallback(function(outputStream) {
outputStream.write(JSON.stringify(merged, null, "\t").getBytes(StandardCharsets.UTF_8));
}));
session.transfer(flowFile, REL_SUCCESS);
}