How to use avro data with old and new namespace - namespaces

I am facing a problem where I have updated the namespace in my avsc schema file. Since we were using common processor created in Java to parse the XML to avro and were using the avsc file.
We have separated the interfaces and created 2 different namespaces and now having 2 avsc schemas which are identical just the namespace is different.
Since we have data which was generated using old namespace, I am unable to query this data with new data generated with new namespace.
Here is example of my schemas -
Old schema - "type" : "record",
"name" : "Message",
"namespace" : "com.myfirstavsc",
"fields" : [ {
"name" : "Header",.....**other fields**
New schema - "type" : "record",
"name" : "Message",
"namespace" : "com.mysecondavsc",
"fields" : [ {
"name" : "Header",.....**other fields**
When I query my hive table I get below exception
Failed with exception java.io.IOException:org.apache.avro.AvroTypeException: Found com.myfirstavsc.Property, expecting union

I am not sure how you are trying to read your data but use GenericDatumReader should solve your issue, after that you can convert the generic record to your specific records. I found something similar here
http://apache-avro.679487.n3.nabble.com/Deserialize-with-different-schema-td4032782.html

http://apache-avro.679487.n3.nabble.com/Deserialize-with-different-schema-td4032782.html
The link mentioned above is not work anymore, so add an explanation here.
We got the same error in a project named Hudi, so raised an issue about it: https://github.com/apache/hudi/issues/7284
After trouble shooting, the root cause of this exception org.apache.avro.AvroTypeException: Found hoodie.test_mor_tab.test_mor_tab_record.new_test_col.fixed, expecting union is Avro schema generator rule, it can't accept the change of namespace when handling UNION type.
According to Avro Schema Resolution doc, it can accept schema evolution if either schema is a union in reader or writer schema in GenericDatumReader(Schema writer, Schema reader). But it didn't mention there is another restriction about it: the full name of schema must be the same if the type is RECORD or ENUM or FIXED.
Code reference:
ResolvingGrammarGenerator#bestBranch
public class ResolvingGrammarGenerator extends ValidatingGrammarGenerator {
...
private int bestBranch(Schema r, Schema w, Map<LitS, Symbol> seen) throws IOException {
Schema.Type vt = w.getType();
// first scan for exact match
int j = 0;
int structureMatch = -1;
for (Schema b : r.getTypes()) {
if (vt == b.getType())
if (vt == Schema.Type.RECORD || vt == Schema.Type.ENUM ||
vt == Schema.Type.FIXED) {
String vname = w.getFullName();
String bname = b.getFullName();
// return immediately if the name matches exactly according to spec
if (vname != null && vname.equals(bname))
return j;
if (vt == Schema.Type.RECORD &&
!hasMatchError(resolveRecords(w, b, seen))) {
String vShortName = w.getName();
String bShortName = b.getName();
// use the first structure match or one where the name matches
if ((structureMatch < 0) ||
(vShortName != null && vShortName.equals(bShortName))) {
structureMatch = j;
}
}
} else
return j;
j++;
}
// if there is a record structure match, return it
if (structureMatch >= 0)
return structureMatch;
// then scan match via numeric promotion
j = 0;
for (Schema b : r.getTypes()) {
switch (vt) {
case INT:
switch (b.getType()) {
case LONG: case DOUBLE:
return j;
}
break;
case LONG:
case FLOAT:
switch (b.getType()) {
case DOUBLE:
return j;
}
break;
case STRING:
switch (b.getType()) {
case BYTES:
return j;
}
break;
case BYTES:
switch (b.getType()) {
case STRING:
return j;
}
break;
}
j++;
}
return -1;
}
...
}

Related

Pass DataType of the Class

I have the following data class, which stores two values, JSON and dataType:
data class DataTypeItem(
var json : String = "",
var dataType : Class<*> ?= null
)
I have the list defined in the following way:
val dataTypeList = mutableMapOf<String, DataTypeItem>()
dataTypeList.put( "item_1", DataTypeItem( json1, MyDataType::class.java ) )
dataTypeList.put( "item_2", DataTypeItem( json1, List<MyDataType>::class.java ) )
Please note that in one case I'm using the MyDataType as the DataType and in the other List < MyDataType >.
Now I would like to loop through each of the dataTypeList items and parse JSON for the given data type into it's model:
fun init()
{
dataTypeList.forEach {
dataTypeItem ->
val model = Gson().fromJson( dataTypeItem.value.json, dataTypeItem.value.dataType::class.java )
}
}
I'm using the following model:
data class dataTypeItem(
#SerializedName("sqlId")
val sqlId: String,
#SerializedName("name")
val name: String
)
But I keep getting an Runtime exception:
Attempted to deserialize a java.lang.Class. Forgot to register a type adapter?
In addition, in case it's a list, I need to call toList() on Gson().fromJSON(..):
fun init()
{
dataTypeList.forEach {
dataTypeItem ->
val model;
if( dataTypeItem.value.dataType::class.java is Array )
model = Gson().fromJson( dataTypeItem.value.json, dataTypeItem.value.dataType::class.java ).toList()
else
model = Gson().fromJson( dataTypeItem.value.json, dataTypeItem.value.dataType::class.java )
}
}
How can I pass the dataType dynamically and distinguish if it's a List/Array or straight up class? In addition, whenever I try to call toList(), I get an error that it's undefined.
If I specify the class directly, then it's working fine
var model = Gson().fromJson( json, DataTypeItem::class.java )
or
var model = Gson().fromJson( json, Array<DataTypeItem>::class.java )
but I need to be able to specify it dynamically as an argument
This code works fine:
val dataTypeMap = mapOf(
"item_1" to MyDataTypeItem("""{"sqlId" : "1", "name" : "a"}""", MyDataType::class.java),
"item_2" to MyDataTypeItem("""[{"sqlId" : "1", "name" : "a"}, {"sqlId" : "2", "name" : "b"}]""", Array<MyDataType>::class.java)
)
val result = dataTypeMap.map{ Gson().fromJson(it.value.json, it.value.dataType) }
I renamed DataTypeItem to MyDataTypeItem and dataTypeItem to MyDataType.
Why you need to call toList()? If it is really necessary you can do the following instead:
val result = dataTypeMap.map {
if (it.value.dataType?.isArray == true) Gson().fromJson<Array<*>>(it.value.json, it.value.dataType).toList()
else Gson().fromJson(it.value.json, it.value.dataType)
}

Can I define a GraphQL field to be any valid json? [duplicate]

Is it possible to specify that a field in GraphQL should be a blackbox, similar to how Flow has an "any" type? I have a field in my schema that should be able to accept any arbitrary value, which could be a String, Boolean, Object, Array, etc.
I've come up with a middle-ground solution. Rather than trying to push this complexity onto GraphQL, I'm opting to just use the String type and JSON.stringifying my data before setting it on the field. So everything gets stringified, and later in my application when I need to consume this field, I JSON.parse the result to get back the desired object/array/boolean/ etc.
#mpen's answer is great, but I opted for a more compact solution:
const { GraphQLScalarType } = require('graphql')
const { Kind } = require('graphql/language')
const ObjectScalarType = new GraphQLScalarType({
name: 'Object',
description: 'Arbitrary object',
parseValue: (value) => {
return typeof value === 'object' ? value
: typeof value === 'string' ? JSON.parse(value)
: null
},
serialize: (value) => {
return typeof value === 'object' ? value
: typeof value === 'string' ? JSON.parse(value)
: null
},
parseLiteral: (ast) => {
switch (ast.kind) {
case Kind.STRING: return JSON.parse(ast.value)
case Kind.OBJECT: throw new Error(`Not sure what to do with OBJECT for ObjectScalarType`)
default: return null
}
}
})
Then my resolvers looks like:
{
Object: ObjectScalarType,
RootQuery: ...
RootMutation: ...
}
And my .gql looks like:
scalar Object
type Foo {
id: ID!
values: Object!
}
Yes. Just create a new GraphQLScalarType that allows anything.
Here's one I wrote that allows objects. You can extend it a bit to allow more root types.
import {GraphQLScalarType} from 'graphql';
import {Kind} from 'graphql/language';
import {log} from '../debug';
import Json5 from 'json5';
export default new GraphQLScalarType({
name: "Object",
description: "Represents an arbitrary object.",
parseValue: toObject,
serialize: toObject,
parseLiteral(ast) {
switch(ast.kind) {
case Kind.STRING:
return ast.value.charAt(0) === '{' ? Json5.parse(ast.value) : null;
case Kind.OBJECT:
return parseObject(ast);
}
return null;
}
});
function toObject(value) {
if(typeof value === 'object') {
return value;
}
if(typeof value === 'string' && value.charAt(0) === '{') {
return Json5.parse(value);
}
return null;
}
function parseObject(ast) {
const value = Object.create(null);
ast.fields.forEach((field) => {
value[field.name.value] = parseAst(field.value);
});
return value;
}
function parseAst(ast) {
switch (ast.kind) {
case Kind.STRING:
case Kind.BOOLEAN:
return ast.value;
case Kind.INT:
case Kind.FLOAT:
return parseFloat(ast.value);
case Kind.OBJECT:
return parseObject(ast);
case Kind.LIST:
return ast.values.map(parseAst);
default:
return null;
}
}
For most use cases, you can use a JSON scalar type to achieve this sort of functionality. There's a number of existing libraries you can just import rather than writing your own scalar -- for example, graphql-type-json.
If you need a more fine-tuned approach, than you'll want to write your own scalar type. Here's a simple example that you can start with:
const { GraphQLScalarType, Kind } = require('graphql')
const Anything = new GraphQLScalarType({
name: 'Anything',
description: 'Any value.',
parseValue: (value) => value,
parseLiteral,
serialize: (value) => value,
})
function parseLiteral (ast) {
switch (ast.kind) {
case Kind.BOOLEAN:
case Kind.STRING:
return ast.value
case Kind.INT:
case Kind.FLOAT:
return Number(ast.value)
case Kind.LIST:
return ast.values.map(parseLiteral)
case Kind.OBJECT:
return ast.fields.reduce((accumulator, field) => {
accumulator[field.name.value] = parseLiteral(field.value)
return accumulator
}, {})
case Kind.NULL:
return null
default:
throw new Error(`Unexpected kind in parseLiteral: ${ast.kind}`)
}
}
Note that scalars are used both as outputs (when returned in your response) and as inputs (when used as values for field arguments). The serialize method tells GraphQL how to serialize a value returned in a resolver into the data that's returned in the response. The parseLiteral method tells GraphQL what to do with a literal value that's passed to an argument (like "foo", or 4.2 or [12, 20]). The parseValue method tells GraphQL what to do with the value of a variable that's passed to an argument.
For parseValue and serialize we can just return the value we're given. Because parseLiteral is given an AST node object representing the literal value, we have to do a little bit of work to convert it into the appropriate format.
You can take the above scalar and customize it to your needs by adding validation logic as needed. In any of the three methods, you can throw an error to indicate an invalid value. For example, if we want to allow most values but don't want to serialize functions, we can do something like:
if (typeof value == 'function') {
throw new TypeError('Cannot serialize a function!')
}
return value
Using the above scalar in your schema is simple. If you're using vanilla GraphQL.js, then use it just like you would any of the other scalar types (GraphQLString, GraphQLInt, etc.) If you're using Apollo, you'll need to include the scalar in your resolver map as well as in your SDL:
const resolvers = {
...
// The property name here must match the name you specified in the constructor
Anything,
}
const typeDefs = `
# NOTE: The name here must match the name you specified in the constructor
scalar Anything
# the rest of your schema
`
Just send a stringified value via GraphQL and parse it on the other side, e.g. use this wrapper class.
export class Dynamic {
#Field(type => String)
private value: string;
getValue(): any {
return JSON.parse(this.value);
}
setValue(value: any) {
this.value = JSON.stringify(value);
}
}
For similar problem I've created schema like this:
"""`MetadataEntry` model"""
type MetadataEntry {
"""Key of the entry"""
key: String!
"""Value of the entry"""
value: String!
}
"""Object with metadata"""
type MyObjectWithMetadata {
"""
... rest of my object fields
"""
"""
Key-value entries that you can attach to an object. This can be useful for
storing additional information about the object in a structured format
"""
metadata: [MetadataEntry!]!
"""Returns value of `MetadataEntry` for given key if it exists"""
metadataValue(
"""`MetadataEntry` key"""
key: String!
): String
}
And my queries can look like this:
query {
listMyObjects {
# fetch meta values by key
meta1Value: metadataValue(key: "meta1")
meta2Value: metadataValue(key: "meta2")
# ... or list them all
metadata {
key
value
}
}
}

How to handle nullable fields for csv generation?

I create from a json source a csv that I want to use to populate a memsql database with the help of LOAD DATA INFILE.
I have written a typescript script for the conversation and use the library json2csv.
It leaves the values for nulled entries empty though, creating a string like:
foo, bar, , barz, 11 ,
Yet I expect my output to be:
foo, bar, \N , barz, 11 , \N
for my nulled fields. Otherwise, my database will fill in different default values, such as 0 for a number that should be NULL.
I discovered myself doing:
const someEntitites.map((entity: Entity) => {
entity.foo = entity.foo === null ? '\\N' : entity.foo;
entity.bar = entity.bar === null ? '\\N' : entity.bar;
...
return entity;
}
So basically I am hardcoding my approach to my entity, and I also am prone to bug, as I might have forgotten to check a nullable property. And if I am to export another table, I have to repeat this all over again.
How can I generalize this, so I can use this on different entities where the script "discovers" the nullable fields and sets the marker accordingly?
I created a function that iterates over its own properties and sets its value to \N if the according value is null:
const handleNullCases = (record: any): any => {
for (let key in record) {
if (record.hasOwnProperty(key)) {
const value = record[key];
if (value === null) {
record[key] = "\\N";
}
}
}
return record;
};
That way I can reuse that snipplet for other entities as well:
const processedEntities = entities.map(handleNullCases);
const processedEntities2 = entities2.map(handleNullCases);
...
I find it a bit dirty, as that I just typehint for any and cast the value to a string even though it might have been declared as another type.
I'm going to assume all properties in Entity may be null. If so, this typing is a bit safer:
type Nullable<T> = {[K in keyof T]: T[K] | null};
type CSVSafe<T> = {[K in keyof T]: T[K] | '\\N'};
const handleNullCases = <E>(record: Nullable<E>): CSVSafe<E> => {
let ret = Object.assign(record) as CSVSafe<E>;
Object.keys(ret).forEach((key: keyof E) => {
if (record[key] === null) {
ret[key] = '\\N';
}
});
return ret;
};
type Entity = Nullable<{ a: number, b: string, c: boolean, d: number, e: string }>;
const entity: Entity = { a: 1, b: null, c: false, d: null, e: 'e' };
const safeEntity = handleNullCases(entity);
// type CSVSafe<{ a: number; b: string; c: boolean; d: number; e: string; }>
The handleNullCases function will take any object whose values might be null, and return a new object which is just the same except that null values have been replaced with "\\N". The output type will be a CSVSafe<> version of the Nullable<> input type.
Hope that helps.

Querying a JSON-Structure in Angular: Find the right index and get a certain property of it

Assume, following JSON Structure is existing:
[
{
"role_id": 1,
"role_name": "Admin"
},
{
"role_id": 2,
"role_name": "Editor"
}
]
and stored in $rootScope.roles.
What I need is:
$rootScope.roles[index -> where role_id == 2].rolename // gets -> Editor
How can I do that in Angular?
You will have to loop over the array and return the property of the object that matches the given id:
function getRoleName(roleId) {
for (var i = 0; i < $rootScope.roles.length; i++) {
if ($rootScope.roles[i].role_id == roleId) {
return $rootScope.roles[i].role_name;
}
}
};
ng-lodash is an elegant way:
role_name = lodash.pluck(lodash.where($rootScope.roles,{'role_id': 2 }),'role_name');
If you are looking for a more "single-line" solution, you can use JS array function find:
($rootScope.roles.find(function (x) { return x.role_id == 2; }) || {}).role_name;
When it is not found, find returns null so I substituted that possible result by {} so it does not throw an exception when accessing null.role_name. This way it will return undefined when the specified role_id is not found.
Note this method is a new technology and won't be available in every browser, more info in https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find
Another "single-line" more stable solution would be to use filter:
($rootScope.roles.filter(function (x) { return x.role_id == 2; })[0] || {}).role_name;
This other method is more stable and can be found in every browser, more info in https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/filter

How do you return lower-cased JSON from a CFCin ColdFusion?

I have a ColdFusion component that will return some JSON data:
component
{
remote function GetPeople() returnformat="json"
{
var people = entityLoad("Person");
return people;
}
}
Unfortunately, the returned JSON has all the property names in upper case:
[
{
FIRSTNAME: "John",
LASTNAME: "Doe"
},
{
FIRSTNAME: "Jane",
LASTNAME: "Dover
}
]
Is there any way to force the framework to return JSON so that the property names are all lower-case (maybe a custom UDF/CFC that someone else has written)?
Yeah, unfortunately, that is just the way ColdFusion works. When setting some variables you can force lowercase, like with structs:
<cfset structName.varName = "test" />
Will set a the variable with uppercase names. But:
<cfset structName['varname'] = "test" />
Will force the lowercase (or camelcase depending on what you pass in).
But with the ORM stuff you are doing, I don't think you are going to be able to have any control over it. Someone correct me if I am wrong.
From http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=functions_s_03.html
Note: ColdFusion internally represents structure key names using
all-uppercase characters, and, therefore, serializes the key names to
all-uppercase JSON representations. Any JavaScript that handles JSON
representations of ColdFusion structures must use all-uppercase
structure key names, such as CITY or STATE. You also use the
all-uppercase names COLUMNS and DATA as the keys for the two arrays
that represent ColdFusion queries in JSON format.
If you're defining the variables yourself, you can use bracket notation (as Jason's answer shows), but with built-in stuff like ORM I think you're stuck - unless you want to create your own struct, and clone the ORM version manually, lower-casing each of the keys, but that's not really a great solution. :/
This should work as you described.
component
{
remote function GetPeople() returnformat="json"
{
var people = entityLoad("Person");
var rtn = [];
for ( var i = 1; i <= arrayLen( people ); i++ ) {
arrayAppend( rtn, {
"firstname" = people[i].getFirstname(),
"lastname" = people[i].getLastname()
} );
}
return rtn;
}
}
If any of your entity properties return null, the struct key wont exist.
To work around that try this
component
{
remote function GetPeople() returnformat="json"
{
var people = entityLoad("Person");
var rtn = [];
for ( var i = 1; i <= arrayLen( people ); i++ ) {
var i_person = {
"firstname" = people[i].getFirstname(),
"lastname" = people[i].getLastname()
};
if ( !structKeyExists( i_person, "firstname" ) ) {
i_person["firstname"] = ""; // your default value
}
if ( !structKeyExists( i_person, "lastname" ) ) {
i_person["lastname"] = ""; // your default value
}
arrayAppend( rtn, i_person );
}
return rtn;
}
}