Given an AVRO schema, I create a JSON string which conforms to this schema.
How can I serialize the JSON string using AVRO to pass it to a Kafka producer which expects an AVRO-encoded message?
All examples I find don't have JSON as input.
BTW, the receiver will then deserialize the message to a POJO - we are working in different tech stacks. So, basically it's JSON -> Kafka -> POJO.
All examples I find don't have JSON as input
Unclear what examples you've found, but kafka-avro-console-producer does exactly what you want (assuming you're using the Confluent Schema Registry).
Otherwise, you want a jsonDecoder, which is what Confluent uses
Related
Is there a way in KAFKA to consume XML source and convert it to JSON and send JSON data to KAFKA to sink?
I have seen Avro, Protobuf as convertors in kafka connect? Are they capable of converting XML to JSON? or would they convert to AVRO, Protobuf specific formats rather than JSON?
Kafka Connect can use any data format. However, there is no builtin Converter for XML, so you just use StringConverter.
Then you can use transforms to parse the XML into an internal format Connect works with, known as Struct. E.g. https://jcustenborder.github.io/kafka-connect-documentation/projects/kafka-connect-transform-xml/transformations/examples/FromXml.books.html
(the json "input" and "output" shown is from a REST proxy request, only look at the value field)
When written to a Connect sink, you can then use JSONConverter (or any other one) to deserialize the internal Struct object
I have a producer that writes a json file to the topic to be read by a kafka consumer stream. Its simple key-value pair.
I want to stream the topic and enrich the event by adding/concatenating more JSON key-value rows and publish to another topic.
None of the values or keys have anything in common by the way.
I am probably overthinking this, but how would I get around this logic?
I suppose you want to decode JSON message at the consumer side.
If you are not concerned about schema and but just want to deal with JSON as a Map, you can use Jackson library to read the JSON string as a Map<String,Object>. For this you can add the fields that you want, convert it back to a JSON string and push it to the new topic.
If you want to have a schema, you need to store the information as to which class it is mapping to or the JSON schema or some id that maps to this, then the following could work.
Store the schema info in headers
For example, you can store the JSON schema or Java class name in the headers of the message while producing and write a deserializer to extract that information from the headers and decode it.
The Deserializer#deserialize() has the Headers argument.
default T deserialize(java.lang.String topic,
Headers headers,
byte[] data)
and you can do something like..
objectMapper.readValue(data,
new Class.forName(
new String(headers.lastHeader("classname").value()
))
Use schema registry
Apart from these, there is also a schema registry from Confluent which can maintain different versions of the schema. You would need to run another process for that, though. If you are going to use this, you may want to look at the subject naming strategy and set it to RecordNameStrategy since you have multiple schemas in the same topic.
I'm trying to figure out whether it's possible to transform JSON values that are stored as strings into actual JSON structures using Kafka Connect.
I tried looking for such a transformation but couldn't find one. As an example, this could be the source:
{
"UserID":2105058535,
"DocumentID":2105058535,
"RandomJSON":"{\"Tags\":[{\"TagID\":1,\"TagName\":\"Java\"},{\"TagID\":2,\"TagName\":\"Kafka\"}]}"
}
And this is my goal:
{
"UserID":2105058535,
"DocumentID":2105058535,
"RandomJSON":{
"Tags":[
{
"TagID":1,
"TagName":"Java"
},
{
"TagID":2,
"TagName":"Kafka"
}
]
}
}
I'm trying to make these transformations for Elasticsearch sink connector if it makes a difference.
I know I can use Logstash together with JSON filter in order to do this, but I'd like to know whether there's a way to do it using just Kafka Connect.
Sounds like this would be a Single Message Transform (thus applicable to any connector, not just ES), but there aren't any out of the box doing what you describe. The API is documented here.
I had a similar issue but in reverse. I had the data in Json and I needed to convert some of it into a Json string representation to store it in Cassandra using the Cassandra Sink. I ended up creating a Kafka stream app that reads from the topic and then output the Json object to another topic that is read by the connector.
topic document <- read by your kafka stream with a call to mapValues or create a Jackson POJO that serializes as you want, and then write value to -> topic document.elasticsearch
You can use FromJson converter.
Please check this link for more details
https://jcustenborder.github.io/kafka-connect-documentation/projects/kafka-connect-json-schema/transformations/examples/FromJson.inline.html
Does smooks support json output or thirdparty plugin for json?
For example, to do XML-to-JSON or EDI-to-JSON
I see it has json reader/parser, but can't seem to find an output/writer.
TIA!
Not directly, but it would support populating of a Java Object model from e.g. XML or EDI and then you could use something like Jackson to serialize the Java to JSON. So should be easy enough to do once you get the data into a Java Object model.
Also note that you do not need to create an actual Java Object model. You can create what Smooks calls a "virtual object model", which is basically collection types (Maps, Lists etc).
I need to deserialize some JSON objects. I tried to use Tiny-json library, but it's too slow. I tried to use Newtonsoft.Json, but it fails in webplayer with this error:
MissingMethodException: Method not found: 'System.Collections.ObjectModel.KeyedCollection.
What JSON parser do you recommend?
You can try one of these open source solutions:
https://github.com/jacobdufault/fullserializer
https://github.com/mtschoen/JSONObject (https://www.assetstore.unity3d.com/en/#!/content/710) I am using this one most of the times, it's versbose but does its job well, not sure about performance however
Or go with paid ones:
https://www.assetstore.unity3d.com/en/#!/content/11347
Unity 5.3 added Native support of Json Serializer. It is faster than others.
JsonUtility.ToJson to convert a class to Json.
JsonUtility.FromJson to convert Json back to class.
For complete example and information regarding json arrays, see
Serialize and Deserialize Json and Json Array in Unity