I've been trying to find a solution to the following situation with no avail:
I have a Kafka Streams application which should read from a single input topic a series of JSON objects, all not exactly the same as one another. Practically speaking, each JSON is a representation of an HTTP request object, thus not all JSON records have the same headers, request parameters, cookies and so forth. Furthermore, the JSON objects are written
Is there any way to achieve this? Not expecting for any detailed how-to solutions. Only for some leads on how I can achieve this, as my search over the internet has ended me with nothing so far.
Here's an idea: use Jackson's tree model to dynamically parse your JSON into a JsonNode, and then use this tree representation in your Kafka Streams topology to process the requests.
ObjectMapper objectMapper = new ObjectMapper();
JsonNode rootNode = objectMapper.readTree(json);
...
Related
I have a producer that writes a json file to the topic to be read by a kafka consumer stream. Its simple key-value pair.
I want to stream the topic and enrich the event by adding/concatenating more JSON key-value rows and publish to another topic.
None of the values or keys have anything in common by the way.
I am probably overthinking this, but how would I get around this logic?
I suppose you want to decode JSON message at the consumer side.
If you are not concerned about schema and but just want to deal with JSON as a Map, you can use Jackson library to read the JSON string as a Map<String,Object>. For this you can add the fields that you want, convert it back to a JSON string and push it to the new topic.
If you want to have a schema, you need to store the information as to which class it is mapping to or the JSON schema or some id that maps to this, then the following could work.
Store the schema info in headers
For example, you can store the JSON schema or Java class name in the headers of the message while producing and write a deserializer to extract that information from the headers and decode it.
The Deserializer#deserialize() has the Headers argument.
default T deserialize(java.lang.String topic,
Headers headers,
byte[] data)
and you can do something like..
objectMapper.readValue(data,
new Class.forName(
new String(headers.lastHeader("classname").value()
))
Use schema registry
Apart from these, there is also a schema registry from Confluent which can maintain different versions of the schema. You would need to run another process for that, though. If you are going to use this, you may want to look at the subject naming strategy and set it to RecordNameStrategy since you have multiple schemas in the same topic.
We need to bulid up a json payload to send to a rest endpoint. Our org uses Jackson. The only way to build up the json (without creating dozens of nested empty pojos) is as follows:
ObjectMapper mapper = new ObjectMapper();
ObjectNode rootNode = mapper.createObjectNode();
ObjectNode content = mapper.createObjectNode();
rootNode.set("someContent", content);
content.put("somekey","someval");
... lots more nested objects created here...
Ok, so now I have a complex json object in the mapper. How do I get it out?
E.g. How do I get a json string out of it suitable for sending to an REST api as post payload?
There are various examples of setting up pretty print, but these revolve around jsonizing a single java object (which I am not doing), or outputting to streams or files, not a simple String.
Any suggestions?
perhaps use ObjectMapper.writeValueAsString:
System.out.println(mapper.writeValueAsString(rootNode));
I need to parse the jax-ws rest response and I tried the following two ways of parsing the response.Both works good.But I am in need to know the best efficient way of implementation.Please provide me your view.
First Approach:
Use getEntity Object and get the response as Input Stream.
Using Jackson ObjectMapper readValue() -covert the inputstream to java
object.
Using getters and setters of nested java class get the response objects member values.
Second Approach:
Use getEntity Object and get the response as Input Stream and and
convert the Input Stream to String.
Using Google Json API,convert the string to json object.
Using Json parser and get the nested objects member values.
I would say the first approach is better for two reasons:
You don't go through the intermediate process of reading the response payload into String
The setter methods that are called during Jackson deserialization may perform validation on input and throw appropriate exceptions, so you do validation during deserialization.
Maybe not a general answer to this question but another variant of what you're describing under "First approach". I would start with a generic data structure and would only introduce an extra bean if necessary. I wouldn't use String to pass structured data around.
Use jackson to convert the JSON response to a
Map<String,Object> or JsonNode.
Advantage:
You don't need to implement a specialized bean class. Even a very simple bean can become unhandy over time (if format changes or new nested structures are added to the json response, etc.). It also introduces some kind of metaphor to your code which sometimes helps but also can be misleading.
Map<String,Object> is in the JDK and offers a good interface to access data. You don't have to change any interfaces even if the JSON format changes.
You can always pass your data in form of a Map<String,Object>
Disadvantage
Data Encapsulation. The map is a very close representation of the input data and therefore offers not same level of abstraction like a bean.
I want to parse JSON data from a RESTful service.
Unlike a SOAP-based service, where a service consumer can create stubs and skeleton from WSDL, in the case of the RESTful service, the service consumer gets a raw JSON string.
Since the service consumer does not have a Java object matching the JSON structure, we are not able to use the JSON to Java Mappers like GSON, Jackson etc.
One another way is to use parsers like JsonPath, minimal-json, etc which help traversing the JSON structure and read the data.
Is there any better way of reading JSON data?
The official docs for Jackson mention 3 different ways to parse a JSON doc from Java. The first 2 do not require "Java object matching the JSON structure". In Summary :
Streaming API (aka "Incremental parsing/generation") reads and writes JSON content as discrete events.
Tree Model provides a mutable in-memory tree representation of a JSON document. ObjectMapper can build trees that consist of JsonNode nodes.
Data Binding converts JSON to and from POJOs based either on property accessor conventions or annotations.
With simple data binding you convert to and from Java Maps, Lists, Strings, Numbers, Booleans and nulls
With full data binding you convert to and from any Java bean type (as well as "simple" types mentioned above)
Another option is to generate Java Beans from JSON documents. You mileage may vary and you may/probably will have to modify the generated files. There are at least 5 online tools for that purpose that you can try:
http://www.jsonschema2pojo.org/
http://pojo.sodhanalibrary.com/
https://timboudreau.com/blog/json/read
http://jsongen.byingtondesign.com/
http://json2java.azurewebsites.net/
There are also IDE plugins that you can use. For instance this one for Intellij https://plugins.jetbrains.com/idea/plugin/7678-jackson-generator-plugin
The GSON supports work without objects, too. Something as this:
JsonObject propertiesWrapper = new JsonParser().parse(responseContent).getAsJsonObject();
assertNotNull(propertiesWrapper);
propertiesWrapper = propertiesWrapper.getAsJsonObject("properties");
assertNotNull(propertiesWrapper);
JsonArray propertiesArray = propertiesWrapper.getAsJsonArray("property");
assertNotNull(propertiesArray);
assertTrue(propertiesArray.size()>0, "The list of properties should not be empty. ");
The problem is that the work this way is so inconvenient that it is really better to create objects instead.
Jackson has absolutely the same problems, and to greater extent - extremal inconvenient for direct json reading/creation. All its tutorials advice to use POJOs instead, too.
The only really convenient way is use Groovy. Groovy works as an envelope on Java, you can simply write Java code and use Groovy operators at need. And in JSON or XML reading and creation Groovy is incomparably more powerful that Java with all its libraries multiplied on each other! It is even much more convenient than already prepared by somebody else tree structure of ready POJOs.
This question already has answers here:
Writing Custom Kafka Serializer
(3 answers)
Closed 2 years ago.
I am new to Kafka, Serialization and JSON
WHat I want is the producer to send a JSON file via kafka and the consumer to consume and work with the JSON file in its original file form.
I was able to get it so JSON is converter to a string and sent via a String Serializer and then the consumer would parse the String and recreate a JSON object but I am worried that this isnt efficient or the correct method (might lose the field types for JSON)
So I looked into making a JSON serializer and setting that in my producer's configurations.
I used the JsonEncoder here : Kafka: writing custom serializer
But when I try to run my producer now, it seems that in the toBytes function of the encoder the try block is never returning anything like i want it to
try {
bytes = objectMapper.writeValueAsString(object).getBytes();
} catch (JsonProcessingException e) {
logger.error(String.format("Json processing failed for object: %s", object.getClass().getName()), e);
}
Seems objectMapper.writeValueAsString(object).getBytes(); takes my JSON obj ({"name":"Kate","age":25})and converts it to nothing,
this is my producer's run function
List<KeyedMessage<String,JSONObject>> msgList=new ArrayList<KeyedMessage<String,JSONObject>>();
JSONObject record = new JSONObject();
record.put("name", "Kate");
record.put("age", 25);
msgList.add(new KeyedMessage<String, JSONObject>(topic, record));
producer.send(msgList);
What am I missing? Would my original method(convert to string and send and then rebuild the JSON obj) be okay? or just not the correct way to go?
THanks!
Hmm, why are you afraid that a serialize/deserialize step would cause data loss?
One option you have is to use the Kafka JSON serializer that's included in Confluent's Schema Registry, which is free and open source software (disclaimer: I work at Confluent). Its test suite provides a few examples to get you started, and further details are described at serializers and formatters. The benefit of this JSON serializer and the schema registry itself is that they provide transparent integration with producer and consumer clients for Kafka. Apart from JSON there's also support for Apache Avro if you need that.
IMHO this setup is one of the best options in terms of developer convenience and ease of use when talking to Kafka in JSON -- but of course YMMV!
I would suggest to convert your event string which is JSON to byte array like:
byte[] eventBody = event.getBody();
This will increase your performance and Kafka Consumer also provides JSON parser which will help you to get your JSON back.
Please let me know if any further information required.