How to serialize JSON to binary Protocol Buffers (Protobuf) serialization on command line? Assuming the JSON follows Protobuf 3 JSON Mapping)
Question How to encode protocol buffer string to binary using protoc has an answer for serializing to Protobuf binary from Protobuf Text Format Language with protoc --decode but if I have understood right, protoc doesn't support JSON. With that, converting from JSON to Protobuf Text Format should answer the need.
Related
I'm reading a Kafka json file from Azure ADLS Gen2 storage account on Azure Databricks. I dont seem to be able to convert the value binary payload to string so that I can perform the from_json conversion. I've tried various flavours of the cast and in all cases the original binary value is shown in my final transformation.
I've tried ...
df.selectExpr("CAST(value as STRING)")
as well as ...
df.select(col("value").cast("string"))
I know i'm doing something stupid, cause this is a trivial transformation, but I cant work out what I'm doing wrong.
I'm using Azure Databricks Runtime 11.3 LTS ML
Sample data that i'm using is a Databricks Academy dataset
I'm expecting the above code to transform this into human readable string format but the final transformation for the 'value' is identical to the original binary.
What you're doing is correct to get it to a string, but perhaps the data is simply not actually a string? Perhaps your data is actually an encoded (base64, maybe) / encrypted string, or a binary payload such as Avro, Protobuf, etc which are not human readable.
Without knowing how it was produced, you cannot know how to deserialize it; and if you are reading a .json file, as you say, then Spark doesn't care about file extensions...
Is there a way in KAFKA to consume XML source and convert it to JSON and send JSON data to KAFKA to sink?
I have seen Avro, Protobuf as convertors in kafka connect? Are they capable of converting XML to JSON? or would they convert to AVRO, Protobuf specific formats rather than JSON?
Kafka Connect can use any data format. However, there is no builtin Converter for XML, so you just use StringConverter.
Then you can use transforms to parse the XML into an internal format Connect works with, known as Struct. E.g. https://jcustenborder.github.io/kafka-connect-documentation/projects/kafka-connect-transform-xml/transformations/examples/FromXml.books.html
(the json "input" and "output" shown is from a REST proxy request, only look at the value field)
When written to a Connect sink, you can then use JSONConverter (or any other one) to deserialize the internal Struct object
I'm working with effectively a JSON API, that requires the JSON payload to be base64 encoded.
I can't imagine a good reason for encoding JSON as base64, it should be already safely utf8 encoded as valid JSON, so am I missing something?
Perhaps it's used as a way to serialise / compare the JSON, but I'm assuming that canonicalisation / comparison of JSON surely is a solved problem?
Until now i was using c# Newtonsoft json to parse graph api json responses. Everything was ok except when the text in the json response includes emoticons.
In that situation the json is not valid and the parsing stops.
is there any way to parse json with emoticons? do i have to convert them to something?
I wanted to convert CSV to Avro schema in Nodejs. I was able to convert CSV to JSON and now trying to convert JSON to AVRO schema. Is there any package available in nodejs. Thanks in Advance
This link might be helpful. This will allow you to encode / decode avro binary format to / from json format, it supports both deflate and snappy compressions and supports node streams
https://www.npmjs.com/package/node-avro-io