Random JSON file to DataStruct unmarshalling - json

I want to create a Data Struct (DS) in GOLANG given a random JSON file. That is, take the JSON file's content and unmarshall it into the DS.
Looking around, I have found solutions on how to create such DS which require knowing beforehand the JSON format (Key:value pairs, types of the values, etc.). To do that, it would be also required to 'manually' enter the fields of the struct, and then unmarshall the JSON content into it. Of course, you can always create a small script that does that. However, that seems a bit unpractical, but not entirely impossible or unimplementable.
Do you know a more straightforward way to achieve this?
I think I also found something about porting the JSON's content into an interface, but I am sure (not 100%, though), that we will want to keep these data in a more static format, i.e. a DS. Is there a way to transform this hypothetical interface to a DS?

may be you can try to do this using https://github.com/golang/go/blob/e7f2e5697ac8b9b6ebfb3e0d059a8c318b4709eb/src/encoding/json/stream.go#L371

Related

Adaptive Card: What is the best way to display an arbitrary JSON object?

I'd like to generate an adaptive card that contains an arbitrary JSON
object inside.
I anticipate that the JSON object will be shallow.
Maybe it will contain just a list of key-value pairs.
But I won't know the structure of that JSON object until runtime.
For this reason, I can't templatize this portion of the adaptive card.
It would be ideal if I could embed the JSON inside of a codeblock, but I don't know if that's supported.
Alternatively, I'd be willing to embed the JSON inside of a monotype textbox.
Any help would be greatly appreciated.
Embedding JSON inside of a codeblock isn't yet supported, although that would indeed be a great way to solve this use case.
A template isn't necessarily static or fixed. If you're planning to "expand" your template with the AdaptiveCards.Templating NuGet package for instance, you'll have the occasion to load the template into memory. At this time, you can mutate the template according to the incoming JSON at runtime. This isn't an ideal solution, but it is a way to accommodate a truly arbitrary JSON object.
If, on the other hand, if you anticipate that the JSON object will merely be a list of key-value pairs, you can use data binding.

Processing json data from kafka using structured streaming

I want to convert incoming JSON data from Kafka into a dataframe.
I am using structured streaming with Scala 2.12
Most people add a hard coded schema, but if the json can have additional fields, it requires changing the code base every-time, which is tedious.
One approach is to write it into a file and infer it with but I rather avoid doing that.
Is there any other way to approach this problem?
Edit: Found a way to turn a json string into a dataframe but cant extract it from the stream source, it is possible to extract it?
One way is to store the schema itself in the message headers (not in the key or value).
Though, this increases message size, it will be easy to parse the JSON value without the need for any external resource like a file or a schema registry.
New messages can have new schemas while at the same time old messages can still be processed using their old schema itself, because the schema is within the message itself.
Alternatively, you can version the schemas and include an id for every schema in the message headers (or) a magic byte in the key or value and infer the schema from there.
This approach is followed by Confluent Schema registry. It allows you to basically go through different versions of same schema and see how your schema has evolved over time.
Read the data as string and then convert it to map[string,String], this way you can process the any json without even knowing its schema
based on JavaTechnical answer , the best approach would be to use a schema registry and
avro data instead of json, there is no going around hardcoding a schema (for now).
include your schema name and id as a header and use them to read the schema from the schema registry.
use the from_avro fucntion to turn that data into a df!

Apache Spark Read One Complex JSON File Per Record RDD or DF

I have an HDFS directory full of the following JSON file format:
https://www.hl7.org/fhir/bundle-transaction.json.html
What I am hoping to do is find an approach to flatten each individual file to become one df record or rdd tuple. I have tried everything I could think of using read.json(), wholeTextFiles(), etc.
If anyone has any best practices advice or pointers, it would be sincerely appreciated.
Load via wholeTextFiles something like this:
sc.wholeTextFiles(...) //RDD[(FileName, JSON)
.map(...processJSON...) //RDD[JsonObject]
Then, you can simply call the .toDF method so that it will infer from your JsonObject.
As far as the processJSON method, you could just use something like the Play json parser
mapPartitions is used when having to deal with data that is structured in a way that different elements can be on different lines. I've worked with both JSON and XML using mapPartitions.
mapPartitions works on an entire block of data at a time, as opposed to a single element. While you should be able to use the DataFrameReader API with JSON, mapPartitions can definitely do as you'd like. I don't have the exact code to flatten a JSON file, but I'm sure you can figure it out. Just remember the output must be an iterable type.

Javascript in place of json input step

I am loading data from a mongodb collection to a mysql table through Kettle transformation.
First I extract them using MongodbInput and then I use json input step.
But since json input step has very low performance, I wanted to replace it with a
javacript script.
I am a beginner in Javascript and even though i tried somethings, the kettle javascript script is not recognizing any keywords.
can anyone give me sample code to convert Json data to different columns using javascript?
To solve your problem you need to see three aspects:
Reading from MongoDB
Reading from JSON
Reading from (probably) String
Reading from MongoDB Except if you changed the interface, MongoDB returns not JSON but BSON files (~binary JSON). You need to see the MongoDB documentation about reading and writing BSON: probably something like BSON.to() and BSON.from() but I don't know it by heart.
Reading from JSON Once you have your BSON in JSON format, you can read it using JSON.stringify() which returns a String.
Reading from (probably) String If you want to use the capabilities of JSON (why else would you use JSON?), you also want to use JSON.parse() which returns a JSON object.
My experience is that to send a JSON object from one step to the other, using a String is not a bad idea, i.e. at the end of a JavaScript step, you write your JSON object to a String and at the beginning of the next JavaScript step (can be further down the stream) you parse it back to JSON to work with it.
I hope this answers your question.
PS: writing JavaScript steps requires you to learn JavaScript. You don't have to be a master, but the basics are required. There is no way around it.
you could use the json input step to get the values of this json and put in common rows

Send PDF as byte[] / JSON problem

I am trying to send a generated PDF file (Apache FOP) to the client. I know this can be done by writing the array to the response stream and by setting the correct content type, length and so on in the servlet. My problem is that the whole app was built based on the idea that it will only receive/send JSON. In the servlet's service() method, I have this:
response.setContentType("application/json");
reqBroker.process(request, response);
RequestBroker is the class who processes the JSON (jackson processor), everything is generic and I cannot change it. On top of this, I have to receive the JSON from the request correctly, to access the data and generate my pdf. So those two lines are necessary. But when I send the response, I need to have another content type so that the pdf is displayed correctly in the browser.
So far, I am able to send the byte array as part of the JSON, but then I don't know how to display the array as PDF on the client (if smth like this is even possible).
I would like some suggestions on how can I send my pdf and set the right header, without messing with the JSON. Thanks.
JSON and byte arrays don't mix.
Instead, you should create an <iframe> and point it to a URL that returns a raw PDF.
Take a look here:How to send pdf in json, it lists couple of approaches that you can consider. The easiest way is to convert the binary data into string by using Base64 compression. In C#, this would mean a call to Convert.FromBase64String. However this has space overhead as Base64 compression means around +33% more memory. If you can get away with it, this is the least complicated solution. in case additional size is an issue you can think about zipping it up.