I have a huge flat json string which has some 1000+ fields. I want to restructure the json into a nested/hierarchical structure based on certain business logic without doing a lot of object-to-json or json-to-object conversions, so that the performance will not get affected.
What are the ways to achieve this in scala?
Thanks in advance!
I suggest you to have a look into JSON transformers provided by play-json library. It allows you to manipulate json (moving fields, creating nested objects) without doing any object mapping.
Check this out : https://www.playframework.com/documentation/2.5.x/ScalaJsonTransformers
Related
I want to use Spark streaming to read from a single Kafka topic messages in JSON format, however not all the events have similar schema. If possible, what's the best way to check each event's schema and process it accordingly?
Is it possible to group in memory several groups each made of a bunch of similar schema events and then process each group as a bulk?
I'm afraid you can't do. You need somehow to decode your JSON message to identify the schema and this would be done in your Spark code. However you can try to populate the Kafka message key with a different value per schema and get assign Spark partitions per key.
Object formats like parquet and avro are good for this reason since the schema is available in the header. If you absolutely must use JSON then you can do as you said and use group-by-key while casting to the object you want. If you are using large JSON objects then you will see a performance hit since the entire JSON "file" must be parsed before any objects resolution can take place.
I am using python 3 for functional testing of a bunch of rest endpoints.
But i cannot figure out the best way to validate the json reaponse ( verifying the type, required, missing and additional fields)
I thought of below options :
1. Writing custom code and validate the response while converting the data into python class objects.
2. Validate using json schema .
Option 1: would be difficult to maintain and need to add separate functions to all the data models.
Option 2 : i like it. But i dont want to write schema for each endpoint in separate file/object. Is there a way to put it in a single object like we have swagger yml file. That way would be easy to maintain.
I would like to know which option is the best and if there are other better options / libraries available.
I've been through the same process, but validating REST requests and responses with Java. In the end I went with JSON Schema (there's an equivalent Python implementation at https://pypi.python.org/pypi/jsonschema) because it was simple and powerful, and hand-crafting the validation for anything but a trivial payload soon became a nightmare. Also, reading a JSON Schema file is easier than reasoning about a long list of validation statements.
It's true you need to define the schema in a separate file, but this proved to be no big deal. And, if your endpoints share some common features you can modularise your schemas and reuse common parts. There's a good tutorial at Understanding JSON Schema.
I have a 7000+ object JSON array that I would like to present as a list with live search and two-way sorting.
I've been looking at Angular, and have successfully made a list with live search, but it's obviously very slow with all this data.
What would be the most optimal way to handle this? Are there other libraries out there, that could handle the job better?
Why not use streaming api for JSON parsing? Check 3 ways here
I encountered many troubles of dealing with serializing/deserializing Scala data types to/from JSON objects and then store them to/from MongoDB in BSON form.
1st question: why Play Framework uses JSON why MongoDb uses BSON.
2nd question: If I am not wrong, Javascript does not have readers and writers for serializing/deserializing BSON from MongoDB. How can this happen? Javascript can seamlessly handle JSON, but for BSON I expect it needs some sort of readers and writers.
3rd question: (I read somewhere) why Salat and ReactiveMongo uses different mechanisms to talk to MongoDB.
JSON is a widely used format for transfer data in this days. So pretty good to have it "from the box" in the web framework. That is the reason Play has it.
The same reason mongo use it - it is a good idea to store data in the same format as user query it and save it. So Why mongo use BSON but JSON ? Well, BSON is the same as JSON but have additional properties on every value - data length and data type. The reason of this - when you are looking a lot of data (like db query do) you need to read all the object in JSON to get to another one. We can skip reading in the case if we will know the length of the data.
So You just do not need any BSON readers in JS (it could be somewhere but rarely used) because BSON is format for inside DB usage.
you can read this article for more inforamtion
I am trying to learning HTML5's IndexedDB with mozilla' tutorial Using Indexed DB.
I understand that IndexedDB is object store implementation. But all the examples I tried, they are storing simple objects with key:value pairs. But how would I save a nested or hierarchical objects? For example parent object and have a list of child objects. What is the best way to deal with complex object structures into Indexed DB?
I know the OOPS representation or XML representation of parent-child objects.
How would I achieve it in IndexedDB? Any tutorial source will be very helpful.
they are only storing key:value pairs. But how would I save a nested object?
What is nested object? You can store any object that can represent by JSON (or more correctly serializable by structured cloning algorithm). Is that nested object? You can convert any OOPS into JSON and get it back through its construction. For XML, just store serialized string format.
If you refer relationship, it is different question. I have write a bit about IndexedDB relationship. Modeling a relaionship in IndexedDB is not a problem. In fact, it support very well.