I am trying to understand the recommended way of parsing a JSON into an object, particularly from httpClient responses (but my question may also relate to parsing JSON from streams in general)
I have scoured the Internet reading many blog posts and that's what I have come up with:
I understand that parsing a stream to a string and then parsing the string to an object is a big no-no in terms of memory usage.
And according to many blog posts I have come across the traditional way of doing it, is or used to be working with streams, using the package Newtonsoft.JSON as follows:
using var streamReader = new StreamReader(stream);
using var jsonTextReader = new JsonTextReader(streamReader);
var myDeserializedObject = new JsonSerializer().Deserialize<MyObject>(jsonTextReader);
But then I have come across another way of doing it:
If you are using .NET Core 3 and above (not so sure about the version) you have a built-in way of deserializing the stream using System.Text.JSON:
var myDeserializedObject = await JsonSerializer.DeserializeAsync<MyObject>(stream);
and particularly to httpClient requests (and if you are using .NET 5 and above if I am not mistaken)
you can do:
var myDeserializedObject = httpClient.GetFromJsonAsync<MyObject>();
Please if someone could explain the ups and downs (if there are any) of each approach, especially in terms of performance and memory usage.
My company has been discussing the something of a similar situation. We ended with the following considerations .
Using Newtonsoft.JSON:
Pros:
A widely used library with a lot of features, including serializing and deserializing JSON.
Provides good control over the serialization and deserialization process.
Cons:
Can consume more memory due to string conversion of the JSON stream.
May have performance overhead when serializing and deserializing large JSON payloads.
Using System.Text.Json:
Pros:
Built-in and optimized for performance in .NET Core 3 and above.
Consumes less memory compared to Newtonsoft.JSON.
Has improved performance compared to Newtonsoft.JSON.
Cons:
Has limited options for customizing the serialization and deserialization process.
May not have all the features available in Newtonsoft.JSON.
For most cases, System.Text.Json should be sufficient for deserializing JSON payloads. However, for more complex JSON serialization/deserialization requirements, Newtonsoft.JSON may still be the preferred choice. Ultimately, the choice depends on the specific requirements of the project if there are any guiding and limitations for your project.
Related
I am very new to Swift programming, but fairly competent in programming in other languages.
I have a project which uses NSPersistenContainer for CoreData. I would like to export and re-import the data using JSON or XML.
I can manually generate a CSV file, but that’s limited in its usefulness, so I would prefer JSON; XML if I have to.
Everything I’ve found is dated, and requires extending NSManagedObject and using Codable. I gather this would not apply if I am using NSPersistenContainer.
Is there anything built in to modern Swift, or how would I go about doing this?
Codable is the thing built in to Swift to do what you want, and extending or subclassing NSManagedObject is how you use it with Core Data. Using NSPersistentContainer is orthogonal to the question. You almost certainly want it but it has no bearing on JSON import/export. It’s there to set up Core Data for you but you still use managed objects as your data model. Core Data doesn’t have built in support for JSON; it relies on the existence of Codable to provide that.
Scenario: I am having a arbitrary JSON size ranging from 300 KB - 6.5 MB read from MongoDB. Since it is very much arbitrary/dynamic data I cannot have struct type defined in golang, So I am using map[sting]interface{} type. And string of JSON data is parsed by using encoding/json's Unmarshal method. Some what similar to what is mentioned in Generic JSON with interface{}.
Issue: But the problem is its taking more time (around 30ms to 180ms) to parse the string json into map[string]interface{}. (Comparing to php parsing json using json_encode/decode / igbinary/ msgpack)
Question: Is there any way to pre-process it and store in cache?
I mean parse string into map[string]interface{} and serialize it and store to some cache, then when we retrieve it should not take much time to unserialization and proceed with execution.
Note: I am Newbie for the golang any suggestion are highly appriciated. Thanks
Updates: Serialization using Gob, binary built-in package & Msgpack implementation for Golang package are already tried. No luck, No improvement in the time to unserialization.
The standard library package for JSON is notoriously slow. There is a good reason for that: it use RTTI to provide a really flexible interface that is really simple. Hence the unmarshalling being slower than PHP's…
Fortunately, there is an alternative way, which is to implement the json.Unmarshaller interface on the types you want to use. If this interface is implemented, the package will use it instead of its standard method so you can see huge performance boosts.
And to help you, there is a small group of tools that appeared, among which:
https://godoc.org/github.com/benbjohnson/megajson
https://godoc.org/github.com/pquerna/ffjson
(listing here the main players from memory, there must be others)
These tools will generate tailored implementations of the json.Unmarshaller interface for the types you requested. And with go:generate, you can even integrate them seamlessly to your build step.
There are many possibilities to parse a JSON in context of a Windows Store App.
Regardless in which language (C#, JavaScript or C++).
For example: .NET 4.5 JsonObject and DataContractJsonSerializer, JavaScript Json parser or an extern one like Json.NET.
Does anybody know something about that?
I only read good things about Json.NET's performance.
But are they true and play that a role for JSON's which include datasets of 100k JSON objects? Or the user won't notice a difference?
I only have experience using Json.NET ... works just fast and great! I also used the library in enterprise-projects - i never got disappointed!
If it helps, and FWIW, I've been collecting recently some new JSON parsing / deserialization performance data that can be observed over various JSON payload "shapes" (and sizes), when using four JSON librairies, there:
https://github.com/ysharplanguage/FastJsonParser#Performances
(.NET's out-of-the-box JavaScriptSerializer vs. JSON.NET vs. ServiceStack vs. JsonParser)
Please note:
these figures are for the full .NET only (i.e., the desktop / server tier; not mobile devices)
I was interested in getting new benchmark figures about parsing / deserialization performances only (i.e., not serialization)
finally, I was also especially interested (although not exclusively) in figures re: strongly typed deserialization use cases (i.e., deserializing into POCOs)
'Hope this helps,
While reading tutorials on Json/Gson, Ive noticed most people prefer to download the file as a String and then parse that String through JSON. However, most XML tutorials prefer parsing directly with an InputStream.
Why the difference between the two? What is the best practice/does it even make a difference?
The lesson learned by XML users is that large object trees in-memory can take up lots of memory.
JSON parse trees don't intrinsically take up less memory than XML, but it's usually simpler. An XML DOM is quite featureful compared to a GSON JsonObject for instance. GSON may (I don't know) use a streaming parser (similar to SAX for XML) that loads only what is needed.
But the point I'm trying to make is that we've learned since then. Reasons JSON is typically loaded as a string include: parsers are more efficient, fewer features than a full DOM are needed in most cases, hardware is more powerful, JSON files are usually shorter, and programmers are lazier.
That said, I found this post when I realized I have to work in complex ways with JSON data sets that are too big to efficiently store in a single string. You shouldn't do that, but I'm grateful JsonParser.parse() has an implementation that can also take an InputStream.
From a comment on the announcement blog post:
Regarding JSON: JSON is structured
similarly to Protocol Buffers, but
protocol buffer binary format is still
smaller and faster to encode. JSON
makes a great text encoding for
protocol buffers, though -- it's
trivial to write an encoder/decoder
that converts arbitrary protocol
messages to and from JSON, using
protobuf reflection. This is a good
way to communicate with AJAX apps,
since making the user download a full
protobuf decoder when they visit your
page might be too much.
It may be trivial to cook up a mapping, but is there a single "obvious" mapping between the two that any two separate dev teams would naturally settle on? If two products supported PB data and could interoperate because they shared the same .proto spec, I wonder if they would still be able to interoperate if they independently introduced a JSON reflection of the same spec. There might be some arbitrary decisions to be made, e.g. should enum values be represented by a string (to be human-readable a la typical JSON) or by their integer value?
So is there an established mapping, and any open source implementations for generating JSON encoder/decoders from .proto specs?
May be this is helpful http://code.google.com/p/protobuf-java-format/
From what I have seen, Protostuff is the project to use for any PB work on Java, including serializing it as JSON, based on protocol definition. I have not used it myself, just heard good things.
Yes, since Protocol Buffers version 3.0.0 (released July 28, 2016) there
is "A well-defined encoding in JSON as an alternative to binary proto
encoding" as mentioned in the release notes
https://github.com/google/protobuf/releases/tag/v3.0.0
I needed to marshal from GeneratedMessageLite to a JSON object but did not need to unmarshal. I couldn't use the protobuf library in Pangea's answer because it doesn't work with the LITE_RUNTIME option. I also didn't want to burden our already large legacy system with generating more compiled code for the existing protocol buffers. For mashalling to JSON, I went with this simple solution to marshal
final Person gpb = Person.newBuilder().setName("Bill Monroe").build();
final Gson gson = new Gson();
final String jsonString = gson.toJson(gpb);
One further thought: if protobuf objects have getters/setters, or appropriately named fields, one could simply use Jackson JSON processor's data binding. By default it handles public getters, any setters and public fields, but these are just default visibility levels and can be changed. If so, Jackson can serialize/deserialize protobuf generated POJOs without problems.
I have actually used this approach with Thrift-generated objects; the only thing I had to configure there was to disable serialization of various "isXXX()" methods that Thrift adds for checking if a field has been explicitly assigned or not.
First of all I think one should reason very carefully on putting an effort into converting a dataset to protobuffs. Here my reasons to convert a data-set to protobuffs
Type Safety: guarantee on the format of the data being considered.
uncompressed memory foot-print of the data. The reason I mention un-compressed is because post compression there isn't much of a difference in the size of JSON compressed and proto compressed but compression has a cost associated with it. Also, the serialization/de-serialization speed is almost similar, infact Jackson json is faster than protobuffs. Please checkout the following link for more information
http://technicalrex.com/2014/06/23/performance-playground-jackson-vs-protocol-buffers/
The protobuffs needs to be transferred over the network a lot.
Saying that once you convert your data-set to Jackson JSON format in the way that the ProtoBuff definition is defined then it can very easily be directly mapped to ProtoBuff format using the Protostuff:JsonIoUtil:mergeFrom function. Signature of the function :
public static <T> void mergeFrom(JsonParser parser, T message, Schema<T> schema, boolean numeric)
Reference to protostuff