What is performance overhead of JSON to JAVA object and vice versa - json

In my project I need to convert json to java object and vice versa multiple times.Using standard object mapper we can do this but does it have considerable performance overhead? Is there any number published ? I could not find much info it .I ran a test but in high throughput low latency framework not sure if it is good idea and should try some alternate.
Any help would be really appreciated.

Already answered here.
All in all, Jackson is considered to be faster, but I've also read that gson gives better performance when working with large objects.

Related

Immutable.js performance slow with toJS()

We are using Immutable.js in our flux archtiecture.
Our state object is a fairly complex tree with thousands of nodes in it. We have no choice but to send this as a JSON object back to the server.
We are using data.toJS() to convert to JSON, however it takes about 2 seconds to finish this call. It is unfortunately too slow for us.
Is there a better solution to our problem?

CSV to JSON benchmarks

I'm working on a project that uses parallel methods to convert text from one form to another. We're going to implement a CSV to JSON converter to demonstrate the speedups that are possible using our parallel framework.
We want to benchmark our converter once it's finished. What are the fastest libraries/stand-alone programs/etc out there that are capable of doing CSV-JSON conversion? I found a list of potential candidates here:Large CSV to JSON/Object in Node.js, but I'm not sure how fast the listed options are. In the worst case I'll benchmark them myself, but if someone already knows what the "best in class" converters are it'd save me some time.
Looks like the maintainer of csvtojson has developed a benchmark application. I think I can add my csv to json converter to his benchmark project to test my converter.
if your project can consider in-browser apps, I suggest csvtojson as it is by far the speediest converter on the market as of 2017.
I created it myself so I may be a bit biaised, but I specifically developed it for a bigger project that required big csv to json crunching.
Tell me if it served.

Apache Johnzon vs Jackson

since Apache released the first final version of Johnzon, it would be really interesting to see if there are already some comparison between Johnzon and FastXML Jackson to see if it is worth to switch. The most important topic is probably the performance.
Has anyone already done performance tests? Can you share your result?
Best
There are some performance benchmarks up on github.
But for each of them you really have to verify if the benchmark is actually correctly implemented.
For what I've seen most benchmarks use the official javax.* APIs in a sub-optimal way. Most use Json.createGenerator, etc but they should actually use JsonProvider.provider() and store this away for your operations. Then call createGenerator etc on this JsonProvider.
That way you can make sure that you really get comparable results.
We have done quite a few tests and for me the numbers of Johnzon look really good. And especially since it's much smaller than most other JSON libs.
As mentioned in several other sources and mailing lists(TomEE, for example), the performance gain, if any, is negligible especially when you compare it to the overall request-response processing chain.
If you use Spring Boot, you will find a lot more community support and flexibility in terms of features for Jackson.
Jackson has tons of different modules and good support for other JVM languages(for example KotlinModule).
We, in my project, also use quite a lot of Clojure, where we use Cheshire, which relies on Jackson under the hood.
In the end, it's up to you what to use and whether the cases I mentioned are applicable to your project, but so far I haven't seen any compelling performance reports about Johnson and until it happens, I would go for a library with a lot higher adoption in the industry.

Jerkson Json parser for scala.

I have used Jerkson for scala, to serialize my list of objects to a JSON file. I'm able to decompose the object into JSON format object and written to a file. Now, I when I want to read it into my program for further processing I get this error. FYI, my file size is 500MB and in the future might grow upto 1GB.
I saw few forums which has asked to increase the XX:MaxPermSize=256M. I'm not sure if this is going to solve my problem, even if it does for now, what is the guarantee that this might not surface later when the size of my JSON file grows to 1GB.Is there a better alternative ? Thanks!
Exception in thread "main" java.lang.OutOfMemoryError: PermGen space
at java.lang.String.intern(Native Method)
at org.codehaus.jackson.util.InternCache.intern(InternCache.java:41)
at org.codehaus.jackson.sym.CharsToNameCanonicalizer.findSymbol(CharsToNameCanonicalizer.java:506)
at org.codehaus.jackson.impl.ReaderBasedParser._parseFieldName(ReaderBasedParser.java:997)
at org.codehaus.jackson.impl.ReaderBasedParser.nextToken(ReaderBasedParser.java:418)
at com.codahale.jerkson.deser.ImmutableMapDeserializer.deserialize(ImmutableMapDeserializer.scala:32)
at com.codahale.jerkson.deser.ImmutableMapDeserializer.deserialize(ImmutableMapDeserializer.scala:11)
at org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:2704)
at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1315)
at com.codahale.jerkson.Parser$class.parse(Parser.scala:83)
at com.codahale.jerkson.Json$.parse(Json.scala:6)
at com.codahale.jerkson.Parser$class.parse(Parser.scala:14)
at com.codahale.jerkson.Json$.parse(Json.scala:6)
From the stack trace we can see that Jackson interns the Strings that are parsed as the names of fields in your document. When a String is interned, it is put in the PermGen, which is the part of the heap that you are running out of. I reckon this is because your document has many, many different field names - perhaps generating with some naming scheme? Whatever the case, increasing you MaxPermSize might help some, or at least delay the problem, but it won't solve it completely.
Disabling String interning in Jackson, on the other hand, should solve it completely. The Jackson FAQ has more information about what configuration options to tweak:
http://wiki.fasterxml.com/JacksonFAQ#Problems_with_String_intern.28.29ing
Adding memory will only treat the symptom rather than cure the disease. I would say this Jerkson memory issue is a blessing in disguise that exposes a fundamental design flaw.
As for how you cure the disease, I can't say for sure since I know nothing about your application or the use cases. I am pretty sure that you don't need 1 GB of information at a time. Consider streaming reads of your JSON file into a database or cache and then fetching only what you need to solve a particular problem.
Vague, I know, but I can't offer specifics without more details. The bottom line is streaming and persisting.

JSON library in Scala and Distribution of the computation

I'd like to compute very large JSON files (about 400 MB each) in Scala.
My use-case is batch-processing. I can receive several very big files (up to 20 GB, then cut to be processed) at the same moment and I really want to process them quickly as a queue (but it's not the subject of this post!). So it's really about distributed architecture and performance issues.
My JSON file format is an array of objects, each JSON object contains at least 20 fields. My flow is composed of two major steps. The first one is the mapping of the JSON object into a Scala object. And the second step is some transformations I'm making on the Scala object data.
To avoid loading all the file in memory, I'd like a parsing library where I can have incremental parsing. There are so many libraries (Play-JSON, Jerkson, Lift-JSON, the built in scala.util.parsing.json.JSON, Gson) and I cannot figure out which one to take, with the requirement to minimize dependencies.
Do you have any ideas of a library I can use for high-volume parsing with good performances?
Also, I'm searching a way to process in parallel the mapping of the JSON file and the transformations made on the fields (between several nodes).
Do you think I can use Apache Spark to do it? Or are there alternative ways to accelerate/distribute the mapping/transformation?
Thanks for any help.
Best regards, Thomas
Considering a scenario without Spark, I would advise to stream the json with Jackson Streaming (Java) (see for example there), map each Json object to a Scala case class and send them to an Akka router with several routees that do the transformation part in parallel.