We are using Immutable.js in our flux archtiecture.
Our state object is a fairly complex tree with thousands of nodes in it. We have no choice but to send this as a JSON object back to the server.
We are using data.toJS() to convert to JSON, however it takes about 2 seconds to finish this call. It is unfortunately too slow for us.
Is there a better solution to our problem?
Related
In my project I need to convert json to java object and vice versa multiple times.Using standard object mapper we can do this but does it have considerable performance overhead? Is there any number published ? I could not find much info it .I ran a test but in high throughput low latency framework not sure if it is good idea and should try some alternate.
Any help would be really appreciated.
Already answered here.
All in all, Jackson is considered to be faster, but I've also read that gson gives better performance when working with large objects.
I have a large JSON file, its size is 5.09 GB. I want to convert it to an XML file. I tried online converters but the file is too large for them. Does anyone know how to to do that?
The typical way to process XML as well as JSON files is to load these files completely into memory. Then you have a so called DOM which allows you various kinds of data processing. But neither XML nor JSON are really designed for storing that much data you have here. To my experience you typically will run into memory problems as soon as you exceed a 200 MByte limit. This is because DOMs are created that are composed from individual objects. This approach results in a huge memory overhead that far exceeds the amount of data you want to process.
The only way for you to process files like that is basically to take a stream approach. The basic idea: Instead of parsing the whole file and loading it into memory you parse and process the file "on the fly". As data is read it is parsed and events are triggered on which your software can react and perform some actions as needed. (For details on that have a look at the SAX API in order to understand this concept in more detail.)
As you stated you are processing JSON, not XML. Stream APIs for JSON should be available in the wild as wel. Anyway you could implement one fairly easily yourself: JSON is a pretty simple data format.
Nevertheless such an approach is not optimal: Typically such a concept will result in very slow data processing because of millions of method invocations involved: For every item encountered you typically need to call a method in order to perform some data processing task. This together with additional checks about what kind of information you currently have encountered in the stream will slow down data processing pretty much.
You really should consider to use a different kind of approach. First split your file into many small ones, then perform processing on them. This approach might not seem to be very elegant, but it helps to keep your task much simpler. This way you gain a main advantage: It will be much easier for you to debug your software. Unfortunately you are not very specific about your problem, so I can only guess, but large files typically imply that the data model is pretty complex. Therefor you will probably be much better off by having many small files instead of a single huge one. And later it allows you to dig into individual aspects of your data and the data processing process as needed. You will probably fail getting any detailed insights into that while having a single large file of 5 GByte to process. On errors you will have trouble to identify which part of the huge file is causing the problem.
As I already stated you unfortunately are very unspecific about your problem. Sorry, but because of having no more details about your problem (and your data in particular) I can only give you these general recommendations about data processing. I do not know any details about your data, so I can not give you any recommendation about which approach will work best in your case.
I am writing code to collect data from the Steam API (documentation: https://developer.valvesoftware.com/wiki/Steam_Web_API).
Particularly, I use the fromJSON() function from the jsonlite package in [R].
However, I have the feeling that the code is slow, and that the bottleneck is the actual calling of the API. I am currently able to do around 7.500-10.000 calls an hour, resulting in around 2-3 calls per second. This feels slow. Is it possible to speed this up, and if so, how?
Two things I found already is that it may be necessary to close the connection after opening it (cf. http://www.firaja.cc/steam-web-api-right-way.html). Also, the API allows json (what I use now) but also XML and CSV outputs, maybe it is better to use one of the latter two? Any other possible solutions?
Scenario: I am having a arbitrary JSON size ranging from 300 KB - 6.5 MB read from MongoDB. Since it is very much arbitrary/dynamic data I cannot have struct type defined in golang, So I am using map[sting]interface{} type. And string of JSON data is parsed by using encoding/json's Unmarshal method. Some what similar to what is mentioned in Generic JSON with interface{}.
Issue: But the problem is its taking more time (around 30ms to 180ms) to parse the string json into map[string]interface{}. (Comparing to php parsing json using json_encode/decode / igbinary/ msgpack)
Question: Is there any way to pre-process it and store in cache?
I mean parse string into map[string]interface{} and serialize it and store to some cache, then when we retrieve it should not take much time to unserialization and proceed with execution.
Note: I am Newbie for the golang any suggestion are highly appriciated. Thanks
Updates: Serialization using Gob, binary built-in package & Msgpack implementation for Golang package are already tried. No luck, No improvement in the time to unserialization.
The standard library package for JSON is notoriously slow. There is a good reason for that: it use RTTI to provide a really flexible interface that is really simple. Hence the unmarshalling being slower than PHP's…
Fortunately, there is an alternative way, which is to implement the json.Unmarshaller interface on the types you want to use. If this interface is implemented, the package will use it instead of its standard method so you can see huge performance boosts.
And to help you, there is a small group of tools that appeared, among which:
https://godoc.org/github.com/benbjohnson/megajson
https://godoc.org/github.com/pquerna/ffjson
(listing here the main players from memory, there must be others)
These tools will generate tailored implementations of the json.Unmarshaller interface for the types you requested. And with go:generate, you can even integrate them seamlessly to your build step.
I'd like to compute very large JSON files (about 400 MB each) in Scala.
My use-case is batch-processing. I can receive several very big files (up to 20 GB, then cut to be processed) at the same moment and I really want to process them quickly as a queue (but it's not the subject of this post!). So it's really about distributed architecture and performance issues.
My JSON file format is an array of objects, each JSON object contains at least 20 fields. My flow is composed of two major steps. The first one is the mapping of the JSON object into a Scala object. And the second step is some transformations I'm making on the Scala object data.
To avoid loading all the file in memory, I'd like a parsing library where I can have incremental parsing. There are so many libraries (Play-JSON, Jerkson, Lift-JSON, the built in scala.util.parsing.json.JSON, Gson) and I cannot figure out which one to take, with the requirement to minimize dependencies.
Do you have any ideas of a library I can use for high-volume parsing with good performances?
Also, I'm searching a way to process in parallel the mapping of the JSON file and the transformations made on the fields (between several nodes).
Do you think I can use Apache Spark to do it? Or are there alternative ways to accelerate/distribute the mapping/transformation?
Thanks for any help.
Best regards, Thomas
Considering a scenario without Spark, I would advise to stream the json with Jackson Streaming (Java) (see for example there), map each Json object to a Scala case class and send them to an Akka router with several routees that do the transformation part in parallel.