How do I measure JSON properties reading performance? - json

I have a static dataset inside a react-native app. There are 3 JSON files: 500kb, 2mb and 10mb.
Each of them contains a JSON object with ~8000 keys.
I'm concerned if this will cause any performance issues. Atm I'm using redux toolkit state mainly for ids and reading data through selectors inside the redux connect method. In some cases, I have to access those JSON files sequentially ~100 times in a row, pulling certain properties by key (no array find or something).
Tbh, I still don't know it if 10mb is a significant amount at all?
How do I measure performance and where do I read more about using large JSON files in react-native apps?

Related

right tech choice to work with Json data?

We are a data team working with our data producers to store and process infrastructure log data. Workers running at our client systems generate log data which are primarily in json format.
There is no defined structure to the json data as it depends on multiple factors like # of clusters run by the client, tokens generated etc. There is some definite structure to the top-level json elements that contain the metadata where the logs are generated. Actual data can go into multiple levels of nesting and varying key-value pairs.
I want to build a sytem to ingest these logs, parse them and present in a way where engineers and PMs(prod managers) can read the data for analytics usecases.
My initial plan is to setup a compute layer like Kinesis to the source and write parsing logic to store the outcome in s3. However, this would need prior knowledge of the json file itself.
I define a parser module to process the data based on the log type. For every incoming log, my compute(kinesis?) directs data processing to corresponding parser module and emit data into s3.
However, I am starting to explore if any different storage engine(elastic etc.) will fit my usecase better. I am wondering if anyone as run into such usecases and what did you find helpful in solving the problem

What's stopping me from using a standalone JSON file instead of a local db?

I need to store data for a native mobile app I'm writing and I was wondering: 'why do I need to bother with DB setup when I can just read/write a JSON file?. All the interactions are basic and could most likely be parsed as JSON objects rather than queried.
what are the advantages?
DB are intent to work with standardized data or large data sets. If you know that there is only a few properties to read and it's not changing, JSON may be easier, but if you have a list of items, a DB can optimize the queries with index or ensure consistency through multiple tables

Streaming very large json directly from a POST request in Django

I'm looking for efficient ways to handle very large json files (i.e. possibly several GB in size, amounting to a few million json objects) in requests to a Django 2.0 server (using Django REST Framework). Each row needs to go through some processing, and then get saved to a DB.
The biggest painpoint thus far is the sheer memory consumption of the file itself, and the memory consumption steadily still increasing while the data is being processed in Django, without any ability to manually release the memory used.
Is there a recommended way of processing very large json files in requests to a Django app, without slaughtering memory consumption? Possible to combine with compressing (gzip)? I'm thinking of uploading the json to the API as a regular file, stream that to disk, and then stream from the file on disk using ijson or similar? Is there a more straightforward way?

Should I use JSONField or FileField to store JSON datas?

I am wondering how I should store my JSON datas to have the best performances and scalability.
I have two options :
The first one would be to use JSONField, which will probably provides me an advantage in simplicity when it comes on performances and handling the datas since I don't have to get them out of a file each time.
My second option would be to store my JSON datas in FileFields as json files. This seems the best option since the huge quantity of JSON wouldn't be stored in a DataBase (only the location of the file). In my opinion it's the best option for scalability but maybe not for user performances since the file has to be read each time before displaying them in the template.
I would like to know if I am thinking reasonably, what's the best way between to store JSON datas for them to be reusable as fast as possible without making it complicated to the database & scalability ?
Json field will obviously has a good performance because of its indexing. A very good feature of it would be the native data access feature which means that you don't have to parse/load json and then query, you can just query directly from model field. Now since you have a huge json data it seems that file is a better option than model field but file only has advantage of storage.
Quoting from some random article from google search:
Postgres json field takes almost 11% extra data than the json file on your file system so test of 268mb file in json field is 233 mb (formatted json file)
Storing in a file has some cons which includes reading files parsing json and querying which is time consuming since it is disk based operations. Scalebility will not be a issue with json field although your db size will be high so moving the data might become tough for you.
So unless you have a shortage of database space you should choose jsonfield.

Using mongodb to store a single but complex JSON object

I want to store a single, big and complex JSON object in mongodb and I want to be able to retrieve and modify specific parts of it. A simple solution would be to store it in a single document, but I'm not sure how that would play with multiple write requests. Another option would be to keep every node of the JSON in different documents, kind of like a pattern explained here in the mongodb documentation. This way I can retrieve only parts of the whole object and work on them that way.
My question is: do I get anything out of the latter approach? I'm kind of new to mongodb, but as I read it has database lock on multiple write requests, so it would seem that having my JSON taken apart like this would achieve nothing when it comes to scaling.
If you consider to store data larger then 16MB you should definitely use some sort of hashing as mongodb has a 16MB size limit on its documents.
From MongoDB Limits and Thresholds
The maximum BSON document size is 16 megabytes.