I have a .csv file with 400 million lines of data. I was wondering if I were to convert it into an data API which returns into JSON format, will there be any limitations if consumers were to call and GET the data API. Would it show the full content of the data API and would it take long for the API to produce an output when being called.
If you convert this as GET API call then you might run into following issues:
You might run into max size limit issue i.e maximum size of data you can transfer over a GET request although it will depend on you server and clients device you can refer this answer of details
Latency will depend on the physical location of your server and the clients, you can potentially reduce this by cache your information if your data is not changing frequently.
Hope that helps
Related
Carrying out different tests on the sending of data and their respective times to the ThingsBoard platform, an important problem has arisen. When I send a single JSON file with 1001 variables using HTTP to a ThingsBoard platform device, no variables arrive on the platform. However, when the JSON file has 1,000 variables, the 1,000 variables are delivered to the platform. Does the platform have any limits so that the JSON files that are sent do not have more
I don't know of any hard limit, which could be restricting the data transfer in that case. The performance depends on the setup of your ThingsBoard instance (including Database, Queue, etc.).
Please have a look at the logs of ThingsBoard or the use the API usage dashboard in the ThingsBoard Web UI.
I am using Google Cloud Datastore, to store some data. When trying to extract all entities from my "Kind", through the API into GAS code, I realized that the API extracts 300 entities each time. To extract the totality of entities, I used the "cursor" option to fetch the next batch where the previous stopped.
Is there any way to extract all entities (or at least more than 300) at once?
While trying to find an answer in the web, I did not find any specific answer.
The max number of entities you can update/upsert in one go via Datastore's Data API is 500 but if you use a lookup operation you could potentially fetch 1000 entities (as long as they are collectively under 10MB for the transaction) as listed under the "Limits" section of Datastore's reference documentation.
However, you might be able to leverage the export/import endpoints of Datastore's Admin API to export/import data in bulk. Check out the guide for more information.
We are up to build API which deals with HBase table. Let's say the API has to methods: api.com/get/to get something out of HBase and api.com/put/ to put a matrix into HBase. We want to put and get matrices of size 200mb.
We can't come to conclusion of how to send data to this API. Do you think it sounds OK to send HTTPS request and represent the 200mb input matrix as JSON and put it to POST parameter?
Can't find any best practices for this case. Thank you.
The payload limits depends of the client and server RAM size and processor.
Teorically there is no limit in the standard (RFC 2616). However is not a good idea to construct a big payload because it probably fails because of one of this reasons:
lost packets on data transmission
limits on server side
limits on client side
The best is try to split your 200mb input matrix in smaller inputs and make multiple requests.
I'm building a system which requires an Arduino board to send data to the server.
The requirements/constraints of the app are:
The server must receive data and store them in a MySQL database.
A web application is used to graph and plot historical data.
Data consumption is critical
Web application must also be able to plot data in real time.
So far, the system is working fine, however, optimization is required.
The current adopted steps are:
Accumulate data in Arduino board for 10 seconds.
Send the data to the server using POST with data containing an XML string representing the 10 records.
The server parse the received XML and store the values in the database.
This approach is good for historical data, but not for realtime monitoring.
My question is: Is there a difference between:
Accumulating the data and send them as XML, and,
Send the data each second.
In term of data consumption, is sending a POST request each second too much?
Thanks
EDIT: Can anybody provide a mathematical formula benchmarking the two approaches in term of data consumption?
For your data consumption question you need to figure out how much each POST costs you giving your cell phone plan. I don't know if there is a mathematical formula, but you could easily test and work it out.
However, using 3G (even Wifi for that matter), the power consumption will be an issue, especially if your circuit runs on a battery; each POST bursts around 1.5 amps, that's too much for sending data every second.
But again, why would you send data every second?
Real time doesn't mean sending data every second, it means being at least as fast as the system.
For example, if you are sending temperatures, temperature doesn't change from 0° to 100° in one second. So all those POSTs will be a waste of power and data.
You need to know how fast the parameters change in your system and adapt your POST accordingly.
I dont know whether i am asking valid question
My problem is i want to send large data from one application to another for which i am using JSONP to pass data to handler file which stores it in database.As the data is large i am dividing it in chunks and passing the packets in loop, the more the number of packets the more time it takes to pass complete data which ultimately results in performance issue.(FYI my web server is bit slow)
Is there any way by which i can compress my data and send it at a time rather than sending it in packets.
OR
Any other way by which i can pass my data(large data) from application to another.
Need this ASAP
Thanks in advance.