Fast, simple data storage with a JSON API - json

I need to store a small and simple dataset online that is retrievable as JSON. I'm thinking of something along the lines of Google Spreadsheets, had that service not been a bit unreliable from time to time.
I'm building a small web app that will display some numbers. Rather than hardcoding the numbers into the application, I need to go through the whole deployment process beforehand. When I get my hands on the numbers, I just want to pop them into the data storage service and have them available via a JSON API within a few minutes.
What is the easiest, simplest and cheapest (but reliable) way to do this?
Cheers!

have a look at :
couch: http://couchdb.apache.org/
mongodb: http://www.mongodb.org/

When you are hosting the small web app yourself anyway, you might just static serve the JSON document with the required parameters for your app and FTP it without a whole new deployment when update is needed.
But when you have less control about the deployment, then the cheapest but least elegant way might be a shared (read-only) text file in Dropbox or similar. But check about throttling and limits and TOS first before using such a way.

Probably the easiest and simplest is https://parse.com/ , there is a very easy to use jquery api for this storage: https://github.com/srhyne/jQuery-Parse
An example on how to put a new data into the storage:
$.parse.post('mykey',{ value : 'myvalue' }, function(json){console.log(json);});
An example on how to get this data from the storage:
$.parse.post('mykey', function(json){console.log(json);});

Related

'Search' endpoint seems unstable

When using Forge Data Management API endpoint
projects/:project_id/folders/:folder_id/search we have two problems.
It seems that we sometimes have to wait for several minutes (hours?)
after a model is uploaded until it can be found by the search.
We often get error 429 "Too Many Requests" even that we only call do
very few calls (less than 10 within an hour).
These issues makes the endpoint hard to use in production code. Is there anything we can do to improve the success rate? Is Autodesk going to improve the endpoint?
This question is related to How to find cloud Item id of a Revit model?
We are aware of those issues and working on improving things in those areas. However, I cannot tell when exactly those will become available.
In the meantime, depending on your workflow, there are two things that could be of help:
Use webhooks in order to be notified about new files being added to BIM 360/ACC
Use folder/contents endpoint to find the file you need, which also supports filtering just like the folder/search endpoint. You would have to iterate through subfolders though if you wanted to look for items in them as well. Newly added files should show up here straight away.

Best strategy to script a web journey in jmeter with too many json submits in requests

I am in process of developing a Jmeter script for a web journey which has close to 30 transactions, I observe that there are like 20 requests(API calls) which are submitting heavy Json load(with lots of fields) and this Json structure varies significantly with key data( Account number), it doesn't look like corelating 100s of fields of each of these json requests is an efficient way of scripting. I did give a try to submit predefined Json files in the requests, but that way I would need at least 10 Json( 1 json per request) for each Account number and considering I am looking to test like 200 Account numbers, this also doesn't look like a reasonable approach. Can someone please suggest pointers to approach this ? Should I test APIs independently ?
There is only one strategy: your load test must represent real life application usage with 100% accuracy so I would go for separate APIs testing in 2 cases:
As the last resort if for some reason realistic workload is not possible
When one particular API is known to be a bottleneck and you need to repeat the test focusing on one endpoint only
With regards to correlation there are some semi and fully automated correlation solutions like:
Correlations Recorder Plugin for JMeter where you can specify the correlation rules prior to recording and all matching values will be automatically replaced with the respective extractors/variables
BlazeMeter Proxy Recorder which is capable of exporting recorded scripts in "SmartJMX" mode with automatic detection and correlation of dynamic elements
If the next request can be "constructed" from the previous response but with some transformation of the JSON structure you can amend the response and convert it to the required form using JSR223 Test Elements - see Apache Groovy - Parsing and Producing JSON article for more information

REST API - file (ie images) processing - best practices

We are developing server with REST API, which accepts and responses with JSON. The problem is, if you need to upload images from client to server.
Note: and also I am talking about a use-case where the entity (user) can have multiple files (carPhoto, licensePhoto) and also have other properties (name, email...), but when you create new user, you don't send these images, they are added after the registration process.
The solutions I am aware of, but each of them have some flaws
1. Use multipart/form-data instead of JSON
good : POST and PUT requests are as RESTful as possible, they can contain text inputs together with file.
cons : It is not JSON anymore, which is much easier to test, debug etc. compare to multipart/form-data
2. Allow to update separate files
POST request for creating new user does not allow to add images (which is ok in our use-case how I said at beginning), uploading pictures is done by PUT request as multipart/form-data to for example /users/4/carPhoto
good : Everything (except the file uploading itself) remains in JSON, it is easy to test and debug (you can log complete JSON requests without being afraid of their length)
cons : It is not intuitive, you cant POST or PUT all variables of entity at once and also this address /users/4/carPhoto can be considered more as a collection (standard use-case for REST API looks like this /users/4/shipments). Usually you cant (and dont want to) GET/PUT each variable of entity, for example users/4/name . You can get name with GET and change it with PUT at users/4. If there is something after the id, it is usually another collection, like users/4/reviews
3. Use Base64
Send it as JSON but encode files with Base64.
good : Same as first solution, it is as RESTful service as possible.
cons : Once again, testing and debugging is a lot worse (the body can have megabytes of data), there is increase in size and also in processing time in both - client and server
I would really like to use solution no. 2, but it has its cons... Anyone can give me a better insight of "what is best" solution?
My goal is to have RESTful services with as much standards included as possible, while I want to keep it as simple as possible.
OP here (I am answering this question after two years, the post made by Daniel Cerecedo was not bad at a time, but the web services are developing very fast)
After three years of full-time software development (with focus also on software architecture, project management and microservice architecture) I definitely choose the second way (but with one general endpoint) as the best one.
If you have a special endpoint for images, it gives you much more power over handling those images.
We have the same REST API (Node.js) for both - mobile apps (iOS/android) and frontend (using React). This is 2017, therefore you don't want to store images locally, you want to upload them to some cloud storage (Google cloud, s3, cloudinary, ...), therefore you want some general handling over them.
Our typical flow is, that as soon as you select an image, it starts uploading on background (usually POST on /images endpoint), returning you the ID after uploading. This is really user-friendly, because user choose an image and then typically proceed with some other fields (i.e. address, name, ...), therefore when he hits "send" button, the image is usually already uploaded. He does not wait and watching the screen saying "uploading...".
The same goes for getting images. Especially thanks to mobile phones and limited mobile data, you don't want to send original images, you want to send resized images, so they do not take that much bandwidth (and to make your mobile apps faster, you often don't want to resize it at all, you want the image that fits perfectly into your view). For this reason, good apps are using something like cloudinary (or we do have our own image server for resizing).
Also, if the data are not private, then you send back to app/frontend just URL and it downloads it from cloud storage directly, which is huge saving of bandwidth and processing time for your server. In our bigger apps there are a lot of terabytes downloaded every month, you don't want to handle that directly on each of your REST API server, which is focused on CRUD operation. You want to handle that at one place (our Imageserver, which have caching etc.) or let cloud services handle all of it.
small 2023 update: If possible, but CDN in front of the pictures, it usually will save you a lot of money and make the pictures even more available (i.e. no issues when peaks happen).
Cons : The only "cons" which you should think of is "not assigned images". User select images and continue with filling other fields, but then he says "nah" and turn off the app or tab, but meanwhile you successfully uploaded the image. This means you have uploaded an image which is not assigned anywhere.
There are several ways of handling this. The most easiest one is "I don't care", which is a relevant one, if this is not happening very often or you even have desire to store every image user send you (for any reason) and you don't want any deletion.
Another one is easy too - you have CRON and i.e. every week and you delete all unassigned images older than one week.
There are several decisions to make:
The first about resource path:
Model the image as a resource on its own:
Nested in user (/user/:id/image): the relationship between the user and the image is made implicitly
In the root path (/image):
The client is held responsible for establishing the relationship between the image and the user, or;
If a security context is being provided with the POST request used to create an image, the server can implicitly establish a relationship between the authenticated user and the image.
Embed the image as part of the user
The second decision is about how to represent the image resource:
As Base 64 encoded JSON payload
As a multipart payload
This would be my decision track:
I usually favor design over performance unless there is a strong case for it. It makes the system more maintainable and can be more easily understood by integrators.
So my first thought is to go for a Base64 representation of the image resource because it lets you keep everything JSON. If you chose this option you can model the resource path as you like.
If the relationship between user and image is 1 to 1 I'd favor to model the image as an attribute specially if both data sets are updated at the same time. In any other case you can freely choose to model the image either as an attribute, updating the it via PUT or PATCH, or as a separate resource.
If you choose multipart payload I'd feel compelled to model the image as a resource on is own, so that other resources, in our case, the user resource, is not impacted by the decision of using a binary representation for the image.
Then comes the question: Is there any performance impact about choosing base64 vs multipart?. We could think that exchanging data in multipart format should be more efficient. But this article shows how little do both representations differ in terms of size.
My choice Base64:
Consistent design decision
Negligible performance impact
As browsers understand data URIs (base64 encoded images), there is no need to transform these if the client is a browser
I won't cast a vote on whether to have it as an attribute or standalone resource, it depends on your problem domain (which I don't know) and your personal preference.
Your second solution is probably the most correct. You should use the HTTP spec and mimetypes the way they were intended and upload the file via multipart/form-data. As far as handling the relationships, I'd use this process (keeping in mind I know zero about your assumptions or system design):
POST to /users to create the user entity.
POST the image to /images, making sure to return a Location header to where the image can be retrieved per the HTTP spec.
PATCH to /users/carPhoto and assign it the ID of the photo given in the Location header of step 2.
There's no easy solution. Each way has their pros and cons . But the canonical way is using the first option: multipart/form-data. As W3 recommendation guide says
The content type "multipart/form-data" should be used for submitting forms that contain files, non-ASCII data, and binary data.
We aren't sending forms,really, but the implicit principle still applies. Using base64 as a binary representation, is incorrect because you're using the incorrect tool for accomplish your goal, in other hand, the second option forces your API clients to do more job in order to consume your API service. You should do the hard work in the server side in order to supply an easy-to-consume API. The first option is not easy to debug, but when you do it, it probably never changes.
Using multipart/form-data you're sticked with the REST/http philosophy. You can view an answer to similar question here.
Another option if mixing the alternatives, you can use multipart/form-data but instead of send every value separate, you can send a value named payload with the json payload inside it. (I tried this approach using ASP.NET WebAPI 2 and works fine).

How to do Realtime/fast processing of GPS data in web application?

I am writing a web application for mapping Real-time GPS coordinates on Google maps coming from a GPS device, for fleet managment.
Since the flow of data is very fast from the GPS device to web application for database it becomes very heavy and the database is being queried every 5 seconds(via AJAX from web browser running the website) it becomes more heavy.
Keeping the updates in real-time is becoming very difficult a lagging of 30 seconds to 60 seconds is created between the actually update and its visibility on the website.
I am using Django + Apache + MySQL on CentOS 6.4 64 bit.
Any advice in what direction i should move to make the processing/visibility of data in more real-time would be helpful.
I would suggest you to use NoSql database like MongoDB. It would really help you to achieve real time application performance.
Have a look at Django-With-MonoDB.
And if possible try to replace default python interpreter to PyPy.
I think these two are enough to give you best performance. :)
Understanding Django-using-PyPy
Also for front-end you should use KnockoutJS or AngularJS.
Some tipps:
Avoid xml, especially a DOM based xml parser (this blows up data by a factor of 100). (A lat long coordinate without time, needs 8 bytes, not more)
favor a binary represenattion of the coordinates, and parse them by hand, instead using an slow generated parsing code taht probaly uses reflection to parse.
try to minimize the use of databases, especially relational ones.
raise the intervall that clients are sending: e.g evry 20min instead
evry 5.
if you use a db, minimize the transactions, try to do all processing
in one transaction.

What is the most efficient way to display lots of data on a website?

I have an optimization question.
Lets say that I'm making a website, and it has a JSON file with 5,000 pairs (about 582 kb) and through the combination of 3 sliders and some select tags it is possible to display every value. So the time to appear between every pair is in microseconds.
My question is: If the website is also made to run on mobile browsers, where is it more efficient to have the 5000 pairs of data - in a JSON file or in the data base? and why?
I am building a photo site with similar requirements and I can say after months of investigations and experimenting that there are no easy answer to that question. But I will try to give you some hints:
Try to divide the data in chunks, for example - if your sliders are selecting values between 1 through 100, instead of delivering exactly what the client selected, round up a bit maybe +-10 or maybe more, that way you can continue filtering on the client side without a server roundtrip. Save all data in client memory before querying.
Don't render more than what is visible on the screen, JSON storage and filtering is fast but DOM is very slow, minimize the visible elements.
Use 304 caching - meaning - whenever the client is requesting the same data twice; send a proper 304 response with etag. For example - a good rule of thumb here is to use something you know very easily, like the max ID in the database or so to see if any new data has been updated since the last call. If not, just send 304 and the client will use whatever he had the last time.
Use absolute positioning. Don't even try to use the CSS float or something like that, it will not work. Just calculate each position of each element. This will also help you to achieve tip nr 2 (by filtering out all elements that are outside of the visible screen). You can still use CSS transitions which gives nice animations when they change sliders.
You could experiment with IndexedDB to help with the client side querying but unfortunately the support in different browsers are still not good enough plus you hit the roof on storage, better to use the ordinary cache and with proper headings.
Good Luck!
A database like MongoDB would be good for this. It still uses the JSON syntax for storage so you can use the values from the JSON file. The querying is very fast too and you wouldn't have to parse the JSON file and store it in an object before using it.
Given the size of the data (just 582Kb) I will opt for the Json file.
The drawback is you will have a penalty starting the app and loading the data in memory, but then all queries will run very fast in memory as a good advantage.
You need to think about how much acceses will your app do to the database (how many queries) against load the file just once. And think if your main objective are mobile browsers or pcs.
For this volume of data I wouldn't try a database (another process consuming resources), just try how much resources (time, memory) are needed to load the JSON file.
If the data is going to grow... then you will need to rethink this, or maybe split your json file following some criteria.