Elasticsearch store user input as JSON document - json

I have a following architectural question - my application back-end is written at Java and client side at AngularJS. Right now I need to store the user input on the page in order to be able to share and bookmark my application urls and restore the state by this url.
I'm going to implement the following approach - every time the user interacts with my application via selecting the data and conditions at the page I'll collect all of his input at a complex JSON document and store this document at Elasticsearch. Key of this document from ES I'll send back to client application(AngularJS) and based on this key I'll update the page url. For example the original url looks like:
http://example.com/some-page
based on a key from server I'll update this url to following:
http://example.com/some-page/analysis/234532453455
where 234532453455 is a key of a document in ES.
Every time the users will try to access the following url - http://example.com/some-page/analysis/234532453455 AngularJS application will try to get a saved state by key (234532453455) via Java backend REST endpoint.
Will it work ?
Also, I'm in doubt right now how to prevent the duplication of the documents in ES. Right now I have no experience with ES so don't know what approach from ES can be used out of the box for this purpose.
For example is it a good idea to calculate some hash code of each JSON document and store this hash code as a key of a document.. so before storing the new document I can check the old document by hash code. Also performance is very important to me so please take this into account also.

For me it sounds you try to implement cache.
Yes you can do this but if you will use ES only for this solution then I think you should better look to redis or memcached.
I can't say that ES is bad solution for this but ES has some tricks which you must remember for instance its near realtime search. After you index data they are not immediately available to search it takes few seconds depends on configuration(But you can also call _refresh but I am not sure about performance if you index data very often).
Hash: I dont see reasons to use has I'd better create right id. So if you have types of reports per user than id could be "reporttype_{userid}", because if you will use hash as an ID then each new object will have new id and instead of rewriting you will end up with having many copies of old data for that user. if you go with pattern reporttype_{userid} then each time when user regenerate report with new data you will just override it.
As an option you could add to that option fields userid and expireat for future cleanup, for instance you could have job which will cleanup expired reports, but this is valid only if you go with ES, since in redis and memcached there is option to set expiration when you save data

Related

What is the use of all GET, PUT, DELETE when anything can be done by POST in the most secured way of communication in REST API calls

I read a lot about each and every function of those mentioned in the title,
But I always have doubt that what is the primary use of all individual functions. Can someone explain to me in detail? Thank you.
What is the use of all GET, PUT, DELETE when anything can be done by POST
This is a pretty important question.
Note that, historically, SOAP essentially did everything by POST; effectively reducing HTTP from an application protocol to a transport protocol.
The big advantage of GET/PUT/DELETE is that the additional semantics that they promise (meaning, the semantics that are part of the uniform interface agreed to by all resources) allow us to build general purpose components that can do interesting things with the meta data alone, without needing to understand anything specific about the body of the message.
The most important of these is GET, which promises that the action of the request is safe. What this means is that, for any resource in the world, we can just try a GET request to "see what happens".
In other words, because GET is safe, we can have web crawlers, and automated document indexing, and eventually Google.
(Another example - today, I can send you a bare URI, like https://www.google.com and it "just works" because GET is understood uniformly, and does not require that we share any further details about a payload or metadata.)
Similarly, PUT and DELETE have additional semantic constraints that allow general purpose components to do interesting things like automatically retry lost requests when the network is unreliable.
POST, however, is effectively unconstrained, and this greatly restricts the actions of general purpose components.
That doesn't mean that POST is the wrong choice; if the semantics of a request aren't worth standardizing, then POST is fine.
In API perspective,
GET - It is to retrieve record/data from a source. API would need no data from client/UI to retrieve all records , or would need query param / path param to filter records based on what is required - either record with a particular ID or other properties.
POST - it is to store a new record at a source . API would get that record from client/UI through request body and store it.
PUT - it is to update an existing record at a source . API would receive updated record along with Id and update it with existing record whose id match with one passed from UI.
DELETE - it is to delete a record present in source. UI would send nothing to delete whole all records at source or send id to remove a particular record.
Source refers to any database.

GET request in Django

I have a doubt. When is a GET request sent. I mean, I have seen a lot of people using if request.method == 'GET', when they render the form for the first time, but when the form is submitted, they do a `POST' request.
While they explicitly mention when defining the form in html that the method will be 'POST', they don't do the same for 'GET' request which is made when an empty form is requested.
How does django know it's a GET request?
And, why is it done so?
Thanks,
I'm not an expert but I think Django "knows" this because like all things on the Internet it uses the HTTP protocol. There are several HTTP methods. If not specified the default method will always be GET
GET
A GET is usually used to retrieve information. Typically a GET function has no side-effects (this means that data in the database is not changed, that no files in the filesystem are modified, etc.).
Strictly speaking this is not always true, since some webservers log requests (themselves) and thus add an entry to the database that a specific user visited a specific page at a specific timestamp, etc.
A typical GET request is idempotent. This means that there is no difference between doing a query one time, or multiple times (two times, three times, five times, thousand times).
GET queries are therefore typically used to provide static content, as well as pages that contain data about one or more entries, search queries, etc.
POST
POST on the other hand typically ships with data (in the POST parameters), and usually the idea is that something is done with this data that creates a change in the persistent structures of the webserver. For example creating a new entry in some table, or updating the table with the values that are provided. Since these operations are not always idempotent, it can be dangerous if the user refreshes the page in the browser (since that could for example create two orders, instead of the single order the user actually wanted to create).
Therefore in Django a POST request will typically result in some changes to the database, and a redirect as result. This means that the user will typically obtain a new address, and perform a GET request on that page (and that GET is idempotent, hence it will not construct a new order).
PUT, PATCH and DELETE
Besides the popular GET and POST, there are other typical requests a client can make to a webserver. For example PUT, PATCH and DELETE.
PUT
PUT is the twin of a POST request. The main difference is that the URI it hits, specifies what entry to construct or update. PUT is usually an idempotent operation.
This means that if we would for example perform a POST server.com/blog/create to create a blog, PUT will typically look like PUT server.com/blog/123/. So we specify the id in advance. In case the object does not yet exists, a webserver will typically construct one. In case the entity already exists, a new entity will typically be constructed for that URI. So performing the same PUT operation twice, should have no effect.
Note that in case of a PUT request, typically one should specify all fields. The fields that are not specified will typically be filled in with default values (in case such values exist). We thus do not really "update" the entity: we destroy the old entity and create a new one in case that entity already exists.
PATCH
PATCH is a variant of PUT that updates the entity, instead of creating a new one. The fields that are thus missing in a PATCH request, typically remain the same as the values in the "old" entity so to speak.
DELETE
Like the name already suggests, if we perform a DELETE server.com/blog/123/ request, then we will typically remove the corresponding element.
Some servers do not immediately remove the corresponding element. You can see it as scheduling the object for deletion, so sometimes the object is later removed. The DELETE request, thus typically means that you signal the server to eventually remove the entity.
Actually Django is based on HTTP responses-requests. HTTP is fully textuall. So Django parses each request and finds in its header information about what kind of request is it. I may be mistaken in details, but as I understand when server receives request - Django creates it's object request, which contains all the data from HTTP. And then you decide if you need a specific action on GET or POST and you check the type of request with request.method.
P.S. And yes, by default each request is GET.

is it possible to save the web page that is returned by a servlet on the server to access again?

Say I have a servlet that processes form data and returns a web page. Is it possible to store the web page that is returned on the server so that it can be accessed again without filling out the form?
I think its possible. But mostly it will be possible if the form submitted generated a unique page with unique URL.
Like let say when u submit a form and if it created a URL like
'http://example.com/result1.html'
then we can use the Expiration filter to set the Expiration HTTP header and make it cacheable.
But one change is that the caching has to be done by a separate caching server and i hope tomcat cannot do it by itself.
One possible way is -
You save the form object along with a unique url generated as key pair value. so key will be your unique url and value will be the form object which you have to display on the page and you got what you want.
Few more hint -
To generate unique url suppose your result page is 'http://example.com/result.html' you can add a auto increment number like 'http://example.com/result1.html' managed programatically. or you can also thing about UUID class available in java.

Sending data from one html file to another

I am creating a dashboard application in which i show information about the servers. I have a Servlet called "poller.java" that will collect information from the servers and send it back to a client.jsp file. In the client.jsp , i make AJAX calls every 2 minutes to call the poller.java servlet in order to get information about the servers.
The client.jsp file shows information in the form of a table like
server1 info
server 2 info
Now, i want to add one more functionality. when the user clicks on the server1, I should show a separate page (call it server1.jsp) containing the time stamps in which the AJAX call was made by calling.jsp and the server information that was retrieved. This information is available in my calling.jsp page. But, how do i show it in the next page.
Initially, i thought of writing to a file and then retrieving it in my server1.jsp file. But, I dont think it is a good approach. I am sure i am missing a much simpler way to do this. Can someone help me ?
You should name your servlet Poller.java not poller.java. Classes should always start with an uppercase. You can implement your servlet to forward to a different page for example if sombody clicks to server1 then the servlet will forward to server1.jsp. Have a look at RequestDispatcher for this. Passing information between request's should be done by request attributes. if you need to retain the information over several request you could think about using session.
In the .NET world, we use SessionState to maintain data that must persist between requests. Surely there's something similar for JSP? (The session object, perhaps.)
If you can't use session state in a servelet, you're going to have to fall back on a physical backing store. I'd use a database, or a known standard file format (like XML). Avoid home-brew file formats that require you to write your own parser.

Retrieve URL JSON data in MS Access

There is a web service that allows me to go to a URL, with my API-key, and request a page of data. The data is returned as JSON. The JSON is well-formed, I ran it through JSONLint and confirmed its OK.
What I would like to do is retrieve the JSON data from within MS Access (2003 or 2007), if possible, and build a table from that data (first time thru), then append/update the table on subsequent calls to that URL. I would settle for "pre-step" where I retrieve this information via another means. Since I have an API key in the URL, I do not want to do this server-side. I would like to keep it all within Access, run it on my PC at home (its for personal use anyway).
If I have to use another step before the database load then Javascript? But I dont know that very well. I dont even really know what JSON is other than what I have read in Wikipedia. The URL looks similar to:
http://www.SomeWebService.com/MyAPIKey?p=1&s=50
where: p = page number
s = records per page
Access DB is a JavaScript Lib for MS Access, quick page search says they play nicely with JSON, and you can input/output with. boo-ya.
http://www.accessdb.org/
EDIT:
dead url; wayback machine ftw:
http://web.archive.org/web/20131007143335/http://www.accessdb.org/
also sourceforge
http://sourceforge.net/projects/accessdb/