Chrome throwing mutliple requests (http pipelining) even for ajax POST requests - google-chrome

So the problem started when we noticed that there are cases of duplicate orders on our website. When we started investigating, we couldn't narrow down something which could cause duplicate orders on our page and yet explain the state of the duplicate data. The eeriest part was that those orders were created at the SAME instant (down to smallest millisecond). In access logs in server too, the requests are received at the same instant.
So in order to investigate further we called random customers, most of them had close to the same answers, that they use a slow connection (some of them through modem) and they use chrome. Most feedback is like, page was stuck so I pressed back button. After some search we learnt of Http pipelining feature in chrome which is an aggressive technique to fetch the page in case connection is slow.
So here is the deal, user presses the Submit button --> a verify ajax JSON call (GET) --> form data is POSTed thru ajax JSON call --> returns some feedback to customer, customer takes action and then is redirected appropriately.
I am not sure if this is the best use of the AJAX or even GET/POST calls , but this is what I am stuck with.
Since this problem occurs in very specific (slow connection and chrome must fire duplicate connections) and truth be told, I have not been able to replicate this. However since nearly 95% feedback points towards Chrome, I am forced to think of http pipelining. It is the only possible explanation that could fire requests, so that multiple records are created at same instant.
I also learnt that http pipelining is done only for GET requests not POST requests. So I am not sure whether:
this covers AJAX POST requests (I use jQuery and I do use type:POST)
Chrome may somehow be back (erroneously) throwing multiple requests for all requests (refer: What to do with chrome sending extra requests?)
Only argument I could find in Chrome's case is that http pipelining is disabled by default.
I am not even sure what checks to put in this case, since the both the requests are being served at the same instant. I could put a check at backend to check if similar record is created, but that would be an expensive check, slowing down ordering and business may not welcome it.
I found something at http://www.chromium.org/developers/design-documents/network-stack/http-pipelining but not sure I must force/hack my requests to meet one of the criteria to stop http pipelining.
Any points to test this would be appreciated.

Related

Why is there a difference between get and put requests? [duplicate]

Back when I first started developing client/server apps which needed to make use of HTTP to send data to the server, I was pretty nieve when it came to HTTP methods. I literally used GET requests for EVERYTHING.
I later learned that I should use POST for sending data and GET for requesting data however, I was slightly confused as to why this is best practice. From a functionality perspective, I was able to use either GET or POST to achieve the exact same thing.
Why is it important to use specific HTTP methods rather than using the same method for everything?
I understand that POST is more secure than GET (GET makes the data visible in the HTTP URL) however, couldn't we just use POST for everything then?
I'm going to take a stab at giving a short answer to this.
GET is used for reading information. It's the 'default' method, and everything uses this to jump from one link to the next. This includes browsers, but also crawlers.
GET is 'safe'. This means that if you do a GET request, you are guaranteed that you will never change something on the server. If a GET request could cause something to delete on the server, this can be very problematic because a spider/crawler/search engine might assume that following links is safe and automatically delete things.
This is why we have a couple of different methods. GET is meant to allow you to 'get' things from the server. Likewise, PUT allows you to set something new on a server and DELETE allows you remove something.
POST's biggest original purpose is submitting forms. You're posting a form to the server and ask the server to do something with that form.
Any client (a human/browser or machine/crawler) knows that POST is 'unsafe'. It won't do POST requests automatically on your behalf unless it really knows it's what you (the user) wants. It's also used for things like are kinda similar to submitting forms.
So when you design your website, make sure you use GET only for getting things from the server, and use POST if your ajax request will cause 'something' to change on the server.
Fun fact: there are a lot of official HTTP methods. At least 30. You'll probably only use a very few of them though.
So to answer the question in the title more precisely:
Why are there multiple HTTP Methods available?
Different HTTP methods have different rules and restrictions. If everyone agrees on those rules, we can start making assumptions about what the intent is. Because these guarantees exists, HTTP servers, clients and proxies can make smart decisions without understanding your specific application.
Suppose, You have one task app in which you can store data, delete data. Now suppose the route of your web page is /xx so to get the webpage, to store the data using add button , to delete the data using delete button you will send requests to /xx but how web server will know whether you are asking for web page or you want to add data or you want to delete because /xx is the same for all requests that's why we have different web requests browser always sends request name(GET,POST,PUT,DELETE) in header to server so server can understand what you need.

Failures in eventual consistent system and user experience [duplicate]

When using distributed and scalable architecture, eventual consistency is often a requirement.
Graphically, how to deal with this eventual consistency?
Users are used to click save, and see the result instantaneously... with eventual consistency it's not possible.
How to deal with the GUI for such scenarios?
Please note the question applies both for desktop applications and web applications.
PS: I'm working with the Microsoft platform, but I imagine the question applies to any technology...
A Task Based UI fits this model great. You create and execute tasks from the UI. You can also have something like a task status monitor to show the user when a task has executed.
Another option is to use some kind of pooling from the client. You send the command, and pool from the client until the command completed and the new data is available. You will have a delay in some cases from when the user presses save to when he will see the new record, but in most cases it should be almost synchronous.
Another (good?) option is to assume/design commands that don't fail. This is not trivial but you can have a cache on the client and add the data from the command to that cache and display it to the user even before the command has been executed. If the command fails for some unexpected situation, well then just design a good "we are sorry" message for misleading the user for a few seconds.
You can also combine the methods above.
Usually eventual consistency is more of a business/domain problem, and you should have your domain experts handle it.
I think that other answers mix together CQRS in general and eventual consistency in particular. Task-based UI is very suitable for CQRS but it does not resolve the issue with eventually consistent read model.
First, I would like to challenge your statement:
Users are used to click save, and see the result instantaneously... with eventual consistency it's not possible.
What do you by this? Why is it not possible to see the result immediately? I think the issue here is your definition of result.
The result of any action is that that action has been performed. There are numerous of ways to show this! It depends on what kind of action do you want to complete. Examples:
Send an email: if user has entered a correct email address, it is almost guaranteed that the action will complete successfully. To prevent unexpected failures one might use durable queues since this kind of actions do not need to be done synchronously. So you just say "email sent". Typically you see this kind of response when you ask to reset your password.
Update some information in a user profile: after you have validated the new data on the client, most probably the command will succeed too since the only thing that could happen is the database error (if you use database). Again, even this can be mitigated by using durable queues. In this case you just show the updated field in the same form. The good practice for SPA is to have a comprehensive data store on the client side, like Redux does. In this case you can safely update the server by sending a command and also updating the client-side store, which will result in UI to shows the latest data. Disclaimer: some answers refer to this technique as "tricking the user", but I disagree with this definition.
If you have commands that are prone to error, you can use techniques that are already described in other answers like Websockets or Server-side events to communicate errors back. This requires quite a lot of additional work. You can also send a command and wait for reply or execute commands synchronously. Some would say "this is not CQRS" but this would be just another dogma to be challenged. Ensuring the command has completed the execution in combination with the previous point (client-side data store) will be a good solution.
I am not sure if there is any 100% bullet proof technique that allows you to always show non-stale data from the read model. I think it goes against the principles of CQRS. Even with real-time events you will only get events that indicate that you write model has been updated. Still, your projections could have failed and reacting on this is a whole other story.
However, I would not concentrate that much on this issue. The fact is that well-tested projections and almost-guaranteed commands will work very well. For error handling in 90% of situations it is enough to have some manual or half-manual process to recover from those errors. For the last 10% you can combine generic "error" messages pushed from the server saying "sorry, your action XXX has failed to execute" and the top priority actions could have some creative process behind them but in reality those situations would be very very rare.
There are 2 ways:
To trick a user (just to show that things has happened then they
really hasn't happened yet)
Show that system is processing request
and use polling in background (not good) or just timer with value of
your SLA.
I prefer the 1st option.
As someone has already mentioned, task based UI's fit well for this, and what I would do is employ a technique that 'buys you time' for the command to propagate.
For example, imagine we are on a list screen, where the user can perform various actions, one of which being to add a new item to the list. After choosing to add an item you could display a "What would you like to do next?" which could have 'Add another item', 'Do this task', 'Do some other task', 'Go back to list'.
By the time they have clicked on an option, the data would have hopefully been refreshed.
Also, if you're using a task based UI, you can analyse the patterns of task execution and use these "what would you like to do next" screens to streamline the UI. Similar to amazon's "other people also bought these items".
As previously stated, it is fine to tell the user that the request (command) has been acknowledged (successfully issued). In case of some failure, the system should communicate this to the requester, by means of:
email;
SMS;
custom inbox (e.g. like the SO inbox);
whatever.
E.g., mail client / service:
I am sending a mail to a wrong address;
the mail service says: "email sent successfully :)";
after few minutes, I receive a mail from the service: "email could not be delivered".
I believe a great way to inform the user about a recent failure is to present him an error panel while he's navigating through the application. A user gesture might be required in order to dismiss that alert etc.
For example:
I wouldn't go with tricking the user or blocking him from committing some other actions. I would rather go for streaming data toward UI after they are being acknowledged by a read side. Let's consider these two cases:
Users saves data and expects result. Connection is established toward server. After they are being acknowledged by a read side, they are streamed toward UI and UI is being updated.
User saves data and refreshes web page. Upon reload, data are being fetched from data store and connection for streaming is established. If read side didn't update the data store in the meantime, there's still an opened stream and UI should be updated after data reaches the read side.
Why streaming from read side and not directly from write side? Simply, that would be a confirmation that read side has been reached.
From technical aspect, Server-Sent Events could be used.
Disadvantage:
Results will still not be reflected immediately by a read side. But at least, in most cases, user will be able to continue with his work without being blocked by a UI.
There are several ways to handle eventual consistency. All of them are really to occupy the time from the User's action until the backend refresh.
User Reads A given user can only read from the same database node that they write to. Other users read from the replicated nodes. PROS: UI is quick enough, and application stays in sync. CONS: Your service architecture has to track and route Users to specific database nodes.
Disable the UI until the action has completed, and refresh it. Java Server Faces has a classic example of this. One could create a modal with a loading spinner to cover the UI until the refresh was completed. PROS: UI stays in sync with application state. CONS: Most every action creates a blocked UI. Users get very frustrated by the restricted UI, and will complain of application slowness.
Confirmation Immediately thank the user for their submission. Then let them know later (email, SMS, in-app notification) whether or not the action was completed. PROS: It's fast up front. CONS: UI lags behind system until refresh. Even with a notice, the User may get confused that they don't see the updates. It also requires integration of various communication channels. Users won't see their changes right away. If the action fails, they may not know until it's too late.
Fake it Optimistically assume that the action will complete. Show the User the resulting UI (upvote, comment, credit card confirmation, etc) and allow them to continue as if it succeeded. If there were failures, immediately show them as contextual errors: alerts next to the undone upvotes, in-app alert on the post with the failed comment, email for the declined credit card. PROS: UI feels much faster. CONS: UI is temporarily out of sync with application state, and you must resolve that. One case: you might fake creation of content with temp IDs. But after content is created, then the temp IDs will be wrong until the refresh. Second case, you might need to store all state changes on the UI after the action until the refresh. Then you need some Resolver to apply all the local state changes since the action was issued. This is resolution is non-trivial.
Web Sockets Subscribe the UI to an event stream so that when the action is completed on the backend, it is pushed to the front end. Is it one-way or two-way streaming? PROS: UI feels fast, and it's in sync with the application state. CONS: Consistent browser support, need a backend source of streaming events, and socket server scalability.

reducing response size

I am working on a web application and I am using polling approach to check if there is any update needed. These polling requests occur in every 1 or 2 seconds. The size of the response is 240 bytes if there is no update needed(An empty response is returned in that case) and around 10 KBs which is the size of the content itself. My problem is, since it returns at least 240 B in every seconds approximately, is there a way to optimize this response by pushing the boundaries a bit more?
When I checked the contents of the response, I saw that the 50 bytes are essential for me(session id and status code). However, there are some information in the header such as connection type, timeout and content-type. These settings will be same for each request of this type(i.e. it always requires content type as: "text/html; carset=utf-8"). So, can I just assume these settings in client side and prevent the server from sending these header info?
I am using django on the server side and jQuery for sending ajax requests by the way. Also, any type of push technology is out of question for now.
It does add up, but not as much as you think. If you polled every sec for a full hour, you'd have only used 864K, less than a typical webpage would require with an unprimed cache. Even if you did it for a full day, you're talking about ~20M. Maybe if you're someone like Twitter, you might need to be concerned about this, but I doubt you'll be getting anywhere near the traffic it would take for this to actually be problematic.
Nevertheless, you can of course customize the headers of the request, but what if any impact this will have on the client will be a matter to testing. Some headers can probably be dropped, but others may surprise you, and it technically could vary browser to browser, as well.
One solution to this kind of problem is "long polling". The polling client will send a request, and the webserver checks to see if there is an update. If there is not, the webserver sleeps for a second or two and then checks again in a loop, without sending a response. As soon as this loop sees an update, it sends a response. To the client web browser, it will look like the server is congested and taking a long time to respond, but actually the relevant data is being transmitted promptly and the "no data" responses are simply being skipped.
I'd recommend adding a timeout to the loop -- say 30 or 60 seconds -- after which the webserver would reply with "no data" as usual. Even just a 30 second cycle would cut your empty response load by a factor of 15-30.
Caveat: I've read about this kind of implementation but I haven't tried it myself. You will need to test compatibility with various web browsers to ensure that this fairly nonstandard method doesn't cause issues on the client side.

Server-Sent Events vs Polling

Is there a big difference (in terms of performance, browser implementation availability, server load etc) between HTML5 SSEs and straight up Ajax polling? From the server side, it seems like an EventSource is just hitting the specified page every ~3 seconds or so (though I understand the timing is flexible).
Granted, it's simpler to set up on the client side than setting up a timer and having it $.get every so often, but is there anything else? Does it send fewer headers, or do some other magic I'm missing?
Ajax polling adds a lot of HTTP overhead since it is constantly establishing and tearing down HTTP connections. As HTML5 Rocks puts it "Server-Sent Events on the other hand, have been designed from the ground up to be efficient."
Server-sent events open a single long-lived HTTP connection. The server then unidirectionally sends data when it has it, there is no need for the client to request it or do anything but wait for messages.
One downside to Server-sent events is that since they create a persistent connection to the server you could potentially have many open connections to your server. Some servers handle massive numbers of concurrent connections better than others. That said, you would have similar problems with polling plus the overhead of constantly reestablishing those connections.
Server-sent events are quite well supported in most browsers, the notable exception of course being IE. But there are a couple of polyfills (and a jQuery plugin) that will fix that.
If you are doing something that only needs one-way communication, I would definitely go with Server-sent events. As you mentioned Server-sent events tend to be simpler and cleaner to implement on the client-side. You just need to set up listeners for messages and events and the browser takes care of low-level stuff like reconnecting if disconnected, etc. On the server-side it is also fairly easy to implement since it just uses simple text. If you send JSON encoded objects you can easily turn them into JavaScript objects on the client via JSON.parse().
If you are using PHP on the server you can use json_encode() to turn strings, numbers, arrays and objects into properly encoded JSON. Other back-end languages may also provide similar functions.
I would only add a higher perspective to what's been said, and that is that SSE is publish-subscribe model as opposed to constant polling in case of AJAX.
Generally, both ways (polling and publish-subscribe) are trying to solve the problem how to maintain an up-to-date state on the client.
1) Polling model
It is simple. The client (browser) first gets an initial state (page) and for it to update, it needs to periodically request the state (page or its part) and process the result into the current state (refresh whole page or render it inteligently into its part in case of AJAX).
Naturally, one drawback is that if nothing happens with the server state the resources (CPU, network, ...) are used unnecessarily. Another one is that even if the state changes the clients gets it only at the next poll period, not ASAP. One often needs to evaluate a good period time compromise between the two things.
Another example of polling is a spinwait in threading.
2) Publish-subscribe model
It works as follows:
(client first requests and shows some initial state)
client subscribes to the server (sends one request, possibly with some context like event source)
server marks the reference to the client to some its client reference repository
in case of an update of the state, server sends a notification to the client based on the reference to the client it holds; i.e. it is not a response to a request but a message initiated by the server
good clients unsubscribe when they are no more interested in the notifications
This is SSE, or within threading a waitable event, as another example.
A natural drawback, as stated, is that the server must know about all its subscribed clients which, depending on an implementation, can be an issue.

Using messaging to do writes as well as reads

I come from a web background where I only have to deal with HTTP so please excuse my ignorance.
I have an app which where clients listen for changes in a message queue which uses stomp. Previously the client only needed to listen to the relevant channels for messages telling them about changes on the server and update themselves accordingly. Simple stuff.
There is now a requirement for the client to be able to edit data and push those changes back to the server. The data on the server is already exposed via restful resources so my first thought was just to make REST put requests to change the data on the server, but then I started to wonder whether I could find a solution using messaging. I could just open up another channel which the clients could publish changes to and the server could subscribe to that channel and update itself accordingly. Implementing this would obviously be simple but I would love to have some of the potential pitfalls pointed out to me ahead of time.
I am familiar with REST so I want to ask some questions in the context of REST:
Would I map a group of queues to REST/CRUD verbs for each resource i.e. itemPostQueue, itemPutQueue, itemDeleteQueue?
What about GET's how can I request data to read using a queue?
What do I use to replace my status code mechanism to catch problems or do I just fire and forget (gulp) or use error/receipt headers in Stomp somehow?
Any answers and advise will be much appreciated.
Regards,
Chris
While I am not clear on why you must use messaging here, a few thoughts:
You could map to REST on the wire like itemPostQueue, but this would likely feel unnatural to a message-oriented person. If you are using some kind of queue with a guaranteed semantic and deliver-once built in, then go ahead and use that mechanism. For a shopping-cart example, then you could put an AddItem message on the wire, and you trust the infrastructure to deliver it once to the server.
There is no direct GET like concept here in message queuing. You can simulate it with a pair of messages, I send you a request and you send me back a response. This is much like RPC, but even further decoupled. So I send you a PublishCart request and later on, the server sends a CartContents message on a channel that the client is listening to.
Status codes are more complex, and generally fall into two camps. First are the actual queue-library messages - deal with them just as you would any normal system message. Second you may have your own messages you want to put on the wire that signal failure at some place in the chain.
One thing that messaging does do is significantly decouple your app. Unlike HTTP, where you know that something happened, with a queue, you send a letter to somebody. It may get there. The postman might drop it in the snow. The dog might eat it. If you don't get a response in some period of time, you try other means to contact your relatives, or to pull back the analogy, to contact the server. Monitoring of the health of the queue infrastructure and depth of queues and the like take on added importance, as they are the plumbing that you are now depending upon.
Good Luck