Netty HttpServer Chrome Browser Multiple Requests - google-chrome

We use Netty, version 4.1.13. We create HttpServer, HttpServerInitializer, HttpServerHandler and start it through using a port.When we make a request from Chrome Browser, HttpServerInitializer is called 3 or 4 times (sometimes 3, sometimes 4) and it is called again after 10 seconds.When we make a request through Microsoft Edge or through console, it is called one times as expected and HttpServerHandler handles the rest.
What should we do to prevent HttpServerInitializer's handling unnecessary extra requests.We have session operations attached to pipeline on Initializer, so this is a critical issue for us.

The default behaviour of browsers for HTTP 1 is to open several connections (how many depends on the browser) to do requests in parallel. Like that, they can retrieve resources like css, js, images,... in parallel.
The number of connection is configurable into the browser. In general there are two preferences: the maximum number of connections by hostname and the total maximum number of opened connections.
See also: http://www.browserscope.org/?category=network&v=0
So, when you start a request with Chrome, it opens several connections, even if it use only one if there is not so much request done. The idle an unused connections will be closed after some seconds.
I think that's why you see the HttpServerInitializer being called several times, only because there are several connections. So, server side, it's normal, because you don't know if it's different clients or only one with many connections.
I advice you to not do costly operation on Connection Opened event, but only when you receive a valid message/request. Your initializer should only configure the necessary handlers on the pipeline which should be quick and simple, and nothing else.

Related

What does Chrome Network Timings really mean and what does affects each timing length?

I was looking at chrome dev tools #resource network timing to detect requests that must be improved. In the link before there's a definition for each timing but I don't understand what processes are being taken behind the scenes that are affecting the length of the period.
Below are 3 different images and here is my understanding of what's going on, please correct me if I'm wrong.
Stalled: Why there are timings where the request get's stalled for 1.17s while others are taking less?
Request Sent: it's the time that our request took to reach server
TTFB: Time took until the server responds with the first byte of data
Content Download: The time until the whole response reaches the client
Thanks
Network is an area where things will vary greatly. There are a lot of different numbers that go into play with these and they vary between different locations and even the same location with different types of content.
Here is some more detail on the areas you need more understanding with:
Stalled: This depends on what else is going on in the network stack. One thing could not be stalled at all, while other requests could be stalled because 6 connections to the same location are already open. There are more reasons for stalling, but the maximum connection limit is an easy way to explain why it may occur.
The stalled state means, we just can't send the request right now it needs to wait for some reason. Generally, this isn't a big deal. If you see it a lot and you are not on the HTTP2 protocol, then you should look into minimizing the number of resources being pulled from a given location. If you are on HTTP2, then don't worry too much about this since it deals with numerous requests differently.
Look around and see how many requests are going to a single domain. You can use the filter box to trim down the view. If you have a lot of requests going off to the same domain, then that is most likely hitting the connection limit. Domain sharding is one method to handle this with HTTP 1.1, but with HTTP 2 it is an anti-pattern and hurts performance.
If you are not hitting the max connection limit, then the problem is more nuanced and needs a more hands-on debugging approach to figure out what is going on.
Request sent: This is not the time to reach the server, that is the Time To First Byte. All request sent means is the request is sent and it took the network stack X time to carry it out.
Nothing you can do to speed this up, it is more for informational and internal debugging purposes.
Time to First Byte (TTFB): This is the total time for the sent request to get to the destination, then for the destination to process the request, and finally for the response to traverse the networks back to the client.
A high TTFB reveals one of two issues. The first is a bad network connection between the client and server. So data is slow to reach the server and get back. The second cause is, a slow server processing the request. This is either because the hardware is weak or the application running is slow. Or, both of these problems can exist at once.
To address a high TTFB, first cut out as much network as possible. Ideally, host the application locally on a low-resource virtual machine and see if there is still a big TTFB. If there is, then the application needs to be optimized for response speed. If the TTFB is super-low locally, then the networks between your client and the server are the problem. There are various ways to handle this that I won't get into since it is an area of expertise unto itself. Research network optimization, and even try moving hosts and seeing if your server providers network is the issue.
Remember the entire server-stack comes into play here. So if nginx or apache are configured poorly, or your database is taking a long time to respond, or your cache is having trouble, then these can cause delays. They are also difficult to detect locally, since your local server could vary in configuration from the remote stack.
Content Download: This is the total time from the TTFB resolving for the client to get the rest of the content from the server. This should be short unless you are downloading a large file. You should take a look at the size of the file, the conditions of the network, and then judge about how long this should take.

Flash - Loader errors in Firefox

I'm writing an application which pulls up to several dozen images from a server using Loader objects. It works fine in all browsers except Firefox, where I'm finding that, with over 6 or so connections, some simply never load - and I cease to get progress events (and can detect no errors/error events)
I extended the Loader class so that it will kill and reopen the transfer if it takes longer than ten seconds, but this temporary hack has created a new problem, in that when there are quite a few connections open, many of them will load 90-odd percent of the image, get killed for exceeding the time limit, open again, load 90-odd percent etc...until the traffic is low enough for it to actually complete. This means I'm getting transfers of many times the amount of data that is actually being requested!
It doesn't happen in any other browser (I was anticipating IE errors, so for Firefox to be the anomaly was unexpected!), I can write a class to manage Loaders, but wondered if anyone else had seen this problem?
Thanks in advance for any help!
Maybe try to limit number of concurrent connections.
Instead of loading all assets at once (then FP or browser manages the connections) try to build a queue.
Building a simple queue is fairly easy - just create an array of URLs and shift or pop a value every time loader has finished loading previous asset.
You might use an existing loader manager like LoaderMax or BulkLoader - they allow to create a queue, limit number of connections and are fairly robust. LoaderMax is my favourite.

Server-Sent Events vs Polling

Is there a big difference (in terms of performance, browser implementation availability, server load etc) between HTML5 SSEs and straight up Ajax polling? From the server side, it seems like an EventSource is just hitting the specified page every ~3 seconds or so (though I understand the timing is flexible).
Granted, it's simpler to set up on the client side than setting up a timer and having it $.get every so often, but is there anything else? Does it send fewer headers, or do some other magic I'm missing?
Ajax polling adds a lot of HTTP overhead since it is constantly establishing and tearing down HTTP connections. As HTML5 Rocks puts it "Server-Sent Events on the other hand, have been designed from the ground up to be efficient."
Server-sent events open a single long-lived HTTP connection. The server then unidirectionally sends data when it has it, there is no need for the client to request it or do anything but wait for messages.
One downside to Server-sent events is that since they create a persistent connection to the server you could potentially have many open connections to your server. Some servers handle massive numbers of concurrent connections better than others. That said, you would have similar problems with polling plus the overhead of constantly reestablishing those connections.
Server-sent events are quite well supported in most browsers, the notable exception of course being IE. But there are a couple of polyfills (and a jQuery plugin) that will fix that.
If you are doing something that only needs one-way communication, I would definitely go with Server-sent events. As you mentioned Server-sent events tend to be simpler and cleaner to implement on the client-side. You just need to set up listeners for messages and events and the browser takes care of low-level stuff like reconnecting if disconnected, etc. On the server-side it is also fairly easy to implement since it just uses simple text. If you send JSON encoded objects you can easily turn them into JavaScript objects on the client via JSON.parse().
If you are using PHP on the server you can use json_encode() to turn strings, numbers, arrays and objects into properly encoded JSON. Other back-end languages may also provide similar functions.
I would only add a higher perspective to what's been said, and that is that SSE is publish-subscribe model as opposed to constant polling in case of AJAX.
Generally, both ways (polling and publish-subscribe) are trying to solve the problem how to maintain an up-to-date state on the client.
1) Polling model
It is simple. The client (browser) first gets an initial state (page) and for it to update, it needs to periodically request the state (page or its part) and process the result into the current state (refresh whole page or render it inteligently into its part in case of AJAX).
Naturally, one drawback is that if nothing happens with the server state the resources (CPU, network, ...) are used unnecessarily. Another one is that even if the state changes the clients gets it only at the next poll period, not ASAP. One often needs to evaluate a good period time compromise between the two things.
Another example of polling is a spinwait in threading.
2) Publish-subscribe model
It works as follows:
(client first requests and shows some initial state)
client subscribes to the server (sends one request, possibly with some context like event source)
server marks the reference to the client to some its client reference repository
in case of an update of the state, server sends a notification to the client based on the reference to the client it holds; i.e. it is not a response to a request but a message initiated by the server
good clients unsubscribe when they are no more interested in the notifications
This is SSE, or within threading a waitable event, as another example.
A natural drawback, as stated, is that the server must know about all its subscribed clients which, depending on an implementation, can be an issue.

HTML5: shared web worker with multiple connections

From what I understand, the big benefit of HTML5's shared web workers is that they can accept multiple connections in a single separate thread of execution.
My question is: has anyone gotten multiple connections with a SharedWorker to work as a single thread with Google Chrome? I'm using latest version 12.0.742.112.
Demo: http://demos.zulius.com/html5/sharedworker
Source (in case demo is down): index.html, sharedworker.js
The demo establishes 2 separate event listeners. The expected output is:
foo got message: Hello World! You are connection #1
bar got message: Hello World! You are connection #2
In the demo, both event listeners fire correctly, but the connection count variable is not maintained in the SharedWorker script. This leads me to believe each connection to the SharedWorker is executing in a separate thread.
Am I doing something wrong? Or is Chrome support for SharedWorker not quite there?
UPDATE: the demo works now.
You have 2 listeners to the Worker but you only start the Worker for once, so it's 1 Worker shared by 1 owner instead of 2 owners. Increasing the number of listeners doesn't affect the ownership.
You can see the example here:
http://weblog.bocoup.com/javascript-web-workers-chrome-5-supports-new-sharedworker
It has 2 frames, one containing the iframe and one inside the iframe. They both call the start method of the Worker so it's shared by 2 owners. Since the start method is called twice, the onconnect event should be fired twice, thus making connection.count equal 2.
In shared webworkers the context is alive till the last browser session end. shared webworkers can maintain the context around the browser tabs. They respond to the requests with the same context of data.
The change in context of data will affect all connections, the possibilities are you can update all the connections with single context change, you can maintain the data till the last connection end. you can maintain the connection changes in all views.
Here is a demo of Shared web workers with multiple connections.
http://www.antkorp.in/sharedworkers/

How do download accelerators work?

We require all requests for downloads to have a valid login (non-http) and we generate transaction tickets for each download. If you were to go to one of the download links and attempt to "replay" the transaction, we use HTTP codes to forward you to get a new transaction ticket. This works fine for a majority of users. There's a small subset, however, that are using Download Accelerators that simply try to replay the transaction ticket several times.
So, in order to determine whether we want to or even can support download accelerators or not, we are trying to understand how they work.
How does having a second, third or even fourth concurrent connection to the web server delivering a static file speed the download process?
What does the accelerator program do?
You'll get a more comprehensive overview of Download Accelerators at wikipedia.
Acceleration is multi-faceted
First
A substantial benefit of managed/accelerated downloads is the tool in question remembers Start/Stop offsets transferred and uses "partial" and 'range' headers to request parts of the file instead of all of it.
This means if something dies mid transaction ( ie: TCP Time-out ) it just reconnects where it left off and you don't have to start from scratch.
Thus, if you have an intermittent connection, the aggregate transfer time is greatly lessened.
Second
Download accelerators like to break a single transfer into several smaller segments of equal size, using the same start-range-stop mechanics, and perform them in parallel, which greatly improves transfer time over slow networks.
There's this annoying thing called bandwidth-delay-product where the size of the TCP buffers at either end do some math thing in conjunction with ping time to get the actual experienced speed, and this in practice means large ping times will limit your speed regardless how many megabits/sec all the interim connections have.
However, this limitation appears to be "per connection", so multiple TCP connections to a single server can help mitigate the performance hit of the high latency ping time.
Hence, people who live near by are not so likely to need to do a segmented transfer, but people who live in far away locations are more likely to benefit from going crazy with their segmentation.
Thirdly
In some cases it is possible to find multiple servers that provide the same resource, sometimes a single DNS address round-robins to several IP addresses, or a server is part of a mirror network of some kind. And download managers/accelerators can detect this and apply the segmented transfer technique across multiple servers, allowing the downloader to get more collective bandwidth delivered to them.
Support
Supporting the first kind of acceleration is what I personally suggest as a "minimum" for support. Mostly, because it makes a users life easy, and it reduces the amount of aggregate data transfer you have to provide due to users not having to fetch the same content repeatedly.
And to facilitate this, its recommended you, compute how much they have transferred and don't expire the ticket till they look "finished" ( while binding traffic to the first IP that used the ticket ), or a given 'reasonable' time to download it has passed. ie: give them a window of grace before requiring they get a new ticket.
Supporting the second and third give you bonus points, and users generally desire it at least the second, mostly because international customers don't like being treated as second class customers simply because of the greater ping time, and it doesn't objectively consume more bandwidth in any sense that matters. The worst that happens is they might cause your total throughput to be undesirable for how your service operates.
It's reasonably straight forward to deliver the first kind of benefit without allowing the second simply by restricting the number of concurrent transfers from a single ticket.
I believe the idea is that many servers limit or evenly distribute bandwidth across connections. By having multiple connections, you're cheating that system and getting more than your "fair" share of bandwidth.
It's all about Little's Law. Specifically each stream to the web server is seeing a certain amount of TCP latency and so will only carry so much data. Tricks like increasing the TCP window size and implementing selective acks help but are poorly implemented and generally cause more problems than they solve.
Having multiple streams means that the latency seen by each stream is less important as the global throughput increases overall.
Another key advantage with a download accelerator even when using a single thread is that it's generally better than using the web browsers built in download tool. For example if the web browser decides to die the download tool will continue. And the download tool may support functionality like pausing/resuming that the built-in brower doesn't.
My understanding is that one method download accelerators use is by opening many parallel TCP connections - each TCP connection can only go so fast, and is often limited on the server side.
TCP is implemented such that if a timeout occurs, the timeout period is increased. This is very effective at preventing network overloads, at the cost of speed on individual TCP connections.
Download accelerators can get around this by opening dozens of TCP connections and dropping the ones that slow to below a certain threshold, then opening new ones to replace the slow connections.
While effective for a single user, I believe it is bad etiquette in general.
You're seeing the download accelerator trying to re-authenticate using the same transaction ticket - I'd recommend ignoring these requests.
From: http://askville.amazon.com/download-accelerator-protocol-work-advantages-benefits-application-area-scope-plz-suggest-URLs/AnswerViewer.do?requestId=9337813
Quote:
The most common way of accelerating downloads is to open up parllel downloads. Many servers limit the bandwith of one connection so opening more in parallel increases the rate. This works by specifying an offset a download should start which is supported for HTTP and FTP alike.
Of course this way of acceleration is quite "unsocial". The limitation of bandwith is implemented to be able to serve a higher number of clients so using this technique lowers the maximum number of peers that is able to download. That's the reason why many servers are limiting the number of parallel connection (recognized by IP), e.g. many FTP-servers do this so you run into problems if you download a file and try to continue browsing using your browser. Technically these are two parallel connections.
Another technique to increase the download-rate is a peer-to-peer-network where different sources e.g. limited by asynchron DSL on the upload-side are used for downloading.
Most download 'accelerators' really don't speed up anything at all. What they are good at doing is congesting network traffic, hammering your server, and breaking custom scripts like you've seen. Basically how it works is that instead of making one request and downloading the file from beginning to end, it makes say four requests...the first one downloads from 0-25%, the second from 25-50%, and so on, and it makes them all at the same time. The only particular case where this helps any, is if their ISP or firewall does some kind of traffic shaping such that an individual download speed is limited to less than their total download speed.
Personally, if it's causing you any trouble, I'd say just put a notice that download accelerators are not supported, and have the users download them normally, or only using a single thread.
They don't, generally.
To answer the substance of your question, the assumption is that the server is rate-limiting downloads on a per-connection basis, so simultaneously downloading multiple chunks will enable the user to make the most of the bandwidth available at their end.
Typically download-accelerators depend on partial content download - status code 206. Just like the streaming media players, media players ask for a small chunk of the full file to the server and then download it and play. Now the catch is if a server restricts partial-content-download then the download accelerator won't work!. It's easy to configure a server like Nginx to restrict partial-content-download.
How to know if a file can be downloaded via ranges/partially?
Ans: check for a header value Accept-Ranges:. If it does exist then you are good to go.
How to implement a feature like this in any programming language?
Ans: well, it's pretty easy. Just spin up some threads/co-routines(choose threads/co-routines over processes in I/O or network bound system) to download the N-number of chunks in parallel. Save the partial files in the right position in the file. and you are technically done. Calculate the download speed by keeping a global variable downloaded_till_now=0 and increment it as one thread completes downloading a chunk. don't forget about mutex as we are writing to a global resource from multiple thread so do a thread.acquire() and thread.release(). And also keep a unix-time counter. and do math like
speed_in_bytes_per_sec = downloaded_till_now/(current_unix_time-start_unix_time)