I'm trying to optimize the load time of a html5 video. Is there any way to make a browser deal with each webm video chunk as single TCP streams, to utilize HTTP/2 improved parallelisation?
You can not directly configure whether a browser reuses the same HTTP/2 connection for making another request or whether it uses a new connection. That is up to the browser to decide.
In theory just using one HTTP/2 connection should give you optimal performance, since it avoids the overhead for having to open new connections. In practice it might be sometimes worse than using multiple HTTP/1.1 connections, due to suboptimal flow-control windows or stream priorization in some HTTP/2 implementations.
One workaround to force multiple connections might be to serve some of the chunks through a different URL (pointing towards the same server), which prevents the browser from reusing the connection. That will however require some additional effort to set it up.
Another option could be to try disabling HTTP/2 for the server which serves those chunks.
Related
I've been working on making a near real-time voice chat app. The webpage will send packets to the server, and the server will save the packets to disk, and then re transmit the packets to the other connected webpages. I've tried many other solutions, but they are either laggy or they do not play. I've realized that sending PCM samples would be optimal (the server will be recording these as well), but I'm not sure how to get them to play on another client's end. I'm using NodeJS with Socket.IO. Thanks in advance!
The webpage will send packets to the server, and the server will save the packets to disk, and then re transmit the packets to the other connected webpages.
Already, this is not that efficient. It's better when possible to send data directly from peer to peer.
I've realized that sending PCM samples would be optimal
No, it wouldn't. This requires more bandwidth, which is going to need better buffering, which means higher latency. This is voice chat... no need to use a lossless encoding like PCM.
I've been working on making a near real-time voice chat app.
This is basically the defacto primary use case that WebRTC was built for. If you use WebRTC, you get:
Peer-to-peer streaming (where possible)
NAT traversal (to enable those P2P connections, where possible, or proxy them when not)
Low latency optimization, from end to end
Hardware acceleration (where available)
Opus audio codec
Automatic resampling, for compatibility and to keep latency low as things drop
In other words, this is already a solved problem with WebRTC.
I have an application in which I open a websocket from a browser (Chrome, in my case) to a server, then I start sending messages from the server side to the browser. What I am finding is that when I send messages too quickly from the server, messages start getting buffered up on the browser side. This means that the browser "falls behind" and ends up processing messages sent long ago by the server, which in my application is undesirable.
I have eliminated the following possible candidates for where this buffering is happening:
The server. I can kill the server process entirely and I see that messages continue to be received by my javascript code for several minutes, so the buffering is not happening inside the server process.
The network. I can reproduce the same issue when running the server on the same machine as my web browser, and the amount of data that I am sending is far below the bandwidth constraints for a TCP connection to localhost.
This leaves the browser. Is there any way I can (a) determine the size of the buffer chrome is maintaining for my websocket, or (b) reduce the size of this buffer and cause chrome to drop frames?
(a) Chrome buffers around 128KB per WebSocket connection. The amount
of buffered data is not visible to the application.
(b) Chrome will never intentionally drop frames (this would violate the standard).
When the processing done by Javascript is trivial, Chrome can handle over 50 MB per second over a WebSocket. So it sounds like the processing you are doing is non-trivial. You can drop messages that are too old in the onmessage handler (but please bear in mind that the clock on the client may be out-of-sync with the clock on the server).
If the main thread of the browser is always busy, even dropping messages may not be enough to keep up. I recommend the "Performance" tab in Chrome Devtools as a good way to see where your application is spending its time.
Many, if not all modern browsers are not using pipelined HTTP requests. In theory pipelining should speed up requests by reducing the number of round trip times required to fetch a website.
According to the HTTP standard, all servers must handle pipelined requests, so the problem should not be in lack of support on the servers.
I have seen some security concerns, such as a layer 7 DoS attack if a client pushes as many pipelined requests as possible to a URL that's performance-intensive for the server, ignoring any answers that might be received.
That would be a reason to turn pipelining support off on the server (violating the standard), but I cannot find any reason to turn it off on the clients.
It is however turned on by default on Android browsers and Chrome mobile.
Why are Chrome, Firefox, IE, Opera and Safari not using pipelined HTTP requests in their desktop (and sometimes mobile) version? What is their reasoning behind turning it off?
Pipelining is disabled for the following reasons:
Firefox:
The bigger issue has frankly been head of line blocking and its impact on performance and robustness. Naïve pipelines simply make performance worse.
Chrome:
The option to enable pipelining has been removed from Chrome, as there are known crashing bugs and known front-of-queue blocking issues. There are also a large number of servers and middleboxes that behave badly and inconsistently when pipelining is enabled. Until these are resolved, it's recommended nobody uses pipelining. Doing so currently requires a custom build of Chromium.
In general:
Buggy proxies are still common and these lead to strange and erratic behaviors that Web developers cannot foresee and diagnose easily.
Pipelining is complex to implement correctly: the size of the resource being transferred, the effective RTT that will be used, as well as the effective bandwidth, have a direct incidence on the improvement provided by the pipeline. Without knowing these, important messages may be delayed behind unimportant ones. The notion of important even evolves during page layout! HTTP pipelining therefore brings a marginal improvement in most cases only.
Pipelining is subject to the HOL problem.
HTTP/2 offers an alternative:
With HTTP/1.x, the browser has limited ability to leverage above priority data: the protocol does not support multiplexing, and there is no way to communicate request priority to the server. Instead, it must rely on the use of parallel connections, which enables limited parallelism of up to six requests per origin. As a result, requests are queued on the client until a connection is available, which adds unnecessary network latency. In theory, HTTP Pipelining tried to partially address this problem, but in practice it has failed to gain adoption.
HTTP/2 resolves these inefficiencies: request queuing and head-of-line blocking is eliminated because the browser can dispatch all requests the moment they are discovered, and the browser can communicate its stream prioritization preference via stream dependencies and weights, allowing the server to further optimize response delivery.
A proxy can be used as well:
You can try something I did to speed up Konqueror in KDE3. I was dissatisfied that Konqueror did not have HTTP pipelining, so after some searching, I installed Polipo as a local HTTP/HTTPS/FTP proxy and set Konqueror to use it (localhost on port 8123 if I remember correctly). In addition to HTTP pipelining, Polipo also provided improved caching, and since it was a proxy, I could set every browser to use it and the caching would be shared between the browsers. (This also means that it is a good idea to disable each browser's independent caching.)
Salesforce uses the following process:
Salesforce has a powerful and field-tested approach for mitigating HOLB at the TCP layer: we decouple the relation between an HTTP request and a TCP connection. Think about your transport as composed of multiple TCP connections (as many as the network context would need). Any part of the HTTP request can go over any TCP connection. So if you hit the HOLB in one connection, it not only helps in mitigating affected requests, it also minimizes impact to other application requests using healthy connections. The result is an ability to enjoy the benefits of multiplexing and pipelining at the HTTP layer while minimizing risks of HOLB.
References
Mozilla Bug 264354 – Enable HTTP pipelining by default
HTTP Pipelining - The Chromium Projects
Chromium Issue 364557: Remove pipelining code from Chrome
Understanding Connection Limits and New Proxy Connection Limits in WinInet and Internet Explorer – Http Client Protocol Issues (and other fun stuff I support)
HTTPS and Keep-Alive Connections – IEInternals
Changes in WinHttp on Windows 7 and onwards wrt HTTP/1.0 – HTTPContext
Content-Length and Transfer-Encoding Validation in the IE10 Download Manager – IEInternals
Use Sensible Long-Lived Cache headers – IEInternals
Web Performance : 2015 : March | Akamai Community
WebSockets, caution required!
HTTP: HTTP/2 - High Performance Browser Networking (O'Reilly)
HTTP Pipelining - Not So Fast...(Nor Slow!) – Guy's Pod
Persistent Connection Behavior of Popular Browsers
Connection management in HTTP/1.x - HTTP | MDN
Download Resumption in Internet Explorer – IEInternals
Networking Improvements in IE10 and Windows 8 – IEInternals
Konqueror very slowly (KDE4) • KDE Community Forums
HTTP Optimization: Multiple TCP Connections and Pipelining
SpeedGuide :: Internet Explorer, Chrome, Firefox Web Browser Tweaks
The Full Picture on HTTP/2 and HOL Blocking – Salesforce Engineering
The accepted answer may be somewhat out of date. Today I've seen chrome desktop pipeline 10 requests in a single HTTPS connection against our server, which provided me with the pipeline counts.
Many Comet implementations like Caplin provide server scalable solutions.
Following is one of the statistics from Caplin site:
A single instance of Caplin liberator can support up to 100,000 clients each receiving 1 message per second with an average latency of less than 7ms.
How does this to compare to HTML5 websockets on any webserver? Can anyone point me to any HTML 5 websockets statistics?
Disclosure - I work for Caplin.
There is a bit of misinformation on this page so I'd like to try and make it clearer..
I think we could split up the methods we are talking about into three camps..
Comet HTTP polling - including long polling
Comet HTTP streaming - server to client messages use a single persistent socket with no HTTP header overhead after initial setup
Comet WebSocket - single bidirectional socket
I see them all as Comet, since Comet is just a paradigm, but since WebSocket came along some people want to treat it like it is different or replaces Comet - but it is just another technique - and unless you are happy only supporting the latest browsers then you can't just rely on WebSocket.
As far as performance is concerned, most benchmarks concentrate on server to client messages - numbers of users, numbers of messages per second, and the latency of those messages. For this scenario there is no fundamental difference between HTTP Streaming and WebSocket - both are writing messages down an open socket with little or no header or overhead.
Long polling can give good latency if the frequency of messages is low. However, if you have two messages (server to client) in quick succession then the second one will not arrive at the client until a new request is made after the first message is received.
I think someone touched on HTTP KeepAlive. This can obviously improve Long polling - you still have the overhead of the roundtrip and headers, but not always the socket creation.
Where WebSocket should improve upon HTTP Streaming in scenarios where there are more client to server messages. Relating these scenarios to the real world creates slightly more arbitrary setups, compared to the simple to understand 'send lots of messages to lots of clients' which everyone can understand. For example, in a trading application, creating a scenario where you include users executing trades (ie client to server messages) is easy, but the results a bit less meaningful than the basic server to client scenarios. Traders are not trying to do 100 trades/sec - so you end up with results like '10000 users receiving 100 messages/sec while also sending a client message once every 5 minutes'. The more interesting part for the client to server message is the latency, since the number of messages required is usually insignificant compared to the server to client messages.
Another point someone made above, about 64k clients, You do not need to do anything clever to support more than 64k sockets on a server - other than configuring the number file descriptors etc. If you were trying to do 64k connection from a single client machine, that is totally different as they need a port number for each one - on the server end it is fine though, that is the listen end, and you can go above 64k sockets fine.
In theory, WebSockets can scale much better than HTTP but there are some caveats and some ways to address those caveats too.
The complexity of the handshake header processing of HTTP vs WebSockets is about the same. The HTTP (and initial WebSocket) handshake can easily be over 1K of data (due to cookies, etc). The important difference is that the HTTP handshake happens again every message. Once a WebSocket connection is established, the overhead per message is only 2-14 bytes.
The excellent Jetty benchmark links posted in #David Titarenco's answer (1, 2) show that WebSockets can easily achieve more than an order of magnitude better latency when compared to Comet.
See this answer for more information on scaling of WebSockets vs HTTP.
Caveats:
WebSocket connections are long-lived unlike HTTP connections which are short-lived. This significantly reduces the overhead (no socket creation and management for every request/response), but it does mean that to scale a server above 64k separate simultaneous client hosts you will need to use tricks like multiple IP addresses on the same server.
Due to security concerns with web intermediaries, browser to server WebSocket messages have all payload data XOR masked. This adds some CPU utilization to the server to decode the messages. However, XOR is one of the most efficient operations in most CPU architectures and there is often hardware assist available. Server to browser messages are not masked and since many uses of WebSockets don't require large amounts of data sent from browser to server, this isn't a big issue.
It's hard to know how that compares to anything because we don't know how big the (average) payload size is. Under the hood (as in how the server is implemented), HTTP streaming and websockets are virtually identical - apart from the initial handshake which is more complicated when done with HTTP obviously.
If you wrote your own websocket server in C (ala Caplin), you could probably reach those numbers without too much difficulty. Most websocket implementations are done through existing server packages (like Jetty) so the comparison wouldn't really be fair.
Some benchmarks:
http://webtide.intalio.com/2011/09/cometd-2-4-0-websocket-benchmarks/
http://webtide.intalio.com/2011/08/prelim-cometd-websocket-benchmarks/
However, if you look at C event lib benchmarks, like libev and libevent, the numbers look significantly sexier:
http://libev.schmorp.de/bench.html
Ignoring any form of polling, which as explained elsewhere, can introduce latency when the update rate is high, the three most common techniques for JavaScript streaming are:
WebSocket
Comet XHR/XDR streaming
Comet Forever IFrame
WebSocket is by far the cleanest solution, but there are still issues in terms of browser and network infrastructure not supporting it. The sooner it can be relied upon the better.
XHR/XDR & Forever IFrame are both fine for pushing data to clients from the server, but require various hacks to be made to work consistently across all browsers. In my experience these Comet approaches are always slightly slower than WebSockets not least because there is a lot more client side JavaScript code required to make it work - from the server's perspective, however, sending data over the wire happens at the same speed.
Here are some more WebSocket benchmark graphs, this time for our product my-Channels Nirvana.
Skip past the multicast and binary data graphs down to the last graph on the page (JavaScript High Update Rate)
In summary - The results show Nirvana WebSocket delivering 50 events/sec to 2,500k users with 800 microsecond latency. At 5,000 users (total of 250k events/sec streamed) the latency is 2 milliseconds.
I am curious if anyone has any information about the scalability of HTML WebSockets. For everything I've read it appears that every client will maintain an open line of communication with the server. I'm just wondering how that scales and how many open WebSocket connections a server can handle. Maybe leaving those connections open isn't a problem in reality, but it feels like it is.
In most ways WebSockets will probably scale better than AJAX/HTML requests. However, that doesn't mean WebSockets is a replacement for all uses of AJAX/HTML.
Each TCP connection in itself consumes very little in terms server resources. Often setting up the connection can be expensive but maintaining an idle connection it is almost free. The first limitation that is usually encountered is the maximum number of file descriptors (sockets consume file descriptors) that can be open simultaneously. This often defaults to 1024 but can easily be configured higher.
Ever tried configuring a web server to support tens of thousands of simultaneous AJAX clients? Change those clients into WebSockets clients and it just might be feasible.
HTTP connections, while they don't create open files or consume port numbers for a long period, are more expensive in just about every other way:
Each HTTP connection carries a lot of baggage that isn't used most of the time: cookies, content type, conetent length, user-agent, server id, date, last-modified, etc. Once a WebSockets connection is established, only the data required by the application needs to be sent back and forth.
Typically, HTTP servers are configured to log the start and completion of every HTTP request taking up disk and CPU time. It will become standard to log the start and completion of WebSockets data, but while the WebSockets connection doing duplex transfer there won't be any additional logging overhead (except by the application/service if it is designed to do so).
Typically, interactive applications that use AJAX either continuously poll or use some sort of long-poll mechanism. WebSockets is a much cleaner (and lower resource) way of doing a more event'd model where the server and client notify each other when they have something to report over the existing connection.
Most of the popular web servers in production have a pool of processes (or threads) for handling HTTP requests. As pressure increases the size of the pool will be increased because each process/thread handles one HTTP request at a time. Each additional process/thread uses more memory and creating new processes/threads is quite a bit more expensive than creating new socket connections (which those process/threads still have to do). Most of the popular WebSockets server frameworks are going the event'd route which tends to scale and perform better.
The primary benefit of WebSockets will be lower latency connections for interactive web applications. It will scale better and consume less server resources than HTTP AJAX/long-poll (assuming the application/server is designed properly), but IMO lower latency is the primary benefit of WebSockets because it will enable new classes of web applications that are not possible with the current overhead and latency of AJAX/long-poll.
Once the WebSockets standard becomes more finalized and has broader support, it will make sense to use it for most new interactive web applications that need to communicate frequently with the server. For existing interactive web applications it will really depend on how well the current AJAX/long-poll model is working. The effort to convert will be non-trivial so in many cases the cost just won't be worth the benefit.
Update:
Useful link: 600k concurrent websocket connections on AWS using Node.js
Just a clarification: the number of client connections that a server can support has nothing to do with ports in this scenario, since the server is [typically] only listening for WS/WSS connections on one single port. I think what the other commenters meant to refer to were file descriptors. You can set the maximum number of file descriptors quite high, but then you have to watch out for socket buffer sizes adding up for each open TCP/IP socket. Here's some additional info: https://serverfault.com/questions/48717/practical-maximum-open-file-descriptors-ulimit-n-for-a-high-volume-system
As for decreased latency via WS vs. HTTP, it's true since there's no more parsing of HTTP headers beyond the initial WS handshake. Plus, as more and more packets are successfully sent, the TCP congestion window widens, effectively reducing the RTT.
Any modern single server is able to server thousands of clients at once. Its HTTP server software has just to be is Event-Driven (IOCP) oriented (we are not in the old Apache one connection = one thread/process equation any more). Even the HTTP server built in Windows (http.sys) is IOCP oriented and very efficient (running in kernel mode). From this point of view, there won't be a lot of difference at scaling between WebSockets and regular HTTP connection. One TCP/IP connection uses a little resource (much less than a thread), and modern OS are optimized for handling a lot of concurrent connections: WebSockets and HTTP are just OSI 7 application layer protocols, inheriting from this TCP/IP specifications.
But, from experiment, I've seen two main problems with WebSockets:
They do not support CDN;
They have potential security issues.
So I would recommend the following, for any project:
Use WebSockets for client notifications only (with a fallback mechanism to long-polling - there are plenty of libraries around);
Use RESTful / JSON for all other data, using a CDN or proxies for cache.
In practice, full WebSockets applications do not scale well. Just use WebSockets for what they were designed to: push notifications from the server to the client.
About the potential problems of using WebSockets:
1. Consider using a CDN
Today (almost 4 years later), web scaling involves using Content Delivery Network (CDN) front ends, not only for static content (html,css,js) but also your (JSON) application data.
Of course, you won't put all your data on your CDN cache, but in practice, a lot of common content won't change often. I suspect that 80% of your REST resources may be cached... Even a one minute (or 30 seconds) CDN expiration timeout may be enough to give your central server a new live, and enhance the application responsiveness a lot, since CDN can be geographically tuned...
To my knowledge, there is no WebSockets support in CDN yet, and I suspect it would never be. WebSockets are statefull, whereas HTTP is stateless, so is much easily cached. In fact, to make WebSockets CDN-friendly, you may need to switch to a stateless RESTful approach... which would not be WebSockets any more.
2. Security issues
WebSockets have potential security issues, especially about DOS attacks. For illustration about new security vulnerabilities , see this set of slides and this webkit ticket.
WebSockets avoid any chance of packet inspection at OSI 7 application layer level, which comes to be pretty standard nowadays, in any business security. In fact, WebSockets makes the transmission obfuscated, so may be a major breach of security leak.
Think of it this way: what is cheaper, keeping an open connection, or opening a new connection for every request (with the negotiation overhead of doing so, remember it's TCP.)
Of course it depends on the application, but for long-term realtime connections (e.g. an AJAX chat) it's far better to keep the connection open.
The max number of connections will be capped by the max number of free ports for the sockets.
No it does not scale, gives tremendous work to intermediate routes switches. Then on the server side the page faults (you have to keep all those descriptors) are reaching high values, and the time to bring a resource into the work area increases. These are mostly JAVA written servers and it might be faster to hold on those gazilions of sockets then to destroy/create one.
When you run such a server on a machine any other process can't move anymore.