I'm reading about long polling, comet etc and, working on .Net, I read all I have found about SignalR and AspComet. I'm a newbie about comet & similar but it is not clear to me what are the advantages in using SignalR or AspComet where i can simply use jquery.ajax/complete:
(function poll(){
$.ajax({ url: "server", success: function(data){
doSomething();
}, dataType: "json", complete: poll, timeout: 30000 });
})();
but I am clearly missing something here, can you help?
Also, from a system/server point of view what are the main differences? I understand that with $.ajax I open a connection to the server and keep it opened for long (with all the disadvantages of too many open simultaneous connection etc) but I assume SignalR does the same. Or not?
While, on this, AspComet says that it releases the thread back to the request pool.
I know, I'm a bit confused and an intro about the advantages to use SignalR and/or AspComet vs the $.ajax stuff would be greatly appreciated :)
Thanks!
In your $.ajax example you are sending multiple requests to the server and you are doing this for each client. So your web server gets constantly hammered by HTTP requests and only few of them actually serve a purpose because the data that clients have subscribed to notifications for might not change as often. As long as one AJAX request completes another one is sent immediately.
Whereas with long polling you are sending a single request which is blocked by the server and allows it to write to the response and thus notify the client.
The advantage of long polling is that you are limiting the number of HTTP requests sent to your server.
COMET applications often require a custom server. IIS keeps a dedicated thread to handle each request which obviously doesn't scale. There is a limit of a few thousand threads per CPU in IIS.
AspComet solves this problem by providing a server side solution to handle the thread lifetime (like you wrote, it returns the threads back to the pool). AspComet is compatible with the Bayeux Protocol so you can use any Bayeux JS client.
SignalR is a client/server solution that encapsulates the underlying communication protocol in asynchronous calls. SignalR chooses the best protocol available (long polling, websockets or other) so you don't need to worry about it. They have clients for .NET, Silverlight, WP7, JS, etc.
Related
I'm looking for a way to send information from a server to a client, for example a song or an image.
Explanation : I want to send a data from my server to the clients who downloaded the HTML5 application.
But I dont know how. I know I can send a php request from the client to the server and answer afterwards, but how could I send something from the server without the client ask it.
Thanks.
You may want to try either Server-Sent Events or WebSockets:
http://www.html5rocks.com/en/tutorials/eventsource/basics/
http://www.html5rocks.com/en/tutorials/websockets/basics/
These technologies allow a client web application to remain open to communication from the server at any time. Server-Sent Events are exclusively server-to-client, whilst WebSockets can be used for bi-directional communication.
Adding to jokeyrhyme's answer...
You want to asynchronously send data from the server to the client. This means the client doesn't know when to expect the data. In practical terms, on today's Web, you have the following options:
Some form of polling, long polling, Comet, etc.
WebSocket
The first option is better understood since those techniques have been around for a long time.
WebSocket is newer but is the better solution as it alleviates the problems that plague HTTP-based techniques with polling, long polling, etc. For a small application, or one that polls infrequently, you can get away with polling. But when you want to scale, those solutions run into problems.
I would not bother with SSE (Server-Sent Events) as that is pretty much a subset of WebSocket. Anyone considering SSE usually ends up just using WebSocket since it's about the same amount of work and it gives you more (e.g. two-way interaction).
However WebSocket doesn't replace HTTP; an application can use both at the same time. Use the right tool for the right job.
In your case, have a client with a WebSocket connection. Then your backend application can notify the client at any time (asynchronously) that there is something to do (e.g. a new song or image is available, as you said in your original post).
I would not bother sending the song or image down the WebSocket connection, although you could. Instead, the client can fetch the song or image using traditional HTTP techniques, which are well understood and good at handling static content. For example, you can take advantage of caching if multiple people are downloading the same (song or image) file.
So, send the id or URL of the song/image to be downloaded via the WebSocket to the client. Then fetch the song/image via HTTP.
That's an example of using both HTTP and WebSocket for their strengths. WebSocket for the efficient asynchronous interaction with virtually no bandwidth consumption, and HTTP for efficient fetching of static resources.
What is the difference between a simple Async servlet and the Comet / Bayeux protocol?
I am trying to implement a "Server Push" (or "Reverse Ajax") kind of webpage that will receive updates from the server as and when events occur on the server. So even without the client explicitly sending a request, I need the server to be able to send responses to the specific client browser.
I understand that Comet is the umbrella term for these kind of technologies; with 'Bayeux' being the protocol. But when I looked through the servlet spec, even the 'Async servlet' seems to accomplish the same thing. I mean I can define a simple servlet with the
<async-supported>
attribute set to true in the web.xml; and that servlet will be able to asynchronously send responses to the client. I can then have a jQuery or ExtJS based ajax client that just keeps doing a
long_polling()
call into the servlet. Something like what is described in the link below
http://www.ibm.com/developerworks/web/library/wa-reverseajax1/index.html#long
So my question is this:
What is the difference between a simple Async servlet and the Comet / Bayeux protocol?
Thanks
It is true that "Comet" is the term for these technologies, but the Bayeux protocol is used only by few implementations. A Comet technique can use any protocol it wants; Bayeux is one of them.
Having said that, there are two main differences between an async servlet solution and a Comet+Bayeux solution.
The first difference is that the Comet+Bayeux solution is independent of the protocol that transports Bayeux.
In the CometD project, there are pluggable transports for both clients and servers that can carry Bayeux.
You can carry it using HTTP, with Bayeux being the content of a POST request, but you can also carry it using WebSocket, with Bayeux being the payload of the WebSocket message.
If you use async servlets, you cannot leverage WebSocket, which is way more efficient than HTTP.
The second difference is that async servlets only carry HTTP, and you need more than that to handle remote Comet clients.
For example, you may want to identify uniquely the clients, so that 2 tabs for the same page result in 2 different clients. To do this, you need add a "property" to the async servlet request, let's call it sessionId.
Next, you want to be able to authenticate a client; only authenticated clients can get a sessionId. But to differentiate between first requests to authenticate and others subsequent requests already authenticated, you need another property, say messageType.
Next, you want to be able to notify quickly disconnections due to network loss or other connectivity problems; so you need to come up with a heart-beat solution so that if the heart beats you know the connection is alive, if it does not beat you know it's dead, and perform rescue actions.
Next you need disconnect features. And so on.
Quickly you realize that you're building another protocol on top of HTTP.
At that point, it's better to reuse an existing protocol like Bayeux, and proven solutions like CometD (which is based on Comet techniques using the Bayeux protocol) that gives you:
Java and JavaScript client libraries with simple yet powerful APIs
Java server library to perform your application logic without the need to handle low level details such as HTTP or WebSocket via annotated services
Transport pluggability, both client and server
Bayeux protocol extensibility
Lazy messages
Clustering
Top performance
Future proof: users of CometD before the advent of WebSocket did not change a line of code to take advantage of WebSocket - all the magic was implemented in the libraries
Based on standards
Designed and maintained by web protocols experts
Extended documentation
I can continue, but you get the point :)
You don't want to use a low-level solution that ties you to HTTP only. You want to use a higher level solution that abstracts your application from the Comet technique used and from the protocol that transports Bayeux, so that your application can be written once and leverage future technology improvements. As an example of technology improvement, CometD was working well way before async servlets came into picture, and now with async servlet just became more scalable, and so your application, without the need to change a single line in the application.
By using a higher level solution you can concentrate on your application rather than on the gory details of how to write correctly an async servlet (and it's not that easy as one may think).
The answer to your question could be: you use Comet+Bayeux because you want to stand on the shoulder of giants.
I am developing a web application that is connected to a server and I need the server to push some information to the clients on a given time.
Therefore I started to read about Server-sent Events (SSE) because the website is been developed on HTML5 and SSE seemed that fit what I was looking for. But what a surprise when I read that what SSE was really doing was sending request FROM the client to the server instead of the opposite way(Yesterday I think I understood that long polling is sort of a push emulation). Therefore I start to read about web sockets (but it seamed that the standard is still a draft) and also had a look to Comet. But I think I cannot fit all the pieces on my mind.
Would someone highlight these technologies (and maybe some other push tec.) that fit my problem and which situation is more appropriate for each one?
Thanks so much, I think I am totally lost on this field.
This post is a better explanation, discussing the difference/advantages/etc, about Long Polling, Comet, SSE and WebSockets.
For the most part, the client usually has to make the first request to the server to establish a connection. Once a connection is established, then the server can push data to the client. So even with WebSockets, the client will make the initial request to the server for establishing a reliable connection between the two.
Server-Sent Events uses a normal HTTP GET request to establish a connection with the server. It's also a read-only connection for the client. It has the benefit of having an easy implementation since we don't have to define a new protocol. The issue is that HTTP connections, even as persistent-connections, are closed after around 15 seconds by most web servers. Even for long standing requests, the web server often has a timeout after which it closes the connection. This is where the idea of long polling comes in. It's a simple idea that we make a normal ajax request to the server and the server leaves it open for as long as possible. If the request is closed by the server for whatever reason, you immediately make the same request again. You can implement a long polling mechanism (ie. Comet) pretty easily with a server such as Node.js and a normal Ajax request from the browser. Server-Sent Events tries to abstract away the browser side implementation of this with EventSource. So instead of you having to implement the browser/client side code for long polling/comet, the browser handles this for you. It provides a nice abstraction of what seems like a persistent connection. Your web server just needs to look out for GET requests which specify the Content-Type, in the header, as "text/event-stream" and leave the HTTP connection open as long as possible.
I would suggest that you don't over complicate what Server-Sent Events are. If you understand a normal HTTP GET request, then you likely already have a 90% understand of the idea behind it.
There is one difference between SSE/Comet and traditional long polling that might be worth highlighting. From my experience, the idea behind long polling is that your request doesn't return until you have an update. At which point the HTTP connection is closed and another request is made immediately afterwards. With SSE, though you can close the HTTP connection right after you send the updated message, your aim is to flush the data from the server to the client without actually closing/ending the HTTP request. This avoids the overhead of actually making a GET request. This can be achieved with a regular ajax request, but again SSE provides a nice/efficient implementation with EventSource.
Edit: clarify distinction between SSE and long polling.
Many Comet implementations like Caplin provide server scalable solutions.
Following is one of the statistics from Caplin site:
A single instance of Caplin liberator can support up to 100,000 clients each receiving 1 message per second with an average latency of less than 7ms.
How does this to compare to HTML5 websockets on any webserver? Can anyone point me to any HTML 5 websockets statistics?
Disclosure - I work for Caplin.
There is a bit of misinformation on this page so I'd like to try and make it clearer..
I think we could split up the methods we are talking about into three camps..
Comet HTTP polling - including long polling
Comet HTTP streaming - server to client messages use a single persistent socket with no HTTP header overhead after initial setup
Comet WebSocket - single bidirectional socket
I see them all as Comet, since Comet is just a paradigm, but since WebSocket came along some people want to treat it like it is different or replaces Comet - but it is just another technique - and unless you are happy only supporting the latest browsers then you can't just rely on WebSocket.
As far as performance is concerned, most benchmarks concentrate on server to client messages - numbers of users, numbers of messages per second, and the latency of those messages. For this scenario there is no fundamental difference between HTTP Streaming and WebSocket - both are writing messages down an open socket with little or no header or overhead.
Long polling can give good latency if the frequency of messages is low. However, if you have two messages (server to client) in quick succession then the second one will not arrive at the client until a new request is made after the first message is received.
I think someone touched on HTTP KeepAlive. This can obviously improve Long polling - you still have the overhead of the roundtrip and headers, but not always the socket creation.
Where WebSocket should improve upon HTTP Streaming in scenarios where there are more client to server messages. Relating these scenarios to the real world creates slightly more arbitrary setups, compared to the simple to understand 'send lots of messages to lots of clients' which everyone can understand. For example, in a trading application, creating a scenario where you include users executing trades (ie client to server messages) is easy, but the results a bit less meaningful than the basic server to client scenarios. Traders are not trying to do 100 trades/sec - so you end up with results like '10000 users receiving 100 messages/sec while also sending a client message once every 5 minutes'. The more interesting part for the client to server message is the latency, since the number of messages required is usually insignificant compared to the server to client messages.
Another point someone made above, about 64k clients, You do not need to do anything clever to support more than 64k sockets on a server - other than configuring the number file descriptors etc. If you were trying to do 64k connection from a single client machine, that is totally different as they need a port number for each one - on the server end it is fine though, that is the listen end, and you can go above 64k sockets fine.
In theory, WebSockets can scale much better than HTTP but there are some caveats and some ways to address those caveats too.
The complexity of the handshake header processing of HTTP vs WebSockets is about the same. The HTTP (and initial WebSocket) handshake can easily be over 1K of data (due to cookies, etc). The important difference is that the HTTP handshake happens again every message. Once a WebSocket connection is established, the overhead per message is only 2-14 bytes.
The excellent Jetty benchmark links posted in #David Titarenco's answer (1, 2) show that WebSockets can easily achieve more than an order of magnitude better latency when compared to Comet.
See this answer for more information on scaling of WebSockets vs HTTP.
Caveats:
WebSocket connections are long-lived unlike HTTP connections which are short-lived. This significantly reduces the overhead (no socket creation and management for every request/response), but it does mean that to scale a server above 64k separate simultaneous client hosts you will need to use tricks like multiple IP addresses on the same server.
Due to security concerns with web intermediaries, browser to server WebSocket messages have all payload data XOR masked. This adds some CPU utilization to the server to decode the messages. However, XOR is one of the most efficient operations in most CPU architectures and there is often hardware assist available. Server to browser messages are not masked and since many uses of WebSockets don't require large amounts of data sent from browser to server, this isn't a big issue.
It's hard to know how that compares to anything because we don't know how big the (average) payload size is. Under the hood (as in how the server is implemented), HTTP streaming and websockets are virtually identical - apart from the initial handshake which is more complicated when done with HTTP obviously.
If you wrote your own websocket server in C (ala Caplin), you could probably reach those numbers without too much difficulty. Most websocket implementations are done through existing server packages (like Jetty) so the comparison wouldn't really be fair.
Some benchmarks:
http://webtide.intalio.com/2011/09/cometd-2-4-0-websocket-benchmarks/
http://webtide.intalio.com/2011/08/prelim-cometd-websocket-benchmarks/
However, if you look at C event lib benchmarks, like libev and libevent, the numbers look significantly sexier:
http://libev.schmorp.de/bench.html
Ignoring any form of polling, which as explained elsewhere, can introduce latency when the update rate is high, the three most common techniques for JavaScript streaming are:
WebSocket
Comet XHR/XDR streaming
Comet Forever IFrame
WebSocket is by far the cleanest solution, but there are still issues in terms of browser and network infrastructure not supporting it. The sooner it can be relied upon the better.
XHR/XDR & Forever IFrame are both fine for pushing data to clients from the server, but require various hacks to be made to work consistently across all browsers. In my experience these Comet approaches are always slightly slower than WebSockets not least because there is a lot more client side JavaScript code required to make it work - from the server's perspective, however, sending data over the wire happens at the same speed.
Here are some more WebSocket benchmark graphs, this time for our product my-Channels Nirvana.
Skip past the multicast and binary data graphs down to the last graph on the page (JavaScript High Update Rate)
In summary - The results show Nirvana WebSocket delivering 50 events/sec to 2,500k users with 800 microsecond latency. At 5,000 users (total of 250k events/sec streamed) the latency is 2 milliseconds.
I am curious if anyone has any information about the scalability of HTML WebSockets. For everything I've read it appears that every client will maintain an open line of communication with the server. I'm just wondering how that scales and how many open WebSocket connections a server can handle. Maybe leaving those connections open isn't a problem in reality, but it feels like it is.
In most ways WebSockets will probably scale better than AJAX/HTML requests. However, that doesn't mean WebSockets is a replacement for all uses of AJAX/HTML.
Each TCP connection in itself consumes very little in terms server resources. Often setting up the connection can be expensive but maintaining an idle connection it is almost free. The first limitation that is usually encountered is the maximum number of file descriptors (sockets consume file descriptors) that can be open simultaneously. This often defaults to 1024 but can easily be configured higher.
Ever tried configuring a web server to support tens of thousands of simultaneous AJAX clients? Change those clients into WebSockets clients and it just might be feasible.
HTTP connections, while they don't create open files or consume port numbers for a long period, are more expensive in just about every other way:
Each HTTP connection carries a lot of baggage that isn't used most of the time: cookies, content type, conetent length, user-agent, server id, date, last-modified, etc. Once a WebSockets connection is established, only the data required by the application needs to be sent back and forth.
Typically, HTTP servers are configured to log the start and completion of every HTTP request taking up disk and CPU time. It will become standard to log the start and completion of WebSockets data, but while the WebSockets connection doing duplex transfer there won't be any additional logging overhead (except by the application/service if it is designed to do so).
Typically, interactive applications that use AJAX either continuously poll or use some sort of long-poll mechanism. WebSockets is a much cleaner (and lower resource) way of doing a more event'd model where the server and client notify each other when they have something to report over the existing connection.
Most of the popular web servers in production have a pool of processes (or threads) for handling HTTP requests. As pressure increases the size of the pool will be increased because each process/thread handles one HTTP request at a time. Each additional process/thread uses more memory and creating new processes/threads is quite a bit more expensive than creating new socket connections (which those process/threads still have to do). Most of the popular WebSockets server frameworks are going the event'd route which tends to scale and perform better.
The primary benefit of WebSockets will be lower latency connections for interactive web applications. It will scale better and consume less server resources than HTTP AJAX/long-poll (assuming the application/server is designed properly), but IMO lower latency is the primary benefit of WebSockets because it will enable new classes of web applications that are not possible with the current overhead and latency of AJAX/long-poll.
Once the WebSockets standard becomes more finalized and has broader support, it will make sense to use it for most new interactive web applications that need to communicate frequently with the server. For existing interactive web applications it will really depend on how well the current AJAX/long-poll model is working. The effort to convert will be non-trivial so in many cases the cost just won't be worth the benefit.
Update:
Useful link: 600k concurrent websocket connections on AWS using Node.js
Just a clarification: the number of client connections that a server can support has nothing to do with ports in this scenario, since the server is [typically] only listening for WS/WSS connections on one single port. I think what the other commenters meant to refer to were file descriptors. You can set the maximum number of file descriptors quite high, but then you have to watch out for socket buffer sizes adding up for each open TCP/IP socket. Here's some additional info: https://serverfault.com/questions/48717/practical-maximum-open-file-descriptors-ulimit-n-for-a-high-volume-system
As for decreased latency via WS vs. HTTP, it's true since there's no more parsing of HTTP headers beyond the initial WS handshake. Plus, as more and more packets are successfully sent, the TCP congestion window widens, effectively reducing the RTT.
Any modern single server is able to server thousands of clients at once. Its HTTP server software has just to be is Event-Driven (IOCP) oriented (we are not in the old Apache one connection = one thread/process equation any more). Even the HTTP server built in Windows (http.sys) is IOCP oriented and very efficient (running in kernel mode). From this point of view, there won't be a lot of difference at scaling between WebSockets and regular HTTP connection. One TCP/IP connection uses a little resource (much less than a thread), and modern OS are optimized for handling a lot of concurrent connections: WebSockets and HTTP are just OSI 7 application layer protocols, inheriting from this TCP/IP specifications.
But, from experiment, I've seen two main problems with WebSockets:
They do not support CDN;
They have potential security issues.
So I would recommend the following, for any project:
Use WebSockets for client notifications only (with a fallback mechanism to long-polling - there are plenty of libraries around);
Use RESTful / JSON for all other data, using a CDN or proxies for cache.
In practice, full WebSockets applications do not scale well. Just use WebSockets for what they were designed to: push notifications from the server to the client.
About the potential problems of using WebSockets:
1. Consider using a CDN
Today (almost 4 years later), web scaling involves using Content Delivery Network (CDN) front ends, not only for static content (html,css,js) but also your (JSON) application data.
Of course, you won't put all your data on your CDN cache, but in practice, a lot of common content won't change often. I suspect that 80% of your REST resources may be cached... Even a one minute (or 30 seconds) CDN expiration timeout may be enough to give your central server a new live, and enhance the application responsiveness a lot, since CDN can be geographically tuned...
To my knowledge, there is no WebSockets support in CDN yet, and I suspect it would never be. WebSockets are statefull, whereas HTTP is stateless, so is much easily cached. In fact, to make WebSockets CDN-friendly, you may need to switch to a stateless RESTful approach... which would not be WebSockets any more.
2. Security issues
WebSockets have potential security issues, especially about DOS attacks. For illustration about new security vulnerabilities , see this set of slides and this webkit ticket.
WebSockets avoid any chance of packet inspection at OSI 7 application layer level, which comes to be pretty standard nowadays, in any business security. In fact, WebSockets makes the transmission obfuscated, so may be a major breach of security leak.
Think of it this way: what is cheaper, keeping an open connection, or opening a new connection for every request (with the negotiation overhead of doing so, remember it's TCP.)
Of course it depends on the application, but for long-term realtime connections (e.g. an AJAX chat) it's far better to keep the connection open.
The max number of connections will be capped by the max number of free ports for the sockets.
No it does not scale, gives tremendous work to intermediate routes switches. Then on the server side the page faults (you have to keep all those descriptors) are reaching high values, and the time to bring a resource into the work area increases. These are mostly JAVA written servers and it might be faster to hold on those gazilions of sockets then to destroy/create one.
When you run such a server on a machine any other process can't move anymore.