WebRTC signaling uses STUN to exchange ICE candidates and SDPs with peers, but how does it know who to exchange that information with?
Obviously, it's not with just anyone and just by virtue of using the same STUN server doesn't mean you'll get paired with some rando. However, I'd like to know how that selection is made. Probably the URI plays a role, but how big of a role? And is it possible to influence that?
It would seem like a problem if it were URI alone. Then someone could just bombard STUN servers with offers or requests to know who's currently on that webpage with WebRTC active.
WebRTC doesn't use STUN to exchange ICE candidates and SessionDescriptions. STUN is used to create a NAT Mapping and getting information about it. You can establish a WebRTC session without using STUN at all.
A session is established because two WebRTC peers exchange Session Descriptions (Offers and Answers). These offers and answers can be exchanged with any protocol you like. Most commonly you see HTTP/Websockets used. This is known as a signaling server.
It is the signaling servers job to make sure that the Offer/Answer is routed properly. You can read more about what the values actually do here WebRTC for the Curious#Signaling
Related
Is there a way to stream a video and audio on a website just to the clients, using a camera installed on the server - for instance, like youtube does ?
I've started reading webrtc, but if I use webrtc I should create a stun/turn server and other things, which for one way stream I think is not necessary (this is just my understanding of the things..) because I don't need anything from the clients, literally, neither their video, or audio..
So is there a way to achieve this using html5, streaming just in one direction:
server (camera) -> clients
Is there something about this out there, or should I stick with webrtc ?
I'm going to explain a possible solution for this scenario, there might be others, but I hope mine gives you a rough idea of how you could do it and a start point to explore more about the amazing possibilities of WebRTC. Please let me know if something is not understood.
So, WebRTC is a free, open project that provides browsers and mobile applications with Real-Time Communications (RTC) capabilities via simple APIs. Sweet, that is: WebRTC has a quite good browser support (not in every browser though, Safari just started supporting it a month ago with Safari 11). But in this case we want to use WebRTC in the server side. At the end of the day we can still think about peer-to-peer real time communication, where one of our peers is the server.
I don't know if you are familiar with Node.js, but I recommend you to write your Server app with it (<3 Javascript!):
There are a few libraries that wrap WebRTC functionality to be used in the server side, like node-webrtc and node-rtc-peer-connection.
But I recommend you to to take a look at electron-werbrtc, since
the others might be using deprecated methods or be incomplete.
electron-webrtc runs a headless Electron client in the background to
use Chromium's built-in WebRTC implementation. So with it you should
be able to access the Camera in your server and create a stream to be
served to the other peer (the browser).
All above would be the WebRTC related tasks, in this case: streaming video peer(server)-to-peer(browser).
Now, let's talk about signaling process, stun and turn.
Signaling: imagine now a scenario peer-to-peer with 2 browsers, they want to establish a direct connection and stream video and audio between each other. But they don't know each other, like if I don't know your home address, I can't send you a letter. So they need a service that helps them know each other, so they can have the other's IP. This should be done by what is called "a signaling server". If somehow you know the other peer IP, you wouldn't need a signaling server.
STUN/TURN: the scheme above works perfectly in a local area network where each peer has its own IP address and there are no firewalls and routers between them. But otherwise, you can have peers behind a NAT or firewalls, and then your signaling server won't be able to make both peers to discover themselves. If you have peers behind a NAT, you'll need a STUN server, and if you have peers behind firewalls you'll need a TURN server. This is a bit simplified, but I just want you to have the general picture of when you might need STUN/TURN servers.
To better understand Signaling, STUN and TURN, there is a very graphic article that explains them perfectly.
Now, for your scenario:
I think you prob don't need STUN/TURN servers and also you prob don't need to implement the signaling process, because the browsers that are supposed to receive the stream from the server will know that server address, right? So they can establish a WebRTC connection with it.
EDIT: it is likely that you will need to implement some sort of handshake between the server and the clients (browsers), so this will be the signaling process. This is not part of WebRTC and this is why you need to implement it yourself. As I said, it is the way 2 peers can discover each other, but they also exchange information as their local media conditions, like codecs, resolutions they can handle, etc. For your case, your signaling server could be hosted in the same server you use to strea: you can build a small node.js app that runs there and that manages all the signaling process easily, it is not a big deal. I recommend you to read this article, and specially the section "How can I build a signaling service?". In general all WebRTC articles from that site are very helpful.
Does this make sense to you? I think with it you can start digging a little bit more and see if with this is enough or you need to implement more stuff. Hope it helps!
In the absence of a signalling server for coordinating the initial exchange, does WebRTC provide any way to allow the responder to send information freely to the caller, if the responder has only received an offer and has no other methods of communication with the caller?
(There's no signalling server because the web app must be useable offline. Any method to establish a connection with only one exchange of information would also be useful.)
Sorry, it's a long and weird question.
I guess by offline you mean that you have two parties that will connect through a network not connected to the internet.
Signaling is just a way to transmit information between the two parties. For the sake of example it can even be manual copy and paste. Even one of the parties can play the role of a server if the other one has a way of connecting to it (doable within the same network).
Without some kind of signaling mechanism, a WebRTC connection is not possible. And signaling is not part of the WebRTC specification, nor of any implementation.
Webrtc needs a signalling system for establishing peer to peer connection. Now the thing to notice is why it needs signalling.
In the process of establishing peer connection the two parties exchange sdp which contains information such as the IP and Port at both ends at which the media/data packets will get exchanged. Similarly it contains the encoding/decoding or codec to be used plus many other useful things.Thus without the exchange of these packets between both the parties any communication can't be possible.
That is why it is not possible at least in case of webrtc that without the communication from both sides a peer connection can be established.
I am trying to test sip capabilities of firewalls using webrtc. However I noticed using the servers needed for webrtc (stun turn websockets etc.) will give me a false positive in that it won’t catch nuanced issues with the ALGs. For reference this is being done from a chrome app so I can’t just run a native sip stack in the browser.
My Question: can I leverage webrtc to just send sip(invite, options, register) and not use any other methods that would get around the firewall?
Your question doesn't make sense because WebRTC doesn't use SIP - SIP is a signaling protocol, and WebRTC doesn't do signaling. What that means is that SIP can be used to establish a WebRTC connection, but they are mutually exclusive.
SIP is sent over a data connection, like a hard line from a phone to a PBX or a websocket from a browser to a server.
It is possible to set up a WebRTC connection using out of band mechanisms, but then that wouldn't be SIP.
Actually there might be a way around that.
Use the signalling server to to do any sort of preconfigurations you might want to do before setting up the peer connection. This would allow you to specify codecs and resolution of the feed as a SessionDescription before hand or even check if the other peer is capable of WebRTC or not.
I'd recommend Socket.io =D
I'm tracing packets between 2 agents. One is from Chrome on Mac, the other is from Chrome Beta on Android. They're communicating by a reference site like apprtc.appspot.com and I managed to save some logs out of it. (please download it or it only displayed as source code) Doing so I also capture packets in Wireshark while 2 agents communicating with WebRTC.
Using filter: stun||udp lots of Binding requests & responses can be founded.
Basically from the rfc doc it said:
An agent can respond to an initial offer at any point while gathering candidates...
thus allowing the remote party to also start forming checklists and performing
connectivity checks.
But I just can't see any sign of SDP like offer or answer sending to each other, which can be found in js log above. For cross reference I hope to find the right order of the entire communication.
Here's the Wireshark file kinda of big
Chrome uses TLS to encrypt the signaling packets. And if its a communication directly between the peer, the only way to see signaling is looking at the Console logs of chrome. It should have the offer answer exchange of the SDP. I am assuming its using SIP as the signaling protocol and you should be seeing it in the console.
If there is a intermediary between the peer, like a FreeSwitch any other SIP Server, it could be possible to debug it better as they have the keys to decode and find use the raw text messages.
We're playing around with WebRTC and trying to understand its benefits.
One reason Skype can serve hundreds of millions of people is because of its decentralized, peer-to-peer architecture, which keeps server costs down.
Does WebRTC allow people to build a video chat application similar to Skype in that the architecture can be decentralized (i.e., video streams are not routed from a broadcaster through a central server to listeners but rather routed directly from broadcaster to listener)?
Or, put another way, does WebRTC allow someone to essentially replicate the benefits of a P2P architecture similar to Skype's?
Or do you still need something similar to Skype's P2P architecture?
Yes, that's basically what WebRTC does. Calls using the getPeerConnection() API don't send voice/video data through a centralized server, but rather use firewall traversal protocols like ICE, STUN and TURN to allow a direct, peer-to-peer connection. However, the initial call setup still requires a server (most likely something running a WebSocket implementation, but it could be anything that you can figure out how to get JavaScript to talk to), so that the two clients can figure out that they're both online, signal that they want to connect, and then figure out how to do it (this is where the ICE/STUN/TURN bit comes in).
However, there's more to Skype's P2P architecture than just passing voice/video data back and forth. The majority of Skype's IP isn't in the codecs or protocols (much of which they licensed from Global IP Solutions, which Google purchased two years ago and then open-sourced, and which forms of the basis of Chrome's WebRTC implementation). Skype's real IP is all in the piece of WebRTC which still depends on a server: figuring out which people are online, and where they are, and how to get a hold of them, and doing that in a massively decentralized fashion. (See here for some rough details.) I think that you could probably use the DataStream portion of the getPeerConnection() API to do that sort of thing, if you were really, really smart - but it would be complicated, and would most likely stomp on a few Skype patents. Unless you want to be really, really huge, you'd probably just want to run your own centralized presence and location servers and handle all that stuff through standard WebSockets.
I should note that Skype's network architecture has changed since it was created; it no longer (from what I hear) uses random users as supernodes to relay data from client 1 to client 2; it didn't scale well and caused rampant variability in results (and annoyed people who had non-firewalled connections and bandwidth).
You definitely can build something SKype-like with WebRTC - and more. :-)