Webrtc behavior Nack & FEC - google-chrome

We have WebRTC application with two peers and I experience packet loss of around 5% (checked on webrtc-internals) when call is ongoing. I see Nacks as well.
Wants to know if FEC is being implemented in my setup? I do see some SDP parameters related to FEC as below but not sure whether they are used or not.
How to check if Webrtc is using FEC?
a=rtpmap:124 red/90000
a=rtpmap:123 ulpfec/90000
Also is there any suggestions on how to improve packet loss percentage by tweaking Nacks or FEC etc?
Tried with different bandwidth and resolutions and packet loss is almost same.

Easiest way to determine whether FEC is actually used is to run a packet capture using Wireshark or tcpdump and look for RTP packets where the payload type matches the value in the SDP (123 and 124 in your example). If you see these packets, you’re seeing FEC.
One thing to note, FEC could make packet loss worse in some cases, essentially where you have bursts of back to back packets lost because of congestion. FEC is transmitting additional packets, which allows any one or two packets in a group to be lost and recovered from the additional packets.

Found the root cause for packet loss. It was related to setup on network switches. We are using dedicated leaseline and leaseline expects fixed 100Mbps duplex configuration instead of auto configuration on network switch ports. Due to auto configuration, the link went in to half duplex and hence FEC errors.

Related

How to resolve tcpdump dropped packets?

I am using tcpdump to capture network packets and running into issue when I start dropping packets. I ran an application which exchanges packets rapidly over network; resulting in high network bandwidth.
>> tcpdump -i eno1 -s 64 -B 919400
126716 packets captured
2821976 packets received by filter
167770 packets dropped by kernel
Since I am only interested in protocol related part from TCP packet; I want to collect TCP packets without data/payload. I hope this strategy can also help in capturing more packets before dropping packets. It appears that I can only increase buffer size (-B argument) upto certain limit. Even with higher limit I am dropping more packets than captured.
can you help me understanding above messages and questions I have
what are packets captured ?
what are packets received by filter?
what are packets dropped by kernel?
how can I capture all packets at high bandwidth without dropping any packets. My test application runs for 3 minutes and exchanges packets at a very high rate. I am only interested in protocol related information not in actual data/ payload being sent.
From Guy Harris himself:
the "packets captured" number is a number that's incremented every time tcpdump sees a packet, so it counts packets that tcpdump reads from libpcap and thus that libpcap reads from BPF and supplies to tcpdump.
The "packets received by filter" number is the "ps_recv" number from a call to pcap_stats(); with BPF, that's the bs_recv number from the BIOCGSTATS ioctl. That count includes all packets that were handed to BPF; those packets might still be in a buffer that hasn't yet been read by libpcap (and thus not handed to tcpdump), or might be in a buffer that's been read by libpcap but not yet handed to tcpdump, so it can count packets that aren't reported as "captured".
And from the tcpdump man page:
packets ``dropped by kernel'' (this is the number of packets that were dropped, due to a lack of buffer space, by the packet capture mechanism in the OS on which tcpdump is running, if the OS reports that information to applications; if not, it will be reported as 0).
To attempt to improve capture performance, here are a few things to try:
Don't capture in promiscuous mode if you don't need to. That will cut down on the amount of traffic that the kernel has to process. Do this by using the -p option.
Since you're only interested in TCP traffic, apply a capture expression that limits the traffic to TCP only. Do this by appending "tcp" to your command.
Try writing the packets to a file (or files to limit size) rather than displaying packets to the screen. Do this with the -w file option or look into the -C file_size and -G rotate_seconds options if you want to limit file sizes.
You could try to improve tcpdump's scheduling priority via nice.
From Wireshark's Performance wiki page:
stop other programs running on that machine, to remove system load
buy a bigger, faster machine :)
increase the buffer size (which you're already doing)
set a snap length (which you're already doing)
write capture files to a RAM disk
Try using PF_RING.
You could also try using dumpcap instead of tcpdump, although I would be surprised if the performance was drastically different.
You could try capturing with an external, dedicated device using a TAP or Switch+SPAN port. See Wireshark's Ethernet Capture Setup wiki page for ideas.
Another promising possibility: Capturing Packets in Linux at a Speed of Millions of Packets per Second without Using Third Party Libraries.
See also Andrew Brown's Sharkfest '14 Maximizing Packet Capture Performance document for still more ideas.
Good luck!
I would try actually lowering the value of your -B option.
The unit is 1 KiB (1024 bytes), thus the buffer size you specified (919400) is almost 1 gigabyte.
I suppose you would get better results by using a value closer to your CPU cache size, e.g. -B 16384.

How do I handle packet loss when recording video peer to server via WebRTC

We are using the licode MCU to stream recorded video from Google Chrome to the server. There isn't a second instance of Google Chrome to handle the feedback and the server must do this.
One thing that we have encountered is when there is packet loss frames are dropped and the video gets out of sync. This causes very poor video quality.
In ExternalOutput.cpp there is a place where it detects that the current packet of data received has not incremented monotonically. Here you can see that it drops the current frame and resets the search state.
I would like to know how to modify this so that it can recover from this packet loss. Is submitting a NACK packet on the current sequence number the solution? I've also read that there is a mode where Google chrome submits RED packets (redundant) to deal with the packet loss.
Media processing apps has two principal different layers:
Transport layer (RTP/RTCP)
Codec layer
Transport layer is codec independent and deal with RTP/generic RTCP packets. On this layer there are couple of mechanisms for struggling with packet lost/delay/reordering:
Jitter Buffer (Handles packet delays and reordering)
Generick RTCP Feedbacks (Notifies source peer of packet lost)
On codec layer there are also couple of mechanisms for struggling with quality degradation:
Codec Layer RTCP Feedbacks
Forward error correction (FECC/RED)
To overcome Licode imperfections you should:
First of all it ignores any packet delays and reordering. So, you should implement mechanism (Jitter buffer), which will handle packet reodering/network jitter and determine packet lost (Probably, you could reuse webrtc/freeswitch mechanisms)
When your app determines packet lost, you should send feedback (RTCP NACK) to remote peer
Also you should try to handle ffmpeg (used for decoding video and saving it to file) decoding errors and send FIR (Fast Intra Request)/PLI to remote peer for requesting keyframes in case of errors.
Take a note, that p.2,3 requires proper explicit negotiation (via SDP).
Only after passing all this cases you could take a look to FECC/RED, because it's definetely more dificult to handle and implement.

How should PLI packets be used in WebRTC video recording

We are using the licode MCU to streaming video from Google Chrome to the server and record it. The tricky part here is that there is only one Chrome browser involved so the server-side code has to handle sending feedback to the client.
We added server-side code to send REMB (bandwidth) packets every 5 seconds to the client. This causes the client to increase bitrate so that the video quality is good.
We did something similar with PLI packets to attempt to improve video quality. The recorded video had blocky artifacts and didn't look good. The current code sends a PLI every 0.8 seconds which causes the client to send a keyframe (full frame of video). This fixes the poor video quality because it forces a keyframe but when there is packet loss (wifi network) it quickly gets bad again.
My question is how should these PLI packets be used?
I think PLI means:
PLI - Picture Loss Indication
Your application should send at least three kinds of RTCP feedback:
an accurate receiver report (RFC 3550) every second or so, indicating packet loss and jitter rates to the sender; this will cause the sender to adapt their throughpout to the link characteristics;
a generic NACK (RFC 4585) whenever it misses a packet; this will avoid corruption by causing the sender to resend any packet that is lost;
a PLI (RFC 4585) whenever it hasn't seen a keyframe in a given interval, for example two seconds.
Sending REMB is only necessary to limit throughput if it grows too fast, for example if the feedback provided in receiver reports is inaccurate.

High "Receiving Time" for HTTP Responses below 500 bytes in Chrome Devtools

While using devtools Network tab on Chrome 15 (stable) on Windows 7 and
Windows XP, I am seeing cases where "receiving" time for an HTTP
response is >100ms but the response is a 302 redirects or small image
(beacons) - with a payload below 500 bytes (header+content).
Capturing the TCP traffic on Wireshark clearly shows the server sent
the entire HTTP response in a single TCP packet, so receiving time should
have been 0. A good example is CNN homepage, or any major website that has a lot of
ads and tracking beacons.
This brings up a couple of questions:
What is defined as "receiving" in chrome devtools? is this the time
from 1st packet to last packet?
What factors in the client machine/operating systems impact
"receiving" time, outside of the network/server communication?
In my tests I used a virtual machine for Windows XP, while Windows 7
was on a desktop (quad core, 8gb ram).
The "receiving time" is the time between the didReceiveResponse ("Response headers received") and didReceiveData ("A chunk of response data received") WebURLLoaderClient events reported by the network layer, so some internal processing overhead may apply.
In a general case, keep in mind that the HTTP protocol is stream-oriented, so the division of data between TCP packets is not predictable (half of your headers may get into one packet, the rest and the response body may get into the next one, though this does not seem to be your case.)
Whenever possible, use the latest version of Chrome available. It is likely to contain less errors, including the network layer :-)
The Nagle Algorithm and the Delayed ACK Algorithm are two congestion control algorithms that are enabled by default on Windows machines. These will introduce delays in the traffic of small payloads in an attempt to reduce some of the chattiness of TCP/IP.
Delayed ACK will cause ~200ms of additional "Receiving" time in Chrome's network tab when receiving small payloads. Here is a webpage explaining the algorithms and how to disable them on Windows: http://smallvoid.com/article/winnt-nagle-algorithm.html

Are there any protocols/standards on top of TCP optimized for high throughput and low latency?

Are there any protocols/standards that work over TCP that are optimized for high throughput and low latency?
The only one I can think of is FAST.
At the moment I have devised just a simple text-based protocol delimited by special characters. I'd like to adopt a protocol which is designed for fast transfer and supports perhaps compression and minification of the data that travels over the TCP socket.
Instead of using heavy-weight TCP, we can utilize the connection-oriented/reliable feature of TCP on the top of UDP by any of the following way:
UDP-based Data Transfer Protocol(UDT):
UDT is built on top of User Datagram Protocol (UDP) by adding congestion control and reliability control mechanisms. UDT is an application level, connection oriented, duplex protocol that supports both reliable data streaming and partial reliable messaging.
Acknowledgment:
UDT uses periodic acknowledgments (ACK) to confirm packet delivery, while negative ACKs (loss reports) are used to report packet loss. Periodic ACKs help to reduce control traffic on the reverse path when the data transfer speed is high, because in these situations, the number of ACKs is proportional to time, rather than the number of data packets.
Reliable User Datagram Protocol (RUDP):
It aims to provide a solution where UDP is too primitive because guaranteed-order packet delivery is desirable, but TCP adds too much complexity/overhead.
It extends UDP by adding the following additional features:
Acknowledgment of received packets
Windowing and congestion control
Retransmission of lost packets
Overbuffering (Faster than real-time streaming)
en.wikipedia.org/wiki/Reliable_User_Datagram_Protocol
If layered on top of TCP, you won't get better throughput or latency than the 'barest' TCP connection.
there are other non-TCP high-throughput and/or low-latency connection-oriented protocols, usually layered on top of UDP.
almost the only one i know is UDT, which is optimized for networks where the high bandwidth or long round trip times (RTT) makes typical TCP retransmissions suboptimal. These are called 'extremely long fat networks' (LFN, pronounced 'elefan').
You may want to consider JMS. JMS can run on top of TCP, and you can get reasonable latency with a message broker like ActiveMQ.
It really depends on your target audience though. If your building a game which must run anywhere, you pretty much need to use HTTP or HTTP/Streaming. If you are pushing around market data on a LAN, than something NOT using TCP would probably suite you better. Tibco RV and JGroups both provide reliable low-latency messaging over multicast.
Just as you mentioned FAST - it is intended for market data distribution and is used by leading stock exchanges and is running on the top of UDP multicast.
In general, with current level of networks reliability it always worth putting your protocol on the top of UDP.
Whatever having session sequence number, NACK+server-to-client-heartbeat and binary marshalling should be close to theoretical performance.
If you have admin/root privilege on the sending side, you can also try a TCP acceleration driver like SuperTCP.