tcpdump packets are captured before fragmentation - tcpdump

I have a setup as below.
[ Host A ] <-> [Rtr-A] <-> [Rtr-M] <-> [Rtr-B] <-> [ Host B]
I have set MTU of out interface of Rtr-A interface ( towards 'Rtr-M' ) to 600
I am capturing packets from 'Host A' and 'Rtr-A'.
I have sent a data of size 1000 from 'Host A' to 'Host B'.
While looking at the packets captured, I can see that ICMP packet with fragmentation required came from Rtr-A to 'Host A' and after that packet from Host A is still 1000 bytes where as the packet reached in Rtr-A is smaller chunks. Which means I assume that after packet is captured from 'Host A', it is fragmented.
Is this expected behaviour ?. Is there any way I can capture fragmented packet from 'Host A' itself.
~S

Yes, this is normal and expected behavior when you capture locally. If you want to see the packets as they appear on the wire, then you'll need to capture externally using a TAP, the SPAN port of a switch or a hub if you can find one.
A good article I recommend reading is Jasper Bongertz's "The drawbacks of local packet captures", where this very issue is mentioned in "Sideeffect #2 -Woah, BIG packets! And small ones, too!". You might also want to refer to the Wireshark Ethernet capture setup wiki page.

Related

Webrtc behavior Nack & FEC

We have WebRTC application with two peers and I experience packet loss of around 5% (checked on webrtc-internals) when call is ongoing. I see Nacks as well.
Wants to know if FEC is being implemented in my setup? I do see some SDP parameters related to FEC as below but not sure whether they are used or not.
How to check if Webrtc is using FEC?
a=rtpmap:124 red/90000
a=rtpmap:123 ulpfec/90000
Also is there any suggestions on how to improve packet loss percentage by tweaking Nacks or FEC etc?
Tried with different bandwidth and resolutions and packet loss is almost same.
Easiest way to determine whether FEC is actually used is to run a packet capture using Wireshark or tcpdump and look for RTP packets where the payload type matches the value in the SDP (123 and 124 in your example). If you see these packets, you’re seeing FEC.
One thing to note, FEC could make packet loss worse in some cases, essentially where you have bursts of back to back packets lost because of congestion. FEC is transmitting additional packets, which allows any one or two packets in a group to be lost and recovered from the additional packets.
Found the root cause for packet loss. It was related to setup on network switches. We are using dedicated leaseline and leaseline expects fixed 100Mbps duplex configuration instead of auto configuration on network switch ports. Due to auto configuration, the link went in to half duplex and hence FEC errors.

Massive ICMP ping fair use policy

I made a (Java) tool capable of issuing tens of ICMP pings per second towards individual hosts. For example for 1000 different hosts it takes about one minute to send the pings and wait for and collect any responses from the individual hosts.
The purpose of this tool would be to monitor periodically the functioning of network connectivity for a collection of remote hosts.
Am I free to push this tool to its limits in setting any high value of pings per second and in the total amount of hosts? Or should I restrict this in order to avoid me to be banned or blocked by anyone? How costly or annoying are ICMP Pings on networks?
Simple PING packet is 74 bytes long, including the Ethernet frame header and the IP + ICMP headers, plus the minimum 32 bytes of the ICMP payload - so it's not that big even 1000 of them.
But you should not use PING too much in my opinion. Network admins can detect such abnormal network behaviour and try to contact you or block your IP. Also IDS and routers can cut you off because of their policy.
The purpose of ICMP PING packets is to help network admins to diagnose network infrastructure problems. Typical use is to send a small number of packets to a target machine, like:
$ ping stackoverflow.com
Pinging stackoverflow.com [151.101.129.69] with 32 bytes of data:
Reply from 151.101.129.69: bytes=32 time=72ms TTL=57
Reply from 151.101.129.69: bytes=32 time=73ms TTL=57
Reply from 151.101.129.69: bytes=32 time=73ms TTL=57
Reply from 151.101.129.69: bytes=32 time=72ms TTL=57
Ping statistics for 151.101.129.69:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 72ms, Maximum = 73ms, Average = 72ms
As you can see the system ping command sends four packets - it's enough to diagnose the problem. Of course you can change the size of packets and their number.
In my opinion any other usage of ICMP PING (bigger number of packets, high speed of sending or a large size of packets) is a sign of abnormal usage. It is very often related to a virus/trojan/worm network infection, agressive port scanning by hackers or DDoS attack.
You want to send ~1000 packets - IMHO it's way to many. You should give a possibility to change this number.

How to resolve tcpdump dropped packets?

I am using tcpdump to capture network packets and running into issue when I start dropping packets. I ran an application which exchanges packets rapidly over network; resulting in high network bandwidth.
>> tcpdump -i eno1 -s 64 -B 919400
126716 packets captured
2821976 packets received by filter
167770 packets dropped by kernel
Since I am only interested in protocol related part from TCP packet; I want to collect TCP packets without data/payload. I hope this strategy can also help in capturing more packets before dropping packets. It appears that I can only increase buffer size (-B argument) upto certain limit. Even with higher limit I am dropping more packets than captured.
can you help me understanding above messages and questions I have
what are packets captured ?
what are packets received by filter?
what are packets dropped by kernel?
how can I capture all packets at high bandwidth without dropping any packets. My test application runs for 3 minutes and exchanges packets at a very high rate. I am only interested in protocol related information not in actual data/ payload being sent.
From Guy Harris himself:
the "packets captured" number is a number that's incremented every time tcpdump sees a packet, so it counts packets that tcpdump reads from libpcap and thus that libpcap reads from BPF and supplies to tcpdump.
The "packets received by filter" number is the "ps_recv" number from a call to pcap_stats(); with BPF, that's the bs_recv number from the BIOCGSTATS ioctl. That count includes all packets that were handed to BPF; those packets might still be in a buffer that hasn't yet been read by libpcap (and thus not handed to tcpdump), or might be in a buffer that's been read by libpcap but not yet handed to tcpdump, so it can count packets that aren't reported as "captured".
And from the tcpdump man page:
packets ``dropped by kernel'' (this is the number of packets that were dropped, due to a lack of buffer space, by the packet capture mechanism in the OS on which tcpdump is running, if the OS reports that information to applications; if not, it will be reported as 0).
To attempt to improve capture performance, here are a few things to try:
Don't capture in promiscuous mode if you don't need to. That will cut down on the amount of traffic that the kernel has to process. Do this by using the -p option.
Since you're only interested in TCP traffic, apply a capture expression that limits the traffic to TCP only. Do this by appending "tcp" to your command.
Try writing the packets to a file (or files to limit size) rather than displaying packets to the screen. Do this with the -w file option or look into the -C file_size and -G rotate_seconds options if you want to limit file sizes.
You could try to improve tcpdump's scheduling priority via nice.
From Wireshark's Performance wiki page:
stop other programs running on that machine, to remove system load
buy a bigger, faster machine :)
increase the buffer size (which you're already doing)
set a snap length (which you're already doing)
write capture files to a RAM disk
Try using PF_RING.
You could also try using dumpcap instead of tcpdump, although I would be surprised if the performance was drastically different.
You could try capturing with an external, dedicated device using a TAP or Switch+SPAN port. See Wireshark's Ethernet Capture Setup wiki page for ideas.
Another promising possibility: Capturing Packets in Linux at a Speed of Millions of Packets per Second without Using Third Party Libraries.
See also Andrew Brown's Sharkfest '14 Maximizing Packet Capture Performance document for still more ideas.
Good luck!
I would try actually lowering the value of your -B option.
The unit is 1 KiB (1024 bytes), thus the buffer size you specified (919400) is almost 1 gigabyte.
I suppose you would get better results by using a value closer to your CPU cache size, e.g. -B 16384.

Definition of Round Trip Time by using Ping ICMP messages

How is the RTT defined by the use of a "simple" ping command?
Example (Win7):
ping -l 600 www.google.de
My understanding is:
There will be send a ICMP message to google with the size of 600 bytes (request). Google copies that message (600 bytes) and sends it back to the destination (reply).
The RTT is the (latency) time for the whole procedure involving the sending and the getting of the 600 byte message.
Is that right?
Latency is typically caused by mainly two reasons:
1) Distance between two Nodes; This plays a vital role in calculating latency. For example, consider a scenario where Node A and Node B need to communicate, sending ICMP messages to each other and vice-versa.
a) The fewer the number of hops, the lower the latency will be. More hops, more latency.
Solution: You can select an alternate path for the communication, maybe the path having less distance.
2) How busy the network is; Whenever packet is sent from one network to other, routers process the packets, which in turn takes some milliseconds doing so. It will add up all the time taken to and fro for calculating the latency.
a) It depends upon the process device, how busy it is. If less busy, packets will be processed and forwarded faster, if busy it will take time.
Solution: one possible solution can be using QOS where in you can prioritize the traffic, not ICMP traffic of course, some other kind of traffic.

High "Receiving Time" for HTTP Responses below 500 bytes in Chrome Devtools

While using devtools Network tab on Chrome 15 (stable) on Windows 7 and
Windows XP, I am seeing cases where "receiving" time for an HTTP
response is >100ms but the response is a 302 redirects or small image
(beacons) - with a payload below 500 bytes (header+content).
Capturing the TCP traffic on Wireshark clearly shows the server sent
the entire HTTP response in a single TCP packet, so receiving time should
have been 0. A good example is CNN homepage, or any major website that has a lot of
ads and tracking beacons.
This brings up a couple of questions:
What is defined as "receiving" in chrome devtools? is this the time
from 1st packet to last packet?
What factors in the client machine/operating systems impact
"receiving" time, outside of the network/server communication?
In my tests I used a virtual machine for Windows XP, while Windows 7
was on a desktop (quad core, 8gb ram).
The "receiving time" is the time between the didReceiveResponse ("Response headers received") and didReceiveData ("A chunk of response data received") WebURLLoaderClient events reported by the network layer, so some internal processing overhead may apply.
In a general case, keep in mind that the HTTP protocol is stream-oriented, so the division of data between TCP packets is not predictable (half of your headers may get into one packet, the rest and the response body may get into the next one, though this does not seem to be your case.)
Whenever possible, use the latest version of Chrome available. It is likely to contain less errors, including the network layer :-)
The Nagle Algorithm and the Delayed ACK Algorithm are two congestion control algorithms that are enabled by default on Windows machines. These will introduce delays in the traffic of small payloads in an attempt to reduce some of the chattiness of TCP/IP.
Delayed ACK will cause ~200ms of additional "Receiving" time in Chrome's network tab when receiving small payloads. Here is a webpage explaining the algorithms and how to disable them on Windows: http://smallvoid.com/article/winnt-nagle-algorithm.html