ping - does it increase load to the network?

ping - does it increase load to the network? - ping

I find a network in an organisation too slow. A specific internal site goes down frequently and i wanted to understand the trend. I am doing ping test to that site for every 5 min and built that as a jenkins job.
Now my question is does that put load to the network. I read it is not a good approach and do we have alternatives to test such scenarios.

Lets assume a typical ping packet and reply is 84 bytes each, and you are pinging once a second. That is 2 * 86,400 * 84 bytes or about 13 megabytes of traffic per day. Or about ~1.2kilobit/s. So unless you are running on a 9600 baud modem, you probably can handle it.

Related

Massive ICMP ping fair use policy

I made a (Java) tool capable of issuing tens of ICMP pings per second towards individual hosts. For example for 1000 different hosts it takes about one minute to send the pings and wait for and collect any responses from the individual hosts.
The purpose of this tool would be to monitor periodically the functioning of network connectivity for a collection of remote hosts.
Am I free to push this tool to its limits in setting any high value of pings per second and in the total amount of hosts? Or should I restrict this in order to avoid me to be banned or blocked by anyone? How costly or annoying are ICMP Pings on networks?

Simple PING packet is 74 bytes long, including the Ethernet frame header and the IP + ICMP headers, plus the minimum 32 bytes of the ICMP payload - so it's not that big even 1000 of them.
But you should not use PING too much in my opinion. Network admins can detect such abnormal network behaviour and try to contact you or block your IP. Also IDS and routers can cut you off because of their policy.
The purpose of ICMP PING packets is to help network admins to diagnose network infrastructure problems. Typical use is to send a small number of packets to a target machine, like:
$ ping stackoverflow.com
Pinging stackoverflow.com [151.101.129.69] with 32 bytes of data:
Reply from 151.101.129.69: bytes=32 time=72ms TTL=57
Reply from 151.101.129.69: bytes=32 time=73ms TTL=57
Reply from 151.101.129.69: bytes=32 time=73ms TTL=57
Reply from 151.101.129.69: bytes=32 time=72ms TTL=57
Ping statistics for 151.101.129.69:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 72ms, Maximum = 73ms, Average = 72ms
As you can see the system ping command sends four packets - it's enough to diagnose the problem. Of course you can change the size of packets and their number.
In my opinion any other usage of ICMP PING (bigger number of packets, high speed of sending or a large size of packets) is a sign of abnormal usage. It is very often related to a virus/trojan/worm network infection, agressive port scanning by hackers or DDoS attack.
You want to send ~1000 packets - IMHO it's way to many. You should give a possibility to change this number.

How to resolve tcpdump dropped packets?

I am using tcpdump to capture network packets and running into issue when I start dropping packets. I ran an application which exchanges packets rapidly over network; resulting in high network bandwidth.
>> tcpdump -i eno1 -s 64 -B 919400
126716 packets captured
2821976 packets received by filter
167770 packets dropped by kernel
Since I am only interested in protocol related part from TCP packet; I want to collect TCP packets without data/payload. I hope this strategy can also help in capturing more packets before dropping packets. It appears that I can only increase buffer size (-B argument) upto certain limit. Even with higher limit I am dropping more packets than captured.
can you help me understanding above messages and questions I have
what are packets captured ?
what are packets received by filter?
what are packets dropped by kernel?
how can I capture all packets at high bandwidth without dropping any packets. My test application runs for 3 minutes and exchanges packets at a very high rate. I am only interested in protocol related information not in actual data/ payload being sent.

From Guy Harris himself:
the "packets captured" number is a number that's incremented every time tcpdump sees a packet, so it counts packets that tcpdump reads from libpcap and thus that libpcap reads from BPF and supplies to tcpdump.
The "packets received by filter" number is the "ps_recv" number from a call to pcap_stats(); with BPF, that's the bs_recv number from the BIOCGSTATS ioctl. That count includes all packets that were handed to BPF; those packets might still be in a buffer that hasn't yet been read by libpcap (and thus not handed to tcpdump), or might be in a buffer that's been read by libpcap but not yet handed to tcpdump, so it can count packets that aren't reported as "captured".
And from the tcpdump man page:
packets ``dropped by kernel'' (this is the number of packets that were dropped, due to a lack of buffer space, by the packet capture mechanism in the OS on which tcpdump is running, if the OS reports that information to applications; if not, it will be reported as 0).
To attempt to improve capture performance, here are a few things to try:
Don't capture in promiscuous mode if you don't need to. That will cut down on the amount of traffic that the kernel has to process. Do this by using the -p option.
Since you're only interested in TCP traffic, apply a capture expression that limits the traffic to TCP only. Do this by appending "tcp" to your command.
Try writing the packets to a file (or files to limit size) rather than displaying packets to the screen. Do this with the -w file option or look into the -C file_size and -G rotate_seconds options if you want to limit file sizes.
You could try to improve tcpdump's scheduling priority via nice.
From Wireshark's Performance wiki page:
stop other programs running on that machine, to remove system load
buy a bigger, faster machine :)
increase the buffer size (which you're already doing)
set a snap length (which you're already doing)
write capture files to a RAM disk
Try using PF_RING.
You could also try using dumpcap instead of tcpdump, although I would be surprised if the performance was drastically different.
You could try capturing with an external, dedicated device using a TAP or Switch+SPAN port. See Wireshark's Ethernet Capture Setup wiki page for ideas.
Another promising possibility: Capturing Packets in Linux at a Speed of Millions of Packets per Second without Using Third Party Libraries.
See also Andrew Brown's Sharkfest '14 Maximizing Packet Capture Performance document for still more ideas.
Good luck!

I would try actually lowering the value of your -B option.
The unit is 1 KiB (1024 bytes), thus the buffer size you specified (919400) is almost 1 gigabyte.
I suppose you would get better results by using a value closer to your CPU cache size, e.g. -B 16384.

Definition of Round Trip Time by using Ping ICMP messages

How is the RTT defined by the use of a "simple" ping command?
Example (Win7):
ping -l 600 www.google.de
My understanding is:
There will be send a ICMP message to google with the size of 600 bytes (request). Google copies that message (600 bytes) and sends it back to the destination (reply).
The RTT is the (latency) time for the whole procedure involving the sending and the getting of the 600 byte message.
Is that right?

Latency is typically caused by mainly two reasons:
1) Distance between two Nodes; This plays a vital role in calculating latency. For example, consider a scenario where Node A and Node B need to communicate, sending ICMP messages to each other and vice-versa.
a) The fewer the number of hops, the lower the latency will be. More hops, more latency.
Solution: You can select an alternate path for the communication, maybe the path having less distance.
2) How busy the network is; Whenever packet is sent from one network to other, routers process the packets, which in turn takes some milliseconds doing so. It will add up all the time taken to and fro for calculating the latency.
a) It depends upon the process device, how busy it is. If less busy, packets will be processed and forwarded faster, if busy it will take time.
Solution: one possible solution can be using QOS where in you can prioritize the traffic, not ICMP traffic of course, some other kind of traffic.

Performance comparison between ZeroMQ, RabbitMQ and Apache Qpid

I need a high performance message bus for my application so I am evaluating performance of ZeroMQ, RabbitMQ and Apache Qpid. To measure the performance, I am running a test program that publishes say 10,000 messages using one of the message queue implementations and running another process in the same machine to consume these 10,000 messages. Then I record time difference between the first message published and the last message received.
Following are the settings I used for the comparison.
RabbitMQ: I used a "fanout" type exchange and a queue with default configuration. I used the RabbitMQ C client library.
ZeroMQ: My publisher publises to tcp://localhost:port1 with ZMQ_PUSH socket, My broker listens on tcp://localhost:port1 and resends the message to tcp://localhost:port2 and my consumer listens on tcp://localhost:port2 using ZMQ_PULL socket. I am using a broker instead of peer to to peer communication in ZeroMQ to to make the performance comparison fair to other message queue implementation that uses brokers.
Qpid C++ message broker: I used a "fanout" type exchange and a queue with default configuration. I used the Qpid C++ client library.
Following is the performance result:
RabbitMQ: it takes about 1 second to receive 10,000 messages.
ZeroMQ: It takes about 15 milli seconds to receive 10,000 messages.
Qpid: It takes about 4 seconds to receive 10,000 messages.
Questions:
Have anyone run similar performance comparison between the message queues? Then I like to compare my results with yours.
Is there any way I could tune RabbitMQ or Qpid to make it performance better?
Note:
The tests were done on a virtual machine with two allocated processor. The result may vary for different hardware, however I am mainly interested in relative performance of the MQ products.

RabbitMQ is probably doing persistence on those messages. I think you need to set the message priority or another option in messages to not do persistence. Performance will improve 10x then. You should expect at least 100K messages/second through an AMQP broker. In OpenAMQ we got performance up to 300K messages/second.
AMQP was designed for speed (e.g. it does not unpack messages in order to route them) but ZeroMQ is simply better designed in key ways. E.g. it removes a hop by connecting nodes without a broker; it does better asynchronous I/O than any of the AMQP client stacks; it does more aggressive message batching. Perhaps 60% of the time spent building ZeroMQ went into performance tuning. It was very hard work. It's not faster by accident.
One thing I'd like to do, but am too busy, is to recreate an AMQP-like broker on top of ZeroMQ. There is a first layer here: http://rfc.zeromq.org/spec:15. The whole stack would work somewhat like RestMS, with transport and semantics separated into two layers. It would provide much the same functionality as AMQP/0.9.1 (and be semantically interoperable) but significantly faster.

Hmm, of course ZeroMQ will be faster, it is designed to be and does not have a lot of the broker based functionality that the other two provide. The ZeroMQ site has a wonderful comparison of broker vs brokerless messaging and drawbacks & advantages of both.
RabbitMQ Blog:
RabbitMQ and 0MQ are focusing on different aspects of messaging. 0MQ puts much more focus on how the messages are transferred over the wire. RabbitMQ, on the other hand, focuses on how messages are stored, filtered and monitored.
(I also like the above RabbitMQ post above as it also talks about using ZeroMQ with RabbitMQ)
So, what I'm trying to say is that you should decide on the tech that best fits your requirements. If the only requirement is speed, ZeroMQ. But if you need other aspects such as persistence of messages, filtering, monitoring, failover, etc well, then that's when u need to start considering RabbitMQ & Qpid.

I am using a broker instead of peer to to peer communication in ZeroMQ to to make the performance comparison fair to other message queue implementation that uses brokers.
Not sure why you want to do that -- if the only thing you care about is performance, there is no need to make the playing field level. If you don't care about persistence, filtering, etc. then why pay the price?
I'm also very leery of running benchmarks on VM's -- there are a lot of extra layers that can affect the results in ways that are not obvious. (Unless you're planning to run the real system on VM's, of course, in which case that is a very valid method).

I've tested c++/qpid
I sent 50000 messages per second between two diferent machines for a long time with no queuing.
I didn't use a fanout, just a simple exchange (non persistent messages)
Are you using persistent messages?
Are you parsing the messages?
I suppose not, since 0MQ doesn't have message structs.
If the broker is mainly idle, you probably haven't configured the prefetch on sender and receptor. This is very important to send many messages.

We have compared RabbitMQ with our SocketPro (http://www.udaparts.com/) persistent message queue at the site http://www.udaparts.com/document/articles/fastsocketpro.htm with all source codes. Here are results we obtained for RabbitMQ:
Same machine enqueue and dequeue:
"Hello world" --
Enqueue: 30000 messages per second;
Dequeue: 7000 messages per second.
Text with 1024 bytes --
Enqueue: 11000 messages per second;
Dequeue: 7000 messages per second.
Text with 10 * 1024 bytes --
Enqueue: 4000 messages per second;
Dequeue: 4000 messages per second.
Cross-machine enqueue and dequeue with network bandwidth 100 mbps:
"Hello world" --
Enqueue: 28000 messages per second;
Dequeue: 1900 messages per second.
Text with 1024 bytes --
Enqueue: 8000 messages per second;
Dequeue: 1000 messages per second.
Text with 10 * 1024 bytes --
Enqueue: 800 messages per second;
Dequeue: 700 messages per second.

Try to configure prefetch on sender and receptor with a value like 100. Prefetching just sender is not enough

We've developed an open source message bus built on top of ZeroMQ - we initially did this to replace Qpid. It's brokerless so it's not a totally fair comparison but it provides the same functionality as brokered solutions.
Our headline performance figure is 140K msgs per second between two machines but you can see more detail here: https://github.com/Abc-Arbitrage/Zebus/wiki/Performance

TCP Slow Start, Congestion Avoidance & Determining Bandwidth

Is there a formula someplace which can be used to determine the minimum number of segments / bytes which need to be transfered across a TCP connection to determine it's bandwidth and which takes into account Slow Start and Congestion Avoidance? I'm aware of the pathrate tool, but I want if possible something a bit simpler that I can incorporate in an app to get a descent ballpark figure. One example of usage would be downloading some data from a webserver in order to determine the optimum number of threads for downloading a bunch of small files automatically. This is related to a previous question I posted: TCP, HTTP and the Multi-Threading Sweet Spot

You can fire up scholar.google.com and search for "TCP chirp". However, that requires hires timers, and if you don't write a kernel tcp congestion control algorithm, you'd have to reimplement TCP in userspace. And that by itself will probably not give good results (general purpose OS are not very good at realtime hires timer related stuff, runnning in userspace).
In theory, using TCP chirp you need as few as 4-5 segments (typically, you'd get better resolution with a longer train of segments) to determine the "optimal" bandwidth.
In any case, since you can not know which path is used (ie. satellite link or tv broadcast in the forward direction), you may need a considerable amount of data (10+ MB, perhaps even 1GB) to get a decent measurement over arbitrary paths. (Satellites can have many dozend MB/s bandwidth, but also latencies in the 1000-3000 ms range; and TCP takes a couple round-trip times to open up cwnd (I'd say around 10 RTTs before a measurement should be started)...

I do not think that there is a fixed number of bytes required to be sent to determine the bandwidth. This number can depend on network type and speed.
Bandwidth is a measure of some resource transferred over a time interval. To get real data you need to measure it. Here are some hints how to do that

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008