How to enable nanosecond resolution when capturing live packets in libpcap? - libpcap

How do I capture packets with nanosecond resolution using libpcap v1.6.1? Based on the changelog they added support for the nanosecond resolution in v1.5.0. When I execute tcpdump and view the cap files, it is still only in microseconds. I tried the previous method of changing
pcap_open_offline_with_tstamp_precision(
fname, PCAP_TSTAMP_PRECISION_MICRO, errbuf)
to
pcap_open_offline_with_tstamp_precision(
fname, PCAP_TSTAMP_PRECISION_NANO, errbuf)
recompiled, and re-installed it but still doesn't work. Now I'm wondering if this has to do with my Linux version (RedHat Enterprise 6.2). If someone could give me any other way or a step by step procedure, it would be very much appreciated.

How do I capture packets with nanosecond resolution using libpcap v1.6.1?
See the answers to the other question about this.
When I execute tcpdump and view the cap files, it is still only in microseconds.
Tcpdump, by default, requests that libpcap give it microsecond-resolution time stamps; newer versions of tcpdump (4.6 and later) support a --time-stamp-precision flag, which can be used to make it request nanosecond-resolution time stamps from libpcap. As there's currently no API to ask what the file's time stamp precision is, it will, when run with that flag, show 9 figures of time stamp, even if the file only has microsecond precision (so that the last 3 digits in the time stamp are always zero).
I tried the previous method of changing ...
What program did you change? That change will not affect live captures, and tcpdump, at least, doesn't make calls like that (older versions don't use pcap_open_offline_with_tstamp_precision() at all, because they were released before pcap_open_offline_with_tstamp_precision() existed; newer versions pass pcap_open_offline_with_tstamp_precision() a variable, which defaults to PCAP_TSTAMP_PRECISION_MICRO but which can be set to PCAP_TSTAMP_PRECISION_NANO by specifying the flag --time-stamp-precision nano.

Related

Timestamp in Chrome DevTool’s Timeline .json

For a research project I am analyzing recordings made using Google Chrome's DevTools Timeline, meaning I run my own software over the saved .json files. I am having trouble understanding their timestamp variable though, and tools such as the EpochConverter do not help. A typical line would be:
{"pid":14038,"tid":17939,"ts":176780856024,"ph":"X","cat":"ipc,toplevel","name":"ChannelReader::DispatchInputData","args":{"class":60,"line":70},"dur":11,"tdur":2,"tts":90016,"bind_id":"0xb35f6002","flow_in":true}
Neither the ts- nor the tts-value provide anything that makes sense. This recording was made with Chrome on Mac. I would much appreciate any help, as for my research it is essential that I'm able to correlate times of scripts. Cheers!
Empirically, ts is the time since OS boot in microseconds (millionths of a second), at least on *nix.
The size of the numbers suggested to me that ts was a fairly high-precision value. So I did a quick recording (roughly seven seconds) and compared the last ts value to the first; it was roughly seven million. Another quick recording confirmed it: Roughly three million for a roughly three-second recording.
Having established microseconds as the units, I wondered what it could be relative to. It clearly wasn't The Epoch. My first thought was "since browser start," but I quickly determined that wasn't the case. But when I looked at the initial number I got (which came out to about 72 hours), I thought "That sounds roughly like how long it's been since I rebooted." A quick reboot confirmed it.
I'm very surprised not to find this information in either of these pages:
How to Use the Timeline Tool
Timeline Event Reference

Why is there a 10 second delay in this http conversation?

I've got a weird situation. The first time I hit an embedded web server (uclinux/boa) at 10.1.10.29, I get a 10 second delay in the browser window before things start happening. "first time" means I haven't hit the machine in a few days. Browser type/OS doesn't matter (source is 10.1.10.20)
I've got a wireshark capture of it happening.
And here is the detail of frame 296:
Note packet 374 doesn't pop out for around 10 seconds after 296. The packets between those 2 aren't from the machine in question. It's just sitting there for 10 seconds and decides to retransmit. How's it supposed to work?
The main reason is most certainly because the code was swapped out from memory.
MS-Windows is really bad in that regard. If some program does not get used for "too long", it gets swapped out of memory. Period. When you come back at it, it has to re-read it back from the hard drive.
The one good thing (main reason) Windows does that is to defragment the kernel memory. For that, it is good.
You have similar problems under Linux, however, only if your server needs the memory. In other words, if you have tons of processes and they all fight for as much of memory as possible, then it is likely to swap out the least used software. Otherwise it will stay in place.
If you were to use the Cassandra database system, you would notice that on any computer that runs anything else than Cassandra. If you just run Cassandra, it remains fast all the time. If you run other software that use a lot of the memory, Cassandra is slow on first access. This is particularly noticeable.
I want to add the answer that solved our problem that had the problem with the 10 second delay, then working and after 5 minutes of inactivity adding another 10 seconds delay.
First of all, we wiresharked everything, and tried to find some kind of error in code, or in the way that the computer or server handled the network traffic. Found nothing out of the ordinary.
After much searching we found it was a DNS-"problem". In the DNS-server that the client computer used, there were dual entries for the domain name of the server. One was correct and one (the first one in the list) was wrong.
So removing the wrong dns pointer solved the problem.
This means the problem was that the computer tried the first address it got, waited 10seconds to get a reply, didnt get it and went to the second address in line. This creates no error messages as this is how DNS is supposed to work. And that is why all our wireshark logs showed up as just waiting 10 seconds with no error and no reason, and then just jump into life, work for as long as the DNS record is valid (5 minutes in our case) and then the procedure needs to be done again.
Hope this helps someone who has a similar problem.

How do download accelerators work?

We require all requests for downloads to have a valid login (non-http) and we generate transaction tickets for each download. If you were to go to one of the download links and attempt to "replay" the transaction, we use HTTP codes to forward you to get a new transaction ticket. This works fine for a majority of users. There's a small subset, however, that are using Download Accelerators that simply try to replay the transaction ticket several times.
So, in order to determine whether we want to or even can support download accelerators or not, we are trying to understand how they work.
How does having a second, third or even fourth concurrent connection to the web server delivering a static file speed the download process?
What does the accelerator program do?
You'll get a more comprehensive overview of Download Accelerators at wikipedia.
Acceleration is multi-faceted
First
A substantial benefit of managed/accelerated downloads is the tool in question remembers Start/Stop offsets transferred and uses "partial" and 'range' headers to request parts of the file instead of all of it.
This means if something dies mid transaction ( ie: TCP Time-out ) it just reconnects where it left off and you don't have to start from scratch.
Thus, if you have an intermittent connection, the aggregate transfer time is greatly lessened.
Second
Download accelerators like to break a single transfer into several smaller segments of equal size, using the same start-range-stop mechanics, and perform them in parallel, which greatly improves transfer time over slow networks.
There's this annoying thing called bandwidth-delay-product where the size of the TCP buffers at either end do some math thing in conjunction with ping time to get the actual experienced speed, and this in practice means large ping times will limit your speed regardless how many megabits/sec all the interim connections have.
However, this limitation appears to be "per connection", so multiple TCP connections to a single server can help mitigate the performance hit of the high latency ping time.
Hence, people who live near by are not so likely to need to do a segmented transfer, but people who live in far away locations are more likely to benefit from going crazy with their segmentation.
Thirdly
In some cases it is possible to find multiple servers that provide the same resource, sometimes a single DNS address round-robins to several IP addresses, or a server is part of a mirror network of some kind. And download managers/accelerators can detect this and apply the segmented transfer technique across multiple servers, allowing the downloader to get more collective bandwidth delivered to them.
Support
Supporting the first kind of acceleration is what I personally suggest as a "minimum" for support. Mostly, because it makes a users life easy, and it reduces the amount of aggregate data transfer you have to provide due to users not having to fetch the same content repeatedly.
And to facilitate this, its recommended you, compute how much they have transferred and don't expire the ticket till they look "finished" ( while binding traffic to the first IP that used the ticket ), or a given 'reasonable' time to download it has passed. ie: give them a window of grace before requiring they get a new ticket.
Supporting the second and third give you bonus points, and users generally desire it at least the second, mostly because international customers don't like being treated as second class customers simply because of the greater ping time, and it doesn't objectively consume more bandwidth in any sense that matters. The worst that happens is they might cause your total throughput to be undesirable for how your service operates.
It's reasonably straight forward to deliver the first kind of benefit without allowing the second simply by restricting the number of concurrent transfers from a single ticket.
I believe the idea is that many servers limit or evenly distribute bandwidth across connections. By having multiple connections, you're cheating that system and getting more than your "fair" share of bandwidth.
It's all about Little's Law. Specifically each stream to the web server is seeing a certain amount of TCP latency and so will only carry so much data. Tricks like increasing the TCP window size and implementing selective acks help but are poorly implemented and generally cause more problems than they solve.
Having multiple streams means that the latency seen by each stream is less important as the global throughput increases overall.
Another key advantage with a download accelerator even when using a single thread is that it's generally better than using the web browsers built in download tool. For example if the web browser decides to die the download tool will continue. And the download tool may support functionality like pausing/resuming that the built-in brower doesn't.
My understanding is that one method download accelerators use is by opening many parallel TCP connections - each TCP connection can only go so fast, and is often limited on the server side.
TCP is implemented such that if a timeout occurs, the timeout period is increased. This is very effective at preventing network overloads, at the cost of speed on individual TCP connections.
Download accelerators can get around this by opening dozens of TCP connections and dropping the ones that slow to below a certain threshold, then opening new ones to replace the slow connections.
While effective for a single user, I believe it is bad etiquette in general.
You're seeing the download accelerator trying to re-authenticate using the same transaction ticket - I'd recommend ignoring these requests.
From: http://askville.amazon.com/download-accelerator-protocol-work-advantages-benefits-application-area-scope-plz-suggest-URLs/AnswerViewer.do?requestId=9337813
Quote:
The most common way of accelerating downloads is to open up parllel downloads. Many servers limit the bandwith of one connection so opening more in parallel increases the rate. This works by specifying an offset a download should start which is supported for HTTP and FTP alike.
Of course this way of acceleration is quite "unsocial". The limitation of bandwith is implemented to be able to serve a higher number of clients so using this technique lowers the maximum number of peers that is able to download. That's the reason why many servers are limiting the number of parallel connection (recognized by IP), e.g. many FTP-servers do this so you run into problems if you download a file and try to continue browsing using your browser. Technically these are two parallel connections.
Another technique to increase the download-rate is a peer-to-peer-network where different sources e.g. limited by asynchron DSL on the upload-side are used for downloading.
Most download 'accelerators' really don't speed up anything at all. What they are good at doing is congesting network traffic, hammering your server, and breaking custom scripts like you've seen. Basically how it works is that instead of making one request and downloading the file from beginning to end, it makes say four requests...the first one downloads from 0-25%, the second from 25-50%, and so on, and it makes them all at the same time. The only particular case where this helps any, is if their ISP or firewall does some kind of traffic shaping such that an individual download speed is limited to less than their total download speed.
Personally, if it's causing you any trouble, I'd say just put a notice that download accelerators are not supported, and have the users download them normally, or only using a single thread.
They don't, generally.
To answer the substance of your question, the assumption is that the server is rate-limiting downloads on a per-connection basis, so simultaneously downloading multiple chunks will enable the user to make the most of the bandwidth available at their end.
Typically download-accelerators depend on partial content download - status code 206. Just like the streaming media players, media players ask for a small chunk of the full file to the server and then download it and play. Now the catch is if a server restricts partial-content-download then the download accelerator won't work!. It's easy to configure a server like Nginx to restrict partial-content-download.
How to know if a file can be downloaded via ranges/partially?
Ans: check for a header value Accept-Ranges:. If it does exist then you are good to go.
How to implement a feature like this in any programming language?
Ans: well, it's pretty easy. Just spin up some threads/co-routines(choose threads/co-routines over processes in I/O or network bound system) to download the N-number of chunks in parallel. Save the partial files in the right position in the file. and you are technically done. Calculate the download speed by keeping a global variable downloaded_till_now=0 and increment it as one thread completes downloading a chunk. don't forget about mutex as we are writing to a global resource from multiple thread so do a thread.acquire() and thread.release(). And also keep a unix-time counter. and do math like
speed_in_bytes_per_sec = downloaded_till_now/(current_unix_time-start_unix_time)

How would you go about reverse engineering a set of binary data pulled from a device?

A friend of mine brought up this questiont he other day, he's recently bought a garmin heart rate moniter device which keeps track of his heart rate and allows him to upload his heart rate stats for a day to his computer.
The only problem is there are no linux drivers for the garmin USB device, he's managed to interpret some of the data, such as the model number and his user details and has identified that there are some binary datatables essentially which we assume represent a series of recordings of his heart rate and the time the recording was taken.
Where does one start when reverse engineering data when you know nothing about the structure?
I had the same problem and initially found this project at Google Code that aims to complete a cross-platform version of tools for the Garmin devices ... see: http://code.google.com/p/garmintools/. There's a link on the front page of that project to the protocols you need, which Garmin was thoughtful enough to release publically.
And here's a direct link to the Garmin I/O specification: http://www.garmin.com/support/pdf/IOSDK.zip
I'd start looking at the data in a hexadecimal editor, hopefully a good one which knows the most common encodings (ASCII, Unicode, etc.) and then try to make sense of it out of the data you know it has stored.
As another poster mentioned, reverse engineering can be hairy, not in practice but in legality.
That being said, you may be able to find everything related to your root question at hand by checking out this project and its' code...and they do handle the runner's heart rate/GPS combo data as well
http://www.gpsbabel.org/
I'd suggest you start with checking the legality of reverse engineering in your country of origin. Most countries have very strict laws about what is allowed and what isn't regarding reverse engineering devices and code.
I would start by seeing what data is being sent by the device, then consider how such data could be represented and packed.
I would first capture many samples, and see if any pattern presents itself, since heart beat is something which is regular and that would suggest it is measurement related to the heart itself. I would also look for bit fields which are monotonically increasing, as that would suggest some sort of time stamp.
Having formed a hypothesis for what is where, I would write a program to test it and graph the results and see if it makes sense. If it does but not quite, then closer inspection would probably reveal you need some scaling factors here or there. It is also entirely possible I need to process the data first before it looks anything like what their program is showing, i.e. might need to integrate the data points. If I get garbage, then it is back to the drawing board :-)
I would also check the manufacturer's website, or maybe run strings on their binaries. Finding someone who works in the field of biomedical engineering would also be on my list, as they would probably know what protocols are typically used, if any. I would also look for these protocols and see if any could be applied to the data I am seeing.
I'd start by creating a hex dump of the data. Figure it's probably blocked in some power-of-two-sized chunks. Start looking for repeating patterns. Think about what kind of data they're probably sending. Either they're recording each heart beat individually, or they're recording whatever the sensor is sending at fixed intervals. If it's individual beats, then there's going to be a time delta (since the last beat), a duration, and a max or avg strength of some sort. If it's fixed intervals, then it'll probably be a simple vector of readings. There'll probably be a preamble of some sort, with a start timestamp and the sampling rate. You can try decoding the timestamp yourself, or you might try simply feeding it to ctime() and see if they're using standard absolute time format.
Keep in mind that lots of cheap A/D converters only produce 12-bit outputs, so your readings are unlikely to be larger than 16 bits (and the high-order 4 bits may be used for flags). I'd recommend resetting the device so that it's "blank", dumping and storing the contents, then take a set of readings, record the results (whatever the device normally reports), then dump the contents again and try to correlate the recorded results with whatever data appeared after the "blank" dump.
Unsure if this is what you're looking for but Garmin has created an API that runs with your browser. It seems OSX is supported, as well as Windows browsers... I would try it from Google Chromium to see if it can be used instead of this reverse engineering...
http://developer.garmin.com/web-device/garmin-communicator-plugin/
API Features
Auto-detection of devices connected to a computer Access to device
product information like product name and software version Read
tracks, routes and waypoints from supported recreational, fitness and
navigation devices Write tracks, routes and waypoints to supported
recreational, fitness and navigation devices Read fitness data from
supported fitness devices Geo-code address and save to a device as a
waypoint or favorite Read and write Garmin XML files (GPX and TCX) as
well as binary files. Support for most Garmin devices (USB, USB
mass-storage, most serial devices) Support for Internet Explorer,
Firefox and Chrome on Microsoft Windows. Support for Safari, Firefox
and Chrome on Mac OS X.
Can you synthesize a heart beat using something like a computer speaker? (I have no idea how such devices actually work). Watch how the binary results change based on different inputs.
Ripping apart the device and checking out what's inside would probably help too.

How can you tell when a user last pressed a key (or moved the mouse)?

In a Win32 environment, you can use the GetLastInputInfo API call in Microsoft documentation. Basically, this method returns the last tick that corresponds with when the user last provided input, and you have to compare that to the current tick to determine how long ago that was.
Xavi23cr has a good example for C# at codeproject.
Any suggestions for other environments?
As for Linux, I know that Pidgin has to determine idle time to change your status to away after a certain amount of time. You might open the source and see if you can find the code that does what you need it to do.
You seem to have answered your own question there Nathan ;-)
"GetLastInputInfo" is the way to go.
One trick is that if your application is running on the desktop, and the user connects to a virtual machine, then GetLastInputInfo will report no activity (since there is no activity on the host machine).
This can be different to the behaviour you want, depending on how you wish to apply the user input.