To Infinity and Beyond! Capturing Forever with Tshark

Over the last couple of years that I’ve been involved with Wireshark, one issue has reared its head a significant number of times in a surprisingly varied number of ways. These range from “Capturing with tshark uses more and more memory!” to “I set tshark to capture in the background, and it keeps crashing!” to “How do I set up tshark to capture forever?”

Historically we’ve had no good answer to these complaints – Wireshark and tshark both only do what is called stateful dissection. This means that they store what they’ve seen in memory and use that information to provide additional details about future packets, for example by matching requests with responses. While this provides substantial benefits — reassembly of protocols over TCP being probably the most obvious — it means that as the amount of traffic increases, so does the amount of memory needed to store all of that state. It also means that there’s no way for tshark to run forever unless you’ve got infinite memory (what’s your secret!?) or no traffic at all.

All of that has just changed. Wireshark and tshark have long had a feature that lets you rotate your packet capture across multiple files, preventing any one file from getting too large; to do this, check out the “-b” flag to tshark. This was handy for systems limited in disk space, but did nothing for the ever-growing memory usage. A few days ago, however, I landed a change in tshark’s master branch that makes it discard its internal state every time the capture rotates to a new file. This has one huge benefit: you can now capture (theoretically) forever with tshark by using the “-b” flag!

Like any experimental change however, it does currently have a number of limitations:

  • State is lost when we switch to a new file, so if two fragments of the same message get split across that file boundary, they will not be reassembled. This is effectively unavoidable – while we considered discarding only very old state, this turns out to be extremely difficult with our current architecture. However, the result is that the dissection you get when capturing with “-b” should actually be closer now to the dissection you get when opening the individual files after the fact.
  • Discarding state can be a surprisingly expensive operation, so captures using this feature in high-traffic environments may see their dissection fall behind when switching to a new file. If you see this, please file a bug report with as much detail as you can provide, and we will try to smooth out the rough edges.
  • Memory usage might still grow a very small amount due to previously-hidden memory leaks now being much more obvious. Again, if you see anything problematic, please file a bug report.
  • This feature is limited to tshark only, it is not available in the graphical Wireshark interface. The graphical interface has to permit scrolling backwards to look at previously dissected packets, so it can’t discard anything until the entire capture is closed. (Edit: This isn’t strictly true – now that I’ve actually checked my assumptions, it should be possible to make the GUI behave the same way. It will take some additional work though.)
  • This feature is limited to the master branch only, it will not be in the upcoming 1.12 release. Given the substantial nature of the change, it was decided that it needed a chance to cook before being released as a “stable” feature. It is, however, present in the 1.99 automated development builds as of this morning.

Please test it out and let us know what you think. Happy capturing!

6 thoughts on “To Infinity and Beyond! Capturing Forever with Tshark

  1. Scott

    I can’t see the reasoning for continuous analysis. If there’s a problem, more often than not, in my experience, a multi-hour or multi-day trace will reveal the source of the problem.

    And more to the point, how the heck can pouring through that much information–a continuous capture– produce any improvement? There are incremental opportunities to be sure, but that much information is just not practical. I know. That’s how we were packaging the data just to appease some people asking for it. Not worth it.

  2. Jim Baxter - PacketIQ

    Great article – Thanks, Evan. Do I correctly interpret your discussion to mean that the graphical Wireshark interface can’t discard state to free memory when using the Ring Buffer option? Wasn’t aware of that.

  3. Evan Huus Post author

    @Scott People keep asking for it, so there must be use cases somewhere.

    @Jim My apologies – you correctly interpreted my post, but I hadn’t checked one of my assumptions, so it should be possible to make the GUI behave the same way. I’ll have to take a look at that.

  4. Anders Broman

    @Scott
    >I can’t see the reasoning for continuous analysis. If there’s a problem, more often than not, in my >experience, a multi-hour or multi-day trace will reveal the source of the problem.

    That’s exactly it. That might not be possible without Evans change as tshark might run out of memory quite quickly under heavy traffic. Chosing the ring buffer file size apropriatly you would get chunks of a managable size for a given time period.

  5. Paul Offord

    Hi Evan,

    Interesting article.

    We run a lot of continuous captures to catch intermittent problems. We have always used dumpcap directly to avoid the memory problem. What advantage would we get running tshark rather than dumpcap.

    Cheers…Paul

  6. Evan Huus Post author

    @Paul

    Instantaeous alerting. Dumcap doesn’t do dissection, so if you want to actually find out what’s going on in the traffic you capture that way, you have to pull it off the server and run tshark/wireshark on it separately (in other words, it’s useful diagnostically after the fact, but can’t be used to detect errors itself). With tshark running continuous dissection you get the packets dissected right away, so you can actually raise alerts on that information if something is wrong, without waiting for the application proper to complain.

Comments are closed.