gdb implemented support for reverse debugging in 2009 (with gdb 7.0). I never heard about it until 2012. Now I find it extremely useful for certain types of debugging problems. I wished that I heard of it before.
Correct me if I’m wrong but my impression is that the technique is still rarely used and most people don’t know that it exists. Why?
Do you know of any programming communities where the use of reverse debugging is common?
Background information:
- Stackoverflow: How does reverse debugging work?
- gdb uses the term “reverse debugging” but other vendors use other terms for identical or similar techniques:
- Microsoft calls it IntelliTrace or “Historical Debugging”
- There’s a Java reverse debugger called Omniscient Debugger, though it probably no longer works in Java 6
- There are other Java reverse debuggers
- OCaml’s debugger (ocamldebug) calls it time travel
10
For one, running in debug mode with recording on is very expensive compared to even normal debug mode; it also consumes a lot more memory.
It is easier to decrease the granularity from line level to function call level. For example, the standard debugger in eclipse allows you to “drop to frame,” which is essentially a jump back to the start of the function with a reset of all the parameters (nothing done on the heap is reverted, and finally
blocks are not executed, so it is not a true reverse debugger; be careful about that).
Note that this has been available for several years now and works hand in hand with hot-code replacement.
3
As mentioned already, performance is key e.g. with gdb’s reversible debugging, running something like gzip sees a slowdown of 50,000x compared to running natively. There are commercial alternatives however: I work for Undo undo.io, and our UndoDB product does the same but with a slowdown of less than 2x. There are other commercial reversible debuggers available too.
3
For an overview of the technology choices and products, see a series of blog posts I wrote about a year ago (and some follow-ups since):
- http://jakob.engbloms.se/archives/1547 – the tech
- http://jakob.engbloms.se/archives/1554 – research
- http://jakob.engbloms.se/archives/1564 – products
- http://jakob.engbloms.se/archives/1768 – updates to the previous post
My feeling for why it is used so little is that it requires special hardware, or using a special debugger, or setting up your system right. Most people do not invest the time to get maximum value from their debug tools, unfortunately.
And the fact that the “cheap default” of gdb is almost unusably slow and has quite a few stability problems for everything but the most common target systems.
From my experience as a sales engineer for TotalView debugger, people do know that it exists but they don’t think it works, regardless of the (acceptable or not) slowdown.
University of Cambridge recently did a survey entitled “Failure to Adopt Reverse Debugging Costs Global Economy $41 Billion Annually”.
And coming back to GDB, I’ve heard (a lot) that slowdown makes it quite unusable on a “real-life” application.
I’d personally love to hear back from more people using reverse debugging on applications others than “Hello world!”
2
I think it is important to expand a little further on this “reverse” or “historic” debugging. I think to understand complex systems and behavior in those, to replay “events” which make state explicit is absolutely crucial.
What I want to express is that you are not alone in wondering why this technique is not so much applied today or why the related problems are rarely discussed clearly.
So let’s emphasize two very important concepts here:
1.To understand a programming system it is helpful to make state explicit
2.To even further understand a programming system replaying sequences of state (events) can help a lot.
Here are some sources which tackled the problem and proposed or designed solutions for the problem (dealing with state in complex systems):
-Out of the tar bit, paper: http://shaffner.us/cs/papers/tarpit.pdf
Main ideas: avoid, isolate or make state explicit
-CQRS
http://www.cqrs.nu/
This is a combination of two concepts: Command Query Segregation and Event Sourcing. There exists different implementations ( Java,C# , Scala).
The replaying of Tate sequences and the evolving of a domain model are the crucial parts here.
If you really zoom out and see the very broad picture you can already see that with the “rise” of functional programming people are already ((un)consciously ) attracted to fp because it makes state explicit!
But that only deal with point one, to address the second one you need another concept which could be “loosely” described as functional reactive programming.
So you might say all well and good but who actually uses CQRS and FRP? I would say (IMO because I don’t have concrete numbers) actually a lot of companies its just that they don’t know the work they do has this terminology. Maybe you google a bit around and you hear from enterprises which use CQRS, there are some success stories already out there.
FRP too is rising slowly as an example I could give Netflix: http://techblog.netflix.com/2013/02/rxjava-netflix-api.html
Which just released an implementation of RX which is actually .NET based (but has a Javascript implementation too). So People are using these techniques today already, IN THE LARGE to understand complex systems and to make them even better. That is why they use reverse debugging techniques.