I currently report my production exceptions to a mysql database, where they have been collecting dust for the most part. The problem I faced is grouping the stack traces. I would like to be able to see kind of an aggregate of the stack traces that are the same, so that I can work them into the workflow.
I know one issue I have is I get stack traces in different languages, which throws things off, otherwise I’d just do a group by sql command.
Does anyone know of any libraries that could assist me with this? As long as I have something where I can put in a collection of stack traces and get out the result, I’m fine. I don’t need any direct sql interaction, or reporting or anything like that(but I’ll take any suggestions you guys have)
Thanks
7
Prepackaged library? Likely not. There exists some – but it doesn’t appear to be at the size of where we are intrested in – STAT and a YouTube video on it. Interesting, but designed for systems that are an entirely different scale than most people work on.
In previous searches for similar solutions, ultimately I ended up writing my own stack trace fingerprinting software.
You’ve got a stack trace from the system and have stuck it some where. It looks roughly like:
com.foo.bar.UserNotFoundException: User not found at com.foo.bar.ErrorHandler.handle(errorHandler.lang:42) at com.foo.bar.StuffDoer.doStuffA(StuffDoer.lang:314) at com.foo.bar.StuffDoer.doStuffB(StuffDoer.lang:271) at com.not.yours.base.Base(Base.lang:2357)
This the general format of a stack trace – it has the exception handler at the top (which isn’t too useful) and the base of the system at the bottom (which isn’t that useful either).
As an application programmer, what you really want is the stuff in the middle. This may be one method, or a dozen, but its what actually identifies this stack trace vs other ones. The specifics of the stuff outside your call isn’t always that useful.
As a systems programmer, the stuff at the bottom is likely the most interesting.
The first step is to parse through the stack trace line by line identifying the relevant information and further identifying if the given frame is of interest at all – if everything goes through ErrorHandler.handle() before the stack is reported, that frame is not useful in identifying or grouping the stack.
Next, I serialized the array of the interesting bits of information (method name, class name – line number in the case that I dealt with in the past was too variable between releases that it wasn’t useful, but it may be for you) and stored the hashcode of that in a database. Because of the different systems and languages working on the application to identify a stack trace, this was a simple text join and md5sum (if I had to do it again, I’d be tempted to make it into JSON and hash code that).
In the database, the hash code of the interesting part of the stack trace could then be computed quickly and was a fixed size (nice for database indexes and or keys – though not bad for directories either).
Stack traces longer than a certain minimum size were examined multiple times. For example, if the minimum size for further examination was 3, and the stack trace was “a b c d e”, the following stack traces would be stored:
- a b c d e
- a b c d
- b c d e
- a b c
- b c d
- c d e
Yes, sometimes everything was a subset of another set of analysis, but it let people look at smaller parts and identify patterns of misbehavior.
3