Sometimes, although not often, I have to include math logic in my code. The concepts used are mostly very simple, but the resulting code is not – a lot of variables with unclear purpose, and some operations with not so obvious intent. I don’t mean that the code is unreadable or unmaintainable, just that it’s waaaay harder to understand than the actual math problem. I try to comment the parts which are hardest to understand, but there is the same problem as in just coding them – text does not have the expressive power of math.
I am looking for a more efficient and easy to understand way of explaining the logic behind some of the complex code, preferably in the code itself. I have considered TeX – writing the documentation and generating it separately from the code. But then I’d have to learn TeX, and the documentation will not be in the code itself. Another thing I thought of is taking a picture of the mathematical notations, equations and diagrams written on paper/whiteboard, and including it in javadoc.
Is there a simpler and clearer way?
P.S. Giving descriptive names(timeOfFirstEvent
instead of t1
) to the variables actually makes the code more verbose and even harder too read.
3
The right thing to do in such circumstances is to implement the algorithm, formula or whatever with exactly the same variable names as in the primary real-world source (as far as the programming language allows this), and have a succinct comment above it saying something like “Levenshtein distance computation as described in [Knuth1968]”, where the citation links to a readily accessible description of the math.
(If you don’t have such a reference, but your math is sound and useful, maybe you should consider publishing it yourself. Just sayin’.)
10
When I have had to implement algorithms like that, there are a couple of things I do.
-
As much as possible, isolate the algorithm to its own method or preferably class. My current project has it’s own equivalent
Math
class to add complex algorithms to. -
Provide a summary of what the algorithm is supposed to do in lay terms including any common acronyms or shorthand references to the term. I do this in the method itself, so it lives with the code.
-
Provide a summary of the algorithm in technical / mathematical terms and include any external references that I know of. Again, I do this with the method itself so it has a better chance of staying relevant. Plain text isn’t great in this case, so I’ll cite the mathematical term as best I can and clarify in a parenthetical comment beside it. For example,
x^y (x raised to the power y)
-
Document how I’m breaking the algorithm apart into components and indicate what each variable represents in the algorithm. eg.
t1 is time of first event
-
Code up the algorithm and comment the complex parts. Essentially, I’ll add a comment anywhere I take a step that wasn’t obvious or straightforward within the algorithm itself. I especially make sure I comment any non-obvious shortcuts and why they are okay that I may take within the implementation.
-
Write up some unit tests that will validate the operation of the algorithm.
Finally, if it’s really, really, really complex then I resign myself to the fact that I own that code for the remainder of my time on that project.
I don’t like relying upon an external document for someone else to understand the code. Yes, it can be necessary sometimes especially when getting into arcane details. But whenever possible, I try to keep everything within the code itself so it has a chance of staying updated and easily located. In this case, I value accessibility to information over expressiveness of the documentation.
In our projects, which are revolving around research in quantitative financial economics, we utilize a LOT of math, and we follow a combination of what has already been posted:
-
Provide a link to the main source you’re using. For us, the easiest way of doing that is using the BibTex-handle, which is basically an ID for a paper that can be looked up by everybody involved. Depending on the specific source, we regularly add the equation reference as well.
-
Provide explanations for all variables. Again, we use Tex for that if the original paper uses Greek or other letters. The Reason for this is that often enough papers and books use different notations. If someone needs to rework the math, this makes it a lot easier.
-
Attempt to code the equation in one piece. It is much easier to recognize that way. DO NOT post the Tex-Code of the full equation into the code – either the equation is very short, and posting tex is messy and superfluous, or the equation is huge, and the tex code is useless, unless you compile it (Use a reference instead). Disassembling an equation into small pieces makes it really hard to understand whats going on (if you’re good at math at least).
IMHO, the most important realization is that formulas often depend on context. Every math paper i know takes its time to set up the environment of the model; You should do the same.
1
text does not have the expressive power of math
You are right. Since you already are looking for a way to do it outside code, and Tex is an overkill besides having a steep learning curve, my recommendation is as follows:
Use OpenOffice.org/LibreOffice Math Equation Editor.
It’s free. It’s open.
You can use it either visually or you can write the equations in a special language.
You don’t have to learn the language right away because when you use the GUI, the “code” is generated in a panel for you to see.
In the upper panel you can “draw” the equations using a pallete. In the lower panel the equivalent notation is generated. You can do it the other way around once you’ve got a grasp of the notation, writing in notation in the lower panel and seeing the graphical output in the top panel.
2