I have inherited a legacy web application many years which:
- Does not make use of Object Oriented principles, even though the language would permit it
- Has no unit tests, nor any sort of test suite, automated or not
- Generally ignores various general best practices making the code very hard to read and maintain
- Splits its workings (including code for business rules) across the web application, database views and stored procedures. Some of the code is even externalized in that outside applications (not under me or my team’s control) often read or write directly into the database
- Has only the barest documentation (including some I’ve created)
- No documented technical or functional specs to speak of either
- Has changed rules so many times and so fast that no one truly knows how it’s supposed to work anymore. Rules are stacked on as missing features or specific cases are discovered on a daily basis.
- Happens to be the most business-critical application of the company
- Is still constantly evolving
Refactoring almost always comes with the main suggestion of adding unit tests to make sure you don’t break anything. How do you do this when you do not even know how the thing is supposed to work (and no one else can tell you for sure)?
Is this a case where you’d need to go back to square 1 and slowly, iteratively, as new features are requested, ask for specs focused on the modified section and then refactor what’s there? Is there any other way to deal with this situation?
This question focuses mostly on the lack of specs in opposition the refactor safety rule of creating unit tests to ensure the rules are respected throughout refactoring.
That said, a few more bits on the situation beyond lack of specs:
- The “best” programming structure found in the code so far is the function. They tend to not be self-contained and call on many global variables (even local variables were erroneously set as global variables risking unexpected behaviors). I get the feeling the first step might be to refactor this to even allow unit testing.
- The issues mentioned apply to the whole application, the database procedures and even sister applications. Everything is intermingled together and almost never through interfaces. It’s still possible to find
- Code tends to be duplicated in various places. It happens with business-rules code, meaning you may get different results depending on where in the application you call up a feature. Obvious maintenance complexity and risk.
4
As long as there isn’t an explicit list of known defects that management wants fixed anyway, you should assume that the behaviour as it is now is correct, and you should strive to keep it unchanged during refactoring.
In other words, absent explicit an explicit spec, the current semantics of the code base are the requirements. That may not be true in every single respect, but since the application as a whole is perceived to have business value (otherwise no effort would be allocated to maintaining it), presumably it does more things right than wrong. Therefore you shouldn’t spend your time figuring out what was correct and what wasn’t; if someone is unhappy with what the new version does, they’ll let you know soon enough, and that‘s your opportunity to request some actual requirements.
2
This is pretty much word-by-word how I described a project I got about a year ago. And as unfortunate as it is – it’s quite a common situation.
OK, so first thing’s first – If it is a black-box kind of module that you don’t need to fix – don’t fix it. Try to document the behaviour, maaaybe add some tests, but don’t fix something you don’t know how to fix or even how it works.
Now, assuming that this thing does, in fact, need fixing. Your first step is to try to understand the piece of code you’re working with, but after that…
There’s a bug in it:
Unit tests is first priority. Try to cover the part of the code and the code that depends on it or uses results from it with tests to make sure you change only the thing you intend to change.
There’s a new feature that needs to be put in:
In this case I would usually use the least amount of existing code as possible. If something is clear – like DB calls – that’s fine to reuse. If something is going through a spaghetti grinder logic – try to have a parallel call to your logic at one single point. This will not produce good code, but it will produce several “clear outs” of code, where the logic is sound, simple, documented and tested. The more of these, the easier it will be later when…
It’s so tied up with everything that it just needs to go:
Try to figure out things you do understand about the code. Some non-trivial function whose side effects and output you understand. A standalone module that is kinda separated from everything else and is easier to tangle with. A set of features and specs that you can get out of people or docs. Divide and conquer is your best tactic – the smaller the piece of code you deal with at any moment, the better your chances to understand it well. Once you have knowledge, tests and docs about the entire thing, you can start working on its architecture.
1
Preface: +1 @Kilian_Foth
Quote from Gold Master Testing Automatically Validating Millions of Data Points
Gold Master Testing
Gold master testing is common when working with legacy code. Rather
than trying to specify all of the logical paths through an untested
module, you can feed it a varied set of inputs and turn the outputs
into automatically verifying tests. There’s no guarantee the outputs
are correct in this case, but at least you can be sure they don’t
change (which, in some systems is even more important).For us, given that we have a relatively reliable and comprehensive set
of unit tests for our analysis code, the situation is a bit different.
In short, we find gold master testing valuable because of three key
factors:
- The inputs and output from our analysis is extremely detailed. There
are a huge number of syntactical structures in Ruby, and we derive a ton of information from them.- Our analysis depends on external code that we do not control, but do want to update from time-to-time (e.g. RubyParser)
- We are extremely sensitive to any changes in results. For example, even a tiny variance in our detection of complex methods across the 20k repositories we analyze would ripple into changes of class ratings, resulting in incorrect metrics being delivered to our customers. These add up to mean that traditional unit and acceptance testing is necessary but not sufficient. We use unit and acceptance tests to provide faster results and more localized detection of regressions, but we use our gold master suite (nicknamed Krylon) to sanity check our results against a dozen or so repositories before deploying changes.
The Golden Master approach
Before making any change to the production code, do the following:
- Create X number of random inputs, always using the same random seed, so you can generate always the same set over and over again. You will probably want a few thousand random inputs.
- Bombard the class or system under test with these random inputs.
- Capture the outputs for each individual random input
When you run it for the first time, record the outputs in a file (or database, etc). From then on, you can start changing your code, run the test and compare the execution output with the original output data you recorded. If they match, keep refactoring, otherwise, revert back your change and you should be back to green.
1