I’m faced with similar classes A1, A2, …, A100. Believe it or not but yeah, there are roughly hundred classes that almost look the same. None of these classes are unit tested (of course 😉 ). Each of theses classes is about 50 lines of code which is not too much by itself. Still this is way too much duplicated code.
I consider the following options:
- Writing tests for A1, …, A100. Then refactor by creating an
abstract base class AA.
Pro: I’m (near to totally) safe by the tests that nothing goes wrong.
Con: Much effort. Duplication of test code. - Writing tests for A1, A2. Abstracting the duplicated test code and using the abstraction to create the rest of the tests. Then create AA as in 1.
Pro: Less effort than in 1 but maintaining a similar degree of safety.
Con: I find generalized test code weird; it often seems … incoherent (is this the right word?). Normally I prefer specialized test code for specialized classes. But that requires a good design which is my goal of this whole refactoring. - Writing AA first, testing it with mock classes. Then inheriting A1, …, A100 successively.
Pro: Fastest way to eliminate duplicates.
Con: Most Ax classes look very much the same. But if not, there is the danger of changing the code by inheriting from AA. - Other options …
At first I went for 3. because the Ax classes are really very similar to each other. But now I’m a bit unsure if this is the right way (from a unit testing enthusiast’s perspective).
3
Your options look very well thought out.
The only true way to give a level of comfort that your refactoring doesn’t break anything is option 1, as it is essential that refactoring doesn’t change any behaviour, however option 3 sounds like the most sensible approach.
If I could see the same block(s) of code/logic is copy-pasted across multiple classes, then I know that logic will not be different in one class or many, so it makes sense to reduce that amount of work.
As a (very simplified) example, say we have 2 classes (C#):
Class1
{
public int Method1(int x, int y)
{
int result = 0;
result = x + y;
return result;
}
}
Class2
{
public int Method1(int x, int y)
{
int result = 0;
x++;
y = y + 2;
result = x + y;
return result;
}
}
In the classes above we can abstract result = x + y;
into a base class:
BaseClass1
{
public int AddTwoNumbers(int x, int y)
{
return x + y;
}
}
After both classes inherit from the base class we can safely replace result = x + y;
with result = base.AddTwoNumbers(x, y);
and expect the same results in the sub-classes.
Obviously result = x + y;
is over simplified, and is meant to represent blocks of code that repeat, but an increased amount of code just makes it harder to come to the same conclusion: refactoring repeating code into a base class will work the same as if it was copy-pasted across many classes.
There will be caveats and what-ifs to what I’ve said, but the key question is what is your level of comfort in your refactoring ability?
Lastly, I assume your end result will go through a test team/environment/colleagues where you perform regression tests; this will validate that your refactoring is good, and if not you’ll have the opportunity to fix it and re-test.
You’re not an island, and should be able to rely on testers/colleagues to proof your work as part of the development lifecycle.
I like some thing that Ian Cooper says “test things you want to preserve”. If you don’t want to preserve the A’s class don’t unit test individually, try to move your test one level of abstraction.
An option can test your System via a facade or something similar (i don’t understand enough your system), test your system (not your classes) expected behavior and when you have this test that check your system you can change the implementation and refactor this A classes without breaking the test.
It seems covering all classes with unit-tests produce a lot of duplicated code. I would add some analysis stage before start writing unit-tests. Probably, some classes have the idential implementation, it could be detected by some static code analyzer (like Simian of PMD Copy-paste detector). These classes could be removed and the code could be refactored using plain search-replace or simple regexp
After finishing that stage you can start cover with unit test classes which have really different implementation
Unit tests are great for refactoring! Just make sure you write test code in such a way that you don’t have to refactor the tests along with the code you are testing. Refactoring the tests would undermine anything that the 100 cut-and-pasted tests would prove.
Write tests one layer removed from the code you are refactoring. Meaning: Test the functionality that should be the same before and after, so that the tests can’t tell which version of the underlying code you are testing. That strategy may save you from writing 100 cut-and-pasted tests as well.
With a big refactor, some manual testing is often required.
1