I’m currently reading Robert Martin’s Clean Code book. So far I’m enjoying it.
Martin talks a lot about separating long functions into short functions, each doing exactly one thing at one level of abstraction.
Generally I like the idea and I understand the concept and the benefits. But I have a simple question to ask:
How is taking a logical piece of code an extracting it to a new method, different than simply ‘folding’ it? Most modern IDEs feature code-folding.
The advantage of code-folding is that when you need to look at that logical piece of code, you don’t have to scroll up and down to find it. You just press a button and it’s right there in front of you, in it’s linear place.
Extracting small logical pieces of code to new methods, means that when a programmer wants to look at the method you extracted, they usually have to go scrolling up and down the page for it. That’s very annoying and makes it harder to concentrate.
I agree with most of what Uncle Bob writes in his book, and I really want to agree with this point. But with code-folding a part of most modern IDEs, I can’t find a real reason for this idea.
Please explain the advantages of extracting small logical pieces of code to new methods, as opposed to simply code-folding them.
(I remember there was another question on this site regarding this, but it was less focused, and the answers havn’t satisfied me).
7
Code folding doesn’t say anything about scope. If you have a 2,000 line function1 that folds nicely into 200 10-line blocks, you still don’t know what sort of dependencies those blocks might have with each other.
If you factor that function into lots of little sub functions (even if each sub function is only used once), you can see at a glance what variables are used where, what the dependencies are, what can be easily changed, etc.
In short, it’s a lot easier to think about little functions where you can see the dependencies than it is to think about an apparently small code block that may depend on a LOT of variables defined hundreds of lines away.
1: yes, I’m exaggerating to make a point
3
The point of having smaller methods is not to save scrolling effort. The point is to be able to abstract completely from what a block of code does, because that makes reasoning about the superordinate routine easier. (And to foster reuse of that small block, but here I’m talking about the aid to reasoning.)
If you feel the need of checking what it really is that the factored-out method does, then the refactoring has failed. The name of the refactored method should tell you all you need to know; if there is doubt, then we’ve failed to achieve the effect we wanted.
2
While reading the answers and comments to this question, and also while arguing with someone over this topic, I can now see two clear benefits of method-extraction over code folding:
1- As @DanPichelman and @Doval said, if you have a 200 line long function with lots of folded blocks, you have no way to see what the dependencies between them are. They can have lots of tangled dependencies.
Functions, however, define explicit in-parameters and return-value, thus making the depenendencies between the functions much more explicit. Functions can also read class-members and global state, but since the functions are small it will be visible to the programmer.
That’s the most important point.
2- Also, since in ecnourages creating methods where each one does one thing, it allows for code reuse in the future. Right now you might make that new method private, but later you might need this functionality somewhere else, so you might extract this method to a shared utility-class or something. This would be easy if it’s an extracted method, and much more dangerous if it’s a folded-block: again, because of the implicit dependencies between blocks.
Thanks for helping me understand
In addition to the other answers, one word: DRY. With a long folded method, there is no way to easily see if a fix you need to make in one place also needs to be made elsewhere – possibly many times – other than unfolding the lot and wading through it. Your maintenance programmers will like you better if the code is structured so that they can be confident they only have to make a change once.
There are two main reasons to put code into multiple functions:
- Abstraction/Encapsulation
- Reusability
Reusability is of course the simplest to explain. If I have a function:
int do_QWERTYalgo(int data1, int data2)
{ /*do stuff*/ }
Then obviously, if I need to do that workagain, I can just call the function from elsewhere , and “tada!” it’s done, and I don’t need to write it again.
Of course you know that, but have you thought about implementing a program where you need to do it multiple times with different intermediate steps?
//...
a1 = 5;
a2 = 7;
a = do_QWERTYalgo (a1, a2);
b1 = a;
b2 = 8;
b = do_QWERTYalgo (b1, b2);
c = do_QWERTYalgo (a, b);
Suddenly, we’ve cut down a lot on the amount of code we need to write.
Additionally, if do_QWERTYalgo
was implemented incorrectly, we only need to change it in one single location now, which brings us nicely to our other point:
do_QWERTYalgo
is abstracted and encapsulated from the rest of the code.
The algorithm running above has no need to know how the do_QWERTYalgo
algorithm works, it just plugs in the inputs and recieves the output.
Also, should do_QWERTYalgo
need to change, as long as the inputs and outputs don’t change, nothing else needs to change in the code.
To explain how this is meant to help you:
lets say you are using a library to provide encrypted communication between yourself and a friend. Suddenly, there is news that the encryption method used is extremely vunerable to a new type of hack. Suddenly, there is a problem with your chat program. What do you do?
Well, you grab an updated version of the library. Done. There may be dozens of places in your code where you use functionality provided by the library, but because the encryption functions are not embedded in your application code, you don’t need to hunt down every single use and change it manually.
Another way of thinking about this:
the function you are writing is a person. Said person wants to drive from A
to B
. So the person gets into the car
and puts input into the car
via the push_pedals
and turn_wheel
methods. With the correct inputs, the person gets to where they want to go.
Writing all your code in one big super method is like getting your person to implement the fuel_injection
functionality, the ignite_fuel
functionality, the gearbox
and clutch
functionalities, the turn_wheels_on_road_when_steering_wheel_turned
functionality, the push_pedals
, accellerate
, brake
functionalities, while also having to drive the car…
everything still happens as before, but there is a lot more work going on.
All code-folding does is hide some of that from your view, it’s no better than giving your navigator a blindfold.