I’ve always wondered what other programmers think about the idea of creating pure aesthetic functions.
Say I have a function that processes a chunk of data: Function ProcessBigData
. Say I need several process steps, only valid for that data: Step1
, Step2
, Step3
.
The normal aproach I see the most in source code is to write comments like so:
Function ProcessBigData:
# Does Step1
Step1..
Step1..
#Does Step2
Step2..
Step2..
What I usually do, but always felt wrong due to the lack of such coding style by fellow peers is:
Function ProcessBigData:
Function Step1:
Step1..
Step1..
Function Step2:
Step2..
Step2..
Step1() -> Step2()
I am mainly concerned if there any drawbacks for such style in Javascript and Python
Are there any alternatives that I’m not seeing?
1
It’s not as strange as you might think. For example, in Standard ML it’s customary to limit the scope of helper functions. Granted, SML has syntax to facilitate it:
local
fun recursion_helper (iteration_variable, accumulator) =
... (* implementation goes here *)
in
fun recursive_function (arg) = recursion_helper(arg, 0);
end
I would consider this good style, given that 1) small functions facilitate reasoning about the program, and 2) it signals to the reader that these functions are not used outside of that scope.
I suppose it’s possible there’s some overhead in creating the inner functions whenever the outer function is called (I don’t know if JS or Python optimize that away) but you know what they say about premature optimization.
It is usually a good thing to do this whenever possible, but I like to think of this sort of work not as “steps”, but as subtasks.
A subtask is a specific unit of work that can be done: it has a specific responsibility, and defined input(s) and output(s) (think of the “S” in SOLID). A subtask need not be re-usable: some people tend to think “I’ll never have to call this from anything else so why write it as a function?” but that’s a fallacy.
I’ll try to also outline the benefits and also how it applies to nested functions (closures) vs just another function in the class. Generally speaking, I’d recommend not using closures unless you specifically need one (there are many uses, but separating code into logical chunks is not one of them).
Readability.
200+ lines of procedural code (body of a function) are hard to read. 2-20 line functions are easy to read. Code is for humans.
Nested or not, you mostly get the benefit of readability, unless you’re using a lot of variables from the parent scope, in which case it can be just as difficult to read.
Limit variable scope
Having another function forces you to limit variable scope, and specifically pass what you need.
This often also makes you structure code better, because if you need some sort of state variable from an earlier “step”, you might actually find there’s actually another subtask that should written and executed first to get that value. Or in other words, it makes it harder to write highly-coupled chunks of code.
Having nested functions allows you to access variables in the parent scope from inside the nested function (closure). This can be very useful, but it can also lead to subtle, hard-to-find bugs since the execution of the function may not happen in the way it’s written. This is even more the case if you’re modifying variables in the parent scope (a very bad idea, generally).
Unit tests
Each subtask, implemented a function (or even a class) is a stand-alone, testable piece of code. The benefits of unit testing and TDD are well documented elsewhere.
Using nested functions/closures doesn’t allow unit testing. To me, this is a deal-breaker and the reason that you should just another function, unless there is a specific need for a closure.
Working on a team / Top-down design
Subtasks can be written by different people, independently, if needed.
Even by yourself, it can be useful when writing code to simply invoke some subtask that doesn’t yet exist, while building the main functionality, and worry about actually implementing the subtask only after you know it’s going to get the results you need in a meaningful way. This is also called Top-down design/programming.
Code re-use
Okay, so despite what I said earlier, sometimes there actually does end up being a reason later to re-use a subtask for something else. I’m not at all advocating “architecture astronaut”-ism but just that by writing loosely-coupled code, you may end up benefiting later from re-use.
Often that re-use means some refactoring, which is perfectly expected, but refactoring the input parameters to a small standalone function is MUCH easier than extracting it from a 200+ line function months after it was written, which is really my point here.
If you use a nested function, re-using it is generally a matter of refactoring to a separate function anyway, which again, is why I’d argue that nested is not the way to go.
2