Here’s a programming/language problem I’d like to hear your thoughts on.
We have developed conventions that most programmers (should) follow that aren’t a part of the languages syntax but serve to make code more readable. These are of course always a matter of debate but there’s at least some core concepts that most programmers find agreeable. Naming your variables appropriately, naming in general, making your lines not outrageously long, avoiding long functions, encapsulations, those things.
However, there’s a problem that I have yet to find anyone commenting on and that just might be the biggest one of the bunch. It’s the problem of arguments being anonymous when you call a function.
Functions stem from mathematics where f(x) has a clear meaning because a function has a much more rigorous definition that it usually does in programming. Pure functions in mathematics can do a lot less than they can in programming and they are a much more elegant tool, they usually only take one argument (which is usually a number) and they always return one value (also usually a number). If a function takes multiple arguments, they are almost always just extra dimensions of the function’s domain. In other words, one argument isn’t more important than the others. They are explicitly ordered, sure, but other than that, they have no semantic ordering.
In programming however, we have more freedom defining functions, and in this case I’d argue it isn’t a good thing. A common situation, you have a function defined like this
func DrawRectangleClipped (rectToDraw, fillColor, clippingRect) {}
Looking at the definition, if the function is written correctly, its perfectly clear what’s what. When calling the function, you might even have some intellisense/code completion magic going on in your IDE/editor that will tell you what the next argument should be. But wait. If I need that when I’m actually writing the call, isn’t there something we’re missing here? The person reading the code doesn’t have the benefit of an IDE and unless they jump to the definition, they have no idea which of the two rectangles passed as arguments is used for what.
The problem goes even further than that. If our arguments come from some local variable, there might be situations where we don’t even know what the second argument is since we only see the variable name. Take for example this line of code
DrawRectangleClipped(deserializedArray[0], deserializedArray[1], deserializedArray[2])
This is alleviated to various extents in different languages but even in strictly typed languages and even if you name your variables sensibly, you don’t even mention the type the variable is when you’re passing it to the function.
As it usually is with programming, there are a lot of potential solutions to this problem. Many are already implemented in popular languages. Named parameters in C# for example. However, all that I know have significant drawbacks. Naming every parameter on every function call can’t possibly lead to readable code. It almost feels like maybe we’re outgrowing possibilities that plain text programming gives us. We’ve moved from JUST text in almost every area, yet we still code the same. More information is needed to be displayed in the code? Add more text.
Anyways, this is getting a bit tangential so I’ll stop here.
One reply I got to the second code snippet is that you would probably first unpack the array to some named variables and then use those but the variable’s name can mean many things and the way it’s called doesn’t necessarily tell you the way it’s supposed to be interpreted in the context of the called function.
In the local scope, you might have two rectangles named leftRectangle and rightRectangle because that’s what they semantically represent, but it doesn’t need to extend to what they represent when given to a function.
In fact, if your variables are named in the context of the called function than you’re introducing less information than you potentially could with that function call and on some level if does lead to code worse code. If you have a procedure that results in a rectangle you store in rectForClipping and then another procedure that provides rectForDrawing, then the actual call to DrawRectangleClipped is just ceremony. A line that means nothing new and is there just so the computer knows what exactly you want even though you’ve explained it already with your naming. This isn’t a good thing.
I’d really love to hear fresh perspectives on this. I’m sure I’m not the first one to consider this a problem, so how is it solved?
16
I agree that the way functions are often used can be a confusing part of writing code, and especially reading code.
The answer to this problem partly depends on the language. As you mentioned, C# has named parameters. Objective-C’s solution to this problem involves more descriptive method names. For example, stringByReplacingOccurrencesOfString:withString:
is a method with clear parameters.
In Groovy, some functions take maps, allowing for a syntax like the following:
restClient.post(path: 'path/to/somewhere',
body: requestBody,
requestContentType: 'application/json')
In general, you can solve this issue by limiting the number of parameters you pass to a function. I think 2-3 is a good limit. If it appears that a function needs more parameters, it causes me to re-think the design. But, this can be harder to answer generally. Sometimes you are trying to do too much in a function. Sometimes it makes sense to consider a class for storing your parameters. Also, in practice, I often find that functions which take large numbers of parameters normally have many of them as optional.
Even in a language like Objective-C it makes sense to limit the number of parameters. One reason is that many parameters are optional. For an example, see rangeOfString: and its variations in NSString.
A pattern I often use in Java is to use a fluent-style class as a parameter. For example:
something.draw(new Box().withHeight(5).withWidth(20))
This uses a class as a parameter, and with a fluent-style class, makes for easily readable code.
The above Java snippet also helps where the ordering of parameters may not be so obvious. We normally assume with coordinates that X comes before Y. And I normally see height before width as a convention, but that is still not very clear (something.draw(5, 20)
).
I’ve also seen some functions like drawWithHeightAndWidth(5, 20)
but even these can’t take too many parameters, or you’d start to lose readability.
2
Mostly it’s solved by good naming of functions, parameters, and arguments. You already explored that and found it had deficiencies, however. Most of those deficiencies are mitigated by keeping functions small, with a small number of parameters, both in the calling context and the called context. Your particular example is problematic because the function you are calling is trying to do several things at once: specify a base rectangle, specify a clipping region, draw it, and fill it with a specific color.
This is kind of like trying to write a sentence using only the adjectives. Put more verbs (function calls) in there, create a subject (object) for your sentence, and it’s easier to read:
rect.clip(clipRect).fill(color)
Even if clipRect
and color
have terrible names (and they shouldn’t), you can still discern their types from the context.
Your deserialized example is problematic because the calling context is trying to do too much at once: deserializing and drawing something. You need to assign names that make sense and clearly separate the two responsibilities. At a minimum, something like this:
(rect, clipRect, color) = deserializeClippedRect()
rect.clip(clipRect).fill(color)
A lot of readability problems are caused by trying to be too concise, skipping intermediate stages that humans require to discern context and semantics.
2
In practice, it’s solved by better design. It is exceptionally uncommon for well-written functions to take more than 2 inputs, and when it does occur, it’s uncommon for those many inputs to not be able to be aggregated into some cohesive bundle. This makes it pretty easy to break up functions or aggregate parameters so you’re not making a function do too much. One it has two inputs, it becomes easy to name and much clearer about which input is which.
My toy language had the concept of phrases to deal with this, and other more natural language focused programming languages have had other approaches to deal with it, but they all tend to have other downsides. Plus, even phrases are little more than a nice syntax around making functions have better names. It’s always going to be hard to make a good function name when it takes a bunch of inputs.
2
In Javascript (or ECMAScript), for example, many programmers grew accustomed to
passing parameters as a set of named object properties in a single anonymous object.
And as a programming practice it got from programmers to their libraries and from there to other programmers who grew to like it and use it and write some more libraries etc.
Example
Instead of calling
function drawRectangleClipped (rectToDraw, fillColor, clippingRect)
like this:
drawRectangleClipped(deserializedArray[0], deserializedArray[1], deserializedArray[2])
, which is a valid and correct style, you call the
function drawRectangleClipped (params)
like this:
drawRectangleClipped({
rectToDraw: deserializedArray[0],
fillColor: deserializedArray[1],
clippingRect: deserializedArray[2]
})
, which is valid and correct and nice with regard to your question.
Off course, there have to be suitable conditions for this – in Javascript this is much more viable than in, say, C. In javascript, this even gave birth to now widely used structural notation that grew popular as a lighter counterpart to XML. It’s called JSON (you may have already heard about it).
2
You should use objective-C then, here is a function definition:
- (id)performSelector:(SEL)aSelector withObject:(id)anObject withObject:(id)anotherObject
And here it is used:
[someObject performSelector:someSelector withObject:someObject2 withObject:someObject3];
I think ruby has similar constructs and you can simulate them in other languages with key-value-lists.
For complex functions in Java I like to define dummy variables in the functions wording. For your left-right-example:
Rectangle referenceRectangle = leftRectangle;
Rectangle targetRectangle = rightRectangle;
doSomeWeirdStuffWithRectangles(referenceRectangle, targetRectangle);
Looks like more coding, but you can for example use leftRectangle and then refactor the code later with “Extract local variable” if you think it will not be understandable to a future maintainer of the code, which might or might not be you.
1
My approach is to create temporary local variables – but not just call them LeftRectange
and RightRectangle
. Rather, I use somewhat longer names to convey more meaning. I often try to differentiate the names as much as possible, e.g. not call both of them something_rectangle
, if their role is not very symmetric.
Example (C++):
auto& connector_source = deserializedArray[0];
auto& connector_target = deserializedArray[1];
auto& bounding_box = deserializedArray[2];
DoWeirdThing(connector_source, connector_target, bounding_box)
and I might even write a one-liner wrapper function or template:
template <typename T1, typename T2, typename T3>
draw_bounded_connector(
T1& connector_source, T2& connector_target,const T3& bounding_box)
{
DoWeirdThing(connector_source, connector_target, bounding_box)
}
(ignore the ampersands if you don’t know C++).
If the function does several weird things with no good description – then it probably needs to be refactored!