Is there a reason why functions in most(?) programming languages are designed to support any number of input parameters but only one return value?
In most languages, it is possible to “work around” that limitation, e.g. by using out-parameters, returning pointers or by defining/returning structs/classes. But it seems strange that programming languages were not designed to support multiple return values in a more “natural” way.
Is there an explanation for this?
11
Some languages, like Python, support multiple return values natively, while some languages like C# support them via their base libraries.
But in general, even in languages that support them, multiple return values are not used often because they’re sloppy:
- Functions that return multiple values are hard to name clearly.
-
It’s easy to mistake the order of the return values
(password, username) = GetUsernameAndPassword()
(For this same reason, many people avoid having too many parameters to a function; some even take it as far as to say a function should never have two parameters of the same type!)
- OOP languages already have a better alternative to multiple return-values: classes.
They’re more strongly-typed, they keep the return values grouped as one logical unit, and they keep the names of (properties of) the return values consistent across all uses.
The one place they are pretty convenient is in languages (like Python) where multiple return values from one function can be used as multiple input parameters to another. But, the use-cases where this is a better design than using a class are pretty slim.
24
Because functions are mathematical constructs that perform a calculation and return a result. Indeed, much that’s “under the hood” of not a few programming languages focuses solely on one input and one output, with multiple inputs being just a thin wrapper around the input – and when a single value output doesn’t work, using a single cohesive structure (or tuple, or Maybe
) be the output (though that “single” return value is composed of many values).
This has not changed because programmers have found out
parameters to be awkward constructs that are useful in only a limited set of scenarios. Like with many other things, the support isn’t there because the need/demand isn’t there.
33
In mathematics, a “well-defined” function is one where there is only 1 output for a given input (as a side note, you can have only single input functions, and still semantically get multiple inputs using currying).
For multi-valued functions (eg. squared root of a positive integer, for example), it’s sufficient to return a collection, or sequence of values.
For the types of functions you’re talking about (ie. functions that return multiple values, of different types) I see it slightly differently than you seem to: I see the need/use of out params as a workaround for better design or a more useful data structure. For example, I’d prefer if *.TryParse(...)
methods returned a Maybe<T>
monad instead of using an out param. Think of this code in F#:
let s = "1"
match tryParse s with
| Some(i) -> // do whatever with i
| None -> // failed to parse
Compiler/IDE/analysis support is very good for these constructs. This would solve much of the “need” for out params. To be completely honest, I can’t think of any other methods off-hand where this wouldn’t be the solution.
For other scenarios – the ones I can’t remember – a simple Tuple suffices.
9
In addition to what’s already been said when you look at the paradigms used in assembly when a function returns it leaves a pointer to the returning object in a specific register. If they used variable/multiple registers the calling function would not know where to get the returned value(s) if that function was in a library. So that would make linking to libraries difficult and instead of setting an arbitrary number of returnable pointers they went with one. Higher level languages don’t quite have the same excuse.
15
A lot of the use cases where you would have used multiple return values in the past simply aren’t necessary anymore with modern language features. Want to return an error code? Throw an exception or return an Either<T, Throwable>
. Want to return an optional result? Return an Option<T>
. Want to return one of several types? Return an Either<T1, T2>
or a tagged union.
And even in the cases where you genuinely need to return multiple values, modern languages usually support tuples or some kind of data structure (list, array, dictionary) or objects as well as some form of destructuring bind or pattern matching, which makes packaging up your multiple values into a single value and then destructuring it again into multiple values trivial.
Here are a few examples of languages that do not support returning multiple values. I don’t really see how adding support for multiple return values would make them significantly more expressive to offset the cost of a new language feature.
Ruby
def foo; return 1, 2, 3 end
one, two, three = foo
one
# => 1
three
# => 3
Python
def foo(): return 1, 2, 3
one, two, three = foo()
one
# >>> 1
three
# >>> 3
Scala
def foo = (1, 2, 3)
val (one, two, three) = foo
// => one: Int = 1
// => two: Int = 2
// => three: Int = 3
Haskell
let foo = (1, 2, 3)
let (one, two, three) = foo
one
-- > 1
three
-- > 3
Perl6
sub foo { 1, 2, 3 }
my ($one, $two, $three) = foo
$one
# > 1
$three
# > 3
12
The real reason that a single return value is so popular is the expressions that are used in so many languages. In any language where you can have an expression like x + 1
you are already thinking in terms of single return values because you evaluate an expression in your head by breaking it up into pieces and deciding the value of each piece. You look at x
and decide that it’s value is 3 (for example), and you look at 1 and then you look at x + 1
and put it all together to decide that the value of the whole is 4. Each syntactic part of the expression has one value, not any other number of values; that is the natural semantics of expressions that everyone expects. Even when a function returns a pair of values it’s still really returning one value that’s doing the job of two values, because the idea of a function that returns two values that aren’t somehow wrapped up into a single collection is too weird.
People don’t want to deal with the alternative semantics that would be required to have functions return more than one value. For example, in a stack-based language like Forth you can have any number of return values because each function simply modifies the top of the stack, popping inputs and pushing outputs at will. That’s why Forth doesn’t have the sort of expressions that normal languages have.
Perl is another language that can sometimes act like functions are returning multiple values, even though it’s usually just considered returning a list. The way lists “interpolate” in Perl gives us lists like (1, foo(), 3)
which might have 3 elements as most people who don’t know Perl would expect, but could just as easily have only 2 elements, 4 elements, or any greater number of elements depending on foo()
. Lists in Perl are flattened so that a syntactic list doesn’t always have the semantics of a list; it can be merely a piece of a larger list.
Another way to have functions return multiple values would be to have an alternative expression semantics where any expression can have multiple values and each value represents a possibility. Take x + 1
again, but this time imagine that x
has two values {3, 4}, then the values of x + 1
would be {4, 5}, and the values of x + x
would be {6, 8}, or maybe {6, 7, 8}, depending on whether one evaluation is allowed to use multiple values for x
. A language like that might be implemented using backtracking much like Prolog uses to give multiple answers to a query.
In short, a function call is a single syntactic unit and a single syntactic unit has a single value in the expression semantics that we all know and love. Any other semantics would force you into weird ways of doing things, like Perl, Prolog, or Forth.
As suggested in this answer, it is a matter of hardware support, though tradition in language design also plays a role.
when a function returns it leaves a pointer to the returning object in a specific register
Of the three first languages, Fortran, Lisp and COBOL, the first used a single return value as it was modeled on mathematics. The second returned an arbitrary number of parameters the same way it received them: as a list (it could also be argued that it only passed and returned a single parameter: the address of the list). The third return zero or one value.
These first languages influenced a lot on the design of the languages that followed them, though the only one which returned multiple values, Lisp, never gathered much popularity.
When C came, while influenced by the languages before it, it gave a great focus on efficient use of hardware resource, keeping a close association between what the C language did and the machine code that implemented it. Some of its oldest features, such as “auto” vs “register” variables, are a result of that design philosophy.
It must be also pointed out that assembly language was widely popular until the 80s, when it finally started to be phased out of mainstream development. People who wrote compilers and created languages were familiar with assembly, and, for the most part, kept to what worked best there.
Most of the languages that diverged from this norm never found much popularity, and, therefore, never played a strong role influencing the decisions of language designers (who, of course, were inspired by what they knew).
So let’s go examine assembly language. Let’s look first at the 6502, a 1975 microprocessor that was famously used by the Apple II and VIC-20 microcomputers. It was very weak compared to what was used in the mainframe and minicomputers of the time, though powerful compared to the first computers of 20, 30 years before, at the dawn of programming languages.
If you look at the technical description, it has 5 registers plus a few one-bit flags. The only “full” register was the Program Counter (PC) — that register points to the next instruction to be executed. The other registers where the accumulator (A), two “index” registers (X and Y), and a stack pointer (SP).
Calling a subroutine puts the PC in the memory pointed to by the SP, and then decrements the SP. Returning from a subroutine works in reverse. One can push and pull other values on the stack, but it is difficult to refer to memory relative to the SP, so writing re-entrant subroutines was difficult. This thing we take for granted, calling a subroutine at any time we feel like, was not so common on this architecture. Often, a separate “stack” would be created so that parameters and subroutine return address would be kept separate.
If you look at the processor that inspired the 6502, the 6800, it had an additional register, the Index Register (IX), as wide as the the SP, which could receive the value from the SP.
On the machine, calling a re-entrant subroutine consisted of pushing the parameters on the stack, pushing PC, changing PC to the new address, and then the subroutine would push its local variables on the stack. Because the number of local variables and parameters is known, addressing them can be done relative to the stack. For example, a function receiving two parameters and having two local variables would look like this:
SP + 8: param 2
SP + 6: param 1
SP + 4: return address
SP + 2: local 2
SP + 0: local 1
It can be called any number of times because all the temporary space is on the stack.
The 8080, used on TRS-80 and a host of CP/M-based microcomputers could do something similar to the 6800, by pushing SP on the stack and then popping it on its indirect register, HL.
This is a very common way of implementing things, and it got even more support on more modern processors, with the Base Pointer that makes dumping all local variables before returning easy.
The problem, the, is how do you return anything? Processor registers weren’t very numerous early on, and one often needed to use some of them even to find out which piece of memory to address. Returning things on the stack would be complicated: you’d have to pop everything, save the PC, push the returning parameters (which would be stored where meanwhile?), then push the PC again and return.
So what was usually done was reserving one register for the return value. The calling code knew the return value would be in a particular register, that would have to be preserved until it could be saved or used.
Let’s look at a language that does allow multiple return values: Forth. What Forth does is keeping a separate return stack (RP) and data stack (SP), so that all a function had to do was pop all its parameters and leave the return values on the stack. Since the return stack was separate, it did not get in the way.
As someone who learned assembly language and Forth in the first six month of experience with computers, multiple return values look entirely normal to me. Operators such as Forth’s /mod
, which return the integer division and the rest, seem obvious. On the other hand, I can easily see how someone whose early experience was C mind find that concept strange: it goes against their ingrained expectations of what a “function” is.
As for math… well, I was programming computers way before I ever got to functions in mathematics classes. There is a whole section of CS and programming languages which is influenced by mathematics, but, then again, there’s a whole section which is not.
So we have a confluence of factors where math influenced early language design, where hardware constraints dictated what was easily implemented, and where the popular languages influenced how the hardware evolved (the Lisp machine and Forth machine processors were roadkills in this process).
13
The functional languages I know of can return multiple values easily through the use of tuples (in dynamically typed languages, you can even use lists). Tuples are also supported in other languages:
f :: Int -> (Int, Int)
f x = (x - 1, x + 1)
// Even C++ have tuples - see Boost.Graph for use
std::pair<int, int> f(int x) {
return std::make_pair(x - 1, x + 1);
}
In the example above, f
is a function returning 2 ints.
Similarly, ML, Haskell, F#, etc., can also return data structures (pointers are too low-level for most languages). I have not heard of a modern GP language with such a restriction:
data MyValue = MyValue Int Int
g :: Int -> MyValue
g x = MyValue (x - 1, x + 1)
Finally, out
parameters can be emulated even in functional languages by IORef
. There are several reasons why there is no native support for out variables in most languages:
-
Unclear semantics: Does the following function print 0, or 1? I know of languages that would print 0, and ones that would print 1. There are benefits to both of them (both in terms of performance, as well as matching the programmer’s mental model):
int x; int f(out int y) { x = 0; y = 1; printf("%dn", x); } f(out x);
-
Non-localized effects: As in the example above, you can find that you can have a long chain and the innermost function affects the global state. In general, it makes it harder to reason about what the requirements of the function are, and if the change is legal. Given that most modern paradigms try to either localize the effects (encapsulation in OOP) or eliminate the side-effections (functional programming), it conflicts with those paradigms.
-
Being redundant: If you have tuples, you have 99% of the functionality of
out
parameters and 100% of idiomatic use. If you add pointers to the mix you cover the remaining 1%.
I have trouble naming one language which could not return multiple values by using a tuple, class or out
parameter (and in most cases 2 or more of those methods are allowed).
7
I think it’s because of expressions, such as (a + b[i]) * c
.
Expressions are composed of “singular” values. A function returning a singular value can thus be directly used in an expression, in place of any of the four variables shown above. A multi-output function is at least somewhat clumsy in an expression.
I personally feel that this is the thing that’s special about a singular return value. You could work around this by adding syntax for specifying which of the multiple return values you want to use in an expression, but it’s bound to be more clumsy than the good old mathematical notation, which is concise and familiar to everyone.
It does complicate the syntax a little, but there’s no good reason at the implementation
level not to allow it. Contrary to some of the other responses, returning multiple values, where available, leads to clearer and more efficient code. I can’t count how often I have wished I could return an X and a Y, or a “success” boolean and a useful value.
8
In most languages where functions are supported you can use a function call anywhere where a variable of that type can be used:-
x = n + sqrt(y);
If the function returns more than one value this will not work. Dynamically typed languages such as python will allow you to do this, but, in most cases it will throw up a run time error unless it can work out something sensible to do with a tuple in the middle of an equation.
4
I just want to build on Harvey’s answer. I originally found this question on a news tech site (arstechnica) and found an amazing explanation that I feel really answers the core of this question and is lacking from all the other answers(except Harvey’s):
The origin of single return from functions lies in the machine code. At the machine code level, a function can return a value in the A (accumulator) register. Any other return values will be on the stack.
A language that supports two return values will compile it as machine code that returns one, and puts the second on the stack. In other words, the second return value would end up as an out parameter anyway.
It is like asking why assignment is one variable at a time. You could have a language that allowed
a, b = 1, 2
for instance. But it would end up at the machine code level being a = 1 followed by b = 2.
There is some rationale in having programming language constructs bear some semblance to what will actually happen when the code is compiled and running.
2
It started with math. FORTRAN, named for “Formula Translation” was the first compiler. FORTRAN was and is oriented to physics/math/engineering.
COBOL, nearly as old, had no explicit return value; It barely had subroutines. Since then it’s been mostly inertia.
Go, for example, has multiple return values, and the result is cleaner and less ambiguous than using “out” parameters. After a little bit of use, it is very natural and efficient. I recommend multiple return values be considered for all new languages. Maybe for old languages, too.
4
It probably has more to do with the legacy of how function calls are made in processor machine instructions and the fact that all programming languages derive from machine code: for example, C -> Assembly -> Machine.
How Processors Perform Function Calls
The first programs were written in machine code and then later assembly. The processors supported function calls by pushing a copy of all of the current registers to the stack. Returning from the function would pop the saved set of registers from the stack. Usually one register was left untouched to allow the returning function to return a value.
Now, as to why the processors were designed this way… it was likely a question of resource constraints.
0