I’m wondering if it was theoretically possible for a compiler to determine maximum stack depth at compile time if a limitation was placed on the program. The limitation being that you can’t use recursion or cyclic calls. With cyclic calls I don’t mean loops but function calls (a -> b, b -> a). The program flow would then have to follow an directed acyclic graph basically.
Under these circumstances, can every program still be written or are there programs that can only work without above limitations?
18
The strict hierarchy of tasks that are computable under one theoretical model of computation vs. another really only exists “in the limit”, i.e. when you allow unlimited resources. There are problems that can be solved with a deterministic finite-state automaton, and there are problems that can’t no matter how many states you allow.
In real life all resources are limited, so technically all programming paradigms are equivalent in power. However, some make the same task much, much smaller and simpler.
For instance, exploring a labyrinth is pretty simple with recursion and a call stack; you can still write essentially the same program without recursion, but you will have to essentially re-implement the stack functionality yourself, which is almost always going to be a lot less efficient than what the compiler would have given you. This is why choosing the right paradigm matters even for real-life problems.
10
I think it is clear we are talking about a programming language environment where all code is organized inside functions with some local scope, all functions exist at compile time and the executing environment is using a call stack which is used to deal with the function calls.
In that case, “no direct or indirect recursive function calls” simply means the call graph is a tree. When one stops the program at any point in time (for example, with a debugger), one can see the call stack as a sequence of function calls, and no function must occur twice within that sequence, otherwise it would be a recursion.
Hence, a trivial compile-time determinable upper limit for the call stack size for such a program is the total number of functions within the program. When the only kind of allowed function calls are static ones, which can be determined at compile time, there is a better bound, namely the depth of the call tree. This can be derived easily from the information which function calls which other function.
Unfortunately, as soon as you allow dynamic function calls, it cannot be determined reliably at compile time whether a program will do recursive calls at run time or not. Most major programming languages of today have mechanisms for making dynamic function calls, deciding which function to call based on certain run-time criteria. Functions or function references may be passed around as parameters to other functions. This allows to write functions which can call any other function in the program including itself (or not), and one cannot easily find out at compile time if there is some restriction on the set of “any other functions” which forbids recursion (or not). I don’t have a formal proof at hand, but I am convinced this makes determination if a program will make recursive calls a equivalent to the halting problem.
Another issue I see is the following. As you probably know, any recursive program can be converted to a non-recursive one. In general, you have to introduce some own stack structure for this. Such a converted program may have a compile-time determinable upper bound on the call stack size after the conversion, but now the custom stack can grow arbitrarily. Hence, though the call stack size may be limited, the total size of required memory or memory limits will not change. (Side note: in reality, such a conversion can make a lot of sense for programming environments where the call stack size cannot be changed dynamically, but custom stack sizes can.)
3
The max call stack depth for a program without recursion would be the number of functions in the program.
It is always possible to rewrite recursive functions to be non-recursive with the help of a data structure like a linked list.
So the immediate answer is yes, you can rewrite any program to have a fixed-size call stack or eliminate the call stack altogether. But it comes at the prize of introducing another data structure of unbounded size.
Use a compiler to translate your program into assembler. Then write a simulator that can simulate assembler programs. Then replace the assembler code with a huge array of static data.
Take a read of the Church Turing Thesis, in short all recursion can be replaced with iteration. Now that your limitation is removed, the question is simply can a compiler determine max stack usage.
The answer to that is unfortunately no. You could write a simple program that loops while random bool is true, the body of the loop containing a push to the stack. There’s no way to know if your random bool is true every iteration for eternity.
You could do what GCC does and flag functions as static, bounded, or dynamic. But you may never be able to determine max stack usage of all programs.
To answer your specific questions:
- Can every program still be written without recursion or cyclic
calls? Yes, all recursion can be replaced with iteration. - Is it possible for a compiler to determine maximum stack depth at compile time? No, not all programs are deterministic.
jgn is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2
With a normal programming language, it’s theoretically possible to write a static analyzer that determines a program’s maximum stack usage, with the caveat that for some programs, the analyzer will answer “I don’t know”.
Proof that it’s possible: “always answer ’I don’t know‘” is a correct (if boring) solution. Proof that the caveat is necessary if there is a way to write programs that overflow the stack: for any static analyzer, there exists a subprogram undecidable()
such that the analyzer cannot decide whether it returns true or false. The analyzer cannot analyze the stack usage of if undecidable() then cause_stack_overflow() else do_nothing()
.
So the interesting questions are:
- Are there programming languages that are still Turing-complete (i.e. that can express everything we think of as (sequential) computation), but don’t have a stack as such?
- If you want memory usage analysis to always be possible, what computing power do you lose?
- Can stack usage analysis be practical?
Programming without a stack
Arbitrary computations require the ability to write programs that can take arbitrarily long and use arbitrary amounts of memory. That memory doesn’t have to be organized as a stack, however. Suppose you have a programming language with a stack (whatever the stack exactly is — typically at least a function call stack). Let’s write an interpreter for it in the form of a while loop that executes one instruction at a time. If the interpreter needs to remember information about past instructions (for example a point to jump back to), it allocates an object in its heap. The interpreter itself doesn’t have any function calls. Thanks to this interepreter, you now have a programming language that’s executed without a stack.
Of course, if the original language had something like function calls, the abstract structure of its call stack still exists in the interpretation. But now it consists of links between heap objects, not directly a stack structure in memory.
What is theoretically necessary is the ability for memory usage to grow over the execution of the program, not that this growth involes a stack structure. Practically though, most programming languages have something like function calls, so they have a function call stack. This is because [functions]https://en.wikipedia.org/wiki/Function_(computer_programming)) (or subroutines, procedures, however you want to call it) are a very convenient way to have some composability, where you can write a piece of code and reuse it. If you’ve ever tried programming in an esoteric programming language without composability, you probably felt the pain.
What I’ve shown informally here is that imperative programming with heap allocation is as powerful (in terms of computability) as functional programming with recursion. The converse is true: both are common ways of achieving Turing completeness.
Programming in bounded memory
We’ve seen that you can “hide” a stack in some memory that isn’t organized as a stack. Can we forbid that as well? What happens if we prevent programs from using arbitrary amounts of memory?
If a programming language only allows programs to use a fixed amount of memory, then all the programs are running on a finite automaton. The theoretical computing power of finite automata is very restricted. All it can do is recognize whether inputs match a regular expressions. Even very simple things such as determining whether a string has balanced parentheses become impossible.
Of course, in practice, we run programs on computers that have a finite amount of memory. But we don’t reason about the finite memory as such, because it’s intractable and extremely ad hoc. When we wonder “does my program work correctly?”, we don’t want to get into the details that some particularly convoluted input only works if the computer has at least that many GB or RAM, and only if such-and-such program isn’t running at the same time, and only if the heap wasn’t fragmented by some previous processing, etc. (The exception is some high-reliability systems where we’re willing to give up flexibility for certainty, which I’ll come back to later.) So practical reasoning about programs involves theoretical models where if you run out of RAM, you just add more.
Coming back to theory, we’ve seen that if you impose that every program must use no more than some fixed amount of memory, your computing model is limited to finite automata. But you can have programs whose memory usage is statically analyzable, yet not statically bounded. For example, a common pattern is that a program will use a predictable amount of memory, but the amount depends on the input. If you follow this pattern that a program’s resource usage can be unbounded but must be predictable, you get primitive recursive functions. Primitive recursion allows nesting function calls and loops, but you have to announce the number of calls or iterations in advance. That number can itself be a runtime calculation, but that calculation has to be bounded as well, and so on.
Not all computation can be done with primitive recursion. Turing completeness requires primitive recursion plus some form of unbounded iteration, such as a while loop or unbounded recursion. However, a lot of algorithms are primitive recursive because the number of iterations can be bounded by the size of some data structure.
Bounding stack usage in practice
In embedded programming, it’s common to run programs in very small stacks. It’s also sometimes inconvenient to detect stack overflow. Or even if a stack overflow can be detected, you can’t recover from it, you can only halt the system. If this is a high-reliability system, you definitely don’t want to do that. And so in embedded development, it’s fairly common to run a static checker to ensure that the program will not overflow the stack. This is a whole-program check that of course depends on the exact platform you’re deploying to and the details of how the compiler optimizes your program.
In order for the stack usage analysis to be practical, recursion is forbidden. If you have a problem that lends itself to a stack, such as searching for a value in a tree, you have to manage memory manually. If this involves heap allocation, you have to handle out-of-memory errors explicitly. Function pointers and dynamic dispatch are forbidden or at least restricted to what your analyzer supports.
Such static checkers exist for languages typically used in low-level programming. For example gcc -fstack-usage
works for C, C++ and other languages supported by GCC. I don’t know how practical it is with C++ where dynamic dispatch tends to happen far more often than in C.
Brainfuck can be crammed into a single main procedure with a single dispatch loop, and everything can theoretically be compiled to Brainfuck. Memory usage is unbounded in the Turing-complete limit, but stack usage is constant and there are no calls aside from those used for I/O. The answer is yes.
3
Pigeonhole principle
I think this question is being asked so broadly and vaguely that it’s ceasing to be useful as an actual code metric.
The direct answer here is that if you disallow cyclic calls, then the Pigeonhole Principle effectively puts an upper limit on your stack size. You could not possible have a stack with more levels than there are methods in your codebase.
What are you asking?
can every program still be written
It depends on what you mean by “every program” and whether you’re expecting that you’re expecting them to map one-to-one or not.
For example, recursion allows you to solve any maze, and you don’t need to know the size and scope of the maze when building your program. For a given maze size, you could effectively figure out how many recursions there would be in your code and hand-write all of them. It’s not efficient, but it can theoretically be done.
But do you consider this a valid rewrite of the program, given that we’re now limited to this size of maze and we can’t solve bigger mazes? We could move the goalposts and hand-write more recursions non-recursively, but there’s always going to be some kind of upper limit to your code.
We could consider an upper boundary of machine resources, and write a program that deals with such resources. For example, the limits of memory inherently limit the size of the maze that we can process, so we could hand-write enough recursions in order to accommodate that upper limit maze.
But still I wonder if you consider this a solution, because a recursive program would effectively be capable of solving bigger mazes if you give it more hardware resources, without needing to rewrite or even recompile the code.
The one limitation that even a recursive program has is the size of its data structures. An integer can only be so big, so even your recursive program is going to break at these sizes, even though up until now we’ve never broken the recursive program (even though we’ve suggested rewriting the non-recursive program several times now).
So is that your upper limit? If it is, then I have a sneaky suspicion that the amount of hand-written recursions is already going to yield a binary so huge that it itself runs into resource constraints to be executed, which is a whole different level of problems to solve.
Conclusion
So, what it boils down to is what you’re asking:
Can we build a non-recursive program that follows a specific runtime’s callstack of a recursive program?
Yes. Run it recursively, record the callstack, copy/paste to your heart’s content.
Can we build a non-recursive program that is capable of solving dynamically sized problems just like a recursive program can?
No, because hand-writing your recursion levels inherently limits you to the finite amount of hand-writing you’ve done.
That being said, recursive programs don’t run on infinite resources or data types with infinite capacity either, so infinity is not a real world consideration in the first place. Since this is a theoretical question, the onus is on you (the asker) to define where you draw the line of what’s sufficiently equivalent between recursive and non-recursive programs.
8
I’m wondering if it was theoretically possible for a compiler to determine maximum stack depth at compile time
No, this is not possible. The call depth usually depends on the input data, so in the general case, you would always have to assume “worst case” regarding the input (i.e., arbitrary size). So in the general case, without any sort of “annotation” or constraints on the input, or further information (on a higher level than what a compiler usually works with), a compiler does not have enough information.
If you want a concrete, albeit somewhat pathological example, think of the Ackermann function, which a general compiler has no chance of guessing the number of calls, and hence the depth of the stack (see this question for the complexities involved for even small values).
can every program still be written or are there programs that can only work without above limitations?
Yes, every recursive program can be written to work without a call stack; as long as your programming language is Turing complete, which it will be if it supports WHILE loops, you can (as a human) transform every recursive program into other code that uses very basic control statements only.
If you do this, you will be using (in the general case) some kind of data structure which has the same overall function as the original call stack – it will of course also, in theory, need to grow without limit. You can see an example of this in the above Wikipedia page on the Ackermann function, which is trivially transformed into a “term rewriting system”, which itself is a WHILE loop without any actual function calls.
Hence, by avoiding recursive function calls you can get past hard limits of the actual call stack; you cannot get past the problem that you may eventually run out of memory. But if you are in a context where the RAM available for general data structures is much larger than the separate call stack, this can absolutely be an applicable solution.
2