Suppose I have a very simple C program that just does this:
int i = 6;
int j = 4;
int k = 5;
int a = i + j + k;
Since i
, j
, and k
are on the stack, they will be located relative to the stack pointer. I am told the compiler determines these relative locations; my guess is the compiler translates int a = i + j + k
into “Add the values located at (stack pointer – 3x), (stack pointer – 2x), and (stack pointer – x) and push the result to the stack.” Am I correct?
3
It depends upon the C compiler, as the specification leaves this up to the compiler implementer. (There are specific requirements, though — such as if you take the address of a parameter — which may further constrain the compiler implementer.)
A typical C function generates some “prologue” code, the “body” code, and some “epilogue” code. The prologue will allocate local variable space. In general, a compiler will NOT go through the process you outlined. So let’s say we are talking about 16-bit x86 code and about this function in C:
int f( void ) {
int i = 6;
int j = 4;
int k = 5;
int a = i + j + k;
return a;
}
Note that in the context I mentioned, the variables will be 2 bytes in size. The compiler will count the number of bytes required for all of the local variables. In this case, you have four of them and since they require 2 bytes each, the compiler “computes” that 8 bytes are required for all of this. (Also note that even if you include variable definitions inside of additional code blocks within the above C function, the C compiler will count ALL OF THEM AT ONCE. It doesn’t just count those defined at the outer level.) So the compiler might generate the following prologue in assembly:
push bp
mov bp, sp
sub sp, 8
Usually, the BP register is used as a “frame pointer” for the current context of the function. BP is also known as the pointer to the “activation frame.” So the C compiler needs two instructions to save the old activation frame pointer on the stack and to then initialize it to point to the current one for this invocation of the function.
The third instruction is simple and allocates ALL of the necessary space for the 8 bytes needed for ALL of the local variables. It simply moves the stack pointer over the needed 8 bytes. BP will still point at the base of this activation frame. But the SP now is at the other side of that, so additional pushes and calls won’t write over the local variables.
The C compiler will also internally assign some offset value for each of the local variables. Perhaps something like IVAR=0, JVAR=2, KVAR=4, and AVAR=6. Now, the C compiler needs to create some BODY code. Because you set these local variable values, something like this (completely un-optimized):
mov IVAR[BP], 6
mov JVAR[BP], 4
mov KVAR[BP], 5
mov ax, IVAR[BP]
add ax, JVAR[BP]
add ax, KVAR[BP]
Note that in this 16-bit context, it’s also common for the return value of a function to be placed into AX, if it fits there. In this case, it does. So the answer is already in the right place at this time. So now an epilogue is required:
mov sp, bp
pop bp
ret
And that’s it.
Now, an optimizer will do a GREAT DEAL to improve the above code in this case. It will probably completely remove ALL of the local variables, as they aren’t needed at all. It can pre-compute the entire result as 15 and just do the following:
mov ax, 15
ret
No need to manage the activation frame, at all. Just return the value. (But then you wouldn’t get any idea about how the local variables might be managed.)
Hope that helps a little.
3
Yes, you are correct. The compiler never actually determines the exact memory locations, it only determines the relative offsets of these locations from the stack pointer. The value that the stack pointer will have during run-time is unknown to the compiler at compilation time.
(Actually, on 16-bit and 32-bit Intel architectures these locations are relative to the so-called “base pointer”, (register BP
, in other architectures known as “the frame pointer”,) whose value is determined from the stack pointer, but that’s irrelevant. They have fixed it in 64-bit and now only the stack pointer is necessary.)
Since i,j, and k are on the stack, they will be located relative to the stack pointer.
First of all, this statement is plainly wrong in a common case with modern compilers. You seem to get used to looking only at very old fashioned compilers, which really worked in the way they allocate positions on stack, map variables to them, and use registers only when explicitly requested (with register
keyword) or for temporary values. (Do you make some homework under MS-DOS and 16-bit compilers?) This is a really ancient way.
Modern compilers (GCC, Clang/LLVM, etc.) utilizes SSA and accompanying techniques. With SSA, there are no “variables” internally; there are nearly unchangeable values (except execution path merging points), each of them can get an own place, either in a register or on the stack, depending on how often it is used and how soon it will be used again. If you named some variable “int i”, it can be at EAX at one part of function, at EDI at another one, at stack when swapped out of the register pool…
Sometimes, a variable is optimized out at all, if compiler assumes it’s easier to use another ones, or even a constant. Sometimes, it is present, but changed (e.g. cycle by i
from 1 to N, if accesses an array as x[i-1], is changed to cycle from 0 to N-1 and access as x[i]). And so on and so on, optimization list is nearly infinite.
You can be sure a variable has its place on stack only when you get its address. Nevertheless, it is a variable address only when this address is valid. If you call another function as g(&i)
, i
is placed onto stack before g
called, but it can be immediately moved into register or used as an expression argument when g() finished, because this address is never used again.
Example. The C function is:
int f(int i, int j, int k) {
return i + j + k;
}
It’s translated for x86_64/Unix to:
addl %esi, %edi
leal (%rdi,%rdx), %eax
ret
No stack is used at all. The calling convention specifies registers for argument and result placing, and that’s how it is implemented. (Notice that the same adding is add
and then lea
– this is spanning instruction flow among different execution units.)
Add an external action for i
:
void g(int*);
int f(int i, int j, int k) {
g(&i);
return i + j + k;
}
Assembly output:
pushq %rbp
movl %esi, %ebp
pushq %rbx
movl %edx, %ebx
subq $24, %rsp
movl %edi, 12(%rsp) ; <-- storing i onto stack
leaq 12(%rsp), %rdi ; <-- getting &i
call g
addl 12(%rsp), %ebp ; <-- using possibly modified i
addq $24, %rsp
leal 0(%rbp,%rbx), %eax
popq %rbx
popq %rbp
ret
Here, i
comes in RDI, then is placed onto stack before g(&i)
, and then used as summand directly from stack, without returning to a register. (Also, frame pointer is adjusted because this isn’t a leaf function anymore.)
Consider again you are utilizing time machine and/or some very embedded environment. SSA is an expensive technique. The mapping of variable set to stack (or, earlier, memory) is how compilers had been worked from the very beginning, namely, early Fortran versions. For that case, you are right. And, the question in that case is how to keep track of possible address changes.
On S/360, a compiler would use a memory chunk after the function body for its variables, and utilize the following instruction sequence to get a base address for this chunk:
BALR 15,0
USING *,15
register 15 is a conventional one for this goal, and after this it gets the address just after the command. Then, offset-based memory access is used for variables, like:
LR 3,196(15)
here 4-byte value from this base plus 196 is loaded into register 3. For the entire function run, i
will be at 196(15).
Notice this approach is not stack-based; a function is not reentrant.
With PDP-11, but again not stack-based Fortran, this can be done with instruction pointer based access, like
MOV 196(PC), R3
because there is no need to allocate a register for such memory access. But offsets will vary (an immediately next instruction will have to use 192 instead of 196).
With stack, this changes to SP-based access and no variable area near the function body:
MOV 16(SP), R3
but see below for offset changing when SP changes.
With x86 before i386, BP-based stack access was mandatory; since i386, it’s still usable, but not unique. After BP setup at function prologue, offsets remain fixed, so you can use the same expression, like [EBP-20] (Intel), -20(%ebp) (AT&T), for a variable during function call lifetime. With ESP/RSP-based access (since i386), offsets are changed with any push/pop. If i
was [ESP+20], a single 4-byte push “converts” it to [ESP+24]; “converts” mean the memory address is the same, but formula for it is changed among with ESP adjusting. For example, if g(i,&i,&i)
is called, a compiler may produce the following sequence to form the argument list (x86_32 calling conventions are mainly stack-based):
## assume i is now at 24(%esp)
leal 24(%esp), %eax ; &i to eax
pushl %eax
leal 28(%esp), %eax ; &i to eax
pushl %eax
movl 32(%esp), %eax ; i to eax
pushl %eax
(It is obviously suboptimal, but expressing.)