In x86, where are the memory addresses of the values on the stack located?

Suppose I have a very simple C program that just does this:

int i = 6;
int j = 4;
int k = 5;
int a = i + j + k;

Since i, j, and k are on the stack, they will be located relative to the stack pointer. I am told the compiler determines these relative locations; my guess is the compiler translates int a = i + j + k into “Add the values located at (stack pointer – 3x), (stack pointer – 2x), and (stack pointer – x) and push the result to the stack.” Am I correct?

It depends upon the C compiler, as the specification leaves this up to the compiler implementer. (There are specific requirements, though — such as if you take the address of a parameter — which may further constrain the compiler implementer.)

A typical C function generates some “prologue” code, the “body” code, and some “epilogue” code. The prologue will allocate local variable space. In general, a compiler will NOT go through the process you outlined. So let’s say we are talking about 16-bit x86 code and about this function in C:

int f( void ) {
    int i = 6;
    int j = 4;
    int k = 5;
    int a = i + j + k;
    return a;
}

Note that in the context I mentioned, the variables will be 2 bytes in size. The compiler will count the number of bytes required for all of the local variables. In this case, you have four of them and since they require 2 bytes each, the compiler “computes” that 8 bytes are required for all of this. (Also note that even if you include variable definitions inside of additional code blocks within the above C function, the C compiler will count ALL OF THEM AT ONCE. It doesn’t just count those defined at the outer level.) So the compiler might generate the following prologue in assembly:

push bp
mov bp, sp
sub sp, 8

Usually, the BP register is used as a “frame pointer” for the current context of the function. BP is also known as the pointer to the “activation frame.” So the C compiler needs two instructions to save the old activation frame pointer on the stack and to then initialize it to point to the current one for this invocation of the function.

The third instruction is simple and allocates ALL of the necessary space for the 8 bytes needed for ALL of the local variables. It simply moves the stack pointer over the needed 8 bytes. BP will still point at the base of this activation frame. But the SP now is at the other side of that, so additional pushes and calls won’t write over the local variables.

The C compiler will also internally assign some offset value for each of the local variables. Perhaps something like IVAR=0, JVAR=2, KVAR=4, and AVAR=6. Now, the C compiler needs to create some BODY code. Because you set these local variable values, something like this (completely un-optimized):

mov IVAR[BP], 6
mov JVAR[BP], 4
mov KVAR[BP], 5
mov ax, IVAR[BP]
add ax, JVAR[BP]
add ax, KVAR[BP]

Note that in this 16-bit context, it’s also common for the return value of a function to be placed into AX, if it fits there. In this case, it does. So the answer is already in the right place at this time. So now an epilogue is required:

mov sp, bp
pop bp
ret

And that’s it.

Now, an optimizer will do a GREAT DEAL to improve the above code in this case. It will probably completely remove ALL of the local variables, as they aren’t needed at all. It can pre-compute the entire result as 15 and just do the following:

mov ax, 15
ret

No need to manage the activation frame, at all. Just return the value. (But then you wouldn’t get any idea about how the local variables might be managed.)

Hope that helps a little.

Yes, you are correct. The compiler never actually determines the exact memory locations, it only determines the relative offsets of these locations from the stack pointer. The value that the stack pointer will have during run-time is unknown to the compiler at compilation time.

(Actually, on 16-bit and 32-bit Intel architectures these locations are relative to the so-called “base pointer”, (register BP, in other architectures known as “the frame pointer”,) whose value is determined from the stack pointer, but that’s irrelevant. They have fixed it in 64-bit and now only the stack pointer is necessary.)

Since i,j, and k are on the stack, they will be located relative to the stack pointer.

First of all, this statement is plainly wrong in a common case with modern compilers. You seem to get used to looking only at very old fashioned compilers, which really worked in the way they allocate positions on stack, map variables to them, and use registers only when explicitly requested (with register keyword) or for temporary values. (Do you make some homework under MS-DOS and 16-bit compilers?) This is a really ancient way.

Modern compilers (GCC, Clang/LLVM, etc.) utilizes SSA and accompanying techniques. With SSA, there are no “variables” internally; there are nearly unchangeable values (except execution path merging points), each of them can get an own place, either in a register or on the stack, depending on how often it is used and how soon it will be used again. If you named some variable “int i”, it can be at EAX at one part of function, at EDI at another one, at stack when swapped out of the register pool…

Sometimes, a variable is optimized out at all, if compiler assumes it’s easier to use another ones, or even a constant. Sometimes, it is present, but changed (e.g. cycle by i from 1 to N, if accesses an array as x[i-1], is changed to cycle from 0 to N-1 and access as x[i]). And so on and so on, optimization list is nearly infinite.

You can be sure a variable has its place on stack only when you get its address. Nevertheless, it is a variable address only when this address is valid. If you call another function as g(&i), i is placed onto stack before g called, but it can be immediately moved into register or used as an expression argument when g() finished, because this address is never used again.

Example. The C function is:

int f(int i, int j, int k) {
  return i + j + k;
}

It’s translated for x86_64/Unix to:

    addl    %esi, %edi
    leal    (%rdi,%rdx), %eax
    ret

No stack is used at all. The calling convention specifies registers for argument and result placing, and that’s how it is implemented. (Notice that the same adding is add and then lea – this is spanning instruction flow among different execution units.)

Add an external action for i:

void g(int*);
int f(int i, int j, int k) {
  g(&i);
  return i + j + k;
}

Assembly output:

    pushq   %rbp
    movl    %esi, %ebp
    pushq   %rbx
    movl    %edx, %ebx
    subq    $24, %rsp
    movl    %edi, 12(%rsp) ; <-- storing i onto stack
    leaq    12(%rsp), %rdi ; <-- getting &i
    call    g
    addl    12(%rsp), %ebp ; <-- using possibly modified i
    addq    $24, %rsp
    leal    0(%rbp,%rbx), %eax
    popq    %rbx
    popq    %rbp
    ret

Here, i comes in RDI, then is placed onto stack before g(&i), and then used as summand directly from stack, without returning to a register. (Also, frame pointer is adjusted because this isn’t a leaf function anymore.)

Consider again you are utilizing time machine and/or some very embedded environment. SSA is an expensive technique. The mapping of variable set to stack (or, earlier, memory) is how compilers had been worked from the very beginning, namely, early Fortran versions. For that case, you are right. And, the question in that case is how to keep track of possible address changes.

On S/360, a compiler would use a memory chunk after the function body for its variables, and utilize the following instruction sequence to get a base address for this chunk:

    BALR 15,0
    USING *,15

register 15 is a conventional one for this goal, and after this it gets the address just after the command. Then, offset-based memory access is used for variables, like:

    LR 3,196(15)

here 4-byte value from this base plus 196 is loaded into register 3. For the entire function run, i will be at 196(15).

Notice this approach is not stack-based; a function is not reentrant.

With PDP-11, but again not stack-based Fortran, this can be done with instruction pointer based access, like

    MOV 196(PC), R3

because there is no need to allocate a register for such memory access. But offsets will vary (an immediately next instruction will have to use 192 instead of 196).

With stack, this changes to SP-based access and no variable area near the function body:

    MOV 16(SP), R3

but see below for offset changing when SP changes.

With x86 before i386, BP-based stack access was mandatory; since i386, it’s still usable, but not unique. After BP setup at function prologue, offsets remain fixed, so you can use the same expression, like [EBP-20] (Intel), -20(%ebp) (AT&T), for a variable during function call lifetime. With ESP/RSP-based access (since i386), offsets are changed with any push/pop. If i was [ESP+20], a single 4-byte push “converts” it to [ESP+24]; “converts” mean the memory address is the same, but formula for it is changed among with ESP adjusting. For example, if g(i,&i,&i) is called, a compiler may produce the following sequence to form the argument list (x86_32 calling conventions are mainly stack-based):

## assume i is now at 24(%esp)
    leal     24(%esp), %eax ; &i to eax
    pushl    %eax
    leal     28(%esp), %eax ; &i to eax
    pushl    %eax
    movl     32(%esp), %eax ; i to eax
    pushl    %eax

(It is obviously suboptimal, but expressing.)

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 14:54

Thẻ: memory, register, stack

Thiết kế website giá rẻ

Danh mục

In x86, where are the memory addresses of the values on the stack located?