Call Stack in C++ (ENG)

Call Stack

The call stack, also known as the execution stack or just the stack if you will, is a crucial component for running your program. Before we dive deep, it's important to understand the process memory layout. If you're not familiar with it, check the linked resources above.

When you call a function, the system sets aside some space in the stack memory for the function to do its necessary work. These chunks of space or memory are often called "stack frames" or "function frames." The purpose of the call stack is to manage the way procedures or functions call each other and pass parameters.

Each thread has its own call stack, also known as stack memory.

An Easy Example

Let's learn how a function stack frame is formed and what's inside a frame using the code below:

int add(int x, int y){
	return x + y;
}
int main(){
	int a = 32;
	int b = 64;
	int sum = add(a, b);
}

Here's what happens:

  1. The main() function is called first, and its function frame is created.
  2. The add() function is called by main(), and its function frame is added to the stack.
  3. After the add() function finishes executing, its function frame is popped off the stack, returning control to main().
  4. The main() finishes executing, the program finishes.

call_stack.png

Under the Hood

We now understand how stack frames are pushed and popped. Next, let's delve into what is inside the stack frame. We'll compile the code above into assembly language to examine the details. For this purpose, I'll use GCC with the -m32 option here, because on a 64-bit machine, parameters are typically passed using registers rather than the stack.

Here's the assembly:

_Z3addii:
    pushl   %ebp
    movl    %esp, %ebp
    movl    8(%ebp), %edx
    movl    12(%ebp), %eax
    addl    %edx, %eax          ; <<<<< Calling add()
    popl    %ebp
    ret ; Return address -> EIP
main:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $16, %esp
    movl    $32, -12(%ebp)
    movl    $64, -8(%ebp)
    pushl   -8(%ebp)
    pushl   -12(%ebp)
    call    _Z3addii            ; <<<<< Before calling add()
    addl    $8, %esp
    movl    %eax, -4(%ebp) ; <- Next instruction address
    movl    $0, %eax            ; <<<<< After calling add()
    leave
    ret

Okay, in assembly, you would always see the annoying base pointer register (ebp in this case) and stack pointer register (esp in this case). But don't worry, you'll get there.
Stack_under_the_hood.png
Before calling add(), main() forms its stack frame using ebp and esp. It first saves the old base pointer using a pushl instruction, and then sets the new ebp to point to the old ebp, creating the base of the stack frame. After that, space is allocated on the stack for local variables. Then, the passing parameters are pushed, and add() is called.

When the call instruction is executed, a return address is automatically pushed into the stack frame. In the add() function, a new stack frame is formed using pushl %ebp and movl %esp, %ebp. After the calculation is done, the return value is passed back via eax. The frame is then cleaned up with popl %ebp.

After the frame is cleaned up, the return address is instantly passed to eip, which points to the next instruction address of call _Z3addii. The stack space allocated for passing parameters is then cleaned up. Finally, the return value 0 is set, and the stack is further cleaned up.

Copy and Reference

From the above, we have seen that passing parameters involves copying them to the stack (32-bit machine). If the object is large, this copying can be quite costly in terms of CPU time. To mitigate this, we can pass a pointer or a reference to the object instead, making the copy much smaller. You can consider references in C++ as syntactic sugar for pointers, they act the same way.

For large objects, it's always recommended to pass it by a reference. You might be concerned that directly operating on the original object could lead to unintended modifications. To avoid this, you can use the const keyword to ensure the object is not altered. This approach is exactly what we use in a copy constructor and copy assignment operator.

Pass by Value or Pass by Const Reference?

It is generally recommended to pass by const reference. On modern 64-bit machines, a pointer requires 8 bytes to store. Whether you use pass-by-pointer or pass-by-reference, you will always copy an 8-byte pointer to somewhere, typically a register (or the stack when there are many parameters).

When you pass a easy type no bigger than 8 bytes (e.g., int, which typically stores 4 bytes on most machines) by value, you only copy 4 bytes. However, if you use a pointer or reference, it would take 8 bytes. In this case, will pass this type no bigger than 8 bytes by value be better?

It looks so, because you copy 4 bytes must be two times faster than copy a 8 bytes pointer. But in most cases, these parameters will be passed to a register (the rule is in the next section). Copying a type no larger than 8 bytes to a register doesn't make much of a difference in terms of performance. But it's a good practice to do so.

It seems so, because copying 4 bytes must be two-times faster than copying an 8-byte pointer, right? However, in most cases, these parameters will be passed to a register. Copying a type no larger than 8 bytes to a register doesn't make much of a difference in terms of performance. But it's a good practice to do so.

Parameter Passing Rule in 64-Bit Machine (AI Generated)

On 64-bit machines, function parameters are usually passed through specific registers, and only when the number of parameters exceeds the available registers do parameters get passed on the stack. Here are the typical rules for x86-64 (System V ABI):

  1. Integer and Pointer Parameters:
    Passed through registers: RDI, RSI, RDX, RCX, R8, R9 (for the first six parameters).
    Any additional parameters are passed on the stack.
  2. Floating-point Parameters:
    Passed through registers: XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7 (for the first eight parameters).
    Any additional parameters are passed on the stack.

For custom complex types:

  • If passed by value and they do not fit into available registers, they are stored on the stack.
  • If passed by reference (pointer), the reference itself is passed through a register if there are enough available.

Following these rules ensures efficient and consistent parameter passing, helping to optimize function calls.