multipart-mixed

Functions and Parameters

This entry is a day late because I started writing about pointers, but I realized I needed to discuss function calls first.

Now that we’ve got the foundation of stack frames and the debugger for exploring them, we can use the stack for more than just local variables. The stack is also used for passing and receiving parameters to functions.

Let’s say I’ve got this very-nearly-useless function:

int sum3(int a, int b, int c)
{
    return a + b + c;
}

int main()
{
    sum3(1, 2, 3);
    return 0;
}

Here’s what the assembly looks like (I swapped main() and sum3() for clarity):

  .text
.globl _main
_main:
  pushl %ebp            # save base pointer
  movl  %esp, %ebp      # set base for this function
  subl  $24, %esp       # 24 bytes of stack frame
  movl  $3, 8(%esp)     # push 3 to ESP plus 8
  movl  $2, 4(%esp)     # push 2 to ESP plus 4
  movl  $1, (%esp)      # push 1 to ESP
  call  _sum3           # call function
  movl  $0, %eax        # main() returns 0 in EAX
  leave
  ret
.globl _sum3
_sum3:
  pushl %ebp            # save base pointer
  movl  %esp, %ebp      # set base for this function
  subl  $8, %esp        # 8 bytes of stack frame
  movl  12(%ebp), %eax  # move EBP plus 12 to EAX
  addl  8(%ebp), %eax   # add EBP plus 8 to EAX
  addl  16(%ebp), %eax  # add EBP plus 16 to EAX
  leave                 # Note: EAX already contains return val
  ret

We’ve seen that local variables are stored at the base pointer minus some offset, but now we’re storing stuff at the stack pointer plus some offset. Say what? It’s a lot more clear if you visualize the stack frames together. (Note: the memory addresses in this figure go 4 bytes at a time, which on i386 happens to be the size of a pointer and the size of an int.)

The bottom of main()’s stack frame is used for the function call parameters. ESP points to the bottom of its stack frame, so main() can refer to the parameters by taking ESP plus an offset. In the function sum3(), the EBP points to the top of its own stack frame, so it can access the same memory locations with EBP plus an offset.

Scoping

You might wonder, “does this mean sum3() can just poke at variables in main()’s stack frame? I thought C wouldn’t allow that kind of thing.” At least, that’s what I wondered. But main() doesn’t have any stack variables—it’s just passing constants to the function. Let’s change main() to use variables and see the effect:

int main()
{
    int a = 1;
    int b = 2;
    int c = 3;

    sum3(a, b, c);

    return 0;
}

Here’s the new assembly:

_main:
  pushl %ebp
  movl  %esp, %ebp
  subl  $40, %esp
  movl  $1, -12(%ebp)     # a is at EBP-12
  movl  $2, -16(%ebp)
  movl  $3, -20(%ebp)
  movl  -20(%ebp), %eax
  movl  %eax, 8(%esp)
  movl  -16(%ebp), %eax
  movl  %eax, 4(%esp)
  movl  -12(%ebp), %eax   # move a into EAX
  movl  %eax, (%esp)      # put EAX at ESP
  call  _sum3
  movl  $0, %eax
  leave
  ret

By making variables for the function parameters, main() must now store the variables at the top of its stack frame—EBP minus some offset—and then copy those values to the bottom of its stack frame—ESP plus some offset—to pass them to the function. Note that i386 doesn’t allow direct memory-to-memory copies, so it must copy the value into EAX and then copy EAX to the destination.

This demonstrates the scope of variables: main() and sum3() each get the values they need to run, but they’re separate. sum3() can modify a, b, or c to its heart’s content, and main()’s copy of those variables are unaffected.

Calling Convention

There’s an implicit contract between functions that specifies where its parameters start and what order they’re in. When a function is called, the stack pointer (ESP) is set to the location of the first parameter. The function needs to save off EBP and establish its own stack frame, but there’s a known offset between its EBP and the calling function’s ESP.

Furthermore, the function knows that the parameters, viewed left to right, are on the stack from bottom to top. The exact spacing between each parameter depends on its size—I’ve used ints here for simplicity; they are 4 bytes apart.

This contract is called the function calling convention. There’s more than one way to do it, but in general compilers on the same operating system will use the same calling convention so that you can link together code created by GCC with code created from other compilers—even compilers of other languages.

Act On It!

  • Write a simple recursive function to get the n’th number in the Fibonacci Sequence. Compile this to assembly (gcc -m32 -S), then assemble into machine code (gcc -m32 -gstabs). For a small input (say 5), walk through this in GDB and observe how stack frames are created for each recursion.

  • As you do the above walkthrough, try the GDB commands up and down to go up to higher stack frames and back down.