Stack Frames
A stack frame
(aka activation record
) contains all
of the data required by a function (aka method, subroutine
) during
its execution. At a minimum, it includes:
-
space for the parameters
-
space to hold the return address/value and other book keeping value(s)
-
space for the local variables
The stack frame
is created before function execution begins, and
deleted after the function execution is complete. As such, it satisfies the
requirement of only allocating space during the execution of the routine. It
also handles recursion nicely, as a new stack frame
is created for
every call. All "data" is physically grouped together which aids caching.
Conceptually, the stack frame
is a C struct
.
A struct
is a complex data type declaration the defines a
physically grouped list of variables placed together in a contiguous block of
memory. (See
Wikipedia
for additional information.) Given a pointer
to a struct
, the individual
elements are referenced with the notation ptr→field
.
Conceptual Runtime Stack
The stack frames
are manged as a stack of stack frames
. The
stack frame
of the currently executing
function is on the top of the stack. Its caller is in the next stack frame. Its
caller is in the following stack frame and so on. A new stack frame
is
pushed on the stack when the function begins, and popped off when the
function returns. This corresponds to what happens when a function is called.
The currently executing function is suspended and the called function begins
execution. When it terminates, execution resumes in the calling function.
One way of implementing a stack is via a singly linked list.
The head
points to the top of the stack and each element of the
stack contains a next
pointer to successive elements.
So the entry to a function looks like:
StackFrame* sf = malloc(sizeof(StackFrame)); sf->next = head; // link it into the list head = sf; // reset the head (push) // initialize parameters
And exit from the function looks like:
// grab return value (if present) StackFrame* sf = head; head = sf->next; // reset the head (pop) free(sf); // clean up memory
In diagram form, the runtime stack looks like the following:
This represents three nested function calls, with the currently executing
function (function3
) at the top of the stack (and pointed to by head
).
Notice that you can access all of the fields of the stack frame (e.g.
head→param1
or head→local1
. And by following the next
field you can
access the fields of the function2
stack frame (e.g. head→next→local1
),
and so on.
While this diagram and the sample code is a good conceptual view of what is happening, there are several problems:
-
Stack frames differ in size, so how can the space be allocated?
-
The size of the stack frame for a function may differ from call to call. Think about
printf()
, which has a variable number of parameters. -
Only the caller of a function knows the values passed as parameters. Hence the idea of call-by-value.
-
Only the function knows how many locals it requires.
So building the activation record requires cooperation between the
caller/callee. And dynamically allocating space (i.e. malloc
)
is too much overhead for function calls.
The Hardware Stack
Some computers have special hardware devoted to the stack. However, many simply choose to use a portion of memory and use it as a stack. In this case, one need only maintain a pointer to the top of the stack. Allocating/deallocating space on the stack only requires moving the stack pointer (i.e. changing its value). Thus, it is fast and allows one to allocate/deallocate variable amounts of space easily and quickly. It also facilitates dividing the effort to create the activation record between the caller and callee.
In most cases, the stack is placed in high memory and grows downward
towards lower memory addresses. Placing the stack in high memory and dynamic
memory
(i.e. malloc()
) in lower memory, the two grow/shrink towards/away from
each other and only if they meet are we "out of memory". Pushing values onto the
stack involves decrementing the stack pointer, while popping values involves
incrementing the stack pointer. A simple add/subtract is all that is required.
Thus the allocation/deallocation of space is fast. The stack is referenced
so often that a register of the CPU is dedicated to holding the current
value. In the case of the LC3 computer, that register is R6 (SP)
.
Building the LC3 stack runtime stack
The stack frame
is built as a cooperative effort by
the calling function and the called function. Two registers are dedicated to
managing the runtime stack. Register R6 (SP)
is used as the stack pointer.
Its value contains the address of the word at the top of the stack. Register
R5
contains the "head" pointer of the linked list of stack frames
and is
referred to as the frame pointer (FP)
.
Since the caller know the values of the parameters, it passes them to
the called function by pushing them onto the stack, one at a time. The values
could be pushed either left to right, or right to left. The convention is to
push the values right to left. As a result, the first parameter ends
up at the top of the stack. Thus, the first parameter is at a known
location regardless of how many parameters were actually passed. Think of
the printf()
function. The format
parameter (the first)
tells how many other parameters there are (count of the %
specifiers).
By putting it at the top of the stack, the printf()
function knows
where to find the format
parameter and can thus find the others.
A function call begins by having the caller push
its parameters right
to left onto the stack. The result is that just before the JSR
to the
function, the stack looks like:
Then the code uses JSR
to being execution of the function. Now it is
time for the called function to take over. If the function has a return value,
it must allocate some space to store it. By placing it next on the stack, upon
return, the caller can get the value by simply popping it. It should also store
the return address so that this function may call other functions. To
reserve space for the return value, the function need only adjust the stack
pointer (ADD R6,R6,#-1
). The return address may be saved by
PUSH R7
. After these two steps, the stack looks like:
To complete the creation of the stack frame
, two things are
required. One is to make the new stack frame
the new
head of the list. The other is to allocate space for local variables. The first
step requires saving head as the "next" pointer of the new stack frame
This is accomplished
by PUSH R5
. Then, the head is reset. The head could point anywhere
in the stack frame
. Elements of the stack frame
are accessed
at fixed offsets from the frame pointer. Since LC3 instructions can have both
positive and negative offsets, it make sense that the frame pointer points
somewhere in the middle of the stack frame to handle the largest possible
stack frame. The convention used is to have the head (the frame pointer) point
to the first local variable.
Finally, space of the locals is allocated by decrementing the stack pointer. At this point, the stack looks like:
The stack with three nested function calls
Note how the three functions are contiguous on the stack. That is, the last parameter of the called function is adjacent to the last local of the calling function.
Unwinding the runtime stack
On completion of the function, the work of building the runtime stack must be
undone. Again, it is a cooperative effort split between the called and calling
functions. The work is done in the opposite order to the building of the
stack. The called function cleans up its part, and after RET
, the
calling function cleans up its part. Note that after this work is complete, the
memory used for the stack is not cleared. It still contains whatever was
stored there. Logically, however, it is no longer part of the stack and will be
overwritten on the next function call(s).
A summary of function call/return
There are a lot of steps involved in using the runtime stack. However, the steps are very mechanical and straightforward if you follow the steps precisely. Normally the code is created by a compiler.
-
caller pushes arguments right to left
-
caller executes
JSR
-
callee makes space for return value (
ADD R6,R6,#-1
) -
callee pushes return address (
PUSH R7
) -
callee pushes old frame pointer (
PUSH R5
) -
callee resets frame pointer (
ADD R5, R6 #-1
) This makes the frame pointer point to the first local. When the locals are removed in step 10 the old frame pointer is at the top of the stack and can be popped. -
callee allocates locals by decrementing stack pointer(
ADD R6, R6, #-numLocals
) -
callee does its work (includes saving return value
STR Rx,R5,#3
) -
callee removes locals by incrementing stack pointer (
ADD R6, R6, #numLocals
) -
callee restores old frame pointer (
POP R5
) -
callee pops return address (
POP R7
) -
callee executes
RET
-
caller pops return value
-
caller removes parameters by incrementing stack pointer (
ADD R6, R6, #numParams
) -
caller continues its execution
Notice the symmetry of the steps. The actions taken in the beginning to build
the stack frame are done in the reverse order to remove the stack frame.
See steps 7/9, 5&6/10, 4/11. Note that steps 2,3 and 13,14 are not symmetric.
If they were, one would have the caller allocate space for the return value
before the JSR
. But, it is easier to put the code in the callee once, rather
that before each call to the function each place it is called.
Takeaways
-
The runtime stack is a linked list of stack frames.
-
The elements of the stack frame are located at fixed offsets from the frame pointer for that frame, regardless of the actual address of the stack frame.
-
Allocating and deallocating the stack frame is a cooperative effort with some of the code in the caller and some of the code in the callee.
-
The stack frames occupy contiguous addresses in memory.
(c) Fritz Sieker - 2016-2021