Recitation - Stack Dump


No submission.


The recitation has these objectives:

  1. to learn about the memory layout and stack for C programs,
  2. to learn how to use C pointers to view memory,
  3. to write a C program prints a dump of the top of the stack,
  4. to analyze the stack layout in the stack dump,
  5. to learn about the assembly code created by C compilers,
  6. to see if you can follow directions!

About The Recitation

In this recitation you will learn about the memory layout for programs, how the programs use the stack to support function calls, and how to use pointers to view the contents of memory.

The goal of the recitation is to implement a small routine that prints the contents of the program stack so you can analyze the information. The knowledge you gain from the Stack Dump recitation will help you complete the Stack Trace programming assignment.


Memory Layout for C programs

C programs use the following memory layout where the high addresses are at the top and the low address are at the bottom of the diagram.

address    ________
ffffffff  |        | 
          |        | stack (function local data)
          |_ _  _ _|   
          |   vv   | (grows down)
          |        |  
          |        | free space
          |        |
          |_ _^^_ _| (grows up)
          |        |  
          |        | heap (dynamic)  
          |________| 
          |        |    
          |        | bss (uninitialized data)   
          |________| 
          |        |
          |        | data (initialized data)
          |________| 
          |        |
          |        | text (code)
00000000  |________|


The Stack

The program stack is a series of memory locations. The 64-bit and 32-bit Intel sytems use somewhat different stack architectures. For the purpose of this recitation we are studying the 32-bit Intel architecture since it is similar to the LC3 architecture we use later in the semester. Memory addresses and integers are 32 bits (4 bytes) so we will study the stack as an array of unsigned ints. The stack is often organized as depicted in this diagram.

top of stack
fn   address     memory  description
f2  fffeffc0  |--------| ^
f2  fffeffc4  |--------| ^
f2  fffeffc8  |--------| local variables
f2  fffeffcc  |--------| exception
f2  fffeffd0  |--------|  handler
f2  fffeffd4  |ffff0000| frame pointer
f2  fffeffd8  |--------| return address
f2  fffeffdc  |--------| return value
f2  fffeffe0  |--------| parameters
f2  fffeffe4  |--------| v
f2  fffeffe8  |--------| v
f1  fffeffec  |--------| ^
f1  fffefff0  |--------| ^
f1  fffefff4  |--------| local variables
f1  fffefff8  |--------| exception
f1  fffefffc  |--------|  handler
f1  ffff0000  |ffff002c| frame pointer
f1  ffff0004  |--------| return address
f1  ffff0008  |--------| return value
f1  ffff000c  |--------| parameters
f1  ffff0010  |--------| v
f1  ffff0014  |--------| v
... ffff0018  |--------| ...

Note we have shown the top of the stack at the top of the diagram. The addresses in the diagram increase as you go down, the opposite of the memory layout diagram. Pushing a 32 bit value on the stack subtracts 4 from the stack pointer. Popping a 32 bit value from the stack adds 4 to the stack pointer.

Each called function has a similar layout in the stack. In this sample, the function f1 called the function f2. The addresses labelled f2 denote the values accessible by the function f2, and similarly for f1. This information added to the stack for each function includes:

Compilers are responsible for the code generation that maintains the stack during execution. Viewing the assembly code for a particular architecture can help you understand how the stack works.

There are many variations depending on the hardware architecture and compiler optimization level. Parameters and return values may be passed in registers. Local variables may stored in registers. Frame pointers may not be maintained in the stack. In our case the value returned by a function is placed in a register and does not appear on the stack. On other architectures, the return value is placed on the stack between the parameters and the return address.


C Pointers

In C we can examine the contents of memory directly using pointers. Here is an example of an integer and a pointer to the integer. It shows how to:

We will use a simple program to learn about pointers. As we progress you can add each new code example after the previous code and before the return statement so it is one large program when you are finished. This is required since each example builds on the previous examples.
#include <stdio.h>

int main() {
    int i;  	// an integer value
    int *pi;	// a pointer to an integer value
    
    i = -1;
    pi = &i ;	// set pi to the address of i
    
    printf("i: %d, pi: %p > *pi: %d\n", i, pi, *pi);
    
    // place additional code here
    
    return 0;    
} 
We can use pointers to traverse consecutive memory locations to view their values rather than using array indices. After setting the pointer to the first element, all we need to do is increment the pointer to the next memory location as in the following example. Add this to the end of your program and try again.
    unsigned int memory[] = {1,2,3,4,5,6,7,8,9};
    unsigned int *pm;

    pm = memory;  // &memory[0] is an alternative.
    for (int i=0; i<9; i++) {
        printf("pm: %p > *pm: %d\n", pm, *pm);
        pm++;
    }
We can also assign a 32 bit integer from memory to a pointer with the appropriate type casting. Here we obtain a 32 bit integer using *pm, casting it with (unsigned int *) to assign to the unsigned int pointer nm. Add this to the end of your program and try again.
    unsigned int *nm;
    pm = memory;
    nm = (unsigned int *) *pm;
    printf ("pm: %p, nm: %p\n", pm, nm);
Pointers can be used with any type, including characters. When we increment the unsigned int pointer by 1 to the next unsigned int, the memory address is actually incremented by 4 since pointers refer to bytes, not words. When we increment a char pointer by 1 to the next char, the memory address is incremented by 1. This code prints the same values as above, except as 36 bytes instead of 9 words.
    char *cp;
    cp = (char *) pm;
    
    for (int i=0; i<36; i++) {
      printf ("cp: %p > *cp: 0x%02hhX\n", cp, *cp);
      cp++;
      }
}
To summarize, given these declarations:
  int word[50];
  int *cptr;
  cptr = word;
we can say the following:

cptrword&word[0] all refer to the address of the first element of the word array.
(cptr + n)(word + n)&word[n] all refer to the address of the nth element of the word array.
*cptr*wordword[0] all refer to the first value in the word array.
*(cptr+n)*(word+n)word[n] are refer to the nth value in the word array.

Dumping the Stack

You only need to complete the dumpStack function to print the stack starting at the top for a fixed number of memory locations. The function should print the header as given with a line from each memory location starting at the top of the stack, tos. Each line should contain the memory address, it's contents in hexadecimal, and it's contents in decimal. Remember that the local variables of the current function are on the top of the stack. My solution was several lines of code.

/* Recitation 3
 * dumpStack prints a memory dump starting at the top of the stack 
 * for a fixed number of locations.
 * For each memory location, dumpStack prints a line containing 
 * the memory address and its contents in both Hexadecimal and decimal.
 */
 
#include <stdio.h>

int depth = 0; // number of memory locations to print in stack dump

void dumpStack(char *label) {

    unsigned int *tos;  // literally the memory location at the top of the stack
                        // when this function is called.
    
    printf("Top of stack: %s\n       Address  Hexadecimal      Decimal\n", label);

    // @todo complete the function.
}

// Do not change the code below this comment 

int fact(int n, int parm2) {
    // the extra parameter and local variables are 
    // sentinel values to help analyze the stack.
    int local1 = -1;
    int local2 = 0x55555555;

    // base case
    if (n < 1) {
        dumpStack("at the base case for fact(n)");
        return 1;
    }

    // capture the return value for the stack dump
    local1 = n * fact(n-1,1-n); 

    return local1; 
}

int main(int argc, char *argv[]){

    int f;
    int n; 

    // initialize locals and global
    f = 65535;
    n = 4;
    depth = 100;

    // obtain n,depth from command line?

    f = fact(n,-n);  

    return f;
}

You will notice an extra parameter parm1 and local variables local1, local2 in this source. These sentinel values were added to aid your analysis of the values on the stack by giving you some know reference points.

Create a dump.c file and paste the code below into the file. To compile the C source file and create the program file:

    gcc -Wall -m32 -g -o dump dump.c  		
		
To run the program:
    ./dump		
		
Remember to use the -m32 option to get the correct object code. The default -m64 will produce results substantially different than this recitation. You may wish to try -m64 option to see the differences.


Analyzing the stack dump

If your function is implemented correctly, the first line should have the same address and value since it is tos. Things to identify in the stack dump include:

You will notice some significant differences if you repeat this with the default -m64 option.


Analyzing assembly code

The hardware does not actually implement C. The compiler translates the C source code into assembly code (machine instructions) that can be executed. We can view the assembly code produced by the compiler.

int fact(int n, int o) {
    // extraneous local variable to help view the stack.
    int f = 0x12346789; 
    int g = 0x55555555;

    // base case
    if (n < 1) {
        return 1;
    }

    // capture the return value for the stack dump
    f = n * fact(n-1,1-n); 

    return f; 
}

Now view the object code in the file fact.s. This cheat sheet might help you intrepret the assembly code listing. The listing contains the source instructions interspersed with the corresponding object code. Each line of object code contains the following information:

Some registers are reserved for special purposes while others can be used for computation. Special purpose registers include Draw a stack and execute the code by hand to get a better understanding how this function uses the stack. You can find more information on the internet about the registers and instruction set. Highlight the code and note what happens to the registers and stack: Which instructions change the stack? Push and pop are the obvious ones. What other instructions change the stack or stack pointer?

Try this again with the -m64 compiler option to compile for a 64 bit machine and note the differences.

Then try again when the -O or -O2 (letter O, not the number 0) compiler options to optimize the code at different levels and note the differences. You might find the second one quite interesting.


© 2018 CS270 Colorado State University. All Rights Reserved.