CS 270
|
In this recitation you will learn about the memory layout for programs, how the programs use the stack to support function calls, and how to use pointers to view the contents of memory.
The goal of the recitation is to implement a small routine that prints the contents of the program stack so you can analyze the information.
In C we can examine the contents of memory directly using pointers. Here is an example of an integer and a pointer to the integer. It shows how to:
int *pi;
,
pi = &i
,
*pi
.
pointer.c
.
gcc -m32 -o pointer pointer.c
.
./pointer
.
#include <stdio.h> int main() { int i; // an integer value int *pi; // a pointer to an integer value i = -1; pi = &i ; // set pi to the address of i printf("i: %d, pi: %p > *pi: %d\n", i, pi, *pi); // place additional code here return 0; }Pointers may also point to other pointers. This example shows how to:
int **ppi;
,
&pi
,
**ppi
.
int **ppi; ppi = &pi ; // set ppi to the address of pi printf("i: %d, pi: %p > *pi: %d, ppi: %p > **ppi: %d \n", i, pi, *pi, ppi, **ppi);We can use pointers to traverse consecutive memory locations to view their values rather than using array indices. After setting the pointer to the first element, all we need to do is increment the pointer to the next memory location as in the following example. Add this to the end of your program and try again.
int memory[] = {1,2,3,4,5,6,7,8,9}; int *pm; pm = memory; // &memory[0] is an alternative. for (int i=0; i<9; i++) { printf("pm: %p > *pm: %d\n", pm, *pm); pm++; }We can also assign a 32 bit integer from memory to a pointer with the appropriate type casting. Here we obtain a 32 bit integer using
*pm
, casting it with (int *)
to assign to the unsigned int pointer nm
. In general this is propably not a good idea as
an integer variable is unlikely to be a valid memory address, but there are times when this can be usefull.
Add this code to the end of your program and try again.
int *nm; pm = memory; nm = (int *) *pm; printf ("pm: %p, nm: %p\n", pm, nm);Pointers can be used with any type, including characters. When we increment the
int
pointer by 1 to the next int
,
the memory address is actually incremented by 4 since pointers refer to bytes, not words.
When we increment a char
pointer by 1 to the next char
,
the memory address is incremented by 1.
This code prints the same values as above, except as 36 bytes instead of 9 words.
char *cp; cp = (char *) pm; for (int i=0; i<36; i++) { printf ("cp: %p > *cp: 0x%02X\n", cp, *cp); cp++; }Try changing the first value in 'memory' to 0x12345678 by adding this line of code
memory[0] = 0x123454678;between the code
cp = (char *) pm; for (int i=0; i<36; i++);above, then run the code again. What do you notice about how the value is stored in memory? If you are curious about this google 'endianness'.
To summarize, given these declarations:
int word[50]; int *cptr; cptr = word;we can say the following:
cptr | word | &word[0] | all refer to the address of the first element of the word array. |
(cptr + n) | (word + n) | &word[n] | all refer to the address of the nth element of the word array. |
*cptr | *word | word[0] | all refer to the first value in the word array. |
*(cptr+n) | *(word+n) | word[n] | are refer to the nth value in the word array. |
C programs use the following memory layout where the high addresses are at the top and the low address are at the bottom of the diagram.
address ________ 0xffffffff | | | | stack (function local data) |_ _ _ _| | vv | (grows down) | | | | free space | | |_ _^^_ _| (grows up) | | | | heap (dynamic) |________| | | | | bss (uninitialized data) |________| | | | | data (initialized data) |________| | | | | text (code) 0x00000000 |________|
The program stack is a series of memory locations. The 64-bit and 32-bit Intel sytems use somewhat different stack architectures. For the purpose of this recitation we are studying the 32-bit Intel architecture since it is similar to the LC3 architecture we use later in the semester. Memory addresses and integers are 32 bits (4 bytes) so we will study the stack as an array of unsigned ints. The stack is often organized as depicted in this diagram.
top of stack fn address memory description f2 fffeffc0 |--------| ^ f2 fffeffc4 |--------| ^ f2 fffeffc8 |--------| local variables f2 fffeffcc |--------| exception f2 fffeffd0 |--------| handler f2 fffeffd4 |ffff0000| frame pointer f2 fffeffd8 |--------| return address f2 fffeffdc |--------| return value f2 fffeffe0 |--------| parameters f2 fffeffe4 |--------| v f2 fffeffe8 |--------| v f1 fffeffec |--------| ^ f1 fffefff0 |--------| ^ f1 fffefff4 |--------| local variables f1 fffefff8 |--------| exception f1 fffefffc |--------| handler f1 ffff0000 |ffff002c| frame pointer f1 ffff0004 |--------| return address f1 ffff0008 |--------| return value f1 ffff000c |--------| parameters f1 ffff0010 |--------| v f1 ffff0014 |--------| v ... ffff0018 |--------| ...
Note we have shown the top of the stack at the top of the diagram. The addresses in the diagram increase as you go down, the opposite of the memory layout diagram. Pushing a 32 bit value on the stack subtracts 4 from the stack pointer. Popping a 32 bit value from the stack adds 4 to the stack pointer.
Each called function has a similar layout in the stack. In this sample, the function f1 called the function f2. The addresses labelled f2 denote the values accessible by the function f2, and similarly for f1. The information added to the stack for each function includes:
There are many variations depending on the hardware architecture and compiler optimization level. Parameters and return values may be passed in registers. Local variables may stored in registers. Frame pointers may not be maintained in the stack. In our case the value returned by a function is placed in a register and does not appear on the stack. On other architectures, the return value is placed on the stack between the parameters and the return address.
You only need to complete the dumpStack function to print the stack
starting at the top for a fixed number of memory locations.
The function should print the header as given with a line from each
memory location starting at the top of the stack, tos
.
Each line should contain the memory address, it's contents in hexadecimal,
and it's contents in decimal.
Remember that the local variables of the current function are on the top of the stack.
My solution was several lines of code.
/* * dumpStack prints a memory dump starting at the top of the stack * for a fixed number of locations. * For each memory location, dumpStack prints a line containing * the memory address and its contents in both Hexadecimal and decimal. */ #include <stdio.h> int depth = 0; // number of memory locations to print in stack dump void dumpStack(char *label) { unsigned int *tos; // literally the memory location at the top of the stack // when this function is called. printf("Top of stack: %s\n Address Hexadecimal Decimal\n", label); // @todo complete the function. } // Do not change the code below this comment int fact(int n, int parm2) { // the extra parameter and local variables are // sentinel values to help analyze the stack. int local1 = -1; int local2 = 0x55555555; // base case if (n < 1) { dumpStack("at the base case for fact(n)"); return 1; } // capture the return value for the stack dump local1 = n * fact(n-1,1-n); return local1; } int main(int argc, char *argv[]){ int f; int n; // initialize locals and global f = 65535; n = 4; depth = 100; // obtain n,depth from command line? f = fact(n,-n); return f; } |
You will notice an extra parameter parm1
and
local variables local1
, local2
in this source.
These sentinel values were added to aid your analysis of the values on the stack
by giving you some known reference points.
Create a dump.c
file and paste the code below into the file.
To compile the C source file and create the program file:
gcc -Wall -m32 -g -o dump dump.cTo run the program:
./dumpRemember to use the -m32 option to get the correct object code. The default -m64 will produce results substantially different than this recitation. You may wish to try -m64 option to see the differences.
If your function is implemented correctly, the first line should have the same address and value
since it is tos
.
Things to identify in the stack dump include:
main
.
This is important for the programming assignment.
The hardware does not actually implement C. The compiler translates the C source code into assembly code (machine instructions) that can be executed. We can view the assembly code produced by the compiler.
fact.c
.
gcc -m32 -g -c fact.c
.
objdump -S --no-show-raw-insn fact.o >fact.s
.
int fact(int n, int o) { // extraneous local variable to help view the stack. int f = 0x12346789; int g = 0x55555555; // base case if (n < 1) { return 1; } // capture the return value for the stack dump f = n * fact(n-1,1-n); return f; } |
Now view the object code in the file fact.s
.
This cheat sheet might help you intrepret the assembly code listing.
The listing contains the source instructions interspersed with the corresponding object code.
Each line of object code contains the following information:
$0x18
, registers %ebp
,
and offsets from the registers -0x10(%ebp)
.
ebp
is the current frame pointer.
It points the the memory location containing the previous frame pointer.
esp
is the stack pointer.
It points to the memory location containing the value on the top of the stack.
eax
is the function return value.
It is set before the function returns and may be used for computation during the function.
Try this again with the -m64 compiler option to compile for a 64 bit machine and note the differences.
gcc -m64 -g -c fact.c
.
objdump -S --no-show-raw-insn fact.o >fact64.s
.
fact.c
.
gcc -m64 -O -g -c fact.c
.
objdump -S --no-show-raw-insn fact.o >fact64O.s
.
gcc -m64 -O2 -g -c fact.c
.
objdump -S --no-show-raw-insn fact.o >fact64O2.s
.