CS 270 PA10: Disassembler
due Sunday, May 10th at 10:00pm, no late submissions.
Goals
For this assignment, you will write a program dis.c, which will
implement a crude LC-3 disassembler.
Sample Run
% cat max.asm
;; File: max.asm
;; Description: Compute the maximum value of two numbers
;; Author: Jack Applin
;; Date: April 26, 2015
.orig x3000
max add r6,r6,#-3 ; Make space for return value, r7, and r5
str r7,r6,#1 ; Save return address
str r5,r6,#0 ; Save frame pointer address
add r5,r6,#0 ; Setup frame pointer
ldr r1,r5,#3 ; Get one argument
ldr r2,r5,#4 ; Get the other argument
;; Which is greater, R1 or R2?
not r0,r1 ; R0=~R1; first part of negation
add r0,r0,#1 ; R0=-R1
add r0,r2,r0 ; R0=R2-R1
brn over ; R0<0 => R2-R1<0 => R2<R1 => R1 is bigger
add r1,r2,#0 ; R1=R2
over str r1,r5,#2 ; Save our result
ldr r5,r6,#0 ; Restore frame pointer
ldr r7,r6,#1 ; Restore return address
add r6,r6,#2 ; Don’t need this stack space any more
ret
;; Some invalid instructions
.fill xdead
.fill x100f
.end
% lc3as max.asm
STARTING PASS 1
0 errors found in first pass.
STARTING PASS 2
0 errors found in second pass.
% c99 -Wall dis.c -o dis
% ./dis max.obj
.orig x3000
add r6,r6,#-3
str r7,r6,#1
str r5,r6,#0
add r5,r6,#0
ldr r1,r5,#3
ldr r2,r5,#4
not r0,r1
add r0,r0,#1
add r0,r2,r0
brn pc+1
add r1,r2,#0
str r1,r5,#2
ldr r5,r6,#0
ldr r7,r6,#1
add r6,r6,#2
ret
.fill xdead
.fill x100f
.end
Object file format
The *.obj file is a sequence of sixteen-bit words.
The first word is the location of the code, from .orig,
and all subsequent words are instructions or data.
Here’s how to read a sixteen-bit word:
#include <stdio.h>
#include <arpa/inet.h>
// Read a single 16-bit word from the given file descriptor.
// Return the word, or EOF if no more data is available.
int read_word(FILE *f) {
unsigned short s;
if (fread(&s, sizeof(s), 1, f) != 1)
return EOF;
return ntohs(s); // Convert file byte order to desired byte order
}
// Read all the words from stdin, and display them.
int main() {
int w;
while ((w = read_word(stdin)) != EOF)
printf("%04x\n", w);
return 0;
}
Requirements
- Your output must match the sample output exactly for that input.
- All output is lower case.
- Invalid instructions are translated to .fill
- .orig and .fill emit numbers in hexadecimal.
- All other numbers are #decimal
- Label operands for br*, jsr, ld*,
st*, etc., become pc+digits or
pc-digits
- Each output line begins with a tab, '\t'.
- If an opcode has operands, a tab separates the opcode from the operands.
- trap instructions are translated as
trap #number, not as out, halt, etc.
- Treat our local instructions push & pop
as invalid, and hence .fill.
- The n/p/z in the br* family of branch instructions
are in alphabetical order:
Emit brnpz, not brpnz.
Emit brnp, not brpn.
- BR with nzp all 0's (i.e., 0000..01FF) is undefined behavior.
- BR with nzp all 1's (i.e., 0E00..0FFF) must emit brnpz,
not br.
- Instruction C1C0 is ret.
- Emit .end at the end.
How to submit your program
Submit dis.c to the Checkin tab.
How to receive negative points
Turn in someone else’s work.