My Project
|
assembler.c
. You will reuse code
that you wrote in previous assignments, add
new code and integrate with code provided to you. Some of that code is C
source code. Other parts are just a library. You are given header files
so that you know what functionality is provided and can call functions in
the library even though you do not have the source code. This assignment
serves several purposes:
ADD R1,R2,R3
to the hex code 0x1283
.
It is like a compiler except that the language is deals with is much simpler
than a high level language like Java or C. Several things make assembly language
easier to deal with:
[label] opcode operands [; optional end of line comment]
The assembler reads the source code a line at a time, analyzes each line and
produces the output file(s) required to run the program. The assembler produces
two output files: 1) an object file containing the code; 2) a symbol table file.
Because the assembly code may contain references to things that have not yet
been encountered (e.g. a branch to a location later in the code), the assembler
normally makes two passes over the "code".
The first pass of the assembler must, at a minimum, do two things:
LD/ST/LDI/STI/BR/JSR/LEA
operators can be computed in the
second pass.Alternatively, you can skip storing this information and only create the symbol table. Then, in the second pass, the source file is be re-read and syntactic analysis done again. At this point, there are no syntatic errors, because they would have been found in the first pass. This approach requires reading the source file twice.
The second pass of the assembler is responsible for generating the object code
from the .asm
file. This pass should write a .obj or .hex file. The
actual work depends on how the second pass was structured. It may:
LD/ST/LDI/STI/BR/JSR/LEA
instruction is encountered, the code needs to compute the PCoffset, determine
if it is in range and insert it into the bit pattern. Offsets out of range
are reported and are the only errors generated by during the second pass.
cd
to itPA10.tar
file and
unpack it, or copy the following files to the directory you created. It is
easiest to right click on the link, and do a Save Target As..
for
each of the files.
assembler.c
(complete this file)assembler.h
(do no modify)field.h
(do not modify)make
.
There should be two warnings about unused variables.
cs314
.
open_read_or_error()
open_write_or_error()
get_reg_or_error()
get_comma_or_error()
get_immediate_or_error()
get_PCoffset_or_error()
Although there is no way to test them directly at this point, they are simple enough you should be able to "test" them by inspection. You might want to implement them only if/when your other code requires them.
asm_pass_two()
asm_pass_one()
.
The second pass of the assembler will loop over the data structure created in
the first pass and generate code. So, write a loop that traverses the elements
of the linked list defined by infoHead
and infoTail
and call asm_print_line_info() on each element.
asm_pass_one()
, phase 1print_tokens()
reference
using the function strdup()
.make
the assembler and run it with a small
assembly file(s). The name of your assembler is mylc3as
. What you
should get is several things:
.sym
) with a header, but no symbolsYou can see from step 4.4 that this assembler is building a data structure that will be re-used in the second pass. This will let you practice your C dynamic memory management skills.
To help you understand what your output should look like, you may execute
the program ~cs270/PA10/phase1
. This program is a reference
implementation of the code described above.
asm_pass_one()
, phase 2check_for_label()
check_line_syntax()
check_for_label()
from its description.
Now start to code update_address()
. Just handle the "default" case.
How should the "output" of your program change? Build it, and run it and
verify that your "output" has changed in the way you expected.
Now begin to code check_line_syntax()
. Specifically, write
code for the first four steps. Step 5 is an error check and will be
deferred until later. In your asm_pass_one()
, add the
call to check_line_syntax()
. Once you have completed this, you can
test by running your assembler with sample file(s). If you have labels in your
assembly code, those labels should appear in the symbol table file. And the
opcode field printed out should correspond to the opcode on each line. Study the
sample code in file seeLC3.c
for ideas.
To help you understand what your output should look like, you may execute
the program
This phase will be much easier if you do incremental development. To do this,
you want a series of "small" assembly language program that will be used to
test code in you assembler. The programs need not even be legal LC3 programs.
For example, you can test your handling of
Code the function
One you have completed
Write code incrementally to handle
a single LC3 instruction. You might do the ones with no operands first.
Test to make sure the simple instructions work correctly. Add more cases
and continue testing.
At some point you will want to turn off your debugging output, if you have
any. Do NOT remove this code. You may need it again. You may comment
it out or, if you are adventurous, learn to use conditional compilation to
turn you debugging output off and on without changing you code.
~cs270/PA10/phase2.
Implementing
This phase is where you must finish translating the assembly code
for each line into values in the line_info_t structure. First remove the
code that stores the source line in the asm_pass_one()
, phase 3reference
field.
.ORIG
with a file
that contains only a single statement. Develope the code necessary to handle
that single instruction and make sure it works. Create additional examples and
add the code necessary to support them. Instructions that take NO
operands are particularly easy.
scan_operands()
. Study the documentation to
determine what it does. The basic structure is a loop. However, the loop
is a little different that what you are used to writing. How do the values
of the operand_t
vary from one to the next? Once you understand
this, the loop is easy to write. For each operand,
call get_operand()
. Once you have written this method, build
your assembler and test it with "small" assembly files. Do you understand
the output and does it match what you think should be output?
scan_operands()
, code the function
get_operand()
. Fill in the code for one of the cases
and build your assembler. Then test it with a simple assembly file that
will exercize that case. Code a little, build, test to perfection, and repeat
until you are done!
Implementing
This phase is where you must use the values in the line_info_t
structure to generate machine code. This involves taking the various operands
discovered in the first pass and encoding them in the 16 bit LC3 word.
The only error checking performed it to make sure that the PCoffset
value is in range.
asm_pass_two()
Testing your code
You should be able to test your assembler with any valid LC-3 program,
but we have provided several files, including some with one instruction
each to allow you to do incremental development.
The test files that are currently available are
here, and more may be added.
Disassembler Option
For the honors or extra credit option you can use all of the same utilities and
files to write a disassembler. A disassembler reads a .hex file (no .obj files!)
and produces LC-3 source code. Here is an overview of how this can be accomplished:
Label ADD R5,R6,#8
1234567890123456789012345678901234567890
Grading Criteria
Here is a list of final tests that you will be graded on, along with
the number of points for each. In each case you must exactly match the .hex
file produced by lc3as to get credit.
NOTE: The AND,ADD,NOT opcodes require only register operands.
NOTE: The AND,ADD,TRAP opcodes require an immediate value:
NOTE: The JSR,LD,LDI,LEA,ST,STI opcodes require a register and PC offset.
NOTE: The JMP,JSRR,RET opcodes require a single register:
NOTE: The LDR,STR opcodes require a register and offset.
NOTE: The BR opcode requires special handling of condition codes.
NOTE: The .FILL and .BLKW directives must generate the correct data:
NOTE: The following are full LC-3 programs:
Checking in Your Code
You will submit the single file assembler.c
using the
Checkin
tab under the PA10 assignment. Submit
only the assembler.c file, do not modify any other source files!