My Project
|
assembler.c
. You will reuse code
that you wrote in previous assignments, add
new code and integrate with code provided to you. Some of that code is C
source code. Other parts are just a library. You are given header files
so that you know what functionality is provided and can call functions in
the library even though you do not have the source code. This assignment
serves several purposes:
ADD R1,R2,R3
to the hex code 1283
.
It is like a compiler except that the language is deals with is much simpler
than a high level language like Java or C. Several things make assembly language
easier to deal with:
[label] opcode operand(s) [; optional end of line comment]
The assembler reads the source code a line at a time, analyzes each line and
produces the output file(s) required to run the program.
.obj, .hex
).sym
)The first pass of the assembler must, at a minimum, do two things:
LD/ST/LDI/STI/BR/JSR/LEA
opcodes can be computed in the
second pass, based on information from the symbol table.
The second pass of the assembler is responsible for generating the object code
from the .asm
file. This is done entirely by traversing the
linked list created in the first pass. The file does not need to be read again.
The second pass generates the LC3 hex value(s) that are required for each
instruction. This involves creating the correct 16-bit bit pattern(s) that
defines the instruction. When an LD/ST/LDI/STI/BR/JSR/LEA
instruction is encountered, the code needs to compute the PCoffset, determine
if it is in range and insert it into the bit pattern. Offsets out of range
are reported and are the only errors generated by during the second pass.
symbol.c
you wrote in previous
assignments. You may need to fix any remaining bugs in those files to complete
this assignment.
cd
therelc3asm.XXX.tar
file. WARNING: this tar ball
will spew files into the current directory. It does not unpack
into a subdirectory.tar -xvpf lc3asm.XXX.tar
symbol.c
with your version.Makefile
and make sure the variable GCC
is appropriate for your C compiler.make
. There should not be any
errors or warnings.cs314
.
asm_init(), asm_term()
asm_term()
.
asm_pass_two()
asm_pass_one()
The second pass of the assembler will loop over the data structure created in
the first pass and generate code. So, write a loop that traverses the elements
of the linked list passed as a parameter and calls asm_print_line_info()
on each element.
asm_pass_one()
, phase 1reference
using the function strdup()
.
To test your code, make
the assembler and run it with a small
assembly file(s). The name of your assembler is mylc3as
. What you
should get is two things:
.sym
) with a header, and symbols. The
addresses will all be 0.make testTokens
,
and run this program. Study the source of testTokens.c
to see
how to do it. Understand what happens with blank lines and lines that contain
only comments.
You can see from step 5.3 that this assembler is building a data structure that will be re-used in the second pass. This will let you practice your C dynamic memory management skills.
asm_pass_one()
, phase 2seeLC3
and run it. Try
different LC3 opcodes and see what the program prints. Then study the code in
seeLC3.c
to understand how it works.
You now have a model of how to determine the type(s) of operands expected by any LC3 instruction. Note that many LC3 instructions have common operands. For example, many instructions require one of more registers. Therefore, it may be appropriate to write a helper method that converts a token to a register, and reports an error if the token is not a register. Similarly, if may be useful to have a helper method that gets immediate values. Immediates are used in a variety of instructions. Those instructions differ in the number of bits used to store the value, and some values are signed, while others are unsigned. A helper method can take care of all these cases.
Add code to collect and store each operand into the fields of
line_info_t
. Create very short assembly language files to test
your code. These files will often have .ORIG, .END
and a single
additional LC3 instruction. Although you may be tempted to start with
ADD, AND
instructions, they are actually a little tricky. This
is because you do not know from the name whether or not the third operand will
be an immediate or a register. Only when you encounter the third operand will
you know which form the writer used. Compare this to JMP, RET
which use two different "names" for the two forms.
asm_pass_one()
, phase 3.BLKW, .STRINGZ
are the exceptions. And, .ORIG
is
handled a little differently. Ay any rate, your code should now put the
address of each label in the symbol table. The symbol table file that you
create can be compared to that produced by the regular LC3 assembler.
asm_pass_one()
, phase 4assembler.h
file has all of the possible errors that you
need to handle, and assembler.c
has a method that prints errors.
asm_pass_two()
asm_pass_one()
. Instead of printing the source line,
your code will now need to generate the machine code for each instruction. If
you have completed the first pass correctly, then each line_info_t
contains the information you need to generate the machine code.
The first step is to copy the prototype
from the
format
field of the information on this line into a variable that
will contain the final machine code. The next step is to insert the operand(s)
into the correct locations of the machine code. For example, if the line
was an ADD
with an immediate value, then the fields
DR, SR1, immediate
are inserted into the appropriate bit locations
of the macine code. Different instructions will have different number of
operands. When the machine code is complete, write it to the object file using
lc3_write_LC3_word()
. When you encounter an instruction that uses
a PCoffset
, you will need to look up the address of the reference
in the symbol table and compute the offset.
Note that the .BLKW
and .STRINGZ
opcodes may generate
multiple words in the object file. If you are ever unsure what should be put in the
output file, run ~cs270/lc3tools/lc3as -hex file.asm
and look at
the hex code that it generates.
ADD R1,R2,R3
instead of 1283
.
As you know, LC-3 assembly language is very limited. In order to make
LC-3 programming slightly simpler, we will introduce several pseudo instructions.
Pseudo instructions are instructions that are recognized by the assembler but
don't actually exist in the machine. For example, LC-3 does not actually support
a HALT
instruction, but we've been putting them in our assembly
code. The assembler recognizes that HALT
is not a real instruction
and generates a TRAP x25
. Note that pseudo instructions do not
give programmers any additional power, because anything that can be done with a
pseudo instruction can be done with 1 or more regular instructions. They just
make programming easier.
Two existing psuedo instructions that generate multiple LC3 instructions are
.BLKW
and .STRINGZ. You will add a couple of others
that will make programming a little simpler.
There are several things that you must do to extend the assembler.
opcode_t
defined in
lc3.h
. This adds an "opcode" for the pseudo instructrion.lc3_instructions[]
defined in
lc3.c
. This defines the number and types of operands for
the pseudo instruction.lc3_instruction_map[]
defined in
util.c
. This associates the "name" of the pseudo instruction
with its "opcode".asm_pass_one()
analyze the operands and determine
how many LC3 instructions this pseudo instruction will produce. This
may not even require any additional code.asm_pass_two()
to generate the LC3 instructions
correponding to the pseudo instruction and add them to the object file..SETR
pseudo instruction.SETR
pseudo instruction allows the programmer to initialize a
register with an immediate value. Thus, one can write
.SETR R1,#100
. The end result is that register R1
contains the value 100
and the condition code is set according to
the value. The following table shows the two cases.
small value (.SETR DR,#5 ) |
large value (.SETR DR,#1000 ) |
---|---|
|
|
When the value is "small", you can take advantage of the immediate field in
a ADD
instruction and use only two LC3 instructions. If
the immediate exceeds to size allowed in ADD
, three instructions
will be required.
.SUB
pseudo instruction.SUB
is just like ADD
except that it performs
subtraction. Like ADD
, the third operand can be a register or an
immediate. For reasons that will become clear soon, if the third operand is an
immediate, it can only have a value from decimal -15 to 16 (versus -16 to 15
for ADD). If the third operand is an immediate, just use an
ADD
with the value negated.
When the third operand is a register, one can generate multiple instructions. Consider this example.
using .SUB |
without using .SUB |
---|---|
|
|
Pretty neat how we performed the subtraction without an additional register,
eh? But... There's a problem: if the first and third registers are the
same register (.SUB R1,R2,R1
), because the second NOT
instruction will corrupt the result of the subtraction. In this case,
don't emit the second NOT
.
And there's another problem: if the second and third registers are the same
(.SUB R1,R2,R2
); when you negate the third register you will
corrupt the value of the second register (which is the same as the third). In
this case, rather than the 4-instruction sequence above, we'll simply emit an
instruction that puts 0 in the DR register (AND DR,DR,#0
).
P10.tar
using the checkin
program and the
P10> key, or the Checkin tab on the course website. The provided
Makefile has a target to create the archive file containing both source
files, as follows:
$ make package