CS253 HW1: Tokens!
Changes
Stress that input comes from cin.
                
Description
To compile any programming language (e.g., C++, Java, Python),
the first step is tokenizing, turning the “words” of the input
into tokens. For example, in C++, some tokens are:
while, int, ++
, *=
, 12
, 34.567
, "hello"
.
The process of breaking a source stream into tokens
is called Lexical Analysis.
                
For this assignment, you will write a C++ program called hw1
that will read from standard input, and writes the corresponding
tokens to standard output, one per line.
                
Your program does not open a file. It reads from standard intput
via cin. It does not use <fstream> or ifstream.
                
Input Format
The input will be any number of lines, containing tokens.
Here are the valid tokens:
                
+=
-=
*=
/=
=
!=
<
>
<=
>=
if
fi
print
return
- a variable: the 26 single letters
a
through z
- an integer: a non-empty series of digits (
0
… 9
)
- Use the Maximal munch method of lexical analysis:
Given the choice between two tokens (
<
and <=
, or
p
and print
) always choose the longer token.
- Ignore whitespace (as determined by isspace()) before, after,
and between (but not inside) tokens. However, tokens are not always
separated by whitespace.
a+=b
is three separate tokens.
printa
is two tokens, print
and a
.
- Ignore comments, which are defined as
#
followed by the rest of
the line. Comments do serve as breaks between tokens.
- Upper and lower case do not matter in the input. All output must be
lowercase. If the input is
iFg>R
, the output must be the four
tokens if
, g
, >
, r
.
- If you encounter invalid input, such as
&&
, produce an error
message containing the invalid input, and stop the program.
Sample Run
Here is a sample run, where %
is my prompt.
                
% cat data
# Celsius to Fahrenheit converter
# °F = °C × 9/5 + 32
# 212°F = 100°C
# 32°F = 0°C
# Input: A contains Celsius temperature
A *= 9 # °C × 9
A /= 5 # °C × 9/5
A += 32 # °C × 9/5 + 32
f = a # Put it in F for Fahrenheit
Z = 0 # Constants have no sign, so compute a negative number.
Z-=459 # absolute zero (should really be −459.67°F)
iFF<=ZF=ZfI#If temperature is too low, make it absolute zero.
print F
% cmake .
… cmake output appears here …
% make
… make output appears here …
% ./hw1 <data
a
*=
9
a
/=
5
a
+=
32
f
=
a
z
=
0
z
-=
459
if
f
<=
z
f
=
z
fi
print
f
Hints
Debugging
If you encounter “STACK FRAME LINK OVERFLOW”, then try this:
export STACK_FRAME_LINK_OVERRIDE=ffff-ad921d60486366258809553a3db49a4a
Requirements
- Honest,
==
is not a valid token.
- Error messages:
- go to standard error
- include the program name as given by
argv[0]
.
- Input format:
- The input may consist of any number of lines.
- Each input line may be arbitrarily long.
- Each input line may contain any number of tokens, including zero.
- Tokens may be arbitrarily long (e.g.,
01234567890123456789012345678901234567890123456789…
).
- I didn’t really need to specify the previous requirements.
When an assignment doesn’t specify a limit, don’t create your own
limits.
- Creativity is a wonderful thing, but your output format is not
the place for it. Your output should look exactly like
the output shown above.
- UPPERCASE/lowercase matters.
- Spaces matter.
- Blank lines matter.
- Extra output matters.
- You may not use any external programs via system(),
fork(), popen(), execl(), execv(), etc.
- You may not use C-style I/O
such as printf(), scanf(), fopen(), and getchar().
- You may not use dynamic memory via new, delete,
malloc(), calloc(), realloc(), free(), strdup(), etc.
- It’s ok to implicitly use dynamic memory via containers
such as string or vector.
- You may not use the eof() method.
- No global variables.
- Except for an optional single global string containing
argv[0]
.
- For readability, don’t use ASCII int constants (
65
) instead of
char constants ('A'
) for printable characters.
- We will compile your program like this:
cmake . && make
- If that generates warnings, you will lose a point.
- If that generates errors, you will lose all points.
- There is no automated testing/pre-grading/re-grading.
- Test your code yourself. It’s your job.
- Test with the CSU compilers, not just your laptop’s compiler.
- Even if you only change it a little bit.
- Even if all you do is add a comment.
If you have any questions about the requirements, ask.
In the real world, your programming tasks will almost always be
vague and incompletely specified. Same here.
                
Tar file
- For each assignment this semester, you will create a tar file,
and turn it in.
- The tar file for this assignment must be called:
hw1.tar
- It must contain:
- source files (
*.cc
)
- header files (
*.h
) (if any)
CMakeLists.txt
- This command must produce the program
hw1
(note the dot):
cmake . && make
- At least
-Wall
must be used every time g++ runs.
How to submit your work:
In Canvas, check in the
file
hw1.tar
to the assignment
“HW1”.
                
How to receive negative points:
Turn in someone else’s work.