Grading Scripts
Purpose
To grade a homework assignment, you write a grading script, a file
with the suffix .gs
. It unpacks, builds, executes the student code,
and evaluates its output.
Example
To test this simple C++ program, hello.cc
:
#include <iostream>
int main() {
int useless; // will cause a problem
std::cout << "Hello, world!\n";
}
This script, hello.gs
, might be used:
#! /s/parsons/d/fac/applin/bin/grader
# Rubric (5-point assignment):
#
# • 1.0 compiles
# • 1.0 no warnings
# • 2.0 produces correct output
# • 0.5 produces no standard error output
# • 0.5 no global variables
src=${1:?first argument must be source file}
setting MaxScore 5 # Value of this assignment
rm -f a.out # In case left over from previous run
export LANG=C # Vanilla compiler output, please.
run g++ -Wall $src
test 1.0 "compiles" [[ -x a.out ]]
if ((score<=0)) return # Give up if we couldn’t even build.
test 1.0 "no warnings" ! grep -qi warning stdout stderr
run ./a.out
test 2.0 "correct output" exact 'Hello, world!\n' stdout
test 0.5 "stderr is empty" empty stderr
globals 0.5 a.out
When executed like this:
./hello.gs hello.cc
It would produce this output, which contains three sections:
- The overall score and who did the grading
(so that students know who to complain to).
- The high-level summary, one line per test.
- The details of individual tests.
Most students will read only the first line.
Some students will read the summary (one line/test) section.
A few students will actually look up the failing tests
from the summary section in the details section.
Score: 4.00/5.00 points
Graded by Jack Applin <applin@CS.ColoState.Edu>
Summary of all tests:
Value Result Test Description
1.00 pass 1 compiles
1.00 FAIL 2 no warnings
2.00 pass 3 correct output
0.50 pass 4 stderr is empty
0.50 pass 5 globals
4.00 Total
Passed 4 tests, failed 1 test.
Details of individual tests:
Executing: g++ -Wall hello.cc
Exit code: 0
Standard output is empty
Standard error (4 lines):
hello.cc: In function 'int main()':
hello.cc:4:9: warning: unused variable 'useless' [-Wunused-variable]
4 | int useless;
| ^~~~~~~
Test 1: compiles
Status: pass
Condition: '[[' -x a.out ']]'
Value: 1.00
Test 2: no warnings
Status: FAIL
Condition: ! grep -qi warning stdout stderr
Value: 1.00
Executing: ./a.out
Exit code: 0
Standard output (1 line):
Hello, world!
Standard error is empty
Test 3: correct output
Status: pass
Condition: exact 'Hello, world!\n' stdout
Value: 2.00
Test 4: stderr is empty
Status: pass
Condition: empty stderr
Value: 0.50
Test 5: globals
Status: pass
Condition: No globals used
Value: 0.50
Overview
The most important functions of the grading script
are run
and test
.
-
run
-
This doesn’t do any testing; it just runs a command,
perhaps with options or redirected input, and saves the output.
-
test
-
Take the results of the previous
run
and assert some condition
about them. If the condition succeeds, great! Otherwise, the test
has failed.
Up/Down
A grading script can either count up or down.
- Counting up
-
The score is initially zero. A passing
test
adds its value to
the score. A failing test
does not change the current score.
Start a script that counts up like this:
setting MaxScore 10 # ten-point assignment, count up
- Counting down
-
The score is initially a non-zero value, the number of points for the
assignment. A failing
test
subtracts its value from the
score. A passing test
does not change the current score.
let score=5.0 # five-point assignment
The Grading Script
The grading script is a Bash script, with additional functions defined.
(Really, it’s a zsh
script, but treat it like a Bash script.)
Here are the functions:
-
setting
name value -
Set a parameter. These are the default settings:
setting MinScore 0.0 | minimum possible score |
setting MaxScore 0.0 | maximum possible score |
setting Visible true | make control chars visible in report |
setting ExpandTabs true | expand tabs in stdout & stderr |
setting TrimCR true | remove DOS carriage returns |
setting TrimWhitespace true | remove trailing horizontal whitespace |
setting TrimTrailingBlankLines true | remove trailing blank lines |
setting Merge false | merge stderr into stdout on output |
setting Flatten true | flatten unpacked archives |
setting StdinTermNull true | route terminal stdin from /dev/null |
setting ShowLines 10 | show this many lines in report |
setting MaxFileSize 1000000 | maximum file size in bytes (1MB) |
setting TimeLimit 30 | runtime CPU limit (30 seconds) |
setting Header '' | show this at the start |
setting Footer '' | show this at the end |
-
unpack
[ -C
directory ] archive -
Unpack the
tar
or zip
archive into the optional directory
(or the current directory if -C
not specified).
If the setting Flatten true
is in effect, which it is by default,
then any prefix directories are removed when unpacking.
Examples:
unpack -C Build hw4.tar
unpack foo.zip
-
test
value title condition -
Take the results of the previous
run
and assert some condition
about them. If the test succeeds (and we are counting up), add the
value to the current score. If the test fails (and we are
counting down), subtract the value from the current score.
The condition can be any of:
[[
condition ]]
((
arithmetic-condition ))
- any command, e.g.,
grep
.
A zero exit code indicates success.
The command should produce no output—just an exit code.
You can use >&/dev/null
to suppress output.
In bash
& zsh
, a leading !
negates the exit code of a
command. It reverses the sense of the test. grep -q "foo" stdout
succeeds if foo
is in the output. ! grep -q "foo" stdout
succeeds if foo
is not in the output.
The title names the test, and is shown in the summary & details
sections of the report.
Provide a positive title—describe the good condition
(e.g., "prompt found"
) as opposed to the bad condition
(e.g., "no prompt found"
).
Examples:
test 1.0 "foo is executable" [[ -x foo ]]
test 1.0 "output contains foo" grep -q foo stdout
test 0.5 "no warnings" ! grep -qi warning stdout stderr
-
run
[ -C
directory ] command arg arg … -
In the optional directory (or the current directory if
-C
not specified), execute command arg arg …, sending standard output
to the file stdout
, and standard error to the file stderr
.
If setting Merge true
is in effect (false
by default),
then send both streadms to stdout
.
The exit code, and contents of stdout
and stderr
, will be
displayed in the detailed results section of the report.
-
empty
file … -
Assert that all the given files are empty.
Note that
stdout
and stderr
may have been filtered
by the settings TrimWhitespace
and TrimTrailingBlankLines
.
Since these are enabled by default, spaces at the end of lines
and empty lines at the end of the output have already been removed.
Since such trailing whitespace is largely invisible, these settings
save a lot of arguments.
Examples:
test 0.5 "stderr is empty" empty stderr
-
exact
string file -
Assert that a file contains exactly these contents. Backslash escapes
(
\n
, \t
, \0
octal, \x
hex, \U
unicode, …)
are recognized, but no pattern matching. This is not a regular
expression.
Examples:
test 1.0 exact "Hello, world!\n" stdout
test 1.2 exact "2+2\n4\n" outfile
- If you want pattern matching, use
grep
with -q
to suppress the output:
test 1.0 grep -q "foo.*bar" stderr
-
badsyms
value executable title sym1 sym2 … -
Forbid certain symbols (functions). If the given symbols are
used in the executable, then report the offending symbols.
Example:
badsyms 1.0 a.out "No C I/O" printf fprintf scanf fscanf
-
globals
value executable exceptions … -
Forbid global variables, with exceptions. Examples:
globals 0.10 Build/a.out # no global variables permitted
globals 0.25 hello program_name # allow program_name, but no others
-
pity
value title -
If the current score is less than value, set the current score to
value. Do this after all the tests, at the end of the grading
script.
pity 0.2 "Minimum score for compilation"
-
score
-
It’s a variable, not a function. It contains the current score,
as a floating-point number. Most commonly used like this:
Unpack the archive
Compile the code
if ((score<=0)) return # give up if unpack/compile failed