Hash Lab                
A video introduction is available.
                
In this lab, we will look at a simple hashing container.
The files are in ~cs253/Lab/Hash.
Copy them somewhere local. Compile like this:
g++ -std=c++17 -Wall main.cc
, which must not produce
any warnings or errors.
                
You will create a tar file called results.tar
, and turn it in for
credit.
                
A Hash Template                
In main.cc, we have a program that uses a templated
container called hset
(for hash set), which is implemented
in the other files. An hset
is much like a set—it takes a single
(for now) template argument, that is, the type to be contained. An
hset
is not the standard C++ container unordered_set, though they
are both hash tables.
                
Compile main.cc and run it. Observe that the order of the
output is not the same as the order of the input, because the hash table
stores the data in hashed order.
                
Understanding the code                
Look at hset.h, hset_iter.h, and
hasher.h, and understand the following:
                
class hset
: a hash table implemented as a vector of lists.
class hset_iter
: an iterator class used by hset
,
containing a reference to the vector, and an integer.
class Hasher
: a class that only defines functors.
Exercises                
- An
hset
is a vector of lists, and the vector is stuck at
size 5
. Change that size to be an optional second template
parameter (not a constructor argument), which defaults to 5
.
We don’t have dynamic rehashing, but at least the user can set a
reasonable size for the vector.
Label your change with // Exercise 1
.
- Add
.empty()
to hset
. Like vector::empty() or
string::empty(), it returns true iff the
.size()
of this container is 0. You may not add any data
members. Make it more efficient than just .size()==0
. Keep in
mind that .empty()
doesn’t need to calculate the size—it just has
to know if the container is empty.
Label your changes with // Exercise 2
.
- Add an
hset<double>
to main(), and insert the values
(1.1, 1.2, 1.3, …, 1.8, 1.9). Observe that it works poorly, because
it calls Hasher::operator(int)
, and the doubles just get
truncated. Improve Hasher
to hash doubles better.
It must not put all of (1.1 … 1.9) in the same bucket.
Label your changes with // Exercise 3
.
- The design of
hset_iter
is horrible.
All that it has is an index into the entire hash table, so, whenever
hset_iter::operator*()
is called, it has to walk the vector,
looking for which list contains this item. The vector can be very
large for a big hash table.
Change hset_iter
so that, instead of having only a single index
into the entire hash table, it has two
indices: one that indexes into the vector,
and another that indexes into the list. We can index into a
vector quickly, using []
. Indexing into a list is slow, but
we can assume that each individual list is small. Both
indices start at 0. operator++
increments
the list index, and if that goes past the end of the list, change the
vector index to next list and reset the list index to 0. What should
you do if the next list is empty?
I suggest that you name the two indices
vector_index
and list_index
. I can hear you thinking: “Those
names are much too long, and, besides, I already I already have
the variable num
. I’ll just re-use num
and add another index
called something short, like n
.” Really? Consider how much
time you’ll spend trying to remember the difference between num
and n
, which is which? However, you saved nearly ten seconds on
typing. Good work!
Make sure that your new iterator works for an hset
that doesn’t
have anything in the first bucket. Don’t just assume it—try it.
We will assume that this changes many places in hset_iter.h
, so
don’t bother labeling those. However, label your changes in
hset.h
with // Exercise 4
.
- Create the tar file
results.tar
.
Type carefully, or copy & paste!
tar -cvf results.tar hasher.h hset.h hset_iter.h main.cc
How to submit your work:                
In Canvas, check in the
file
results.tar
to the assignment “Lab14”.
It’s due 11:59ᴘᴍ MT Saturday, with a 24-hour late period for a 25% penalty.
                
How to receive negative points:                
Turn in someone else’s work.