Hash Lab
In this lab, we will look at a simple hashing container.
The files are in ~cs253/Lab/Hash.
                
You will create a tar file called hash.tar
, and turn it in for
credit.
                
A Hash Template
In main.cc
, we have a program that uses a templated
container called hset
(for hash set), which is implemented
in the other files. An hset
is much like a set—it takes a single
(for now) template argument, that is, the type to be contained. An
hset
is not the standard C++ container unordered_set, though they
are both hash tables.
                
Compile main.cc
and run it. Observe that the order of the output is
not the same as the order of the input, because the hash table stores
the data in hashed order.
                
Understanding the code
Look at hset.h, hset_iter.h, and
hasher.h, and understand the following:
                
class hset
: a hash table implemented as a vector of lists.
class hset_iter
: an iterator class used by hset
,
containing a reference to the vector, and an integer.
class Hasher
: a class that only defines functors.
Exercises
- An
hset
is a vector of lists, and the vector is stuck at
size 5
. Change that size to be an optional second
template parameter, which defaults to 5
.
We don’t have dynamic rehashing, but at least the user can set a
reasonable size for the vector.
Label your change with // Exercise 1
.
- Add
.empty()
to hset
.
Make it more efficient than just .size()==0
, without adding a data
member to keep track of the size. Keep in mind that .empty()
doesn’t need to calculate the size—it just has to know if the
container is empty.
Label your changes with // Exercise 2
.
- Add an
hset<double>
to main(), and insert the values
(1.1, 1.2, 1.3, …, 1.8, 1.9). Observe that it works poorly, because
it calls Hasher::operator(int)
, and the doubles just get
truncated. Improve Hasher
to hash doubles better. It must not
put all of (1.1 … 1.9) in the same bucket.
Label your changes with // Exercise 3
.
- The design of
hset_iter
is horrible.
All that it has is an index into the entire hash table, so, whenever
hset_iter::operator*()
is called, it has to walk the vector,
looking for which list contains this item. The vector can be very
large for a big hash table.
Change hset_iter
so that, instead of having only a single index
into the entire hash table, it has two
indices: one that indexes into the vector,
and another that indexes into the list. We can index into a
vector quickly, using []
. Indexing into a list is slow, but
we can assume that each individual list is small. Both indices
start at 0. operator++
increments the list index, and if that
goes past the end of the list, change the vector index to next list
and reset the list index to 0. What should you do if the next
list is empty?
Make sure that your new iterator works for an hset
that doesn’t
have anything in the first bucket. Don’t just assume it—try it.
We will assume that this changes many places in hset_iter.h
, so
don’t bother labeling those. However, label your changes in
hset.h
with // Exercise 4
.
- Create the tar file
hash.tar
. Type carefully, or copy & paste!
tar -cvf hash.tar hasher.h hset.h hset_iter.h main.cc
How to submit your work:
In Canvas, check in the
file
hash.tar
to the assignment
“Lab14”.
                
How to receive negative points:
Turn in someone else’s work.