Hashing Implementation
-
Helping you to understand hashing algorithms
-
Implement hashing functions and an associative data structure
Create a project called P8
and import HashingImplementation-starter.jar
Your directory should now look like this:
P8/ ├── resources │ ├── UnitedStates.txt │ └── Dictionary.txt └── src ├── Debug.java ├── P8_Unit_Testing_HashTable.java ├── P8_Unit_Tests_FIRST_Specific.java ├── P8_Unit_Tests_JAVA_Specific.java ├── P8_Unit_Tests_PRIME_Specific.java ├── P8_Unit_Tests_SUM_Specific.java ├── Shell.java ├── IHash.java ├── TestCode.java ├── Hasher.java └── HashTable.java
-
UnitedStates.txt
is a list of the 50 states -
Dictionary.txt
is a dictionary of more than 84 thousand words -
You can make your own data file, one string per line
Debug
and Shell
are classes for printf
debugging and incremental
development and testing, provided by Fritz Sieker.
IHash
provides the interface for HashTable
, which is where you will
implement the hash table data structure. Hasher
is the implementation of the
hashing functions, which is used by HashTable
.
The hash table for this assignment has two distinct parts:
-
A set of hashing functions that compute a hash code from a string.
-
A data structure that uses the hashing functions to insert, remove, and search.
Complete and test the Hasher class.
-
This is where the calculations for the hash codes are done.
-
There are three distinct ways you can implement this class. Read through the additional documentation and then choose which way you would like to try. You might want to try all three!
IHash has the interface to the hashing implementation, and HashTable is the file with your implementation. Implement all of the methods in the interface.
Conceptually,
-
Each
hashTable
is a table (implemented as an array list) indexed byhashCode % size
. -
Each entry in the table contains a
hashBucket
.
You will need to decide which data structure you wish to use for the hashBuckets
. Good
candidates include an ArrayList
or LinkedList
.
If you are ambitious, you may want to create a custom list class. Implement the methods in
the interface to produce the data structure shown below, in which we used a LinkedList
for
each hashBucket, but you can use whatever you would like:
Warning
|
Create a working version of the assignment before trying to optimize |
This is the last part of the HashTable
class and will take time to conceptualize. Here is some
additional information to help get you started.
After completing the data structure, you will find it useful to have code that prints the entire data
structure. This is the purpose of the print()
method. This method is also used for most of the automated
testing. Here is an example run, with the output truncated for brevity. Commands are shown in
green, normal output (stdout) in black, and Debug.printf (stderr) in red:
The file TestCode.java
provides a test harness for your code. Instead of explicitly writing test code, you
test your implementation by typing in commands, in a similar fashion to the Linux command line. Each command
calls a method in your code, and results are printed. To get started, run java TestCode. When prompted enter
help
and you will see a list of the commands available to you. For this assignment, the first two commands
you must execute are shown below. These are required to define the hashing algorithm for the entire execution
of the program, and to initialize the HashTable
data structure.
hasher <FIRST|SUM|PRIME|JAVA> table <size>
After the HashTable
is initialized, you can insert a value in the table by using the "insert" command. To insert
many strings (keys) from a file, use the insertf
command followed by a file name. This command will read all
of the strings (keys) from the file, one per line, and insert them into the table. Other commands include "remove"
and "search" for removing and searching for individual strings, and removef
and searchf
for removing and search
for a set a strings from a file. To turn on debugging output use the debug
command with a value of 1. Issuing
the debug
command again with a value of 0 turns off debugging. Additional commands are documented by the
help
command.
The unit testing for this assignment is similar to the previous assignments with unit testing. For this assignment there are multiple testing files. This is due to the different hashing mehods the user can use in this program. The 'P8_Unit_Testing_HashTable' file is the main file with methods that are not HashFunction specific. The other files include two tests each that the outputs will vary for each HashFunction defined. Note that the tests given are rudementary and designed to keep you on track. If you want to test full code coverage you will have to write more tests on your own. If you are struggling to write your own tests, refer to the course content at the beggining of the semester. If you are still having trouble please visit help desk and the helpers will be able to guide you through what it means to unit test this program.
Your program must meet the following specifications:
-
Work on your own, as always.
-
Package
Hasher.java
andHashTable.java
intoP8.jar
for submission. -
Assignments should be implemented using Java 1.8.
-
Make sure your code runs on machines in the COMCS 120 lab.
-
Submit your program to the Checkin tab.
-
Read the syllabus for the late policy.
-
We will be checking programs for plagiarism, so please don’t copy from anyone else.
-
100 points for perfect submission.
-
0 points for no submission, will not compile, submitted class file, etc.
-
Preliminary Tests: a compilation test. Test using the provided code and any additional testing you create.
-
Final Tests: final testing will use an arbitrary and very large file of strings.
-
testHashFirst: Tests the FIRST hashing function in
Hasher.java
. (10 points) -
testHashSum: Tests the SUM hashing function in
Hasher.java
. (10 points) -
testHashPrime: Tests the PRIME hashing function in
Hasher.java
. (10 points) -
testHashJava: Tests the JAVA hashing function in
Hasher.java
. (10 points) -
testInsert: Tests the insert method. (15 points)
-
testRemove: Tests the remove method. (15 points)
-
testSearch: Tests the search method. (10 points)
-
testSize: Tests the size methods. (10 points)
-
testIterator: Tests the iterator methods. (10 points)
-
Note
|
All testing except testIterator depends on the print method working correctly!
You will have to test your HashIterator using your own main method.
|
Add a comment block with your name and email, and submit P8.jar to Checkin