Hashing Implementation

Objectives
  1. Helping you to understand hashing algorithms

  2. Implement hashing functions and an associative data structure

Getting Started

Create a project called P8 and import HashingImplementation-starter.jar

Your directory should now look like this:

P8/
├── resources
│   ├── UnitedStates.txt
│   └── Dictionary.txt
└── src
    ├── Debug.java
    ├── P8_Unit_Testing_HashTable.java
    ├── P8_Unit_Tests_FIRST_Specific.java
    ├── P8_Unit_Tests_JAVA_Specific.java
    ├── P8_Unit_Tests_PRIME_Specific.java
    ├── P8_Unit_Tests_SUM_Specific.java
    ├── Shell.java
    ├── IHash.java
    ├── TestCode.java
    ├── Hasher.java
    └── HashTable.java
  • UnitedStates.txt is a list of the 50 states

  • Dictionary.txt is a dictionary of more than 84 thousand words

  • You can make your own data file, one string per line

Debug and Shell are classes for printf debugging and incremental development and testing, provided by Fritz Sieker.

IHash provides the interface for HashTable, which is where you will implement the hash table data structure. Hasher is the implementation of the hashing functions, which is used by HashTable.

Directions

The hash table for this assignment has two distinct parts:

  1. A set of hashing functions that compute a hash code from a string.

  2. A data structure that uses the hashing functions to insert, remove, and search.

Hash Functions

Complete and test the Hasher class.

  • This is where the calculations for the hash codes are done.

  • There are three distinct ways you can implement this class. Read through the additional documentation and then choose which way you would like to try. You might want to try all three!

Hash Table

IHash has the interface to the hashing implementation, and HashTable is the file with your implementation. Implement all of the methods in the interface.

Conceptually,

  • Each hashTable is a table (implemented as an array list) indexed by hashCode % size.

  • Each entry in the table contains a hashBucket.

You will need to decide which data structure you wish to use for the hashBuckets. Good candidates include an ArrayList or LinkedList.

If you are ambitious, you may want to create a custom list class. Implement the methods in the interface to produce the data structure shown below, in which we used a LinkedList for each hashBucket, but you can use whatever you would like:

Warning
Create a working version of the assignment before trying to optimize
HashIterator

This is the last part of the HashTable class and will take time to conceptualize. Here is some additional information to help get you started.

Program Debugging

After completing the data structure, you will find it useful to have code that prints the entire data structure. This is the purpose of the print() method. This method is also used for most of the automated testing. Here is an example run, with the output truncated for brevity. Commands are shown in green, normal output (stdout) in black, and Debug.printf (stderr) in red:

TestCode
Program Testing

The file TestCode.java provides a test harness for your code. Instead of explicitly writing test code, you test your implementation by typing in commands, in a similar fashion to the Linux command line. Each command calls a method in your code, and results are printed. To get started, run java TestCode. When prompted enter help and you will see a list of the commands available to you. For this assignment, the first two commands you must execute are shown below. These are required to define the hashing algorithm for the entire execution of the program, and to initialize the HashTable data structure.

hasher <FIRST|SUM|PRIME|JAVA>
table <size>

After the HashTable is initialized, you can insert a value in the table by using the "insert" command. To insert many strings (keys) from a file, use the insertf command followed by a file name. This command will read all of the strings (keys) from the file, one per line, and insert them into the table. Other commands include "remove" and "search" for removing and searching for individual strings, and removef and searchf for removing and search for a set a strings from a file. To turn on debugging output use the debug command with a value of 1. Issuing the debug command again with a value of 0 turns off debugging. Additional commands are documented by the help command.

Unit Testing

The unit testing for this assignment is similar to the previous assignments with unit testing. For this assignment there are multiple testing files. This is due to the different hashing mehods the user can use in this program. The 'P8_Unit_Testing_HashTable' file is the main file with methods that are not HashFunction specific. The other files include two tests each that the outputs will vary for each HashFunction defined. Note that the tests given are rudementary and designed to keep you on track. If you want to test full code coverage you will have to write more tests on your own. If you are struggling to write your own tests, refer to the course content at the beggining of the semester. If you are still having trouble please visit help desk and the helpers will be able to guide you through what it means to unit test this program.

Specifications

Your program must meet the following specifications:

  • Work on your own, as always.

  • Package Hasher.java and HashTable.java into P8.jar for submission.

  • Assignments should be implemented using Java 1.8.

  • Make sure your code runs on machines in the COMCS 120 lab.

  • Submit your program to the Checkin tab.

  • Read the syllabus for the late policy.

  • We will be checking programs for plagiarism, so please don’t copy from anyone else.

Grading Criteria
  • 100 points for perfect submission.

  • 0 points for no submission, will not compile, submitted class file, etc.

  • Preliminary Tests: a compilation test. Test using the provided code and any additional testing you create.

  • Final Tests: final testing will use an arbitrary and very large file of strings.

    • testHashFirst: Tests the FIRST hashing function in Hasher.java. (10 points)

    • testHashSum: Tests the SUM hashing function in Hasher.java. (10 points)

    • testHashPrime: Tests the PRIME hashing function in Hasher.java. (10 points)

    • testHashJava: Tests the JAVA hashing function in Hasher.java. (10 points)

    • testInsert: Tests the insert method. (15 points)

    • testRemove: Tests the remove method. (15 points)

    • testSearch: Tests the search method. (10 points)

    • testSize: Tests the size methods. (10 points)

    • testIterator: Tests the iterator methods. (10 points)

Note
All testing except testIterator depends on the print method working correctly! You will have to test your HashIterator using your own main method.
Submission

Add a comment block with your name and email, and submit P8.jar to Checkin