Run-length Encoding (RLE) has been around since the 1967, and while it isn’t as used today in place of better encoders, it is still used as part of those encoders. In this assignment, you will develop a run-length encoder and decoder for String. You will also get to explore encoding very simple sections of DNA strands which is related to the field of Bioinformatics!
Step 0 - Provided Code
You will want to look through the provided code before starting, as for this assignment, you will be editing two different files. One file, RunLengthEncoder.java is where you will be doing most of your work. It is the RLE “library” that you are building, along with a few warm up methods. The second file, TestRunLength.java contains both your main method, and a method you will work on to help run some tests.
Why two files?
Often in programming, you will have both a test file and a main file you are working on. This becomes even more important as you have larger applications, as you will have multiple files (objects) interacting with each other. It is important to have tests for every method and every object, in addition to your full application. From this point on, you will be working with multiple files, and we will contain your test code in separate files.
As a reminder, when you call a static method from a different file, you need the name of the class in addition to the method call.
For example, if your method, myMethod(), that returns a String exists in File1.java, you would call myMethod by using the following code
String myVariableName = File1.myMethod();
This is specific for static methods as they exist in shared space.
Step 1 - hiddenNumber(String message)
hiddenNumber(String message) looks for all numbers in a String. It then returns those numbers as a single number. So for example:
hiddenNumber("4. Everyone should watch hitchhikers for 2 reasons..."); // returns 42
hiddenNumber("There were 6 original avengers."); // returns 6
hiddenNumber("What is the number 1 question for doctor who fans? Who is your favorite doctor. It is not 3."); // returns 13
It also need to handle very, very large numbers! As such, looking at the total size of an int, you may find that an int is not large enough and instead need to use a long!
Writing hiddenNumber
Go to RunLengthEncoder.java (there is a drop down arrow by the file name in Zybooks). You will see the method stub for hiddenNumber has already been created. Obviously, you will want to loop through the String message, and every time you find a digit, you add that digit to the String num. This is concatenating the characters, but that is fine. Once you have ran through the entire String, you will want to call the following method
rtn = Long.parseLong("123"); // replace 123 with your num variable
Think about what does Long.parseLong do? If you are unsure, ask your fellow classmates or the TA.
Testing Hidden number
Go to TestRunLength.java and find the main method. Go ahead and uncomment out the lines around hiddenNumber(), so you can see what it prints out. Also, you should feel free to change up the messages, to see if it works in other cases. For example, what happens if no number is in the message.
Step 2 - expand(char x, int count)
This method is a helper method for runLengthDecoder. However, your goal is to only focus on what this method does. What does it do? Great question!
Expand takes a character and a number. It then builds a String with that character being repeated the same number of times as the number. For example
expand('x', 5); // returns "xxxxx"
expand('L', 2); // returns "LL"
expand(' ', 3); // returns " "
Writing expand(char x, int count)
Go to RunLengthEncoder.java and find the method stub that is already built for you.
You will want to build a loop that starts at 0 and runs till count, and for every iteration, add the character from the parameter to the return string.
Yes, this method is just one loop, but focus on the quest, and not what else it will be used for.
Testing expand(char x, int count)
Go back to TestRunLength.java, and uncomment the lines in the main method for the expand tests. You may want to test it with your own numbers and characters also.
Step 3 - runLengthEncoding(String str)
You finally get to write the RLE! It takes in a String, and encodes it as follows:
runLengthEncoding("RRRRSTTT"); // should return 4R1S3T
runLengthEncoding("a"); // should return 1a
runLengthEncoding("aa"); // should return 2a
runLengthEncoding("aaBb"); // should return 2a1B1b
Basically, it counts the number of times it sees a character. Then displays the total number of times the character shows up and the character. For simplicity’s sake, we will never be encoding digits! (or decoding)
Writing runLengthEncoding(String str)
First, this is a perfect time to look back through the class slides, as last unit we wrote a variation of runLengthEncoding in class!
What we are doing is looping through the String str, and counting the characters as we go. You need to first create a variable of type int that will represent the number of same characters and until the character changes increment your counter variable (so, look at the current character, and the character at the position before it).
Once the character changes, you add the total count (as a number) and the character before it to the String rtn.
You will then reset the counter, and continue with your loop.
When your loop is done, you will also need to figure out how to add the last character to rtn since inside the loop you are adding the previous character to rtn! So keep that in mind.
A few hints for writing this
- Really, go to the slides
- Write it in steps, and print out what is going on. You may want to uncomment your tests before you start writing
- Write it out on paper, as the hard part of this isn’t the code, but the logic.
Testing runLengthEncoding(String str)
Go back to TestRunLength.java, and uncomment the lines in the main method associated with runLengthEncoding. You may want to do this before you write the method, as it will help you debug running this with printlns.
runLengthDecoding(String s)
While we can encode the String, we also need to take the encoded String and convert it back to the original String. This is what runLengthDecoding does.
Examples:
runLengthDecoding("4R1S3T"); // should return RRRRSTTT
runLengthDecoding("1a"); // should return a
runLengthDecoding("2a"); // should return aa
runLengthDecoding("10a1B1b"); // should return aaaaaaaaaaBb
Some tips:
- You can assume numbers are followed by letters or special characters (not another number)
- Calling the expand() method, once you know your exact number (for example 3) and the character (B) - will help generate part of the String BBB.
- Numbers can be any range, and often will go above 10 (consider building a String based on Character.isDigit()), before converting it using Integer.parseInt(String)
See your hiddenNumber method for an example.
It is very important to take this method in steps, and think about the warm-up activity.
writing runLengthDecoding(String s)
For this method, you will want to look at how you can exploit the pattern of digits and letters. (What is your quest!)
You first will need to declare a temp String that will contain the digits found
You want to loop through the entire String, looking at each character individually (hint, write a loop that does this).
If the character is a digit or the String that contains the digits is empty, add it to a temp String that keeps the digits as you see them. Then when you see a non-digit, call the expand method using the character and the digits.
Hint: expand takes a character and an int, so using Integer.parseInt() you will need to convert your String that contains the digits to an int. Also remember to rest your String that contains the digits back to an empty String.
If you repeat this pattern, you should be able to work through an entire encoded String. Take your time, and figure it out in steps. Maybe get it working on 3A (or other single digit and character) and then 10A (or other digit that is longer than a single character). Then focus on getting it working for all strings.
Testing runLengthDecoding
This method is important to test as you write it! Line by line with print statements will help. As such, we recommend going to the main method, and uncommenting the lines involving runLengthDecoding.
You will also notice unitTests method is a good method to see how it all works, but for it to work, you will need to implement the following method.
formattedInfo(String name, int rawLength, int encodedLength, double compression)
In effort to make sure we are DRY in our programming, you will notice there is a method in TestRunLength called formattedInfo. It simply returns a formatted String from using String.format. Need a review of String.format? Go here
You will build a String to return that takes in A String name, int rawLength, int encodedLength, and double compression.
You will want to print out a String that looks like the following
DNA raw length is 980 while encoded DNA length is 673. 31.33% compression.
or like
Image raw length is 1214 while encoded Image length is 299. 75.37% compression.
Note that the percent is only 2 decimal places intentionally, and there is a new line at the end of the String!
You will build this String using String.format.
formattedInfoformattedInfo
This method can be done by returning the value returned from String.format. Some useful printf/String.format rules are:
- %% allows a % sign in the string
- %d for a number/digit
- %s for a String in the String
- %.2f for a floating point, that only goes two decimal places
- %n new line
It is important to know that you can also use a parameter more than once if you need it to repeat (hint: name).
Testing formattedInfo
There aren’t any tests written using it. You may want to write your own! Remember, using String.format, not printf!
When you run the full program, you will see it in action, as it shows the compression rate for an image and DNA.