Work in an incremental fashion for debugging purposes. We will again be using the state capital machines. You must use these machines to measure and report performance. When studying performance, run your codes on a quiet capital machine.
We suggest the following steps:
For testing (and debugging for yourselves) your code should provide 3 verbose options. Use the code provided in verbose to print out the required details based on verbose option (the code for option 0 goes in main, and that for the other two goes in the recursive solve_kp function). Wherever you insert the code, do not make any changes to the print statements. We will use the exactly same output format for testing your submission. Here is a sample output file output.
You will have a recursive algorithm that will compute a series of values of C*. Print these values of C* and maximum profit possible at C*. Test your program with the same input files that were used for HW0.
NOTE: Since the solution to the knapsack problem may not be unique (multiple combinations of the object may have the same profit) you need to be careful about how to resolve ties so that the answer you produce is the same as what we generate. Here are the rules to follow.
Parallelize the implementation of HW3A using OpenMP's coarse grain parallelization (using the #pragma omp parallel section directive) where large chunks of work will be done in parallel, but the chunks themselves are sequential, e.g., parallelize the complete calls to solve_kp. Do not parallelize the for loops. An additional option to think about is that in each call itself, there are two sub-tables to fill up, so they too can be executed as two separate parallel chunks. Therefore, we have three variations possible as follows:
Do not expect to see much improvement in performance as yet. Observe that these recursive calls build a tree. Introduce a new command line argument called depth. Modify your code such that at a recursion level that exceeds depth the code executes sequentially. In other words, allow parallelism only until certain depth in the tree. For the performance results, we will ask you to experiment with this parameter.
Test your code and observe the performance on the standard test cases k120x4M.txt and k240x2M.txt(same as HW0). Observe how performance changes with depth. For your final report, you will be asked to study the performance on specific input data.
Parallelize the implementation of HW3A using OpenMP's parallelization directives (used in HW0A) where for loop is executed in parallel. For this part, parallelize the fill_table function only. Do not use any coarse grain parallelization as you did in HW3B.
Test your code and observe the performance on the standard test cases k120x4M.txt and k240x2M.txt (same as HW0). For your final report, you will be asked to study the performance on specific input data.
Combine the parallelizations of HW3B and HW3C using OpenMP's parallelization directives to reflect both coarse-grain and fine-grain parallelism. For this part, parallelize both the fill_table and solve_kp functions.
As before, test your code and observe the performance on the standard test cases k120x4M.txt and k240x2M.txt (same as HW0). Observe how performance changes with depth with the hybrid version. For your final report, you will be asked to study the performance on specific input data (to be posted later).
Your report should present the performance, i.e., speedup as a function of the number of processors, of the different versions of the code. In each case, you must report speedup with respect to the fastest sequential code, (the version you wrote for HW3A). In your report, we are going to ask you for specific performance experiments. All experiments are to be done on the capital machines in the CS department.
More details in report questions.
Submit, using the Checkin tab, your sequential and parallel codes, your time and speedup statistics, and observations as well as your complexity analysis in a report (maximum 4 pages). Your submission must contain a Makefile in addition to other files. The Makefile should generate executables named knapAlgoSeq, knapAlgoCoarse, knapAlgoFine and knapAlgoHybrid for HW3A, HW3B, HW3C and HW3D respectively. Submit HW3 files in HW3.tar (NOTE that there is separate submission for CODE and REPORT)..