Introduction

The objective of this PA is to write two sequential and two OpenMP programs for the knapsack problem using the dynamic programming approach: an unblocked and a blocked version. Check the Knapsack lecture, as well as the caching lecture on blocking the knapsack. In both cases, first write and debug a sequential version before you work on any parallelization. Work in a step wise fashion for debuging purposes. When studying performance, run your codes on a quiet Capital machine.
  • PA3A: knapSeq and knap
    Start from the code skeleton knap.c provided in PA3.tar to create a sequential knapsack solver.

    Inputs are in kxx.txt files, where xx represents N, the number of objects.

    The file format is:

       //  first line:  # of objects, knapsack capacity, 
       //  next lines:   weight and profit  of next object (1 object per line)
     

    The following input file, k50.txt has the following solution k50.res.

    You can also use the simpler case, k10.txt to debug your codes: k10.txt contains and 'knapSeq k10.txt 1' should produce:

    platte 69 # ty k10.txt 
    10 25
    9 10
    2 200
    7 100
    6 20
    4 15
    3 190
    7 159
    8 125 
    6 29
    4 200  
    platte 70 # knapSeq k10.txt 1 
    The number of objects is 10, and the capacity is 25.
    The optimal profit is 874 Time taken : 0.000003.
    Solution vector is: 
     --> 0 1 0 0 0 1 1 1 0 1 
    

    Parallelize the loop that goes through all capacities for one object, and other easy parallizable loops (intiaiizations).

    Record times for the part of your code that computes the table, and speedups for 1,2,3,4,5,6,7, 8 threads with the larger input files (k120.txt and k240.txt).

    Write down your observations:

    • how does the parallel code compare to the sequential?
    • for which number of threads does speedup end?
    • what is your max speedup?

  • PA3B: Blocked knapBSeq and knapB
  • Write a new sequential blocking knapsack code (starting from you previous sequential code). Note that blocks on a diagonal can be processed in parallel (see cache lecture), and that in a diagonal the sum of the row and column indices is constant.

    E.g., for a 4 by 4 blocked table we have:

    diagonal 0: (0,0)
    diagonal 1: (0,1) and (1,0)
    diagonal 2: (0,2), (1,1) and (2.0)
    diagonal 3: (0,3), (1,2), (2,1) and (3.0)
    diagonal 4: (1,3), (2,2) and (3,1)
    diagonal 5: (2,3) and (3,2)
    diagonal 6: (3,3)
    
    Only after convincing yourself that your blocked sequential code works, parallelize it.

    Record times for the part of your code that computes the table, and speedups for 1,2,3,4,5,6,7, 8 threads with the larger input files (k120.txt and k240.txt).

    Write down your observations:

    • is your blocked sequential code faster than your unblocked sequential code?
    • how does the parallel code compare to the sequential?
    • for which number of threads does speedup end?
    • what is your max speedup?

    Submit your sequential and parallel codes, makefile, timer code, i.e. all files necessary to compile your sequential and parallel codes by typing make. Your time , speedup statistics, and observations are to be included in a lab report submitted to Canvas.

    How to tar files:

    • Enter the directory above the directory that contains the files you want to submit.
    • Use the command:
       tar cvf PA3.tar [dir]