The objective of this homework is to write three OpenMP programs, to debug and test them on a Capital machine, and to experimentally determine the gains you get in running it in parallel with 1, 2, 3, 4, 5, 6, 7, and 8 threads. The parallelizations are relatively simple, and the results should be interesting in terms of speedup. You should measure and plot the performance of your parallelization as a function of the number of threads, and analyze your observations. Can you see a correspondence between memory (re)use and max. speedup?
You are responsible for 2 seperate submissions for this assignment. First, you need to submit your parallel code. This submission should be done using the checkin program; either through the website or using Linux tools. Second, you need to submit a report through canvas. Both are detailed further below.
We will be performing automated testing on your output. Do not change the output format from the existing format.
Below is an example:$ jacobi_1D 4000 200000 data[400]: 1366.611240 data[800]: 142.298734 data[1200]: 5.074897 data[1600]: 0.058852 Data size : 4000 , #iterations : 200000 , time : 0.924085 sec $ jacobi_2D 800 2000 Data : 800 by 800 , Iterations : 2000 , Time : 0.524361 sec Final data 56111.442113 28472.061997 28460.727609 28460.727563 28472.061997 23.332803 11.666517 11.666470 28460.727609 11.666517 0.000094 0.000047 28460.727563 11.666470 0.000047 0.000000 $ mat_vec 25000 10000 N=25000, M=10000 c[0] = 49995000.000000 c[3125] = 81245000.000000 c[6250] = 112495000.000000 c[9375] = 143745000.000000 c[12500] = 174995000.000000 c[15625] = 206245000.000000 c[18750] = 237495000.000000 c[21875] = 268745000.000000 elapsed time = 0.050035If you use the website to submit your assignment there will be initial tests performed. These tests do not indicate your final grade. They can however catch small mistakes in your submission.
You are responsible for submitting a report presenting and explaining your performance results.