Lab 2 OMP Tasks: Merge Sort
Download and untar the provided tasks.tar file. This is the
starting point for your experiments.
If needed be remind yourself what merge sort is with these fun animations
Time sequential code
- Start by running the sequential version on a quiet state capital machine (Albany to Trenton in the list)
Run it
a few times, and get a median run time for this.
- Test your code
by running it with a small input size, e.g., Merge_sort_debug 100. For all timing
experiments run your codes with the default (no explicit size on the command
line) size of 100,000,000.
Make it parallel
- Copy the sequential Merge_sort.c code into Merge_sort_parallel.c.
- Use OMP task parallelism to parallelize Merge_sort_parallel.c (leave Merge_sort.c alone for reference).
- Use an OMP parallel pragma to spawn threads in the main program. This is similar to the nfib(n) example given in the tasks lecture.
- Spawn two new
tasks, one for each recursive mergesort call.
- Adjust your Makefile. Make sure that you use
-O3
flag to compile all the files
- Keep in mind: the pragma parallel should only be executed once.
Also, don't merge the left and right halves of the array until they are
sorted, i.e. you need a taskwait.
First parallel version: no pruning
- Do not prune the task tree yet.
- Call this code Merge_sortNP (for not pruned).
- Run the parallel code for 1, 2, 3, 4, 5, 6, 7, 8 threads.
- See how the
run times compare to the sequential version.
- Why do you think you are getting
the results you are?
- Think about how many tasks are spawned. Bring this to the
discussion on this topic (later).
Second parallel version: pruning
- Prune your code by only spawning sufficiently large tasks.
- Call this code Merge_sortP.
- Run the parallel code for 1, 2, 3, 4, 5, 6, 7, 8
threads.
- Experiment with tasks sizes and create a speedup curve for your
preferred task size.
- See how the run times compare to the sequential and
unpruned versions.
- Think about how many tasks are spawned now.
- Bring your
observations to the discussion on this topic.
Third parallel version: spawning only one task
- Change your code to spawn only one task for one of the recursive calls,
and let the other recursive call be done by the parent task.
- Call this code
Merge_sortS.
- Experiment with it.
- What are your observations?
Fourth parallel version
For the sake of completeness, it would be
useful to gather data for the version where½ you spawn only one task, and do
no pruning.
Summarize your observations to take to the discussion on OMP
Tasks. You must complete the lab well in advance to leave enough time for
discussions with your group.