Computer Science Department

Foundations of Fine Grain Parallelism

CS/ECE 560: Spring 2011

Combined on-campus on-line sections

Specific course details are being revised. Information below is subject to change

Class meets: Tu-Th 9:30-10:45; CS 420

Announcements:

Be sure to watch this space. Late breaking info will be posted here, rather than sent by email. Also be sure to frequently check the Class Schedule

Previous announcements

Course Overview

Objectives:

CS-ECE 560, which teaches the foundations of a model of parallelism called the polyhedral equational model, is unlike any other course offered anywhere in the world. Here, you will

learn how to optimize compute-intensive kernels by exploiting fine-grain parallelism, memory and locality for a variety of target architectures such as multi-core processors, GPUs, FPGAs, and vector units (SSE instructions);
understand the foundations of the polyhedral model, a mathematical formalism to specify, develop, analyze and transform high-level programs for parallelism, locality, memory, and complexity optimization;
appreciate the elegance of equational programming in the polyhedral model, where computations are specified as high-level mathematical equations; and
acquire the skills to do research in architectural and programming abstractions for massive parallelism.

The polyhedral model is now used for automatic parallelization in a number production compilers such as gcc, IBM's XL series, and Reservoir Lab's R-Stream. There are active research groups on the polyhedral model at MIT (Amarasinghe), Illinois (Padua), Utah (Hall), Ohio State (Sayappan), Louisisana State (Ramanujam), IBM Research (Renganarayana, Bondhugula, etc.), Leiden (Kienhuis, Deprettere), and many groups in France: IRISA, Rennes (Derrien, Quinton), ENS Lyon (Feautrier, Darte, Alias), Bordeaux (Barthou), INRIA (Cohen, Bastoul, etc.)

Rajopadhye is one of the inventors of the polyhedral model, and has been active in the field since his Ph.D. dissertation (Utah, 1986). At CSU, we take a unique view of the polyhedral model that combines the analytic quantitative power of the model with the expressivity and clean semantics of declarative, functional programming in the form of equational programming.

Motivation

Computer architecture is in a constant flux, with ever-increasing parallelism, which now comes in many forms: instruction-level parallelism, SIMD instructions, and recently multiple cores.

Emerging "many-core" processor architectures in the general-purpose arena are taking us towards tens to hundreds of processors on the same processor die.
Graphics Processing Units (GPUs, specialized hardware devices, previously used exclusively to accelerate display engines, that are finding their way into general purpose computing (GPGPU).
Embedded systems offer yet other forms of parallelism such as FPGAs (Field Programmable Gate Arrays: chips that can be re-programmed "at the circuit level," allowing for tremendous performance gains for certain niche applications) and ASICs (Application Specific Integrated Circuits, that can provide highly tuned but specialized functionality).

Future processors, whether general or specialized, will have a large number of cores, typically fine-grain, possibly with dedicated, often distributed memories, and lower power consumption. Collectively, they will be much more powerful than today's machines.

In the past, programmers were shielded from details of processor evolution. A single architectural abstraction (the von Neumann machine), and a single programming/algorithmic abstraction (the random access machine RAM) allowed a clean separation of algorithmic, programming and architectural advances. This has been called the end of "La-Z-Boy Programming," where programmers simply wait for the next generation of processors. These abstractions are crumbling today. Programmers, library writers and application developers must be aware of parallelism and memory locality at multiple levels, or risk losing many of the Moore's Law gains of modern architectures. This raises a number of issues.

The immediate goal is to develop highly tuned applications that can best exploit the emerging architectures of today.
Second, the medium term challenge is to ensure that the skills learnt to do this are portable to tomorrow's architectures, even though we don't know their details.
Finally, the long term goal is to enable the foundational research to render the first two challenges moot. This can be achieved through automatic compilation and code generation tools, and will enable the "return to La-Z-Boy Programming."

Details

Instructor: Sanjay Rajopadhye CSB 340, email: Sanjay.Rajopadhye@colostate.edu. Office Hours: see Sanjay's schedule.
GTA:Tomofumi Yuki CSB 345, email: yuki _at_ cs.colostate.edu Office Hours:11am-1pm, Thursday in CSB120
For best response time, please send questions to cs560team _at_ cs.colostate.edu
Textbook: There is no text book. There are readings associated with each topic in the class schedule and references are posted there.
Tentative Grading: Homework 30%, take home midterm 30%, Project 35%, Discussions 5% (see discussion rules).
Late assignments (labs and homework) will be accepted for up to 3 days after the due date or until the solution is posted/discussed in class. A 10% penalty per day (24 hours) will be assessed.
Important Information: Be sure to read the Computer Science Department Student Information Sheet. You are expected to be aware of the policies outlined there.