|
Parralel Computing Turorial |
|
|
|
|
Written by Hemanshu Patel
|
|
Sunday, 23 December 2007 |
|
Page 12 of 18
Designing Parallel ProgramsAutomatic vs. Manual Parallelization * Designing and developing parallel programs has characteristically been a very manual process. The programmer is typically responsible for both identifying and actually implementing parallelism.
* Very often, manually developing parallel codes is a time consuming, complex, error-prone and iterative process.
* For a number of years now, various tools have been available to assist the programmer with converting serial programs into parallel programs. The most common type of tool used to automatically parallelize a serial program is a parallelizing compiler or pre-processor.
* A parallelizing compiler generally works in two different ways:
o Fully Automatic + The compiler analyzes the source code and identifies opportunities for parallelism. + The analysis includes identifying inhibitors to parallelism and possibly a cost weighting on whether or not the parallelism would actually improve performance. + Loops (do, for) loops are the most frequent target for automatic parallelization.
o Programmer Directed + Using "compiler directives" or possibly compiler flags, the programmer explicitly tells the compiler how to parallelize the code. + May be able to be used in conjunction with some degree of automatic parallelization also.
* If you are beginning with an existing serial code and have time or budget constraints, then automatic parallelization may be the answer. However, there are several important caveats that apply to automatic parallelization: o Wrong results may be produced o Performance may actually degrade o Much less flexible than manual parallelization o Limited to a subset (mostly loops) of code o May actually not parallelize code if the analysis suggests there are inhibitors or the code is too complex o Most automatic parallelization tools are for Fortran
* The remainder of this section applies to the manual method of developing parallel codes.
Designing Parallel ProgramsUnderstand the Problem and the Program * Undoubtedly, the first step in developing parallel software is to first understand the problem that you wish to solve in parallel. If you are starting with a serial program, this necessitates understanding the existing code also.
* Before spending time in an attempt to develop a parallel solution for a problem, determine whether or not the problem is one that can actually be parallelized.
o Example of Parallelizable Problem:
Calculate the potential energy for each of several thousand independent conformations of a molecule. When done, find the minimum energy conformation.
This problem is able to be solved in parallel. Each of the molecular conformations is independently determinable. The calculation of the minimum energy conformation is also a parallelizable problem.
o Example of a Non-parallelizable Problem:
Calculation of the Fibonacci series (1,1,2,3,5,8,13,21,...) by use of the formula:
F(k + 2) = F(k + 1) + F(k)
This is a non-parallelizable problem because the calculation of the Fibonacci sequence as shown would entail dependent calculations rather than independent ones. The calculation of the k + 2 value uses those of both k + 1 and k. These three terms cannot be calculated independently and therefore, not in parallel.
* Identify the program's hotspots: o Know where most of the real work is being done. The majority of scientific and technical programs usually accomplish most of their work in a few places. o Profilers and performance analysis tools can help here o Focus on parallelizing the hotspots and ignore those sections of the program that account for little CPU usage.
* Identify bottlenecks in the program o Are there areas that are disproportionately slow, or cause parallelizable work to halt or be deferred? For example, I/O is usually something that slows a program down. o May be possible to restructure the program or use a different algorithm to reduce or eliminate unnecessary slow areas
* Identify inhibitors to parallelism. One common class of inhibitor is data dependence, as demonstrated by the Fibonacci sequence above.
* Investigate other algorithms if possible. This may be the single most important consideration when designing a parallel application.
|
|
Last Updated ( Sunday, 23 December 2007 )
|