*deep breath*
Before we can discuss the complexities of writing parallel programs one point needs to be made clear that seems to escape the notice of most programmers and even graduate students in computer science (myself included).
Parallelism is an optimization and not a correctness criterion.
You might ask yourself, why is parallelism not included in the list of criteria for a program to be correct? This makes no sense!
The answer is that the academic definition of program correctness only concerns itself with ensuring that a program terminates with the correct output for a given input. Notice that duration of computation is not part of this informal definition of correctness -- even though performance may be part of the list of requirements for a program to be considered useful. This is because a computation is nothing more than a ordered (total) set of deterministic transformations on a given set of inputs.
How often have you seen the following situation occur?
A software architect is given a set of requirements describing what the program must do to be considered a success (hopefully). Note this list can be extremely detailed to as a vague a directive as, "Customer X has Problem Y, solve it." Most requirements lists fall somewhere in between these two extremes resulting in many iterations of prototypes and specification changes before an acceptable result is produced and there are very good reasons why this state of affairs is quite acceptable.
The architect immediately begins to design a parallel program as the requirements have performance metrics included.
*ouch*
This is both problematic and perplexing as parallel programs are often so complex that they defy reasoning abilities of programmers and verification tools. This means that determining the cause of a perplexing result generated by a parallel program is difficult, or even impossible, leaving all parties in an uncomfortable lose-lose position.
A sane, albeit potentially more expensive, approach is to maintain a reference sequential (non-parallel) implementation against which the results can be verified. This approach has an additional advantage, benchmarking the sequential version often reveals the subsections that actually *need* parallelization -- most parallel implementation that I have seen are over architected.
So again, parallelism is an often needed optimization and not a correctness issue.




No comments:
Post a Comment