The model of parallel computation centers around the interactions among source code constructs, the runtime, and the hardware. It covers all aspects that drive performance of parallel computations, and exposes the keys to high performance. It is not an analytic model, but rather identifies elements and patterns that remain invariant across parallel languages, runtime systems, and parallel hardware. These elements and patterns have been defined precisely, and verified by instrumenting real runtime systems on actual hardware and comparing measured results to predictions made by the model.
It was motivated by a desire to understand parallelism basics, in a way that is independent of language features, or runtime implementation strategies, or hardware architecture choices. The resulting model applies equally to hardware design as to language design. Intuition gained from abstract representations of hardware, constructs, and scheduling strategies remains valid when the representations are reduced to practice.
The model should provide useful guidance when evaluating different approaches to any of the three aspects: language, runtime, or hardware. It abstracts away non-essential details, allowing simulations that expose the essential behaviors within a wide range of implementations that share the basic structure. Further, the model can be used to discover the best that can be done, for example, the best scheduling strategy, and the resulting highest performance possible from application code with a given structure when run on hardware with a given structure. Or, the performance delivered by the best possible network topology, given an application structure and scheduling approach. Or, the minimum size of work unit within the application that can achieve, say, 50% utilization of given hardware, when executed with a given runtime system.
Due to the abstraction, the space of possibilities can be quickly explored, giving confidence that a particular design or implementation is acceptably close to the optimal possible.
Lastly, as a survey of papers on parallel languages, synchronization constructs, runtime systems, and parallel hardware, shows, past research has tended to present individual attempts, that were motivated by localized ideas, rather than being guided by invariant understanding of how parallel computation works, which applies to all forms of parallelism. The model may provide a convenient framework within which to get a global picture of what's possible, and place individual attempts within that larger context.