End user View of CloudDSM
The CloudDSM platform consists of four parts:
- A convenient and productive set of systems of pragmas, "embedded" languages, and language features. The application developer chooses one of these and uses it to provide information to the CloudDSM development tools.
- Development tools. They fit into the standard build process as the first tools to run and feed the rest of the toolchain. The end result is a standard executable. They use the information provided by the pragmas/language-features to transform the code into a form that is highly efficient on the target Cloud machines. Their main task is to rearrange code so that the calls to the DSM system each transfer a large chunk of data, so that computation on the data dominates overhead of the DSM activity.
- A DSM runtime system. It handles machine-to-machine communication during the run of an application, and handles mapping addresses, so that the same pointers are valid on all machines. The communication and mapping make the collection of Cloud servers appear as a single desktop machine, which simplifies application development.
- A deployment tool. The end-user interacts with the tool in order to start execution of their application. The deployment tool runs as an application inside the standard Cloud software stack, and employs the API of the stack to dynamically provision machines. The end user supplies an executable that was previously created using the CloudDSM development tools, and the deployment tool then runs it on the Cloud infrastructure.
The CloudDSM deployment tool is launched as an SaaS application, and appears as a portal on the web. It is itself a Cloud application, that sits on top of whatever Cloud stack the machine Host provider is using. The CloudDSM deployment tool can either be installed by the end user, or we foresee the partner who develops the tool as making a permanent installation available as a service. Either way, once the deployment tool is up and running, then an end-user interacts with it. They upload their executable code to it, and it then launches a run of their application.
Thus, we have a stack -- end application on top of deployment tool application, on top of Cloud stack.
The application does not include any awareness of the Cloud aspects. Rather, the same executable runs on a desktop the same as on a cluster, the same as on the distributed Cloud (future projects may enable also running on supercomputer and Grid, without changing the source, but may require a recompile in order to let the tools tune the executable to the target machines. See http://opensourceresearchinstitute.org/pmwiki.php/PRT/HomePage for technology that helps enable this).
The one thing the application programmer must do in order to use the CloudDSM system is to include code that divides the work to be performed into pieces, and also to make the data structures, plus the code that accesses them, known to the tools. The project will develop new pragmas for OpenMP that are used for this, and the project includes a language, called Reo, that will make this simpler, and also a pattern, called Divider Kernel Undivider. The value of these is that they feed the tools, and the tools then have all they need in order to adapt the code to different machines and deployment technologies.