Welcome to the CloudDSM project home page

This will be the main portal for the CloudDSM project during the proposal writing period. We will coordinate the proposal documents and discussions via this wiki.

Overview for Cloud Participants

The CloudDSM project makes multiple machines appear as a single entity to programmers, thereby simplifying the effort of writing applications that use multiple Cloud based machines cooperatively to solve a single request. Current Cloud infrastructure takes advantage of multiple machines by distributing incoming requests among the machines. However, it is unable to make multiple machines cooperate on a single request. This is a problem for Big Data or Analytics applications, for example, for which a single request often requires high computation. The server assigned to the task takes too long to compute it, so the application cannot be used interactively. Currently, the only way to get fast response times on Cloud hardware is for the application developer to rewrite their code to explicitly use multiple machines. Getting that code to achieve efficient use of the machines is difficult and time consuming.

The project will provide development tools for the application developers that makes development simple and highly productive. They will write their code as if it will run on a standard multi-core server, based on a shared memory model. The project will also provide a deployment tool that dynamically auto-provisions machines and coordinates them to work cooperatively on a given request.

The approach is related to Distributed Shared Memory, but takes advantage of advances in software tools, to solve the performance issues that past DSM systems have suffered from. The resulting technology will enable domain experts to upload code from their desktop machines up to the Cloud, and make efficient use of multiple servers inside the Cloud provider's facilities.

An efficient DSM system that dynamically auto-provisions Cloud based hardware will open up a new segment of customers, who need the time-to-answer benefits of multiple machines cooperating effectively, and require the productivity benefits of a shared memory programming model.

Overview for Research Participants

The project creates a distributed shared memory system for use on Cloud hardware. It will address performance issues by auto-sizing work such that each inter-machine fetch transfers large chunks of data. The chunk will be sized such that the amount of work on it is higher than the overhead of the network transfer activity, thus ensuring efficient use of the machines.