Innovations in EuroDSL and 3 example use cases
Innovations
- A primary innovation is the use of small "langlets" that are each focused on common patterns within one area of programming, and are intermixed within the same application. A langlet is a collection of less than a dozen custom constructs that are mixed-in to sequential code such as C/C++, Java/Scala, or Python. This innovation increases productivity by embodying the common patterns that are otherwise implemented by hand, it lowers the learning curve because each langlet is small and specific to an aspect of the problem, and it hides the complexities of parallel programming by presenting simple rules in patterns that are natural to the problem.
- In addition, innovations exist in the tools that back up the langlets. The information captured by the langlet constructs enable targeting strongly heterogeneous platforms in ways that reduce energy consumption. One of the difficult aspects of creating such tools in the past has been the difficulty of identifying needed information such as dependencies and boundaries of units of work to be scheduled onto parallel hardware. The innovation is using custom constructs that do dual duty, on the one hand fitting to the problem by embodying common patterns, and secondly thereby capturing this critical information needed by the toolchain and runtime system. (Albert and Armin, this is about your stuff, edit away!)
- Another innovation lies within the challenges of optimally scheduling work onto hardware resources during a run. This project applies innovative machine learning based approaches that take the critical information captured by the langlets as input, with the result of reducing the amount of communication and distance of communication of data during the run. In this way, energy is reduced and computation consequently sped up. (Erol, as I recall, this was where your interest lay?)
3 use cases
- GUIs are ubiquitous in programming, with many versions of libraries to facilitate productively creating effective graphical user interfaces. Yet despite this plethora, effective and easy to learn and use libraries that inherently embed a parallel execution model, arguably, have yet to be satisfactorily realized. One of this project's target outcomes will be a langlet that makes the creation of GUIs simple and productive, and will embody a parallel computation model that is several times lighter weight than the best OS thread implementations. At the same time it will eliminate the mental challenges associated with threading models and their derived or related programming models.
- Bio-informatics is a burgeoning field that promises deep benefits to society such as custom cancer fighting agents designed specifically to each individual's DNA. Researchers in this field routinely use sequential code because they focus on the biology aspects and the modelling of experimental behaviors. They have no time nor interest to learn the details of parallel hardware nor to undergo the painful activity of writing parallel code, and do not have funds to hire an expert to do this for them. As a result, their exploration of biological effects is less efficient. They must wait longer before seeing a result of a simulated experiment, and compensate by doing scatter shots of many experiments at once, hoping to speed up the search by blindly doing many experiments at the same time. This project will provide langlets targeted at common bio-informatics specific patterns, which will make the researchers more productive in creating their code, while also being quick and easy for them to learn. The langlets will then take advantage of highly parallel hardware to reduce the time until a researcher gets an answer to a simulated experiment, thereby making them more efficient and more effective in exploring the biology.
- Machine learning is exploding in usage and spreading to wide and varied areas of application. Yet data scientists and researchers in the field nearly universally write sequential code for machine learning algorithms and data visualization. This presents similar ineffeciencies for the data scientists as does sequential code written for bio-informatics and many other scientific fields. In this project we will create one or more langlets that capture the common patterns in machine learning code. This will have similar benefits of increasing productivity of data scientists as they search through alternative means to visualize their data and to implement algorithms or combinations of algorithms that model the patterns embedded within data. Likewise the implementation of the langlets will speed up machine learning calculations by effectively harnessing heterogeneous parallel hardware in ways that reduce movement of data, thereby reducing the dominant factor in the energy consumed by calculations.