Integrating proto-runtime with Java Home Page
This description of the work has been cut-pasted from an email, and will be updated periodically as the project evolves:
The project is to integrate the multi-lang version of proto-runtime with Java. What this means, from an application programmer perspective, is that coders will be provided with classes that encapsulate parallelism behavior. For example, instead of using Java's built-in threads and locks, the programmer could instead call methods in our Vthread library. The programmer would use them the same way as Java's built-in thread constructs, except the Vthread constructs would be implemented with proto-runtime, and so have lower overhead.
The real value, though, is that this will make a growing body of innovative parallel languages available to Java programmers. They will still write Java programs, in the normal Java environment, but they will gain new classes that contain methods for parallelism constructs. For example, I have a language called HWSim which is good for event based simulations of a large number of interacting entities. Also, we have StarSs, Cilk, OpenMP, and so on all implemented on top of proto-runtime, so those all become available to Java programmers..
The most valuable part, though, is that proto-runtime makes it fast and easy to write new langauges, such as domain-specific languages. This will open the door, for example, to parallel rendering engines written entirely in Java, which will take advantage of multiple cores available. Other examples are parallel GUI toolkits, parallel implementations of modelling languages, and so on.. It also makes C based code for high performance domain languages available to pure Java programmers, without them even having to learn JNI.
Okay, so that's the value of the project. Now, the technical side of how to implement it..
The first step will be to learn the proto-runtime, so that you understand the details of the code, especially the base concepts such as virtual-processor and task, and the way the proto-runtime primitives suspend those, and the way proto-runtime handles requests from them.
Now, I am about to describe many details about the project, but they involve aspects of the proto-runtime code, so the description won't make sense until you've completed step 1. Just the same, reading the details now will give you some sense of what it is you need to learn about proto-runtime, within step 1.
The key thing you need to know, in order to understand what I'm about to say, is that proto-runtime creates its own virtual processors. Each virtual processor (VP) has its own stack, and its own program-counter. The proto-runtime provides assembly language operations that suspend a virtual processor, by saving its program-counter, and saving its stack, and then switch the physical core over to a different program-counter and stack.
In fact, a proto-runtime instance is born into OS threads, where there is one OS thread pinned to each core, and each of those threads is born running the proto-runtime code. But the proto-runtime code quickly swaps out the program-counter and stack of the OS thread and replaces them with the program-counter and stack of virtual processors that proto-runtime created itself.
In other words, the OS manages a thread inside the kernel, and that thread has a stack and program counter.. and the OS keeps a variable that records which OS thread is currently running on a given core. But, that thread is no longer running there! The proto-runtime has created its own stack and program-counter, and switched the core over the proto-runtime's stack and program-counter. Fortunately, when the OS suspends a thread, it just records the current value that's in the stack-pointer register of the core, and records the program-counter value, so the OS doesn't know it, but when it thinks it is suspending its OS thread, it actually is suspending the proto-runtime created thread.. But that's not critical to understand, just hopefully helps clear up the mystery of how proto-runtime can create its own threads without breaking the OS..
The second thing you need to know about proto-runtime is how it is started up. The way it works now, the application programmer writes a normal C program, and in the main, they call a "start proto-runtime" function, which starts the proto-runtime system. They then call a "create process" function, which causes the proto-runtime system to create a process inside the proto-runtime. The command that creates a process is handed a pointer to a function. The process is born with a single virtual processor inside the process, and that virtual processor is born running the function that was handed to the "create process" command. That is called the "seed" function. That function contains code that calls the "start" commands for the languages that will be used in the application. The seed function then uses constructs from those languages to create virtual processors and/or tasks. Those created VPs/tasks then execute application code that performs the work.
So, once you've seen the code and understood the details of the assembly code that switches the program counter and stack, then the second step will be to learn the JNI interface in Java, and do some experiments. The experiments will show how Java's JVM reacts to having the program-counter and stack of its main thread swapped out and replaced by a proto-runtime created virtual processor.
The idea will be to figure out whether Java's JVM is compatible with the proto-runtime mechanism of swapping out the program-counter and stack. If yes, the JVM continues to function correctly, then the next experiment will be to figure out a way to communicate a Java function to the proto-runtime system, such that a new virtual processor can be born running that Java function, or an existing virtual processor can be made to jump to that Java function.
The JNI has mechanisms for these tasks -- it has ways for C functions to invoke Java functions -- so this experiment will be mostly about following the JNI tutorials and getting that working.
The third step will be to make the Java interface that starts up the proto-runtime system, and starts a process in it.
I believe there are two basic approaches that can be taken to starting up the proto-runtime system. One is to have a single JVM that all the threads share, versus having as separate JVM for each core.
Either way, during startup, the proto-runtime creates an OS thread for each core and pins that thread to the core. This happens in the existing C code. The OS thread is born running proto-runtime code. So, the task here is to write a birth function that either attaches the OS thread to the shared JVM, or it makes a new JVM for that pinned thread (will have to experiment to find out which is better).
Now, if we go with a separate JVM for each core, then the birth function will make the JNI call that creates a JVM. That way the JVM is running in that pinned thread. The JVM will have to be born running a special Java program. This special Java program does one thing: it uses JNI to call proto-runtime code. The called C function uses the proto-runtime to suspend the thread, which consequently also suspends the JVM and switches the core over to the proto-runtime code.
The proto-runtime can create new proto-runtime virtual processors (VP), and cause the JVM to start executing Java code in those new VPs.
The end effect will be that the machine has an OS thread pinned to each core, and that pinned OS thread has a suspended JVM, and both are under proto-runtime control. All of these threads are sitting idle, part of the proto-runtime system. The proto-runtime system will put them to work once a language creates some work.
This link talks about creating a JVM inside a C thread: http://docs.oracle.com/javase/1.5.0/docs/guide/jni/spec/invocation.html
An alternative may be to have all the proto-runtime threads be attached to the same JVM, via the JNI AttachCurrentThread() call.. will need to gain more understanding and do experiments in order to see which makes more sense.. the thing that concerns me is the garbage collection..
Once the proto-runtime has the suspended JVM(s), then it can cause a JVM to execute the "birth" function in a new virtual processor created by the proto-runtime. How to do this was worked out in the experiments in step 2, above. So, the last part of this step is get the birth function for the new process running in the seed VP of that process.
After that is complete, the next step is to get suspend and resume working. If a VP is suspended, then to resume it, the proto-runtime just does the same as it currently does for a VP running a C application function. This will cause the VP to resume just past the point that it invoked the proto-runtime primitive that suspended it. For VPs that were running Java code, the first thing past the suspend will be inside the JNI call. This will just return from the JNI call to the Java code. You'll write both the Java side and C side of the JNI call that invokes the existing proto-runtime suspend, and when it resumes, cause it to return from the JNI call, back into the JVM.
The key thing to learn in this step is how to suspend and resume a VP that is executing Java application code, and along with that, learn whether it makes more sense to have one JVM, and attach the pinned threads to that, or rather to have multiple JVMs, one created inside each pinned thread (multiple JVMs may cause problems with garbage collection!).
Here's how it will work.. a Java program will be executed normally, in the normal way. This Java program will use classes provided as parallelism libraries. These libraries have methods that use the JNI interface to invoke C functions. One of these methods will use JNI to invoke the proto-runtime C function that starts the proto-runtime system. Another method will use JNI to invoke the proto-runtime C function that creates a new process. It will use the approach worked out in the steps above, to hand a "seed" Java function to the "create new process" C function.
So, application programmers will write their code in Java, once, and then when it runs on a particular machine, that code will be dynamically linked to whatever proto-runtime and language plugins are on that machine.
The next thing needed is a class that acts as the gateway to the proto-runtime primitives that suspend a virtual processor and switch over to the proto-runtime. This class has static methods that use JNI to interface to the proto-runtime C functions. These will be used to create Java based wrapper libraries. So, each proto-runtime call that is used in a C based wrapper library needs an equivalent Java method, which in turn uses JNI to invoke the proto-runtime primitive. This is how the Java wrapper library will invoke the proto-runtime assembly code that suspends a virtual-processor and causes the proto-runtime request handling to start.
That will be the last thing needed -- after that, each language just needs its wrapper library translated to Java, which is separate from the proto-runtime, and minimal work. Then, the language should work, from Java code, even though the language was implemented in C, and the application code is written in Java. The application code will run inside the JVM(s) that was (were) created when the proto-runtime started up.
All the parallelism related activity will happen in C, inside the proto-runtime code and the plugin code. But the application work will happen in Java, and the JNI will be used to move between the two worlds.
My goal is for the proto-runtime code to be used as-is, without changing it. All the adaptation to Java should be provided by additional C functions written as adaptors, along with Java functions that invoke JNI.
So, that's the project. I have no idea how long it will take, nor whether it is even possible to do in a remote setting like this. I expect if it is possible, that it will require many sessions of us talking live via Skype or Vyew, or some other net meeting technology, in which we walk through code and explain what's happening to each other.