Prof. Juan A. Gomez-Pulido
Computer Architecture and Logic Design Group (ARCO) Dep. Technologies of Computers & Communications. Univ. of Extremadura Escuela Politecnica, 10003 Caceres, Spain. +34-927257264(tel)/187(fax) jangomez@unex.es - http://arco.unex.es/jangomez
Since your proposal for us was machine learning, I've extended it a bit to include computational intelligence techniques (like evolutionary computing), massively used to solve large combinatorial optimization problems, by two reasons: on the other hand, in order to allow using the application for more potential users; on the other hand, because we have strong, solid and proven knowledge in solving combinatorial optimization problems. In this sense, I've focused the following answers mainly to the field of the optimization problems, where machine learning and computational intelligence (specifically Evolutionary Algorithms) are applied.
Please, modify/correct anything you consider, and tell me if you need more extension or more clarification in any aspect.
Q What application will you create (or enhance) as part of the project (what does it do)?
A Many combinatorial optimization problems emerge in the engineering domains. These problems are characterized by a large set of possible solutions, each of them with different goodness degree. Finding the optimal solution among the many possible ones is a complex problem that requires using heuristic techniques when the space of solutions is too large to be explored by direct search algorithms. The algorithms that tackle this kind of problems by means of machine learning or evolutionary computation share, in many cases, a common feature: repetitive computing tasks (usually, fitness functions) performed on many similar units (individuals) in a wide set (population) of possible solutions. Identifying and coding these common operations can aid to create applications able of accelerating the execution of the algorithms, since the repetitive tasks allow a high parallelism degree.
Q Who will use the application (how large is the audience, what areas.. will it be used by end users in the community, or people inside a company using it as a tool, or for advancing science..)
A There are many combinatorial optimization problems related to different fields of the engineering: civil, electronic, networks, communications, etc. In this sense the potential users of the application could come from academia and enterprises fields, specially for high-level scientific and research problems. In general, the application could be attractive for any people looking for quick solutions to combinatorial optimization problems.
Q What are the computation needs that will benefit from parallel computation
A Users interested in solving large combinatorial optimization problems seek to increase the computation needs when they want to explore wider spaces of solutions than those tackled by conventional and usual computer resources. The level of computation needs depends on the accuracy and time required by the engineer or scientist, where the parallel computation will satisfy these two parameters in any case.
Q What has blocked you from using parallel computation up to this point
A When an engineer or scientist wants to take advantage of the parallel computation in current dual or quad-cores CPU chips or in multiprocessor clusters, he must to apply programming techniques more complex (threads, message passing, etc.) than his usual programming languages and tools, therefore increasing the programming effort and extending the development time for the final application. This inconvenience discourages the scientist to use parallel computation, unless certain tools, integrated into his programming environment, allow him to launch easily parallel tasks.
Q How will your application provide higher benefit as a result of the parallel computation (what currently can't be done that will be enabled, or what aspect will be improved. For example, will weeks of waiting for simulation results drop to hours? Will a researcher be able to interactively search, rather than doing a scatter shot of simulations and hoping one of them was the right one? Will a product be producible with less material or less design effort? Will the graphics be richer, or render faster, or use less battery?)
A The parallel computation will have a quick and direct effect in the results obtained from the execution of the algorithms involved in solving the combinatorial optimization problems. These algorithms are characterized by intensive data management where many operations (often involving floating-point arithmetic of high computing costs) are performed as repetitive, highly parallelizable tasks. We may note that the application of the evolutionary operators and processes on each individual of the population is essentially a parallel task. A typical and usual task in this domain is the fitness function computation; thus, the fitness of one individual can be evaluated at the same time another individual is being evaluated, and likewise the rest of the individuals in the population. This is only possible in practice if the computational resources permit parallel executions. On the other hand, it is a well-known issue that the fitness evaluation is the most time consuming part of an evolutionary algorithm; thus, speed considerations made some researchers even avoid good fitness functions in time-critical program parts. This way, the parallel computation would allow to reduce the time, increase the precision and find better solutions to the optimization problem.
Q Who will receive this benefit and how (for example, will the application help cure cancer for millions of EU citizens by enabling doctors to use personalized genetics?)
A Nowadays, there are many large combinatorial optimization problems where scientists are researching to obtain more accurate solutions, that could have a direct impact in the citizens life. Domains such as biochemistry, molecular biology or climatology, provide problems where the optimal solutions can lead to specific achievements like drug discovery or protein structure prediction, as examples of real-world problems currently tackled.
Q And, for logistics of the project, we would like to start by coding in C/C++ but are open to integrating into other languages, such as Python, Java, or even Javascript. Could you say a bit about your development process:
Q What language(s) do you plan to use for the application
A C
Q Is the application.. desktop based, Cloud (SAAS) based, browser based, or mobile
A Desktop or browser based
Q A little bit about the architecture (do you have a server with database, or a large data set that is churned through such as Big Data style, what parts of the computation are performed on the end-user device versus in a server, and so on)
A We are used to servers running applications where the data are continuously being generated and replaced, so there is no need to store large data sets in huge databases. In our research, the end-user device is used to control and monitor the processes, that are performed in remote servers.
Q From what you mentioned, it sounds like your story will be strong as far as the computation parts. I'm thinking the end-user benefits part may need clarification, about the impact on the EU. Also, perhaps there is a GUI aspect.. do you need data visualization ?
A Data visualization could be an attractive aspect to improve. Usually, the problems are launched from a console to be solved in a remote server, and many hours/days later, the obtained results must be graphically arranged by hand (for example, in Pareto's fronts for multiobjective optimization). It would be a good idea to integrate GUI aspects and data visualization in run-time.