ParaM: Parallelizing and Optimizing MATLAB
Introduction
This project uses MATLAB (and its open-source counterpart, Octave) as an example language to study optimizations that might benefit scripting languages, and specifically those that are used for computationally-intensive programming.
Perhaps the best known work on compiling MATLAB is the FALCON project at UIUC. Unfortunately, that work has been inactive for more than 10 years. In the mean time, MATLAB has evolved tremendously, as a language as well as in terms of its runtime and library support. MATLAB now has a (rapidly evolving) Just-In-Time (JIT) compiler, operating on the proprietary byte-code that MATLAB parser produces. The libraries have been improved, and many have been parallelized. With the Parallel Programming and Distributed Computing Toolboxes MATLAB now also has support for directly writing parallel programs, albeit in a somewhat restricted way.
Even with a byte-code JIT and enhanced libraries performance of MATLAB code is no match for a well-written Fortran or C program, indicating potential for improving MATLAB code. The ParaM effort is aimed at exploiting some of this potential.
Goals
One area that we have identified as critically important is memory optimizations. Our studies indicate potential for improving the memory behavior at the source-level that might be extremely difficult, if not impossible, to leverage at the byte-code or library-level. On modern multi-core machines more and more applications are memory-bound, making the traditional models of algorithmic complexity of very limited use in estimating the performance on real machines. We are currently researching a theoretical model, based on defining reuse distances at the source-level, to enable more accurate estimation of compiler optimizations on modern machines. We will use this research to inform the MATLAB / Octave to C compiler, called ParaM, which we are developing.
Funded by NSF, award CCF-0811703