Sunday, May 31, 2009

Parallel Programming in Common Lisp

To kick off a series of posts about parallel programming in Common Lisp, let's first review what's available.

Two models for parallel programming

First, there are basically two models for parallel programming:

  • Parallelism based on threads (a.k.a. shared-memory model). Multiple threads are spawned by a single process. This is the most common approach in a multi-core machine. For example, we can spawn one computational thread per core. Memory can be shared among threads, so communications between computations is usually implemented by threads writing to shared memory spaces.

  • Parallelism based on message passing (a.k.a. distributed memory model). This is the most common and natural approach in a cluster of machines, because in a cluster of commodity machines, there's no hardware support for sharing RAM among multiple machines. Communication among computations is done by explicitly sending message structures according to some messaging protocol.

It is possible to mix these two models. That is, we can have message passing among multi-core machines, and within each machine, use thread-based parallelism.

Threads in Common Lisp

When discussing threads, we first have to distinguish between native threads and non-native threads. Native threads are handled by the operating system. Native threads allow us to take advantage of multiple cores. For example, in a 4-core machine, we can spawn 4 computational threads that are executed in parallel. Non-native threads are handled by the Lisp implementation, and do not allow true parallelism. If we spawn multiple non-native threads, they are executed sequentially (the threads yield control, usually cooperatively, to each other). So, when talking about parallel processing, we are really interested in native threads.

Some languages such as Java provide a standardized threading library. The Common Lisp standard does not include threads, so it is up to the implementations to provide threading facilities. Although there is no standard thread API for Common Lisp, Bordeaux-Threads is an attempt to provide a portable layer on top of the implementation-specific APIs.

Here's a table of threading support for most of the current Common Lisp implementations.

Native Non-native
ABCL all platforms -
Allegro Windows other platforms
CLISP - -
CMUCL - x86 (Linux,BSD)
Clozure CL all platforms -
Corman CL Windows -
GCL - -
Scieneer all platforms -
SBCL X86 (Linux,BSD) -
So, native thread support is not universal. If we're interested in parallelizing computations in Common Lisp, this is unfortunate, if a particular implementation of interest happens to not support native threading on the selected platform. You might think that since SBCL offers native threading on x86, and Clozure CL offers native threading everywhere, we're in pretty good shape. However, suppose that we want to use CLISP because the computational bottleneck in our application is bignum arithmetic, and CLISP has much faster bignum arithmetic than the other implementations -- no dice. Also, a particular application may require the use of non-portable code or libraries which require a particular implementation that doesn't support native threads for a required platform.

Message-Passing in Common Lisp

Message passing is less commonly used than threading in most languages (with the notable exception of Erlang). As far as I know, there is only one implementation which supports parallelism based on message passing, which is GCL, which includes an interface to MPI, originally by Gene Cooperman.

Coming up: A New, Portable Message-Passing Library for Common Lisp

I've been working on CL-MPI, a portable, CFFI-based binding for the MPI message passing library. CL-MPI enables portable, parallel programming in Common Lisp using a message-passing model on either a cluster of machines, or a single multicore machine. I have just made the most recent version (0.2.1) available for download. So far, I've tested CL-MPI with SBCL and CMUCL on Linux. I initially developed CL-MPI to do massively parallel computations on a large cluster at my university. But since we can run MPI on a single multi-core machine, one interesting use for CL-MPI is to provide a "poor man's multiprocessing solution" for some Common Lisp implementations which don't have native threading capabilities, such as CMUCL and CLISP. Unfortunately, CL-MPI is not really well documented yet, and as a prerequisite, you'll need to have basic understanding of the MPI programming model (here is an excellent tutorial). I'll write a series of posts describing CL-MPI and some example applications, and hopefully, this will be the basis of a tutorial. Stay tuned...

6 comments:

  1. Don't forget about ECL, it does support native threads and uses GMP for bignums, which is quite fast.

    ReplyDelete
  2. There's usually a third model of parallelism listed -- result-oriented parallelism. See http://marijn.haverbeke.nl/pcall

    ReplyDelete
  3. For my raytracer, I wrote a wrapper around some of OpenMPI. I never made it so far as the send-auto and recv-auto stuff. I may have to get mpich2 running on my machines.

    ReplyDelete
  4. stassats: thanks, I forgot to list ECL.

    ReplyDelete
  5. Marijn: Now that you mention it, I do remember seeing your article on pcall a while ago. Thanks for remininding me. I'm not sure whether it's a "third type", in contrast to shared-memory or message passing (distributed memory). The primitives you provide such as pexec are higher level constructs which can be implemented on top of some lower-level framework (in your case, you built on top of threads). Some variant of pcall could also be implemented on top of a distributed memory infrastructure like MPI.

    ReplyDelete
  6. Patrick: I actually tried to test CL-MPI with OpenMPI, and but couldn't get far due to mysterious memory corruption errors. Do you remember what FFI you used (CFFI? UFFI? or implementation-specific?)

    ReplyDelete