Twobit and Larceny

Twobit is a relatively simple optimizing compiler for Scheme. Four different back-ends now serve as the basis for three related implementations of Scheme.

Twobit
SPARC, IA32
C
CLR

In decreasing order of performance:

  1. Larceny is a research-quality implementation of Scheme that compiles directly to native machine code for SPARC or x86-32 architectures.
  2. Petit Larceny is a portable implementation that batch-compiles to C instead of machine code.
  3. Common Larceny runs on Microsoft .NET's Common Language Runtime, generating JIT-compiled IL instead of native or C code.

These design notes were written primarily for students and for the developers of Twobit and Larceny. They do not yet describe the current version of Twobit, and are based on the description of an earlier version of Twobit in

Clinger, W., and Hansen, L.T. Lambda, the ultimate label; or a simple optimizing compiler for Scheme. Proceedings of the 1994 ACM Conference on Lisp and Functional Programming, June 1994, pages 128-139.

Lambda, the Ultimate Label

Twobit is based on viewing lambda expressions as assembly language labels that have been augmented by a list of symbolic names for the registers that are live at that label. The simplicity of this view results from allocating all non-local, non-global variables on the heap. Aggressive lambda lifting makes this practical by eliminating all non-local variables except for those that would have to be allocated on the heap anyway. The eliminated non-local variables become additional arguments to lambda expressions, which means they will be allocated in registers.

Most optimizations used in Twobit are well-known, but a few are new or unusual:

The intermediate form used by the front end is a restricted subset of Scheme. The intermediate form used by the back end is assembly code for a hypothetical MacScheme machine.

Pass Structure