Incremental lambda lifting
lifts the internal definitions of known local
procedures to the next outer lambda expression.
For example, the internal definition of `.loop_3`

in

((lambda () (begin (set! reverse-map (lambda (.f_2 .l_2) (define .loop_3 (lambda (.l_5 .x_5) (if (pair? .l_5) (.loop_3 (cdr .l_5) (cons (.f_2 (car .l_5)) .x_5)) .x_5))) (.loop_3 .l_2 '()))) 'reverse-map)))

is lifted to the outer lambda expression in

((lambda () (define .loop_3 (lambda (.f_2 .l_5 .x_5) (if (pair? .l_5) (.loop_3 .f_2 (cdr .l_5) (cons (.f_2 (car .l_5)) .x_5)) .x_5))) (begin (set! reverse-map (lambda (.f_2 .l_2) (.loop_3 .f_2 .l_2 '()))) 'reverse-map)))

Because `.loop_3`

is lifted out of the scope of
`.f_2`

, which occurs free within the original code
for `.loop_3`

, `.f_2`

must become an
additional formal parameter to `.loop_3`

, and all calls
to `.loop_3`

must be changed to pass this additional
parameter.

Lambda lifting is one of the most complex optimizations performed by Twobit, mainly because of complications that arise with groups of mutually recursive known local procedures. Consider

((lambda () (begin (set! foo (lambda (n x y) (define (f n) (if (zero? n) x (g (- n 1)))) (define (g n) (if (zero? n) y (f (- n 1)))) (f n))) 'foo)))

If `f`

were lifted to the outer lambda expression, then
it would have to take `g`

and `x`

as extra parameters.
That means `g`

would have
to supply `g`

and `x`

as extra arguments when
it calls `f`

:

((lambda () (define (f g x n) (if (zero? n) x (g (- n 1)))) (begin (set! foo (lambda (n x y) (define (g n) (if (zero? n) y (f g x (- n 1)))) (f g x n))) 'foo)))

Notice that `g`

now escapes its scope, negating the earlier
closure analysis. Worse yet, if `g`

were lifted to the outer
lambda expression, it would have
to take an extra parameter `y`

.
That means `f`

would have
to supply `y`

as an extra argument when it calls `g`

.

But `f`

is now outside the scope of `y`

!
Lifting `g`

can therefore add extra arguments not only to
`g`

but to procedures that call it.
Adding extra arguments to those procedures
may in turn involve adding extra arguments to procedures that call
the procedures that call `g`

, and so on.

Eventually this process of adding arguments will stabilize. For this example the result is

((lambda () (define (f g x y n) (if (zero? n) x (g x y (- n 1)))) (define (g x y n) (if (zero? n) y (f g x y (- n 1)))) (begin (set! foo (lambda (n x y) (f g x y n))) 'foo)))

This is suboptimal: `f`

is back within
the scope of `g`

, so it is unnecessary to pass
`g`

to `f`

.

The solution is to lift each group of known local procedures as a unit, after performing a flow analysis to determine precisely which parameters must be added to each procedure. Let V be the set of parameters for the lambda expression at whose head a group of known local procedures is defined, let f1, ... be the names of those procedures, and let FV(fi) be the variables that occur free in the code for fi. The set of variables Ai that need to be added as parameters to fi is defined by the recursive flow equation

Compare this with the flow equation for conventional lambda lifting:

The only difference between these flow equations is that, by intersecting with V, incremental lambda lifting limits the flow problem to the formal parameters of the lambda expression from which the known local procedures are being lifted. Another way of looking at it is that conventional lambda lifting takes V to be the set of all variables that occur within the program.

Solving these flow equations takes O(m^3 * n^2) time, where m is the number of procedures being lifted and n is the number of variables in the base set V. For incremental lambda lifting, m and n are usually quite small. On the other hand, incremental lambda lifting requires O(m) different flow problems to be solved.

Twobit currently represents sets of variables as lists, so it actually uses an O(m^3 * n^3) algorithm to solve these flow equations. When compiling Twobit itself, Pass 2 spends about 6% of its time solving flow equations. This is less than 1% of the total compilation time.

Experiments using an O(m^3 * n^2) algorithm on synthetic benchmarks have shown that incremental lambda lifting can be either faster or slower than conventional lambda lifting, depending on the structure of the program being compiled.

In short, compilation time is not a major issue for incremental lambda lifting.

Incremental lambda lifting is flexible because the lifting process can be halted at an intermediate scope. Heuristics are applied at each stage of lifting to determine whether further lifting is desirable.

Interprocedural register targeting is performed by reordering the formal parameters of known local procedures when they are lifted.

Incremental lambda lifting is performed at the very end of Pass 2.