Recursive closures ===================================================== Haldean Brown First draft: Nov 2016 Last updated: Jan 2018 Status: In progress, SELFREF not yet implemented ------------------------------------------------------------------------ The closure transformation turns closures into partially-applied functions, by turning values that have been closed over into arguments to the function itself. The specific algorithm for doing this is detailed greatly in the header closure.c, but the general gist is that: It transforms this: \x -> (\y -> + x y) To this: \x -> ((\x0 y -> + x0 y) x) It transforms this: \x -> (\y -> (\z -> x)) Into this: \x -> ((\x0 y -> ((\x1 z -> x1) x0)) x) It transforms this: \x -> { : y = + x 10 ! \z -> y } Into this: \x -> { : y = + x 10 ! (\y0 z -> y0) y } This closes these functions over the enclosing scope by making all enclosing scope information an explicit argument to the function, and then partially applying the function over that information. This document describes how the closure transformation interacts with recursion for values that are not bound to a global name. For example, this snippet: ! { : t = \x -> ? { . eq x 0 => "ok\n" . => t 0 } ! t 1 } In this example, t needs access to its own value from within its definition. However, since t is a local binding, it's never actually assigned to a name that persists past its own scope. To handle this, we transform t to take itself as a parameter; callers then call t by passing it itself. The resulting function then looks like: : t = \$t x -> ? { . eq x 0 => "ok\n" . => $t $t 0 } This poses a problem for the callers of this function, though; now all callers have to know that t needs to be passed itself as its first argument, so the immediate value of the block would be transformed to: ! t t 1 But this is clunky; we don't want to have to transform every call site that references t. Instead, we transform t to a partially-applied function, with itself already applied (for brevity, B is used to represent the body of the function t as given in the previous listing): : t = (\$t x -> B) (\$t x -> B) Note that this can be very succinctly represented in Ubik bytecode, without requiring a double-definition of the function, as the function must be its own value: 000 VALUE &B 001 APPLY 000 000 This seems nice in practice, but there's one more sticky bit: the original closure transformation. Let's take a slightly more complicated example: : top = \x -> { : inner = \y -> inner (+ x y) ! inner } Never mind that this is infinitely recursive: it's a nice simple example for our syntactic transformations. The closure transformation takes inner from that given in the previous listing to this: : inner = (\x y -> inner (+ x y)) x Which works fine. The recursion transformation would then take it to: : inner = (\inner x y -> inner inner (+ x y)) (\inner x y -> inner inner (+ x y)) x This looks okay, but doesn't properly work, because the reference to inner in the body of the function doesn't pass any of the closed-over parameters. What we want is something more like: : top = \x -> { : inner_ = \x inner y -> inner (+ x y) : inner_with_x_ = inner_ x : inner_with_inner_ = inner_with_x_ inner_with_x_ } But that doesn't work either. Instead, we introduce a secret atom type, SELFREF, which is a reference to the innermost enclosing function. So we end up rewriting inner to be: : inner = (\x y -> { : inner = SELFREF x ! inner (+ x y) }) x Now the "inner" on the inside doesn't actually reference a named binding on the outside. When this is compiled down to a graph, SELFREF will be transformed into a VALUE node that points at the value currently being compiled. With more-heavily-nested recursion, this ends up transforming: : outer = \a -> { : inner = \b -> outer (+ (inner a) b) ! inner a } Into: : outer = \a -> { : outer = SELFREF : inner = (\a outer b -> { : inner = SELFREF a outer ! outer (+ (inner a) b) } } Which actually ends up being pretty pleasantly readable.