6.10

## Assignment 1: A church-encoder

Your task on Assignment 1 is to write a full compiler for a core subset of scheme to the lambda calculus—the same grammar that our interpreter on assignment 0 supported. This will mean compiling the language:

 e ::= (letrec ([x (lambda (x ...) e)]) e) | (let ([x e] ...) e) | (lambda (x ...) e) | (e e ...) | x | (and e e) | (or e e) | (if e e e) | (uprim e) | (bprim e e) | datum datum ::= nat | (quote ()) | #t | #f nat ::= 0 | 1 | 2 | ... x is a symbol uprim ::= add1 | car | cdr | null? | not bprim ::= + | * | cons

Into the language:

 e ::= (lambda (x) e) | (e e) | x

Efficiency of compiled code notwithstanding, none of the above language forms are fundamental outside of abstractions, applications, and variables. All other forms may be compiled into nothing but functions. The pure lambda-calculus is Turing-complete.

The goal of this assignment is to continue to exercise your understanding of core functional programming, to understand the essentials of compiling to equivalent code in a core language, and to scale up your knowledge of Racket to a substantial (if not yet practical) compiler.

### Church encoding

Church encodings are representations of data types as pure functions. We can convert numbers, booleans, null, lists, and any other data type possible in real world languages into a lambda abstraction (a first-class function) built out of further lambdas. All the language forms we need can then be desugared into applications of lambdas.

At the end of this assignment, you’ll be able to encode a function that maps arithmetic functions over a list into pure lambda calculus; for example:

 > (require "solution.rkt") > (write (church-encode '(letrec ([map (lambda (f lst) (if (null? lst) lst (cons (f (car lst)) (map f (cdr lst)))))]) (map (lambda (x) (* 2 x)) (cons 0 (cons 5 (cons 3 '()))))))) ((((((((((((((lambda (Y-comb) (lambda (church:null?) (lambda (church:cons) (lambda (church:car) (lambda (church:cdr) (lambda (church:add1) (lambda (church:sub1) (lambda (church:+) (lambda (church:-) (lambda (church:*) (lambda (church:zero?) (lambda (church:=) (lambda (church:not) ((lambda (map) ((map (lambda (x) ((church:* (lambda (f) (lambda (x) (f (f x))))) x))) ((church:cons (lambda (f) (lambda (x) x))) ((church:cons (lambda (f) (lambda (x) (f (f (f (f (f x)))))))) ((church:cons (lambda (f) (lambda (x) (f (f (f x)))))) (lambda (when-cons) (lambda (when-null) (when-null when-null)))))))) (Y-comb (lambda (map) (lambda (f) (lambda (lst) (((church:null? lst) (lambda (_) lst)) (lambda (_) ((church:cons (f (church:car lst))) ((map f) (church:cdr lst))))))))))))))))))))))) ((lambda (u) (u u)) (lambda (y) (lambda (mk) (mk (lambda (x) (((y y) mk) x))))))) (lambda (p) ((p (lambda (a) (lambda (b) (lambda (tt) (lambda (ft) (ft ft)))))) (lambda (_) (lambda (tt) (lambda (ft) (tt tt))))))) (lambda (a) (lambda (b) (lambda (when-cons) (lambda (when-null) ((when-cons a) b)))))) (lambda (p) ((p (lambda (a) (lambda (b) a))) (lambda (_) (lambda (x) x)))))(lambda (p) ((p (lambda (a) (lambda (b) b))) (lambda (_) (lambda (x) x))))) (lambda (n0) (lambda (f) (lambda (x) (f ((n0 f) x)))))) (lambda (n0) (lambda (f) (lambda (z) (((n0 (lambda (g) (lambda (h) (h (g f))))) (lambda (u) z)) (lambda (x) x)))))) (lambda (n0) (lambda (n1) (lambda (f) (lambda (x) ((n1 f) ((n0 f) x))))))) (lambda (n0) (lambda (n1) ((n1 (lambda (n0) (lambda (f) (lambda (z) (((n0 (lambda (g) (lambda (h) (h (g f))))) (lambda (u) z)) (lambda (x) x)))))) n0)))) (lambda (n0) (lambda (n1) (lambda (f) (lambda (x) ((n0 (n1 f)) x)))))) (lambda (n0) ((n0 (lambda (b) (lambda (tt) (lambda (ft) (ft ft))))) (lambda (tt) (lambda (ft) (tt tt)))))) (lambda (n0) (lambda (n1) ((((lambda (n0) ((n0 (lambda (b) (lambda (tt) (lambda (ft) (ft ft))))) (lambda (tt) (lambda (ft) (tt tt))))) (((lambda (n0) (lambda (n1) ((n1 (lambda (n0) (lambda (f) (lambda (z) (((n0 (lambda (g) (lambda (h) (h (g f))))) (lambda (u) z)) (lambda (x) x)))))) n0))) n0) n1)) (lambda (_) ((((lambda (n0) ((n0 (lambda (b) (lambda (tt) (lambda (ft) (ft ft))))) (lambda (tt) (lambda (ft) (tt tt))))) (((lambda (n0) (lambda (n1) ((n1 (lambda (n0) (lambda (f) (lambda (z) (((n0 (lambda (g) (lambda (h) (h (g f))))) (lambda (u) z)) (lambda (x) x)))))) n0))) n1) n0)) (lambda (_) (lambda (tt) (lambda (ft) (tt tt))))) (lambda (_) (lambda (tt) (lambda (ft) (ft ft))))))) (lambda (_) (lambda (tt) (lambda (ft) (ft ft)))))))) (lambda (bool) ((bool (lambda (_) (lambda (tt) (lambda (ft) (ft ft))))) (lambda (_) (lambda (tt) (lambda (ft) (tt tt)))))))

If we ran this code in racket, we would get back a church-encoded list of church-encoded numbers ’(0 10 6). Further on, we will see how to turn these back into racket values for testing purposes.

#### Desugaring and currying

In writing a church encoder, or compiler pass generally, the name of the game is to reduce more complex language forms to simpler language forms while preserving the exact same program semantics (i.e., meaning or behavior).

An example of this is currying functions. Our input language allows lambdas to take zero or more arguments such as (lambda () 4) or (lambda (a b c) (+ a (+ b c))), and allows call sites to provide any number of arguments, for example (f) or (g 0 2 4). Our output grammar however, only supports lambdas and call sites with exactly one argument.

Because lambdas are first-class values we may return and pass around, they are closed over an environment with values for their free variables. A lambda such as (lambda (x y) (+ x y)) may be rewritten as (lambda (x) (lambda (y) (+ x y))) so long as all its call sites (f a b) are transformed into ((f a) b). The subexpression (f a) will return the syntactic lambda (lambda (y) (+ x y)) closed over the binding [x -> a]. It remembers the value for x wherever else it may be flow or propagate in the program.

In a currying transformation, we can safely ensure all curried functions get curried call sites by currying all forms in the program at once. So we can transform a call site (f) to (f f) or to (f (lambda (x) x)) without knowing which function(s) f may get bound to at runtime, because we know that a) our inputs are not malformed, so f must be a 0-arity/nullary function if this call is reached, and b) in the same pass we are going to transform all lambdas as well, so every (lambda () ...) will be turned into a lambda (lambda (_) ...) that takes some unused parameter.

For example, in your compiler, part of this pass could be written:

 ... (define (churchify e) (match e ... [`(lambda () ,e) `(lambda (_) ,(churchify e))] ... [`(,f) `(,(churchify f) (lambda (x) x))] ...)) ...

A nullary lambda is rewritten as one with an unused parameter (for this assignment using _ or unused or something like this will be just fine). The corresponding call sites are rewritten as the nullary function (now taking one unused argument) applied on an arbitrary value (as it will go unused).

In both these cases, we must recursively transform our expressions by applying our compiler (here named churchify). To do this we use unquote to insert a racket expression in a quasiquote and apply the compiler to the pattern variable representing a subexpression.

#### Church booleans and conditionals

So, how might we encode a boolean such as #t? We should ask, "how is a value like this used?" and "can we encode that behavior as a function?" As booleans are used by conditional/branching forms, how would we encode a conditional statement such as (if #t 0 1) or (if #f 0 1)?

We want if-expressions to evaluate to one value or another depending on the value of the guard expression.

 (if #t 0 1)  -->  0 (if #f 0 1)  -->  1

The lambda calculus gives us a way to take multiple parameters using abstractions:

 (((lambda (t) (lambda (f) ...)) 0) 1)  -->  0

And then a way to select one value among multiple to return using variable reference:

 (((lambda (t) (lambda (f) t)) 0) 1)  -->  0

Church booleans can then be encoded as lambdas that take two parameters in curried fashion and produce the correct branching behavior, while if-expressions are turned into applications of the church-boolean; i.e., (if b e0 e1) > ((b e0) e1).

This behavior is almost correct, but to get it perfect we must consider evaluation-order and the possibility of side effects. The pure lambda-calculus can’t write to files, connect to a network, make sounds, or mutate global variables; all it can do is warm up the CPU and terminate with a value... or non-terminate. Consider the program Omega which non-terminates:

 omega = ((lambda (u) (u u)) (lambda (u) (u u)))

What might go wrong in your compiler if one of the branches of an if-form is omega, and would non-terminate if executed, but is never reached when the program is run? For example:

 (if (null? '()) '() ((lambda (u) (u u)) (lambda (u) (u u))))

#### Church numerals

To design church numerals, we repeat this central question: what behavior are we encoding as a function?

In the case of numbers, a primary use is for iteration or looping. The number 4 may be useful for being able to iterate over a list of length 4, or in the context of (* 4 6), for adding 6 to an accumulator four times in a row. If we were to want a very different representation for numbers, a numbers-as-iteration representation should be sufficient to build up that representation inductively.

Our representation for a number n as a lambda is a function (lambda (f x) (f n x)), the function that takes two parameters f and x and applies f iterated on x, n times. Equivalently, you can think of this as simply the function that takes an input f and returns f n. So the number zero is the function (lambda (f) (lambda (x) x)) and the number two is (lambda (f) (lambda (x) (f (f x)))).

In this representation it is easy to formulate a church encoding of the increment function add1 as:

 (lambda (n0) (lambda (f) (lambda (x) (f ((n0 f) x)))))

This takes a church-numeral n0 as input, and returns an abstraction over parameters f and x which applies f to x one more time than n0 does. Addition of m and n can then be expressed as the composition of fn and fm, and multiplication as (f n)m (which is the same as f n*m):

 church-multiply = (lambda (n0 n1) (lambda (f x) ((n0 (n1 f)) x)))

#### Church lists

To develop a church encoding of null and pairs, and therefore linked lists, we again apply the approach of encoding, as lambdas, behaviors attending the values we want to encode.

For lists, this means checking to see if the value is null or a pair and, in the latter case, decomposing the value into car and cdr values. For this, we can reuse the trick used to encode booleans: we take multiple values as arguments and select one to use. So in the case of null, we can take two functions (think of them as two callbacks) and apply one that indicates the value is null:

 church-null = (lambda (when-cons when-null) (when-null))

This means a cons cell (pair) should invoke the other callback, when-cons. In this case, that a value is a pair, we also may want to decompose it into its car and cdr values. We can do this by providing the callback with these values as parameters:

 church-cons = (lambda (a b) (lambda (when-cons when-null) (when-cons a b)))

A church encoding of the cons operator is a function that takes two arguments a and b, and returns a pair. This pair is encoded as an abstraction taking two callbacks, when-cons and when-null, that invokes when-cons on a and b in that order. This means that to write a church operation to replace car, we would provide two callbacks, a when-cons that accepts a and b and returns a (the car value), and some thunk that will not be invoked unless the input program contained an error.

 church-car = (lambda (p) (p (lambda (a b) a) (lambda () omega)))

#### The Y-combinator

In the example of omega, we see one core principle repeating itself three times: self-application. Self-application is a term such as (x x), and the lambda term used in omega, U = (lambda (u) (u u)), is merely an abstraction of self-application! This function, known as the U-combinator, will self-apply any function it is applied on. What happens then when we self-apply an abstraction of self-application? Indefinite recursion.

Self-application is all that is necessary to simulate self-reference.

The evaluation substitutes U for u in the body of U, (u u):

 (U U) = ((lambda (u) (u u)) U) = (u u)[u <- U] = (U U)

Let’s consider how this may actually be useful in our church compiler.

Desugaring letrec requires that we provide to a function a binding to itself. For example, if we were to use a let instead of a letrec to bind a map function,

 (let ([mymap (lambda (f lst) (if (null? lst) '() (cons (f (car lst)) (mymap f (cdr lst)))))]) (mymap add1 '(3 4 5)))

the recursive use of mymap would be an unbound variable because let-forms only introduce a binding in their body (i.e., in (mymap add1 (3 4 5))). Try this code out in Racket and see why it fails.

As a first step to providing the binding we need, we can add a lambda that introduces it:

 (let ([mymap (lambda (mymap) (lambda (f lst) ...))]) ...)

But now, when we invoke mymap, we must first pass it a copy of itself:

 (let ... ((mymap mymap) add1 '(3 4 5)))

So mymap isn’t bound to a map function at all, but a map function maker. We can rename it to reflect this:

 (let ([makemap (lambda (makemap) (lambda (f lst) (if (null? lst) '() (cons (f (car lst)) (mymap f (cdr lst))))))]) ((makemap makemap) add1 '(3 4 5)))

So we can’t seem to hand mymap a copy of itself yet, but we have provided it with a makemap function which can return mymap when self-applied! So we do this in the inside of mymap as well:

 (let ([makemap (lambda (makemap) (lambda (f lst) (if (null? lst) '() (cons (f (car lst)) ((makemap makemap) f (cdr lst))))))]) ((makemap makemap) add1 '(3 4 5)))

Try this out in Racket!

At each iteration, we can self-apply a function that returns the function we want, in order to produce one more iteration. We may also think of this as entangling our recursive function with the omega expression and using it as the engine of our unbounded recursion.

We may also extract this behavior and encode it as a higher-order function Y that takes a function (lambda (mymap) (lambda (f lst) ...)) and calculates its fixed-point: the exact recursive mapping function we need!

 Y = ((lambda (u) (u u)) (lambda (y) (lambda (mk) (mk (lambda (x) (((y y) mk) x))))))

Now we can rewrite our mymap function using this Y-combinator like this:

 (let ([mymap (Y (lambda (mymap) (lambda (f lst) (if (null? lst) '() (cons (f (car lst)) (mymap f (cdr lst)))))))]) (mymap add1 '(3 4 5)))

Give this a try in Racket!

It is also possible to derive the Y-combinator equationally from its definition as a function that computes fixed-points of higher-order functions.

We start by observing that Y applied to a function yields a fixed-point exactly when (Y f) equals (f (Y f)), because a fixed-point for a function f is simply a point at which f is fixed (x where x = (f x)).

 (Y f) = (f (Y f)) Y = (λ (f) (f (Y f)))

If we use this directly as a definition for Y by turning it into an abstraction (as above), a) we would be immediately recuring on Y and looping infinitely without returning, and b) we could not refer to Y in its own body anyway without a letrec construct or using the trick we just used for mymap. If we address both these issues, we’ll have ourselves a valid Y-combinator.

This first problem can be solved by η-expansion; that is, expanding an expression g to (λ (x) (g x)). In the pure lambda-calculus where all values are single arity functions, this expansion is always sound.

 Y = (λ (f) (f         (Y f)    )) Y = (λ (f) (f (λ (x) ((Y f) x))))

This alone gives us a Y-combinator we can use in Racket, so long as we’re able to leverage the letrec or define constructs to give us a self-reference. Try it out:

 (define (Y f) (f (λ (x) ((Y f) x)))) (let ([fact (Y (lambda (fact) (lambda (n) (if (= 0 n) 1 (* n (fact (- n 1)))))))]) (fact 5))

But without define, we must use our previous U-combinator trick to induce self-reference using self-application (self-substitution). We turn Y into a makeY function by wrapping it in a lambda that takes a function y:

 makeY = (λ (y) (λ (f) (f (λ (x) ((y f) x)))))

And then self-apply makeY (which pushes it down as a binding to y) and self-apply y before invoking it on f (the function whose fixpoint we’re computing):

 Y = ((λ (u) (u u)) (λ (y) (λ (f) (f (λ (x) (((y y) f) x))))))

This definition will work, even if we only use let to bind it to Y! Give it a try in Racket.

#### Unchurching church-encoded values

We can directly use church-encoded values obtained by running our compiled programs in Racket code by using a set of "unchurching" functions. That is to say, assuming you have some function cv encoding a church value, we can convert it to a normal Racket value.

 (define (church->nat cv) ((cv add1) 0)) (define (church->list cv) ((cv (lambda (car) (lambda (cdr) (cons car (church->list cdr))))) (lambda (na) '()))) (define (church->bool cv) ((cv (lambda (_) #t)) (lambda (_) #f)))

For numbers, obtaining a Racket integer from a church-numeral requires we apply cv to the Racket function add1 and then to the Racket number 0. Because church-numbers allow repeated application of a function, using them to perform repeated application of add1 on zero returns the corresponding Racket number. Consider the church-numeral for 1:

 (lambda (f x) (f x))

Applying this to add1 will reduce to a function:

Then, subsequently applying this to 0—as church->nat does—will then reduce to 1.

With the help of these "unchurching" functions, we can convert between any church-encoded representation of some type and its Racket equivalent.

To test your compiler, we will apply the church encoder you write to some core scheme program, evaluate the result according to the lambda calculus interpreter you wrote for assignment 0 (or Rackets eval function), and then use these "unchurching" transformations on the result of the church-encoded program. You may write your own tests by simply generating some expressions in the source language, compiling them using your encoder, and then checking that the evaluation of this output yields a value that, unchurched, matches the evaluation of the input language.

### Getting started

For your assignment you may use the following stubbed out code to get started. You should turn in a solution.rkt file that provides a church-encode function.

 #lang racket (provide church-encode) (define (church-encode e) (define prims '(+ * add1 cons car cdr null? not)) (define (prim? prim) (if (member prim prims) #t #f)) (define (churchify-prim prim) (string->symbol (string-append "church:" (symbol->string prim)))) (define (churchify e) (match e ; Tagged expressions [`(letrec ([,f (lambda (,args ...) ,e0)]) ,e1) ] [`(let ([,xs ,e0s] ...) ,e1) ] [`(lambda () ,e0) ] [`(lambda (,x) ,e0) ] [`(lambda (,x . ,rest) ,e0) ] [(? symbol? x) ] [`(and ,e0 ,e1) ] [`(or ,e0 ,e1) ] [`(if ,e0 ,e1 ,e2) ] [`(,(? prim? prim) . ,args) ] ; Datums [(? natural? nat) ] ['(quote ()) ] [#t ] [#f ] ; Untagged application [`(,fun) ] [`(,fun ,arg) ] [`(,fun ,arg . ,rest) ])) ; Church primitives (define Y-comb `((lambda (u) (u u)) (lambda (y) (lambda (mk) (mk (lambda (x) (((y y) mk) x))))))) ; other definitions we'll put in scope below ... (churchify `(let ([Y-comb ,Y-comb] [church:null? ,church:null?] [church:cons ,church:cons] [church:car ,church:car] [church:cdr ,church:cdr] [church:add1 ,church:add1] [church:+ ,church:+] [church:* ,church:*] [church:not ,church:not]) ,e)))

The case ordering here is reasonable and need not be changed. For example, if you were to move the bottom cases for application to the top of the match form, what might go wrong?

Notice an idiom you may find useful in the initial call to churchifyperforming part of the translation and then making a tail call to continue desugaring further. We can bind a set of operations for church values, such as church:+, using let. Then we call churchify on our input wrapped in this small library of church operations and this can turn the let-form into applications of lambdas that will bind each church operation in turn. This is much easier than writing all these lambdas ourselves and much nicer to debug than inlining every church operation where it is encountered.

I recommend you get started by commenting out any cases you aren’t supporting yet and handling a single aspect of this reduction. For example, you might first comment out this let form and first try to get just currying working for the n-arity lambda calculus.

• Try writing some church-encodings manually in Racket to make sure you understand how they work before writing your compiler.

• It will be much easier to debug your code if you implement one feature at a time and check the output carefully against your expectations.

• The reference implementation is only around 100 lines long (spacing included). No single case of the match statement is more than three lines. Thus, if you find yourself writing more than (around) 6-8 lines in a given match clause, you are definitely writing an unnecessarily complicated solution.

• You should also consult the slides we used in class.

• Remember, the pure lambda calculus is a subset of Racket itself. You will find it helpful to copy-and-paste your compiler’s outputs into a Racket file and play around with them in that context to understand how they work. If you get an error when running an output, Racket will be helpful in pointing out exactly where it occurred and what went wrong.

#### Extra Credit

Extra credit opportunity: If you support a binary operation - and two unary operations zero? and sub1 correctly, you will receive 10% extra credit on the assignment.

Web Accessibility