In the last lecture, we built the CEK machine, which allowed us to define how our programs evolved as a state transition system. The CEK machine implements a small step semantics for the lambda calculus by starting with an initial state and taking incremental “steps” to evolve the computation. Like a board game, the CEK machine defines a set of configurations, and defines a step function, that allows us to transition from one configuration to the next.

The CEK machine that we defined last lecture operated on terms of the lambda calculus:

\[
\begin{align}
t & ::= & \\
& x & \text{Variables} \\
& \lambda x. t & \text{Lambda abstractions} \\
& t_1 t_2 & \text{Applications}\\
\end{align}
\]

States in the CEK machine are comprised of three components:

The control ( $C$ ), the program text currently being executed
The environment ( $E$ ), the current assignment of variables
The continuation ( $K$ ), what to do after we finish the current operation

The control is simply defined as a term. In OCaml that means we would have a type that looks something like:

type control = term

The control defines what the machine is working on right now. To make progress executing a term, we update the control. So for example, to execute the term $(\lambda x. x) (\lambda y. y)$ , we start out with the machine looking like this:

\[
\langle \underbrace{(\lambda x. x) (\lambda y. y)}_{\text{Currently working on this}}, \overbrace{[]}^{\text{With no variables defined yet}}, \underbrace{Done}_{\text{And after that we’re done}} \rangle
\]

And the way we start working is to start working on $(\lambda x. x)$ . The way we do that is to take a step to focus the control on that. But we have to remember what to do after that, and to talk about that, we’ll have to define environments and continuations.

Environments tell us what variables mean. When we built the CEK machine, we decided that the environment would map variables to closures. Closures are lambda terms paired with an environment we’ll use to look up variables. Remember, this is because when we “return” a function, we need to remember what the variables were at the time of its definition. The domain $D$ holds the values storable by our machine

\[
\begin{align}
D & = & Clo( \lambda x. t, E) \\
E & : & V \to D
\end{align}
\]

In other words, variables are mapped to closures, which have type $D$ . In our code for the CEK machine, we implemented these as mutually recursive datatypes. You can think of $D$ as the “internal representation” of functions in the CEK machine. Environments $E$ , map variables to closures.

Now, to think about continuations, we have to think about what the CEK machine is going to do when we execute an application. So let’s stop and ask ourself, what happens when we want to execute an application here:

\[
t_1 \, t_2
\]

The first thing we have to do is execute $t_1$ . We do this by updating the control $C$ to “focus in” on $t_1$ . When the machine takes a step to focus in on $t_1$ , it’s like modifying the program counter (in assembly) to point at the beginning of a procedure you wish to call. You update the control, and then the machine goes and does more stuff to compute on. But what happens when we’re doing working on $t_1$ ? What do we need to do next? If we think about it, we need to go back and work on $t_2$ . So while we’re working on $t_1$ , we need to remember that the next thing we need to do is work on $t_2$ . We represent this with a continuation $EArg(t,E,K)$ , which says “after you’re done with the current thing, evaluate an argument $t$ , using environment $E$ , and after that do $K$ .”

When we’re done working on $t_1$ , we end up with a lambda $\lambda x. t$ . Then we’re ready to work on $t_2$ . When we’re done working on $t_2$ and get a result, what do we need to do? We need to actually call that lambda expression with that result. While we’re working on $t_2$ , we need to remember to call $\lambda x. t$ after we finish. Inside the execution of $t_2$ , we represent this by using a continuation $Call(t,E,K)$ , which can be read “we’re evaluating an argument to a function, and after we’re done, we need to call term t (which will be of the form $(\lambda x. t)$ ) with the environment $E$ , and then do K.”

In other words, our continuations look like this:

\[
\begin{align}
K & ::= & \\
& | & Done \\
& | & EArg(t,E,K) \\
& | & Call(t,E,K) \\
\end{align}
\]

So, here’s what the evaluation of the term $(\lambda x. x) (\lambda y. y)$ looks like in the CEK machine:

\[
\begin{align}
& \langle \underbrace{(\lambda x. x) (\lambda y. y)}_{\text{Currently working on this}}, \overbrace{[]}^{\text{With no variables defined yet}}, \underbrace{Done}_{\text{And after that we’re done}} \rangle \\
\rightarrow & \langle \underbrace{(\lambda x. x)}_{\text{Evaluating the function}}, [], \underbrace{EArg(\lambda y. y, [], Done)}_{\text{And after that evaluate the argument}} \rangle \\
\rightarrow & \langle \underbrace{(\lambda y. y)}_{\text{Evaluating the argument}}, [], \underbrace{Call(\lambda x. x, [], Done)}_{\text{And after that call the function}} \rangle \\
\rightarrow & \langle \underbrace{x}_{\text{Evaluating the body}}, \overbrace{[ x \mapsto \{ (\lambda y. y), [] \}]}^{\text{With x pointing to the closure}}, \underbrace{Done}_{\text{After that we’re done}} \rangle \\
\rightarrow & \langle \underbrace{\lambda y. y}_{\text{Looked up $x$}}, \overbrace{[]}^{\text{Empty environment (from $x$’s closure)}}, \underbrace{Done}_{\text{Now we’re done}} \end{align}
\]

Reduction rules for the machine

The machine proceeds in steps. Each step looks at the current configuration and decides where to move. Where to move depends on the current step. There are four basic cases:

Evaluate a variable, by looking it up in the store
Start evaluating an application $t_1 t_2$ , by moving the focus to $t_1$ and remembering to evaluate $t_2$ next
Finish evaluating a function $t_1$ , and start to evalaute its argument
Finish evaluating an argument and call to evalaute the body

Now, let’s define these basic cases. To look up a variable, we look it up in the environment, returning a closure, and then step to evaluating the closure’s code (a function) in the closures environment. Note that the environment switches, to become the environment from the closure. This is because a closure $\{ \lambda x. t , E \}$ tells us where to resolve variables ( $E$ ) when executing $\lambda x. t$ .

\[
step \overbrace{\langle x, E, K \rangle}^{\text{When evaluating a variable}} = \underbrace{\langle \lambda x’. t, E’, K \rangle}_{\text{Evaluate function in its environment}} \text{, where, } \overbrace{E(x) = { \lambda x’. t, E’ }}^{\text{Look it up in E}}
\]

Next, we need to define what happens when we see an application. When we see $t_1 t_2$ , we need to put the focus on $t_1$ , and remember to go back and evaluate the argument after we finish evaluating that. We do this by adding on a continuation. Note that we capture the previous continuation: this creates a stack of continuations, similar to the stack in Java or C.

\[
step \overbrace{\langle t_1 t_2, E, K \rangle}^{\text{When evaluating an application}} = \langle \overbrace{t_1}^{\text{Evaluate the function}}, E, \underbrace{EArg(t_2,E,K)}_{\text{Evalaute its argument next}} \rangle
\]

Next, when we finish evaluating a function $t_1$ , we need to start to evaluate its argument. In this case, the control will have the form $\lambda x. t$ , and the continuation will be of the form $EArg(t',E',K)$ . In this case, we need to start to evalaute $t'$ , but remember that we need to go back and evalute the call next. Note that we need to switch to using the environment $E'$ :

\[
step \overbrace{\langle \lambda x. t , E , EArg(t’,E’,K) \rangle}^{\text{When done evaluating a function}} = \langle \overbrace{t’, E’}^{\text{Evaluate its argument using its environment}}, \underbrace{ECall(\lambda x. t, E, K)}_{\text{And call the result next}} \rangle
\]

For the last case, we actually have to call the function after evaluating its arugment. To do this, we have to add the argument to the environment, by extending the environment to map $x$ to the result $(\lambda x. t)$ of the argument. But we can’t just map $x$ to the result, because it needs to be able to look up variables, so we need to create a closure, and map $x$ to the closure:

\[
step \overbrace{\langle \lambda x. t , E , Call(\lambda y. t’, E’,K) \rangle}^{\text{When done evaluating argument}} = \langle \overbrace{t’}^{\text{Eval body}}, \underbrace{Clo(\lambda x. t,E)::E’}_{\text{By extending environment with binding for $x$}}, K \rangle
\]

And we’re done! Those are the four basic rules for the CEK machine. To run the machine, we repeatedly apply the $step$ function until we reach a state where the machine is finished. As part of project three, you’ll implement detecting this and exit the machine (returning the result).

If you haven’t gone through it yet, you can see the CEK machine in action in the files:

cek-lambda.ml, which contains the lecture notes for the CEK machine
cek.ml, which contains a stripped down version of the machine without all the comments

CMSC 330, Summer 2015

Organization of Programming Languages

A refresher on the CEK machine

Reduction rules for the machine