p3.zip, which contains
cekaml.ml, the file you will extend and use to implement your project. The math for our CEK machine is here (you should read this after this project description). The project zip also includes:
public_programs.ml– A set of example programs written in CEKaml
test_apply.ml/rb– Tests applying functions
test_arg.ml/rb– Tests evaluating arguments
test_call.ml/rb– Tests calling functions
test_compute.ml/rb– Tests computing results of whole programs
test_final.ml/rb– Tests the
test_fix.ml/rb– Tests evaluation of
test_if.ml/rb– Tests evaluation of
test_inject.ml/rb– Tests the
test_let.ml/rb– Tests implementation of
test_var.ml/rb– Tests variable lookup
outputs/*– Outputs for public tests
goTest.rb– Run all public tests
July 3, 2015: Fixed
EBArgto include environment as well, which must be passed along to evaluate each case for builtins. Added a few more builtins.
July 7, 2015: Error in the definition of the
fixrule caused error in public tests. Public test is fixed and change is noted on Piazza.
For the semantics portion of our class, we covered big step and small step semantics. Big step semantics (like the kind we implemented here) use an evaluation relation , which we implement in code as an
eval function. These interpreters are often simple to understand, becuase of their recursive top-down organization.
But while simple, big step interpreters have many shortcomings. For example, they can’t easily interpret large programs, since recursion in the target language will be mirrored in the
eval function. Because of this, we introduced small step semantics, and focused specifically on the CEK machine. A CEK machine evolves in steps, similar to a processor for a computer, pushing around various components to evolve the computation.
In this project you will implement a CEK style interpreter for the langauge CEKaml, a langauge based on OCaml but without types. Specifically, the CEKaml terms are:
- Integral and boolean literals,
- Variable references,
- Let binding,
let x = t1 in t2
- Lambda abstraction,
fun x -> t
- Integer and boolean literals,
- Polyadic builtin operations, e.g.,
+ 23 1 x,
if t1 then t2 else t3
fix f in ...
- Constructors (or variants),
- Match statements,
match Ctr1(2) with | Ctr1(x) -> x | Ctr2(x,y) -> y
In this project, we won’t work with concrete syntax for our programming language, we’ll work with abstract syntax. In other words, you’ll work with an OCaml based representation of the program text, represented as a
In this project you may assume only valid programs (or moves) will be made with your interpreter.
The only things matched by match statements are variants. This is in contrast to OCaml, where you can also match (e.g.,) integers.
For the built in operators, assume that you will always be passed at least one argument. For the case of
not, you may assume that there is only one argument.
Terms are represented in the
Syntax module of the project implementation. This module includes the term datatype and a utility to convert terms into strings
type builtin = | Plus | Times | And | Not | Or type term = | Var of var (* Variables *) | Let of var * term * term (* Let binding let x = ... in *) | Lam of var * term (* (\x. e) *) | App of term * term (* e1 e2 *) | NumLit of int (* literal integers *) | BoolLit of bool (* literal booleans *) | Builtin of builtin * term list (* built in operators e1 + e2 *) | Ifthenels of term * term * term (* if e1 then e2 else e3 *) | Fix of var * term (* fixpoints: let rec var = ... *) | Variant of name * term list (* constructors: C(e1,...,en) *) (* Match patterns. match t with | C1(v11,...,v1k) -> b1 | ... -> ... | Cn(vn1,...,vnk) -> bn represented as Match(t,[("C1",["v11",...,"v1k"],b1); ... ("C1",["v11",...,"v1k"],b1)]) *) | Match of term * ((name * var list * term) list) and (* A match pattern is a: - Constructor name - Variable list - Term When matched, the term will be executed with the variables substituted into the environment. *) match_pattern = (name * var list * term)
Interpreters in the CEK style
This week’s class covered small step semantics using the CEK machine. The CEK machine is an abstract machine for efficiently evaluating functional languages, that has three components:
- The control, which tells us what the current instruction is. In our case this is simply a term.
- The environment, which tells us how to look up variables
- The continuation, which tells us where to go next
We went over the details for how the CEK machine worked in class, but I reiterate here the elements germane to this project. Remember that if you get stuck, there are various notes and resources on the course webpage and on Piazza from which to draw inspiration
The CEK machine state () is then a three tuple of those components, . The machine executes by taking multiple steps from an initial state. If the program is
e, then the initial state is
\Sigma_0 = \langle e ,  , Done \rangle
You will implement an OCaml function,
inject e, that performs this operation. After the initial state, the machine steps to new states until there are no more possible moves:
\Sigma_0 \rightarrow \Sigma_1 \rightarrow \cdots \rightarrow \Sigma_f
To detect whether the machine is in a final state, you will implement an OCaml function
final_state s. This function will look at the state and detect whether or not any possible move could be made. This will happen when the continuation is in the
Done state and no more moves can be applied.
To define an interpreter for TypelessCaml, you will implement the
step function for the CEK machine, as we discuss in class. To correctly implement this project, you will have to translate the mathematics in the CEK machine’s definition into OCaml code.
We went over the mathematics for the enhanced CEK machine yesterday in class, but didn’t systematically write them down anywhere.
When you get stuck on the project, and to understand the rules for our enhanced CEK machine, read the guide linked at the top of this description.
Implement the following functions in
inject eThe inject function takes a term and creates an initial state out of it, setting up the control to point at the program, the empty environment, and a
final_state sDetects whether the machine is in a final state or not. The machine is in a final state when it can apply no more steps.
step sImplements the stepping behavior of the machine. There are many cases that need to be handled, which will be tested independently:
- Evaluating a variable
xby looking it up in the environment. Remember to handle each possible case for machine values (not just closures).
- Evaluating an application
t1 t2by evaluating
t1and setting up the correct continuation for
- Evaluating an argument
EArgand creating a
Callwith the proper structure.
- Evaluating a call
\x. t, by binding
xin the proper environment and stepping into the function body.
- Evaluating a let by treating
let x = e1 in e2in the same way as
(\x. e2) e1
- Evaluating a literal (integer or boolean) by correctly making
- Evaluating if correctly by evaluating the guard and then the correct branch.
- Evaluating fix correctly by creating a closure of the current environment and going into the function.
- (Extra credit) Evaluating constructors correctly by evaluating their arguments in order
- (Extra credit) Evaluating match correctly, by evaluating its term, and then choosing the correct case.
- Evaluating a variable
step, to compute the final value of a computation by applying the step function until a final state is reached. When a final state is reached,
computegives the value of the resulting term (in the control).
Extra credit (5%)
Implement constructors and match. If you plan to do this, let me know and we can talk about your planned implementation. The implementation is not difficult per se, but has some slight technical complications that we should talk about in person.
Specifically, to evaluate a variant, you need to evaluate its arguments in order: just like for built ins. But to then finish the variant, you can’t take it and step into the assembled variant, because you’d be calling the machine with an
mvalue in the control position. Things wouldn’t quite type check. To handle this, we have to add a special rule for calling functions where the argument is a variant, and also add a special case for if the program is done and computes a variant as its result.
To implement match, you perform a similar sequence of steps. You evaluate the discriminee (thing being matched against), then have a continuation that allows you to choose between a set of constructors. When you finish evaluating the discriminee, you decide which match body to jump into and instantiate the arguments appropriately.
These steps should not be too challenging if you’ve implemented the project, but might be overwhelming to complete if you don’t have a good familarity with how the CEK machine is working. If you’d like to do the extra credit, we can discuss over email, after class, or in office hours.
Tips / Hints
This project shouldn’t consist of all that much code. Each of the machine steps can be written succinctly.
Your step function should be a bunch of cases in a match statement. When this code starts to become too unwieldy, split it into various helper functions.
Debugging your interpreter should be easy if you write a little bit of utility code. Write a function that prints the input state (using the helper function for printing states), runs the step function, and then prints the output state. Then you can figure out what your step function is doing wrong.
In class we presented an implementation of the CEK machine that handled the basic lambda calculus. You should start the project by copying that over to your step function and seeing that it works for the basic lambda calculus.
If you get stuck on what the interpreter should do, look at the math for it in the companion document