Project 4 - Scheme Parser & Interpreter
Due 11:59pm November 9th, 2015 (Monday)
In this project, you will write an interpreter for a tiny subset of
programming language. As part of the project, you will also write a
parser that translates a plain text Scheme program into an abstract
syntax tree (AST), and an evaluator that executes the
code represented as an AST.
For a quick introduction to Scheme, check out these lecture
In order to test your project can try running small Scheme programs online
In order to test your project you can try running it on Grace using the Scheme interpreter guile. It behaves similarly to the OCaml top-level. You might try to get the logic of your functions right, first, by writing them in OCaml, and then translate them to Scheme.
If you want to find out more about Scheme, you can download the
Revised6 Report on the Algorithmic
Language Scheme. You may also try Teach
Yourself Scheme in Fixnum Days.
For purposes of this project, we will only test your interpreter
with valid input. Thus your code may do whatever you want on a
bad input. We do, however, recommend adding reasonable error handling
code to your project for cases of malformed or otherwise incorrect
Scheme input, because it will make developing your project easier. As
you are testing your program, you may inadvertently create incorrect
input data; substantial time may be lost in trying to debug the
program, only to find a few mistyped characters in your input data are
the source of the problem.
Also, even though you need all of these pieces written to have a
"complete" Scheme interpreter, we've structured the project so that
you can work on any of the separate phases independently and test them
on their own. So if you get stuck on an earlier part of the project,
you can still move on to a later part.
Download the following archive file p4.zip
and extract its contents.
Along with files used to make direct submissions to the
submit server (submit.jar,
submit.rb), you will
find the following project files:
- Your Scheme program - schemeTest.txt
- Your OCaml program - scheme.ml
- Public tests
- Expected outputs for public tests
- Utility Code
- testUtils.ml - utilities for printing ASTs & values
- parse.ml - uses your parser to produce Scheme ASTs
- interpret.ml - uses your parser and
evaluator interpret Scheme expressions
- Test Script - goTest.rb (does not
test your SchemeTest.txt file -- you must do that manually)
Part 1: Simple Scheme Programming
Put your solution to this part in the file schemeTest.txt.
Implement the following functions in regular Scheme. All of these
functions should operate over integers and/or lists of integers. The
purpose of this part is just to make sure you understand Scheme before
you try to start writing an interpreter for it. You will find that
Scheme is very much the same as OCaml, but with different syntax, and
with dynamic (rather than static), typing. You might find these lecture
- Write a function double x that returns two times x.
- Write a function powof2 x that returns true (which is written
#t in Scheme) if and only if x is a power of 2.
(Hint: Use the functions / and mod.)
- Write a function sum l that returns the sum of the
integer list l, using recursion. Hint: You can
use the function pair? to determine whether a
value is a cons cell or not; and in this problem, you can assume if
something is not a cons cell, then it's '().
- Write a function applyToList f l that returns a new list
containing all the elements of l, in the same order, but
with f applied to them. Your implementation should use
recursion. You may not use Scheme's built-in map or
for-each functions in writing your map function. Note applyToList
directly takes 2 arguments (i.e., it is not using currying).
Part 2: Parsing Scheme
Put your solution to this part in the top half of file scheme.ml.
Your next task is to write a parser for Scheme expressions,
which in this case will be
a function that turns a string into a Scheme abstract syntax tree
(AST). Your parser will take as input a sequence of
tokens, produced by a scanner, which are the terminals
of the Scheme grammar, and output the Scheme AST.
We've supplied you with a function
tokenize : string -> token list that acts as a scanner/lexer,
converting the string input into a list of tokens,
represented by the following data type:
type token =
| Tok_Id of string
| Tok_Num of int
| Tok_String of string
For example, when called as tokenize "(foo (bar 3) 4
\"baz\")", the return value is
[Tok_LParen; Tok_Id "foo"; Tok_LParen; Tok_Id "bar"; Tok_Num 3;
Tok_RParen; Tok_Num 4; Tok_String "baz"; Tok_RParen]
What to do: You must write a function parse : token list ->
ast that takes as input a list of tokens (returned from
tokenize) and returns an AST. Once you have done this, you
can run it using the code in the file parse.ml.
You should use the idea of a recursive descent parser, as we
discussed in class. Thus we suggest you write two functions:
parse_S, which parses the non-terminal S representing
a single Scheme expression, and parse_L, which parses the
non-terminal L representing a list of Scheme expressions.
context free grammar for Scheme -- and notably for nonterminals
S and L -- is given next, followed by the OCaml type
definition ast of Scheme abstract syntax trees.
The grammar for Scheme expressions you need to support is particularly simple:
- S -> id | n | str | #t | #f | ( L )
- L -> S L | epsilon
- id are Scheme identifiers. Identifiers may contain upper- and
lower-case letters, digits, =, *, +,
/, < >, !, ?, and
-. For example, valid identifiers are Foo,
set!, <three, and +=. Identifiers may
not begin with a digit. These correspond to the Tok_Id token.
Identifiers are distinguished from integers and strings by their first
- n are Integers (made up of digits). These correspond to the
- str are Strings (beginning and ending with quotes).
For purposes of this project, only alphanumeric and whitespace
characters can appear within a string. These correspond to the
- #t and #f are tokens representing true and
false. These correspond to the tokens Tok_True and
- ( and ) are tokens (marking the beginning and end of
a list); these correspond to the tokens Tok_LParen and
For this project, Scheme expressions are represented using
an AST (abstract syntax tree) defined using the following
OCaml data type:
type ast =
| Id of string
| Num of int
| Bool of bool
| String of string
| List of ast list
For example, the Scheme s-expression (foo (bar 3) 4 "baz")
is represented by the AST List [Id "foo"; (List [Id "bar"; Num 3]);
Num 4; String "baz"]. Your AST nodes for strings should
not include quotes. For example, parse ["\"a\""]
should return String "a", i.e., a String constructed
from a string with one character, a.
Part 3: Scheme Interpreter
For part 3 of this project, your task is to write a evaluator for
Scheme abstract syntax trees. This evaluator will form the basis of your Scheme
interpreter. Put the code for the Scheme interpreter is in the bottom
part of scheme.ml.
What you will do: You will write a function eval that,
given an environment and an AST, evaluates the expression
corresponding to that AST in the given environment, producing a final
value. The type of eval is (string * value) list
-> ast -> value, where the first argument is the environment, the
second is the AST, and the result is a value. We present the OCaml
definition (and meaning) of values and environments below.
Note that what you will implement for this part corresponds very
closely to the operational
semantics for OCaml-like programs give in lecture, so that may
serve as a good reference (and this project may serve as a good way to
understand that lecture better).
Your Scheme interpreter should represent the value of Scheme
expressions using the following value user-variant data type:
type value =
| Val_Num of int
| Val_Bool of bool
| Val_String of string
| Val_Cons of value * value
| Val_Define of (string * value) list
| Val_Closure of ...
where the part labeled ... is for you to fill in. However, you must not
change the part of value we have given you, because our
grading scripts will look for exactly those constructors, with exactly
those arguments as given, to test your interpreter.
When doing this project, be sure to keep straight the
difference between textual entities that the programmer has written
down (like the text "3") with the resulting value that your
interpreter produces (Val_Num 3).
Your interpreter will need to maintain an environment to
store bindings of values to symbols. The interpreter will represent
the environment as an associative list of type
(string * value) list. How the associative list
is organized is up to you. The interpreter will need to maintain
a top-level environment of all definitions encountered so
far in the input program.
For example, eval  (Num 3) should return Val_Num
3, meaning that in an empty environment, an AST node containing
the number 3 evaluates to the integer 3.
Calling eval [("x", Val_Num 3)] (Id "x") should
also return Val_Num 3, since x is bound to 3 in the
environment used to evaluate x.
Scheme Language Features
Here we present the language features your Scheme interpreter needs to
- Basic Expressions
- Values. Your evaluation function should evaluate
integers, booleans, and strings to the corresponding values.
- Null. For this project, we will vary slightly from
Scheme and make null the built-in keyword for the empty
list. Thus null should evaluate to Val_null. In
actual Scheme, null is written '().
- Functions Calls
- In Scheme function calls appear as lists, where the name of the function
is the first element in the list, and the remaining list arguments are the
arguments passed to the function. E.g., (+ 1 2 3) calls the built-in
+ function and passes it three arguments, 1, 2, and 3.
- Your implementation must support call-by-value, so that arguments are
evaluated before they are passed to a closure. For example, consider
the following Scheme expression: (+ (- 1 2) 3). The arguments
to the + function must be first evaluated before they are passed to +.
In this case, it means the expression (- 1 2) must be
evaluated first, and its resulting value passed as the first argument to +.
- Built-in Functions
- Your interpreter should support a number of built-in Scheme
functions. Some of these functions can take more than one argument,
but you don't have to implement them with currying. You can just
make these special cases inside your evaluator.
- You do not have to treat primitive functions as first-class
values; they will only be invoked and applied to arguments
(and not passed as arguments or used as return values).
- boolean?, number?, string?,
pair?, and null?.
These built-in functions return true if their single argument is a
boolean, integer, string, cons cell, or null, respectively, and false
- Integer operations. +, -, *, and = on
- + sums its arguments, and should accept one or more arguments.
- - may take one or more of arguments.
It subtracts its 2nd through last argument from
its first argument; given only one argument, it compute unary
negation. For example, (- 3) evaluates to
Val_Num (-3), while (- 4 3) evaluates to
Val_Num 1 and (- 4 3 1) evaluates to
- * multiplies its arguments, and should accept one or more arguments.
- = compares its two arguments and evaluates to either
Val_Bool true or Val_Bool false. You may use
the OCaml = operator.
- If expressions. You should allow both (if cond tr
fl), which evaluates to tr if
cond is true and fl otherwise, and
(if cond tr), which evaluates to
Val_Null if cond is false.
- display. Prints its single argument, which is either
a number or a string. Do not add a trailing newline. If a string,
do not include any quotes around the string when it is printed.
Display should return the value null.
- Your interpreter should support lists constructed from multiple
calls to the cons function. The car and cdr
functions may be used to extract the head and tail of a list.
- cons - Here
(cons x y) should evaluate to the cons cell
Val_Cons(a, b), where a and b are
whatever x and y evaluate to, respectively.
- car -
When applied to a cons cell Val_Cons(a, b),
car returns a.
- cdr -
When applied to a cons cell Val_Cons(a, b),
cdr returns b.
- Top-level Definitions
- When presented with an identifier, your
interpreter needs to look it up. The first place it should look it up
is in the "local" environment, consisting of parameters of enclosing
- If an identifier is not in the local environment, the interpreter
should look it up in the top-level environment (the one that
- If an identifier is in neither environment, then it is unbound,
and trying to evaluate it should produce an error. Since we are
testing your code with only valid Scheme expressions, there will
not be any unbound variables in the submit server test cases.
- Scheme expressions use static lexical scoping to determine
variable bindings in the local environment. For this project,
user functions defined with the "dynamic" keyword will use
dynamic scoping instead to determine where variables are bound
in the local environment.
- User Defined Functions
- You should support creating anonymous functions
of one argument. We will only test user-defined functions
called with a single argument (though functions may use currying).
You need to support functions using both static (lexical) and dynamic
scoping. Both types of functions produce a Val_Closure value that
you must define.
- There is one case where a lambda user-defined function foo
may use dynamic scoping. A free variable x in foo that is also
not in the top-level environment will not be saved in the closure
for foo. The function foo may still be valid, if a define expression
is used to bind a value to x before foo is called. In this case,
the binding for x behaves as if foo used dynamic scoping.
This behavior is different from how OCaml's top-level environment
works, and is a result of the requirement that a Scheme expression
look up variable bindings in both the local and top-level environment
(and that define can add bindings to the top-level environment).
For OCaml, a free variable in foo that is not in the top-level
environment will cause an Unbound Variable error immediately.
The public tests include an example of this situation. In the body
of the user-defined lambda function bound to fact is a
recursive call to fact. Similar code in OCaml would fail
unless a "let rec" was used instead of "let". This code works
in Scheme by relying on the assumption fact will have
been added to the top-level environment using define by the time
the function is actually called.
Testing and Submission
We will test your project by calling your parsing and evaluation
functions directly, so be sure to give those functions the types we
expect, as given above. You can work on the interpreter and parser in
any order, we will test each part independently.
You may assume that all input test cases are syntactically
correct. If the input Scheme code is not legal you may perform any
action (e.g., exit, throw an exception).
All your code should be in two files, schemeTest.txt & scheme.ml.
You can submit your project in two ways:
Submit your schemeTest.txt & scheme.ml files directly to the
You can submit multiple files by putting the files in a .zip archive first.
On Windows you can select the two files, then right click to
select the "Send to->Compressed (zipped) Folder" option
to create a .zip archive.
Once your files are in a single zip archive,
bring up the upload dialog box by clicking on the
submit link in the column "web submission".
Select your archive file using the "Browse" button,
then press the "Submit project!" button.
The submit server now allows multiple files (from
the same directory) to be selected.
Bring up the upload dialog box by clicking on the
submit link in the column "web submission".
Browse to the directory containing your project
files, then click on both schemeTest.txt and scheme.ml.
Now press the "Submit project!" button.
You may also submit directly by executing a Java program on a computer
with Java and network access. Use the submit.jar file
from the archive p4.zip,
To submit, go to the directory containing your project, then either
execute submit.rb by typing:
or use the java jar directly using the following command:
java -jar submit.jar
You will be asked to enter your class account and password, then
all files in the directory (and its subdirectories) will be
put in a jar file and submitted to the submit server.
If your submission is successful you will see the message:
Successful submission # received for project 4
The Campus Senate has adopted a policy asking students to include the
following statement on each assignment in every course: "I pledge on
my honor that I have not given or received any unauthorized assistance
on this assignment." Consequently your program is requested to
contain this pledge in a comment near the top.
Please carefully read the academic honesty section of the
course syllabus. Any evidence of impermissible cooperation on
projects, use of disallowed materials or resources, or unauthorized
use of computer accounts, will be submitted to the Student
Honor Council, which could result in an XF for the course, or
suspension or expulsion from the University. Be sure you understand
what you are and what you are not permitted to do in regards to
academic integrity when it comes to project assignments. These
policies apply to all students, and the Student Honor Council does not
consider lack of knowledge of the policies to be a defense for
violating them. Full information is found in the course
syllabus---please review it at this time.
[an error occurred while processing this directive]
This course project is copyright of Dr. Michael Hicks. ©Michael Hicks . All rights reserved. Any redistribution or reproduction of part or all of the contents in any form is prohibited without the express consent of the author.