CMSC 631, Fall 2010

Program Analysis and Understanding

Project 2

Due September 29, 2010
11:59:59pm

Updates

  • Sep 25 - Clarification: when printing out a counterexample, print out assignments to all symbolic integers used in the program up to that point.
  • Sep 24 - Change to project description - to keep things simpler, do not print out array values in counterexamples; see the forum for a posting about this.
  • Sep 20 - Clarification: all ints in this project are unsigned. Implement unary negation -x as (0-x).
  • Sep 15 - Reposted p2.tar.gz without .cmo, .cmi, .cmx, .cma, and .cmxa files.
  • Sep 14 - Reposted p2.tar.gz without .o or .a files.

Introduction

In this project, you will write a symbolic executor for a very simple language. The programming language is essentially three-address code, as might be found in the intermediate representation of a compiler. The grammar for statements in the language is as follows:

stmt ::=
| prestmt                /* unlabeled statement */
| l: prestmt             /* labeled statement */

prestmt ::=
|  x := a1 bop a2        /* binary operation */
|  x := uop a            /* unary operation */
|  x := a                /* copy statement */
|  x := newarray         /* array (of unsigned int) initailization */
|  x[a1] := a2           /* array write */
|  x := y[a]             /* array read */
|  goto l                /* unconditional branch */
|  if x rop y goto l     /* conditional branch */
|  assert x rop y        /* assertion */
|  x := symint           /* create a symbolic, 16-bit, unsigned integer */
|  x := symarray         /* create an symbolic array of 16-bit unsigned ints */

a, a1, a2 ::=            /* arguments */
| x                      /* variable */
| n                      /* natural number (16-bits) */

bop ::= + | - | * | /    /* binary operations */
uop ::= -                /* unary operations */
rop ::= == | != | < | > | >= | <=   /* relative operations */

x, y ::= variables (strings)
l ::= labels (strings)
n ::= natural numbers

Statements are optionally labeled (labels are just strings). Most of the staements are self-explanatory. The language includes arrays, which are created with the statement x := newarray, which creates a new array of unbounded size. Arrays may only contain numbers; there are no arrays of arrays. Integers are 16 bits and unsigned. Arrays are indexed from 0, and array indices are 16-bit unsigned integers. You can assume that all variables and array elements are initialized before use. (We use 16-bit numbers so they easily fit within an OCaml integer, which is 31 bits wide.)

Code Outline

Here is a code skeleton for this project: p2.tar.gz

We've supplied you with some basic code for working with the above language. We have provided a parser that takes text in the input language and turns it into an OCaml data structure, defined in se.ml, representing the code. We also have provided an unparser that takes OCaml data structures representing statements and prints them.

Here is a brief overview of the code we've given you:

  • camlidl/, ocamlstp/, stp/ - STP code and interface to OCaml
  • err.ml - keeps track of the current line during parsing
  • lex.mll - lexer for input language
  • parse.mly - parser for input language
  • unparse.ml, unparse.mli - an unparser, which turns an AST back into program text
  • se.ml, se.mli - a skeleton for your symbolic execution code
  • stp_example.ml - sample code that uses STP
  • main.ml - driver code that calls the analysis

To try out this skeleton code, cd into the top-level directory and type make. Then you should be able to run ./se -unparse test1.tac which should parse the file test1.tac and then unparse it to standard out. Note that each statement in the input file has been assigned a unique number, which you'll use in the output below. You can also run ./se -stp-example to run the example STP client in stp_example.ml. (Hint: you may find it useful to experiment with this file to learn about STP. Feel free to change this file however you like; we won't use this code in grading.)

Note: If you are building on junkfood, you have to use a more recent version of gcc by adding /usr/local/gcc-4.4.3/bin to your PATH and running the project as env LD_LIBRARY_PATH=/usr/local/gcc-4.4.3/lib ./se ....

Symbolic Executor

Your task is to implement a symbolic executor for this language. You must implement function se : stmt list -> unit in file se.ml. You need not put all the code you write in just that file; feel free to add additional files as you see fit, and modify the Makefile appropriately.

Your symbolic executor should start executing in the empty environment (no variables exist yet) with the first statement in the program. Execution of a program ends when there are no more statements to execute (i.e., when execution "falls off the end" of the program). Your executor should explore all possible paths through the program, forking whenever necessary. (But you shouldn't actually fork the executor, unless you really want to. I recommend maintaining a work-list of executions intsead.) Use STP to determine which branches are feasible, and whether assertions pass.

If all executions through program pass all assertions, your program should simply print Pass and exit. Otherwise, do not pass Pass. Each time the assertion at line n fails during symbolic execution, your program should print Fail n on one line. Then the next lines should list an assignment that triggers the failure. The assignment should be expressed as tab-indented lines either of the form x: v or x[j] : v, where x is the assigned-to variable name on the line where the symbolic integer or array was created, j is the array index (for arrays) and v is the value. For example, if the input program is

a := symint;
b := symint;
assert a < b;
the output could be
Fail 2
	a : 10
        b : 5

Since there can be many assertion failures, your program could print many Fail cases. You should print out an assignment to all symbolic integers used up to that point in the program. Your symbolic executor should stop executing the current path when it reaches an assertion failure.

Hint: I strongly suggest you start by only worrying about the concrete semantics of programs, handle symbolic integers next, and then finally handle symbolic arrays.

Hint: There are many ways to design the symbolic executor. One possibility is to write a function step : stmt list -> state -> state * (state option), where the first arugment is the program (you could also use a different representation than the statement list), and state captures the current program counter, path condition, and environment (variable bindings). Then the left output state component contains the new program counter, path condition, and environment after executing a single statement. The right output state component contains the same information for the other path to take, if any.

What to Submit

Submit a complete project that we can compile with make and run. You may add more files to your project.

Academic Integrity

The Campus Senate has adopted a policy asking students to include the following statement on each assignment in every course: "I pledge on my honor that I have not given or received any unauthorized assistance on this assignment." Consequently your program is requested to contain this pledge in a comment near the top.

Please carefully read the academic honesty section of the course syllabus. Any evidence of impermissible cooperation on projects, use of disallowed materials or resources, or unauthorized use of computer accounts, will be submitted to the Student Honor Council, which could result in an XF for the course, or suspension or expulsion from the University. Be sure you understand what you are and what you are not permitted to do in regards to academic integrity when it comes to project assignments. These policies apply to all students, and the Student Honor Council does not consider lack of knowledge of the policies to be a defense for violating them. Full information is found in the course syllabus---please review it at this time.

Valid HTML 4.01!