CMSC 631, Spring 2009

Program Analysis and Understanding

Project 2

Due February 25, 2009
11:59:59pm (via the submit server)


  • None


In this project, you will implement a data flow analysis for a simple imperative programming language.

What to do

Grab this file to get started. When you unpack the tarball and type make, this will build an executable impflow. The main function (in invokes the parser to build an abstract syntax tree. After the file is parsed successfully, it will either be printed or analyzed, depending on the command-line options provided. Invoke impflow -help for usage information.

You should submit the file that contains your implementation of the function analyze : Imp.command -> analysis -> unit. This function takes as its argument a command (which itself is possibly a compound list of commands) and an analysis to perform, according to the following type (defined in

type analysis =
  | ReachDefs
  | AvailExps
  | BusyExps
This function will then analyze the given command using the specified analysis and print out the results before returning. Which argument is provided depends on the command-line options. For example, if impflow is called with -live then it will invoke analyze with the argument LiveVars; similarly for the other three analyses.


The command language in has been modified slightly so that commands are defined using two types:
type raw_comm =
| Assign of string * exp
| If of exp * comm * comm
| While of exp * comm

and comm = 
  Stmt of int * raw_comm
| Block of comm list
Commands are either a single command or a list of commands (a block). Each single command is given a unique number, assigned starting from 0 based on the order of parsing (which is sometimes slightly different than the order in the file). You will use the numbers in producing output to be graded. For compound commands (i.e., "if" and "while"), the number for the command can be thought of as referring to its guard expression.

We have also extended expressions to include two relational operators.

The Analyses

The first step is to create a control flow graph; how to do this is up to you. I recommend that you create an explicit entry and exit node. The entry node precedes the first statement of the program, and the exit node follows the last statement.

You will implement four dataflow analyses that we discussed in class.

  • Live variables (option -live) -- for every variable, determine for every statement whether the variable may be subsequently used. Your output should be of the form {x,y,z} where x, y, and z are the variables live at that point, and no other variables are live.
  • Reaching definitions (option -reach) -- for each statement that assigns to a variable, determine which statements the assignment may reach. Your output should be of the form {n1,n2,n3}, where ni are the numbers of defining statements that reach that point.
  • Available expressions (option -avail) -- for each expression e1 + e2 or e1 * e2 determine which expressions are available at each statement. Your output should be of the form {e1,e2,e3}, where ei are the expressions available at that point.
  • Very busy expressions (option -busy) -- for each expression, determine which are very busy at each program point. Your output should be of the form {e1,e2,e3}, where ei are the expressions that are very busy at that point.
You'll notice that these four analyses cover the gamut of may vs. must and forward vs. backward. Thus it behooves you to make your dataflow analysis suitably generic to easily cover these four cases. Hint: the OCaml generic Hashtbl module is your friend ...


The output from your analysis should be of the following format: For each numbered command, output on a new line
#n <facts>
where n is the statement number and facts is the set of computed dataflow facts (following the format described above). For forward problems, the printed facts should be those that hold immediately at the point after the statement. For backward problems, the printed facts should be those that hold immediately at the point before the statement.

Below are the results of running the four analyses on test1.imp. Warning: I produced these results by hand, so there could be small typos in them. Use your own understanding of the analyses to verify the output.

Live variables:

#0 {b,c}
#1 {b,c,t1}
#2 {t2,b,c}
#3 {t2,t3,b}
#4 {t2,t4}
#5 {t5}

Reaching definitions:

#0 {0}
#1 {0,1}
#2 {0,1,2}
#3 {0,1,2,3}
#4 {0,1,2,3,4}
#5 {0,1,2,3,4,5}

Available expressions:

#0 {-1 * c}
#1 {-1 * c,b * t1}
#2 {-1 * c,b * t1}
#3 {-1 * c,b * t1,b * t3}
#4 {-1 * c,b * t1,b * t3,t2 + t4}
#5 {-1 * c,b * t1,b * t3,t2 + t4}

Very busy expressions:

#0 {-1 * c}
#1 {-1 * c,b * t1}
#2 {-1 * c}
#3 {b * t3}
#4 {t2 + t4}
#5 {}

The order in which you print the statements does not matter. You can also adjust spacing however you like. Please list the dataflow facts for each statement on a separate line, however.

Extra Credit

For extra credit, implement your dataflow analysis efficiently using bit-vectors. If you choose to do this, write a plain text file EXTRA-CREDIT and include in it an explanation of how your bit-vector code works. You must worry about the case where there are more facts than bits in the size of a word.

Academic Integrity

The Campus Senate has adopted a policy asking students to include the following statement on each assignment in every course: "I pledge on my honor that I have not given or received any unauthorized assistance on this assignment." Consequently your program is requested to contain this pledge in a comment near the top.

Please carefully read the academic honesty section of the course syllabus. Any evidence of impermissible cooperation on projects, use of disallowed materials or resources, or unauthorized use of computer accounts, will be submitted to the Student Honor Council, which could result in an XF for the course, or suspension or expulsion from the University. Be sure you understand what you are and what you are not permitted to do in regards to academic integrity when it comes to project assignments. These policies apply to all students, and the Student Honor Council does not consider lack of knowledge of the policies to be a defense for violating them. Full information is found in the course syllabus---please review it at this time.

Valid HTML 4.01!