CMSC 430, Spring 2011

Introduction to Compilers

Project 2 - C-- : Type Checker & AST

Due 11:59pm Fri, March 11th, 2011

Introduction

In this project you will add syntax-directed translation to your C-- parser in order to build symbol tables, perform type checking, and construct an abstract syntax tree (AST) intermediate representation of the input program.

Getting Started

Project 2 will be implemented in Java in Eclipse. Download the following file:
  • Eclipse project archive file p2.zip
You can import the project into Eclipse as an existing project in an archive file.

Changes

See project 1 description for description of project files and syntax/semantics of C--. Compared to project 1, there have been a few changes to mycc.cup:

  • Added a number of actions to build basic symbol table & AST.
  • Uncommented calls for building AST and classFile in header.
There are also a number of files that have been added/modified since project 1. For this project, you should replace the skeleton mycc.lex file with your mycc.lex file from project 1. You should compare the skeleton mycc.cup file with your mycc.cup file from project 1 and integrate the changes. For all other files, use the version from project 2.

Requirements

Assuming you passed all the test cases for project 1 (minus error checking), the only changes you need to make for project 2 should be to add/modify action routines in mycc.cup.

You may assume that all input C-- programs in the test cases are syntactically correct.

Symbol Tables

Symbol tables are used to collect and store information on all identifiers (variables, functions) in the program. A variable can be local, global, or passed as a parameter. To support nested lexical scoping, you must link together multiple symbol tables in a hierarchical manner. Use the symTab and symTabEntry classes to construct symbol tables in the C-- compiler. Each nested scope must have its own symbol table. References to symbol tables for each function are stored in its symTabEntry.

Goals:

  • Build a symbol table for each lexical scope
  • Link up symbol tables appropriately
    • Functions and global variables in top-level symbol table
    • Symbol table for function in its symTabEntry
    • Symbol tables for local scopes in program order (order seen in program text)
  • Have your symbol tables print out in the manner expected in the test*.out files

Abstract Syntax Tree

An abstract syntax tree (AST) is an intermediate representation of a program. It can be constructed during a bottom-up parse. Use the expNode, astNode, and astNodeList classes to construct ASTs in the C-- compiler.

You may assume ASTs will be examined only for C-- programs with no type errors.

Goals:

  • Build the AST for the input program
  • Have your AST print out in the manner expected in the test*.out files

Type Checker

The type checker makes sure that all variables accessed are legally declared and have proper types. It also provides warning of undeclared and redeclared variables. Use information from symbol tables to perform type checking in the C-- compiler.

Most type errors are straightforward. One tricky case in C-- is handling type errors for the "==" and "!=" operators, which can take either two integer operands, or two boolean operands. If types of the two operands differ, assume the type of the first operand is the intended type (i.e., "if (a < b) == 2" would assume "2" should be a boolean, while "if (2 == (a < b)" would assume "a < b" should be an integer.

If the forward declaration for a function does not match the actual function definition, go with the function definition once it is compiled.

Goals:

  • Report the following type errors using parser.msg():
    • "undeclared variable x" // x is used before it is declared
    • "redeclaration of x" // x is declared more than once in same scope
    • "requires integer" // for operators such as + * < ...
    • "requires boolean" // for operators such as &&, ||, IF, WHILE
    • "misuse of x" // misuse x[i], x(i), x
    • "illegal subscript" // non-integer array subscript
    • "parameter mismatch" // e.g., declared as void foo(int x), invoked as foo()
    • "return mismatch" // return type different from function declaration
    • "forward decl mismatch x" // mismatched return type or parameter

Main Classes

parser

Generated by CUP, this is the class which performs the actual parse. All action code in your CUP grammar are executed as methods of class "parser". Important public fields include globalSymTab, currSymTab, and curType. These fields are used to store information to be transferred between different actions in CUP. The "parser" class also contains main(), the starting point of the user code.

expNode

Stores information for individual instructions. Basic unit of abstract syntax tree. Contains a number of fields whose contents vary depending on node type.
  • Fields:
    • int nodeType; // type of the node: Const.INSTRUCTION, Const.IDENTIFIER, Const.INTEGER, Const.STRING, Const.ARRAY, Const.CALL
    • int expType; // type of the variable: sym.INT, sym.VOID, sym.BOOL
    • int val; // value of integer constant
    • String str; // content of literal string
    • symTabEntry ident; // pointer to the symbol table entry
    • int opcode; // opcode of the instruction
    • expNode op1; // operand 1
    • expNode op2; // operand 2
  • Node types:
    • Const.INTEGER // expType = sym.INT; val = value
    • Const.STRING // expType = sym.VOID; str = literal string
    • Const.ARRAY // expType = sym.INT; op1 = array subscript
    • Const.IDENTIFIER // expType = sym.INT or sym.VOID; ident = symTabEntry for identifier
    • Const.CALL // expType = sym.INT or sym.VOID; op1 = name of function; op2 = function argument (may be null)
    • Const.INSTRUCTION // expType = sym.INT, sym.VOID, or sym.BOOL; opcode = code for instruction ; op1 = first operand ; op2 = 2nd operand

astNode

Node of the abstract syntax tree. Contains a number of fields whose contents vary depending on node type.
  • Fields:
    • int treeType; // type of the tree
    • expNode treeExp; // useful for if, while, instructions
    • astNode node1; // useful for if and while
    • astNode node2; // useful for else
    • astNodeList nlist; // useful for block stmt
  • Node types:
    • TREE_INSTR // treeType = Const.TREE_INSTR expNode = instruction
    • TREE_IF // treeType = Const.TREE_IF treeExp = condition node1 = if block statement node2 = else block statement
    • TREE_WHILE // treeType = Const.TREE_WHILE treeExp = condition node1 = loop body
    • TREE_BLK // treeType = Const.TREE_BLK nlist = list of block statements
    • TREE_PROC // treeType = Const.TREE_PROC treeExp = name of function node1 = block statement body

Submission

You can submit your project in two ways:
  • Submit your mycc.lex and mycc.cup files by first putting both files in a single .zip or .jar archive. The files should be at the top level of the archive, not in a subdirectory.

    You may create a jar archive containing mycc.cup & mycc.lex by going to the directory containing your project, then either execute goJar (or goJar.bat) or type the following command directly:

    jar cf p2.jar mycc.cup mycc.lex

    You can submit the resulting p2.jar archive to to the submit server by clicking on the submit link in the column "web submission".

    Select the p2.jar archive using the "Browse" button, then press the "Submit project!" button.

  • Submit directly by executing a Java program on a computer with Java and network access. Use the submit.jar file from the archive p2.zip, To submit, go to the directory containing your project, then either execute goSubmit (or goSubmit.bat) or type the following command directly:

    java -jar submit.jar

    You will be asked to enter your class account and password, then all files in the directory (and its subdirectories) will be put in a jar file and submitted to the submit server. If your submission is successful you will see the message:

    Successful submission # received for project 3

Academic Integrity

Please carefully read the academic honesty section of the course syllabus. In particular, remember that you are not allowed to show or copy any code for your project, either from the web or other students. Any evidence of impermissible cooperation on projects, use of disallowed materials or resources, or unauthorized use of computer accounts, will be submitted to the Student Honor Council, which could result in an XF for the course, or suspension or expulsion from the University. Be sure you understand what you are and what you are not permitted to do in regards to academic integrity when it comes to project assignments. These policies apply to all students, and the Student Honor Council does not consider lack of knowledge of the policies to be a defense for violating them. Full information is found in the course syllabus---please review it at this time.