CMSC 330, Spring 2015

Organization of Programming Languages

Project 2 - Finite Automata Interpreter

Due 11:59pm Wed, Mar 4th, 2015 Thu, Mar 5th, 2015

Introduction

You will need to implement code to build up NFAs from regular expressions, convert them to DFAs, and test whether a string is accepted by a DFA. You will do this by extending a provided FiniteAutomaton class that operates as part of a simple finite automata interpreter.

Getting Started

Download the following archive file p2.zip and extract its contents.

Along with files used to make direct submissions to the submit server (submit.jar, .submit, submit.rb), you will find the following project files:

To test your FiniteAutomaton implementation, you can execute it on input files from the command line by typing commands like ruby fa.rb < public_size.in. When you compare your output to the output in files like public_size.out, be aware that for full credit you only need to match lines that do not begin with %; these lines are effectively comments to the grading script. Nevertheless, this output should help you debug your implementation, to make sure you are constructing automata that have the right number of states, transitions, etc. even if the state names differ.

The fa.rb file you downloaded already implements a simple interpreter for commands to build/transform finite automata, print out various properties about them (such as the number of states they have, or the strings they accept), and use them to accept/reject particular strings. The interpreter operates using a stack, where each object on the stack is a finite automaton. Commands pop zero, one, or two finite automata from the stack and push zero or one finite automata on stack as a result, along with potentially printing output. Each command is separated by a whitespace: there is nothing special about line breaks. The interpreter (implemented in top-level function interpreter) works by creating new FiniteAutomaton objects and invoking methods on them.

The interpreter will accept the following commands:

  • symbol where symbol is a lowercase letter or the uppercase E. In the former case this command creates finite automaton that accepts just that letter, and in the latter case it creates an automaton that accepts just the empty string. The new automaton is pushed onto the stack.
  • . pops off top 2 finite automata on the stack and creates a new NFA representing their concatenation, pushing this new NFA on to the stack
  • | pops off top 2 finite automaton on stack, creates a new NFA representing their union, pushing this new NFA on to the stack
  • * pops off top finite automaton on stack, creates a new NFA representing its closure, pushing this new NFA on to the stack
  • SIZE prints # of states in the finite automaton at the top of the stack (without popping it off)
  • PRINT prints the finite automaton at the top of the stack (without popping it off). All output from this command is preceded by %, so it is just for documentation & debugging
  • STAT prints some statistics about the finite automaton at the top of the stack (without popping it off)
  • DFA converts finite automaton (DFA or NFA) at the top of the stack to a DFA
  • "str" using finite automaton (must be a DFA) at the top of the stack, decide whether to accept or reject string str
  • GENSTR# prints all strings accepted by finite automaton (must be a DFA) at the top of the stack of length # or less
  • COMPLEMENT converts finite automaton (must be DFA) on top of stack to a DFA that is its complement (i.e., rejects strings accepted by the previous DFA, and accepts strings rejected by the previous DFA)
  • DONE interpreter exits

Example

Here is an example session with the interpreter:
a
b
.
PRINT
SIZE
c
|
*
DONE
The first line creates an FA that accepts exactly the string a, and pushes it on the stack. The second line creates an FA that accepts exactly the string b, and pushes it on the stack. The third line pops these two automata off of the stack, and constructs a new automaton that accepts the concatenation of strings accepted by the two, i.e., ab. This new automaton is pushed back on the stack, and is now the only automaton on the stack.

The next command prints out this automaton (leaving it on the stack), producing output like the following:

% Start 0
% Final { 3 }
% States { 0 1 2 3 }
% Alphabet { a b }
% Transitions {
%  (0 a 1)
%  (1  2)
%  (2 b 3)
% }
Note that different implementations might produce different output in this case, most notably because they could choose different names for states.

The next command prints the number of states in the topmost automaton (leaving it on the stack), in this case 4. The next command constructs an automaton that accepts the string c and pushes it on-stack. The subsequent command pops off this automaton and combines with the the ab automaton also on-stack to produce the automaton implementing ab|c, pushing it on-stack. The next command creates a new automaton from this one, implementing (ab|c)*. The final command terminates the session.

As mentioned, linebreaks are immaterial. Exactly the same results would be produced by the input

a b . PRINT SIZE c | * DONE

Part 1: Extend the FiniteAutomaton class to support NFAs

The FiniteAutomaton class in fa.rb is based on the DFA class used as an example during discussion section. The original DFA class is able to represent DFAs and run them to determine whether input strings are accepted by the DFA. You need to decide how to change/extend this FiniteAutomaton class to represent both NFAs and DFAs. In particular, your modified class should support additional features allowed by NFAs, such as epsilon-transitions and multiple transitions with the same label. You may choose to represent epsilon-transitions as transitions labeled with the empty string "" (as we do in our reference implementation used to produce the sample public outputs) or some other label.

If an automaton is deterministic (i.e., it represents a DFA), it should be able to run on input strings; i.e., the accept? method should return true if your FiniteAutomaton object is a DFA and it accepts the string; the accept? method returns false if the automaton does not accept the string. We will not test the accept? method on NFAs.

Part 2: Building NFAs out of other NFAs

The symbol! method for creating a new NFA for an individual symbol is provided. You need to add the ability to construct more complex NFAs from these single-symbol NFAs. Three of the commands of the interpreter require the ability to create new NFA from existing NFA, using one of the following actions: concatenate, union, and closure. You need to implement the same functionality for NFAs represented by the FiniteAutomaton class in the methods concat! union! closure!. You must construct the new NFA using the algorithm discussed in lecture. In particular, given two NFA a and b as follows:
You should be able to use the algorithm described in class to create a single NFA representing:

NFA

Figure

ab
a|b
a*

Note that though there are other algorithms for generating correct NFA for these operations, using them will yield a different NFA that will not pass the submit server tests. Your methods may (destructively) modify the current NFAs a and b. You can assume there is only one start state for each NFA.

Part 3: Implement utility functions

You should ensure that the existing code provided for the SIZE, PRINT, and GENSTR# commands will work with the extensions you made to the FiniteAutomaton class to support NFAs. Make whatever code changes are necessary. Implement code to support the STAT command. The STAT command should output the following statistics about the finite automaton: the number of states, the number of final states, the number of transitions, and the number of states found with n (outgoing) transitions (in ascending order by n), using the following format:
FiniteAutomaton
  4 states
  1 final states
  3 transitions
    1 states with 0 transitions
    3 states with 1 transitions
If no states are found with n transitions for some value n, do not output the line "0 states found with n transitions".

Make sure all utility functions except the GENSTR# command work for both DFAs and NFAs. These commands are useful for debugging your code during development, and are also used on the submit server to check the correctness of your solution automata.

Part 4: Reducing NFA to DFA

Once you have a NFA, implement code that can reduce it to a new finite automaton representing the DFA created by the subset reduction method presented in class. I.e., the method should return a new FiniteAutomaton object that will accept only the strings accepted by the current FiniteAutomaton object, but which does not have epsilon-transitions or multiple transitions from a state with the same label. You should probably not find it necessary to modify the existing NFA while reducing it to a DFA.

Part 5: Build complement of a DFA

Finally given a DFA x, implement code that can build a new DFA y which is its complement. I.e., y rejects all strings accepted by x, and accepts all strings rejected by x. You must use the algorithm discussed in lecture. First add an explicit dead state and make explicit all transitions to it. Second change all final states to non-final states, and all non-final states to final states. Your methods may (destructively) modify the current DFA when building its complement DFA.

Submission

You can submit your project in two ways:
  • Submit your fa.rb file directly to the submit server by clicking on the submit link in the column "web submission".

    Next, use the submit dialog to submit your fa.rb file directly.

    Select your file using the "Browse" button, then press the "Submit project!" button. You do not need to put it in a Jar or Zip file. Some students have mentioned problems with using Internet Explorer, because submissions being extracted in directories (e.g., "C:\My Documents\330\fa.rb") where the submit server could not find them. The problems went away when switching to the Mozilla Firefox browser.

  • Submit directly by executing a Java program on a computer with Java and network access. Use the submit.jar file from the archive p2.zip, To submit, go to the directory containing your project, then either execute submit.rb or type the following command directly:

    java -jar submit.jar

    You will be asked to enter your class account and password, then all files in the directory (and its subdirectories) will be put in a jar file and submitted to the submit server. If your submission is successful you will see the message:

    Successful submission # received for project 2

Hints and Tips

  • Be sure you have read and understand the project grading policies in the course syllabus. Do this well in advance of the project due date.

    Academic Integrity

    The Campus Senate has adopted a policy asking students to include the following statement on each assignment in every course: "I pledge on my honor that I have not given or received any unauthorized assistance on this assignment." Consequently your program is requested to contain this pledge in a comment near the top.

    Please carefully read the academic honesty section of the course syllabus. Any evidence of impermissible cooperation on projects, use of disallowed materials or resources, or unauthorized use of computer accounts, will be submitted to the Student Honor Council, which could result in an XF for the course, or suspension or expulsion from the University. Be sure you understand what you are and what you are not permitted to do in regards to academic integrity when it comes to project assignments. These policies apply to all students, and the Student Honor Council does not consider lack of knowledge of the policies to be a defense for violating them. Full information is found in the course syllabus---please review it at this time.

    Copyright Notice

    This course project is copyright of Dr. Michael Hicks. ©Michael Hicks [2015]. All rights reserved. Any redistribution or reproduction of part or all of the contents in any form is prohibited without the express consent of the author.

  • Web Accessibility