------------------------------------------------------------------------ CMSC 430 Project #1 You are to implement software that converts a regular expression into an NFA, converts the NFA into a DFA and then minimizes the DFA. The program is to execute the minimized DFA on a number of strings. For each string, it should print the list of states the minimized DFA goes through, and then print either "accept" or "reject". Half of the credit for the programming assignment will be based on handling provided examples correctly. The other half will be based on other examples. You should use the support software provided in the proj1 directory on the cluster. The only code you should need to modify is the TG.java file. You will be using Java class libraries from Java 1.5.0. Don't worry if your NFA or your non-minimized DFA don't exactly match the one generated by the ``official'' solution. You only need to worry that your minimized DFA and your execution match. Remember that all minimized DFA's are isomorphic (i.e., nodes and edges can be mapped onto each other exactly), but your algorithm may assign different state numbers to a node than the official solution. This is acceptable. ************ COMPILER NOTES Use Java version 1.5.0, since it is the version that will be used to grade your submissions. ********* FILES ---- You should look at these files ---- TG.java - The main program test*.in - Test case inputs test*.log - Results of running your DFA minimizer on test inputs test*.out - Results demonstrating correct operation goAll - Script to compile everything from scratch goTest - Script to run DFA minimizer on test input files ---- You don't have to worry about these files ---- re.lex - The scanner specification for regular expressions re.cup - The grammar specification for regular expressions Yylex.java - Code produced by scanner generator (JLex) sym.java parser.java - Code generated by parser generator (CUP) ************ GENERAL INFORMATION 1) The project will require both JLex and CUP to work. - If you are working on your own system (Windows/Linux), download and install Java 1.5.0. Links to Java files for Java 1.5.0 are available from the CMSC 430 web page. Next download proj1.tar or proj1.zip from the class web page. Untar/unzip the file, then install JLex & CUP and place them in a directory in CLASSPATH (e.g., C:\Java for Windows). Project 1 files are in proj1. Example (on Windows): Download & install Java 1.5.0 SDK Download proj1.zip and extract Install JLex and CUP to C:\Java, creating C:\Java\JLex and C:\Java\java_cup Put in your environment (e.g., autoexec.bat) set CLASSPATH=.;C:\Java set PATH=.;C:\\bin; Go to C:\Java and compile JLex and CUP by typing javac JLex\Main.java javac java_cup\*.java java_cup\runtime\*.java 2) All the code you need to write is in the file TG.java. I have already prepared most of the code. What you need to do is provide bodies to the seven functions at the bottom of the file. Start with union(), concat(), and closure(). Leave minimize() for last. 3) You can write additional functions/methods for TG.java (besides the original functions) to act as helpers for your code. Doing so may be very helpful for the minimize() function. 3) You can compile/build your DFA using the script "goAll" on UNIX and "goAll.bat" on DOS Windows systems. It will build the parser, 4) Once the DFA is built, you can test it by using the script "goTest" on UMIX and "goTest.bat" on DOS Windows systems. The script will test the DFA on some input files, putting the output in log files. 5) You should be able to test your code at each step of the way. The existing code will print out the status of the Transition Graph (TG) at each step of the process. The format is as follows: ------NFA Transition Graph 0: a -> 1 // state 0, edge from 0 to 1 (on 'a') 1: epsilon -> 2 // state 1, edge from 1 to 2 (on empty str) 2: epsilon -> 3 , b -> 1 // state 2, edges from 2 to 3 & 1 (on e, 'b') 3: // state 3, no edges start state: 0 final states: { 3 } alphabet: { b,a } 6) You can debug your minimizer by looking at the output of your .log file and comparing against the provided .out files. The results will probably be mostly different, since different TGs may be numbered differently. The main parts in which the two areas need to match are 1) number of states in the minimal DFA, and 2) accept/reject descision for each input string. ************ HINTS 1. You need to implement ALL of the functions in TG.java for the linking to be successful (to get the executable). 2. However, during development and testing, you needn't write COMPLETE implementations of the functions. Nevertheless, you should at least return valid instances of the return types. 3. Do the DFA minimization step last. You can test whether you are generating correct DFAs by testing different inputs, even if the DFA is not minimized. 4. Implementing the algorithms for this project involves the careful application of a few non-trivial data types, such as sets. You should learn how to use Java class libraries and tools. HashSets and HashMaps are particularly useful for the project.