Question: Java Code -------------------------------------------------------------------------------------------- static Set getIdentifiers(String filename) throws Exception{ Set identifiers = new HashSet(); String state=INIT; // Initially it is in the INIT state. StringBuilder



Java Code
--------------------------------------------------------------------------------------------
static Set getIdentifiers(String filename) throws Exception{ Set identifiers = new HashSet(); String state="INIT"; // Initially it is in the INIT state. StringBuilder code = new StringBuilder(); BufferedReader br = new BufferedReader(new FileReader(filename)); String line; while ((line = br.readLine()) != null) { code=code.append(line+" "); } // read the text line by line. code =code.append('$'); //add a special symbol to indicate the end of file.
int len=code.length(); String token=""; for (int i=0; i if (state.contentEquals("INIT")){ if (isLetter(next_char)){ state="ID"; // go to the ID state token=token+next_char; } //ignore everything if it is not a letter }else if (state.equals("ID")) { if (isLetterOrDigit(next_char)) { //take letter or digit if it is in ID state token=token+next_char; } else { // end of ID state identifiers.add(token); token=""; state="INIT"; }
}
} return identifiers; }

3.1 A11: coding from scratch The first approach is the accomplish the task from scratch without using any tools. This approach also motivates the introduction of DFA in Assignment 2. Program A11.java is not supposed to use regular expressions, not regex package, not any methods involving regular expression in String class or other classes. Your program should use the most primitive method, i.e. look at characters one by one, and write a loop to check whether they are quoted strings, identifiers, etc. A simplified version of the algorithm can be depicted by Algorithm 20. It gets a set of identifiers from an input string 2. The algorithm starts with the initial ("INIT") state, and scans the characters one by one. Once it sees a letter, it goes to the "ID" state. In the "ID" state, it expects to see more letter or digits, until it sees a character other than letter or digit. At this point, it exits the "ID" states, and goes back to the initial state "INIT". The algorithm needs to be expanded to deal with quoted strings and keywords. For quoted strings, you can remove them first before you pick the identifiers. For keywords, you can check whether a token belongs to the keyword set before adding into the identifiers set. 5 6 8 Input: An input string 2. Output: a set of identifiers in 2 1 state="INIT"; 2 token=""; 3 identifiers={}; 4 while (c=nextChar())!=end_of_string_2 do if c isLetter then state="ID"; 7 token =token+c; end if state is "ID" then 10 if c is letter or digit then state="ID"; 12 token=token+c; end else add token to identifiers; token="". 17 state="INIT"; end 9 11 13 14 15 16 18 end 19 20 end Algorithm 1: The algorithm for obtaining identifiers from an input string. We provide the starter code for A11 as follows. You need to expand it to deal with quoted string and keywords. import java.io.FileReader; import java.io. Buffered Reader; import java.util. Set; import java.util. HashSet; a public class A11 { // check whether the char is a letter static boolean isLetter (int character) { return (character >= && character = 'A' && character = '0' && character INT, public static Set getIdentifiers (String filename) throws Exception { String() keywordsArray = { "IF "WRITE "READ", "RETURN BEGIN" "END" " MAIN" "REAL" }; Set keywords = new HashSet(); Set identifiers = new HashSet(); for (Strings : keywords Array) {; keywords .add(s); } String state="INIT"; // Initially it is in the INIT state. String Builder code = new String Builder (); Buffered Reader br = new Buffered Reader (new File Reader (filename)); String line ; while ((line = br.readLine()) != null) { code=code.append(line+" "); } // read the text line by line. code =code.append( '$'); //add a special symbol to indicate the end of file. int len=code.length(); String token=""; for (int i=0; i= 'a' && character = 'A' && character = 'O' && character