Question: Your assignment is to use Java to write a recursive descent parser for a simplified HTML language. The lexical syntax is specified by regular expressions

 Your assignment is to use Java to write a recursive descent

parser for a simplified HTML language. The lexical syntax is specified by

regular expressions definition Token Extended regular e STRING (LETTER | DIGIT)+ KEYWORD

Your assignment is to use Java to write a recursive descent parser for a simplified HTML language. The lexical syntax is specified by regular expressions definition Token Extended regular e STRING (LETTER | DIGIT)+ KEYWORD | | In the above, LETTER is any lower or upper-case letter and DIGIT is any digit. An arbitrary number of whitespace can appear between tokens. You may already know that . is the tag for bolded text in HTML, .. . for italicized text,

    .. .

for an uordered list and

  • . .
  • for a list item. Note in the above syntax we use a notation for non-terminals that is different from the notation we used in lectures. The reason is that symbols for non-terminals. Instead, we use names in upper-case letters for non-terminals. Therefore, in the above token syntax, KEYWORD is a non terminal, while is a string of terminals that starts with terminal Using the same notation, the syntax of the simplified HTML language is specified by the following E-BNF grammar, where WEBPAGE is the start non-terminal: WEBPAGE-> TEXT TEXT-> STRING | TEXT I TEXT |

      { LISTITEM }
    LISTITEM -
  • TEXT 1i> Note that and are meta-symbols in E-BNF An example expression in the language is as follows: google yahoo/x/i/> This programming project is broken down into the following series of tasks. Your assignment is to use Java to write a recursive descent parser for a simplified HTML language. The lexical syntax is specified by regular expressions definition Token Extended regular e STRING (LETTER | DIGIT)+ KEYWORD | | In the above, LETTER is any lower or upper-case letter and DIGIT is any digit. An arbitrary number of whitespace can appear between tokens. You may already know that . is the tag for bolded text in HTML, .. . for italicized text,
      .. .
    for an uordered list and
  • . .
  • for a list item. Note in the above syntax we use a notation for non-terminals that is different from the notation we used in lectures. The reason is that symbols for non-terminals. Instead, we use names in upper-case letters for non-terminals. Therefore, in the above token syntax, KEYWORD is a non terminal, while is a string of terminals that starts with terminal Using the same notation, the syntax of the simplified HTML language is specified by the following E-BNF grammar, where WEBPAGE is the start non-terminal: WEBPAGE-> TEXT TEXT-> STRING | TEXT I TEXT |
      { LISTITEM }
    LISTITEM -
  • TEXT 1i> Note that and are meta-symbols in E-BNF An example expression in the language is as follows: google yahoo/x/i/> This programming project is broken down into the following series of tasks
  • Step by Step Solution

    There are 3 Steps involved in it

    1 Expert Approved Answer
    Step: 1 Unlock blur-text-image
    Question Has Been Solved by an Expert!

    Get step-by-step solutions from verified subject matter experts

    Step: 2 Unlock
    Step: 3 Unlock

    Students Have Also Explored These Related Databases Questions!