Question: # RegEx Parser (top-down) # # Grammar: (c is a terminal representing a single letter) # e -> alt # alt -> seq {'|' seq}

 # RegEx Parser (top-down) # # Grammar: (c is a terminal # RegEx Parser (top-down) # # Grammar: (c is a terminal representing a single letter) # e -> alt # alt -> seq {'|' seq} # seq -> rep {rep} # rep -> atom ['*'] # atom -> '(' e ')' | c # # Usage: linux> ./python3 regex.py 'RE string' # import sys

def regex(str): i = 0 # idx to input string

# lookahead the next char, return '$' if reaches the end def next(): if i

# match a char, advance the input idx def match(c): nonlocal i if str[i] == c: i += 1 else: raise Exception("expected " + c + " got " + str[i])

# alt -> seq {'|' seq} def alt(): seq() while next() == '|': match('|') seq() # seq -> rep {rep} def seq(): rep() while next() == '(' or next().isalpha(): rep()

# rep -> atom ['*'] def rep(): atom() if next() == '*': match('*') # atom -> '(' alt ')' | c def atom(): if next() == '(': match('(') alt() match(')') else: c = next() if not c.isalpha(): raise Exception("expected a letter, got " + c) match(c)

# parsing starts here # e -> alt alt() if i

if __name__ == "__main__": print(regex(sys.argv[1]))

3. [5 pts) The RE parser regex.py you worked on in this week's lab only checks whether an input string is a valid RE or not; it does not produce any output except for an acknowledgment True. A real parser always generates an AST. Your task is to augment the program, so that it produces an AST in the form of an S-expression. You should use the same AST as the one defined in the lab, i.e. with three nodes, alt, seq, rep. As a starting point, here is an augmented function for alt: # alt -> seq {'l' seq} def alt(): ast = seq() while next() match('|') ast = ['alt', ast, seq()] return ast == l': It now returns an AST in the nested list form, ['alt', ...]. 3. [5 pts) The RE parser regex.py you worked on in this week's lab only checks whether an input string is a valid RE or not; it does not produce any output except for an acknowledgment True. A real parser always generates an AST. Your task is to augment the program, so that it produces an AST in the form of an S-expression. You should use the same AST as the one defined in the lab, i.e. with three nodes, alt, seq, rep. As a starting point, here is an augmented function for alt: # alt -> seq {'l' seq} def alt(): ast = seq() while next() match('|') ast = ['alt', ast, seq()] return ast == l': It now returns an AST in the nested list form, ['alt', ...]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!