Question: Implement the CYK Parser in Java (see algorithm below) for the context-free grammar and lexicon showed below: Algorithm: We assume an enumerated class NonTerms with

Implement the CYK Parser in Java (see algorithm below) for the context-free grammar and lexicon showed below:

Algorithm:

We assume an enumerated class NonTerms with all the non-terminals, both phrase structure symbols and parts of speech.

A node of the parse tree is an object with the following data fields (in Java-esque notation)

class Tree { NonTerm phrase % The Non-terminal int startPhrase, int endPhrase; % indices of starting and ending word String word; % If a leaf, then the word Tree left; Tree right; double prob; }

The fields left, right, and prob are the left-hand child, right-hand child and probability in the best parse found so far for this particular non-terminal from this start to this end (e.g. the best parse for an NP from word 3 to word 6). Note that the probability on a node is _not_ the probability of this parse tree given the phrase, it is the probability of the parse tree including the choice of word given the non-terminal at the root. However for any particular non-terminal and choice of sentence those two probabilities are proportional, by Bayes' rule, which we will come to later in the course.

The main data structure in the procedure is a chart which is an array P[M,I,J]. M is indexed on the non-terminal, I and J go from 1 to N (the length of the sentence). P[M,I,J] is the node with NonTerm==M, startPhrase==I, and endPhrase==J. Note that this uses 1-based indexing, like the textbook.

function CYK-PARSE(sentence,grammar) return P, a chart. { 1. N = length(sentence); 2. for (i = 1 to N) { 3. word = sentence[i]; 4. for (each rule "POS --> Word [prob]" in the grammar) 5. P[POS,i,i] = new Tree(POS,i,i,word,null,null,prob); 6. } % endfor line 2. 7. for (length = 2 to N) % length = length of phrase 8. for (i = 1 to N+1-length) { % i == start of phrase 9. j = i+length-1; % j == end of phrase 10. for (each NonTerm M) { 11. P[M,i,j] = new Tree(M,i,j,null,null,null,0.0); 12. for (k = i to j-1) % k = end of first subphrase 13. for (each rule "M -> Y,Z [prob]" in the grammar) { 14. newProb = P[Y,i,k].prob * P[Z,k+1,j].prob * prob; 15. if (newProb > P[M,i,j].prob) { 16. P[M,i,j].left = P[Y,i,k]; 17. P[M,i,j].right = P[Z,k+1,j]; 18. P[M,i,j].prob = newProb; 19. } % endif line 15 20. } % endfor line 13 21. } % endfor line 10 22. } % endfor line 8 23. return P; 24. } % end CYK-PARSE.

The value returned in P["S",1,N] is the root of the most probable parse.

The most probable parse tree can be printed out (not very elegantly) using pre-order search.

printTree(P) { % P is the chart returned by CYK-PARSE above. printTree1(P[S,1,N],0) % S is the top-level nonterminal in the grammar. } % N is the length of the sentence. printTree1(Tree tree, int INDENT) { if (tree != null) { print out INDENT spaces; print out tree.phrase if (tree.word != null) print out tree.word; printTree(tree.left, INDENT+3); printTree(tree.right, INDENT+3); }

Grammar

S -> Noun Verb [0.2] S -> Noun VerbAndObject [0.3] S -> Noun VPWithPPList [0.1] S -> NP Verb [0.2] S -> NP VerbAndObject [0.1] S -> NP VPWithPPList [0.1] NP -> Noun PP [0.8] NP -> Noun PPList [0.2] PP -> Prep Noun [0.6] PP -> Prep NP [0.4] PPList -> PP PP [0.6] PPList -> PP PPList [0.4] VerbAndObject -> Verb Noun [0.5] VerbAndObject -> Verb NP [0.5] VPWithPPList -> Verb PP [0.3] VPWithPPList -> Verb PPList [0.1] VPWithPPList -> VerbAndObject PP [0.4] VPWithPPList -> VerbAndObject PPList [0.2]

Noun -> amy [0.1] Noun -> dinner [0.2] Noun -> fish [0.2] Noun -> streams [0.1] Noun -> swim [0.2] Noun -> tuesday [0.2] Prep -> for [0.5] Prep -> in [0.3] Prep -> on [0.2] Verb -> ate [0.7] Verb -> streams [0.1] Verb -> swim [0.2]

The program should read a single sentence and print out the most probable parse tree with its probability. For instance given the input "Fish swim in streams" the program should print out

S Noun fish VPWithPPList Verb swim PP Prep in Noun streams Probability = 0.0000216 (2.16e-5).

If the input sentence cannot be parsed as a sentence e.g. "toy in" then the program should print out "This sentence cannot be parsed".

You may assume that the sentence is correctly formatted; that is, it consists of words in the lexicon in lower case separated by spaces.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

ttth Suppose that the sequence of bags {Bn | n N} is recursively enumerated by the computable function e(n, x) = fn(x), [7 marks] Hence prove that the set of all recursive bags cannot be recursively...

"Fortran, Algol and Lisp invented most programming language concepts 50 years ago; adding the concept of object-orientation suffices to explain all programming languages to date". To what extent is...

Write an alternative definition that is tail-recursive (iterative) and makes use of accumulator variables. [10 marks] Explain why your alternative definition executes more efficiently. [3 marks] 1...

Please download the parser template. 1. Copy the changes to the scanner template you made in the previous quiz into the parser template. Note that the scanner is now called from the parser, which is...

I am having difficulty with this question which is part two of two larger questions. The scanner template I edited and the parser template I will paste below. Parser Template: import re import sys...

Implement the un-implemented functions to make a working parser. An example of a parse from a properly working program would be: Enter expression: bcac S 2 AB 3 bAB 4 bcB 5 bcaA 4 bcac bcac | $ '''...

design and implement a syntax analyzer for the language specified by the grammar specified on the next page. Deliverables: Documentation 1 . Transformed Grammar into LL ( 1 ) : 1 . 1 . Remove all the...

pls help Final Assignment (due on Jan.10, 2021, 18:00) 1. Please design and implement a DFA, which can recognize the regular expression 1*(00|011")"00. Requirements: (1) Please submit the source code...

You are asked to implement 1. A simple shell that supports Linux shell commands in addition to the internal commands; 2. A multi-threading simple application that uses pthread library. You are...

''' This program implements a recursive descent parser for the CFG below: 1. S aA 2. S AB 3. A bA 4. A c 5. B aA 6. B b ''' import math class ParseError(Exception): pass w = '' # the input string i =...

Mortonsen Corporation has collected the following information after its first year of sales. Net sales were $2,000,000 on 100,000 units; selling expenses $400,000 (30% variable and 70% fixed); direct...

Hanson Inc. has the following direct material standard to manufacture one Widget: 1.5 pounds per Widget at $4.00 per pound Last week, 1,900 pounds of material were purchased and 1,650 pounds of...

10. Restful Sleep Mattress company is planning to expand into selling bedroom furniture. This expansion will require a loan from the bank. The bank has requested financial information. In a group,...

Seved Help 14 Wisconsin Snowmobile Corp. is considering a switch to level production Cost efficiencies would occur under level production, and aftertax costs would decline by $31,500, but inventory...

4. Move around while on a videoconference to improve interpersonal communication; sitting rigid will result in loss of persuasiveness.

2. Understand the cultural differences that may be experienced among team members.

Have roles been defined and assigned?