Question: The instructions are as follows: Read Syntax Trees from File: Read all the syntax trees from the provided file train.trees.pre.unk into a list. Use the

The instructions are as follows:
Read Syntax Trees from File: Read all the syntax trees from the provided file train.trees.pre.unk into a list. Use the same code used in the Step 02.
Convert Trees to CNF: Convert each tree into Chomsky Normal Form with the Tree.chomsky_normal_form method (from nltk library) and the horzMarkov=2 parameter.
Extract Productions:
Having converted the trees to Chomsky Normal Form, our next step is to extract the grammatical productions from these CNF trees. Grammatical productions are essentially the rules that define how sentences in a language can be constructed and are pivotal for understanding the structure and syntax of the language.
To accomplish this, create a list named productions. This list will store all the production rules extracted from each tree. Traverse through each of your CNF trees, and for every tree, use the productions() method. This method will break down the tree into its constituent production rules. Accumulate these rules in the productions list, which will later be used for constructing our grammar.
I have found that the chomsky_normal_form and production methods don't support lists, so I have been appending to trees. But now I can't get the productions method to work at all.
The current state of my code is below:
from nltk.tree import chomsky_normal_form
with open('train.trees.pre1stline.unk', 'r') as first_file: #read the first Line of the file
first_line = first_file.readline().strip()
big_tree = Tree.fromstring(first_line)
with open('train.trees.pre.unk', 'r') as file: #read the lines 2+ and append to original tree
for line in file:
line = file.readline().strip()
tree = Tree.fromstring(line)
big_tree.append(tree)
#big_tree.append(tree) # <---- something like this, define big_tree as empty and then add the subtree
#trees =[Tree.fromstring(line.strip()) for line in file]
tree_list =[]
for tree in big_tree:
Tree.chomsky_normal_form(tree, horzMarkov=2, vertMarkov =1)
tree_for_list =[str(tree)]
#tree_for_list = Tree.fromstring()
tree_list.append(tree_for_list)
print(tree_list)
productions =[]
#for each_tree in tree_list:
#each_tree_t = Tree.fromstring(str(each_tree))
#productions.extend(each_tree_t.productions())
#each_tree_prod = each_tree.productions()
#productions.append(each_tree_prod)
#for production in productions:
#print(productions)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!