Question: Your goal here is to modify the pre-processing in load_book one more time! Make a small modification to the input: load_book(book_id, pos = True, lemma

Your goal here is to modify the pre-processing in load_book one more time! Make a small modification to the input: load_book(book_id, pos = True, lemma = True):, to accept two boolean arguments, pos and lemma specifying how to identify each word as a key term. In particular, each word will now be represented in both of the document and index as a tuple: heading = (text, tag), where text contains the word.text attribute from spacy if lemma = False, and word.lemma_ attribute if True. Similarly, tag should be left empty as "" if pos = False and otherwise contain word.pos_.

Note this functions output should still consist of a document and index in the same format aside from the replacement of word with heading, which will allow for the same use of output in fast_kwic, although more specified by the textual features.

My code:

def load_book(book_id, pos = True, lemma = True): string_id = str(book_id) book_file = open("./data/books/"+string_id+".txt", "r") booktext = book_file.read().strip() paragraphs = re.split(' {2,}' , booktext) for i, paragraph in enumerate(paragraphs): doc = nlp(paragraph) paragraph_components = [] for j, sentence in enumerate(doc.sents): sentence_components = [] for k, word in enumerate(sentence):

if lemma==False: text_output = word.text else: text_output = word.lemma_

if pos==False: tag_output = "" else: tag_output = word.pos_

heading = (text_output, tag_output)

sentence_components.append(word.text)

index[heading].append([i, j, k])

paragraph_components.append(sentence_components) document.append(paragraph_components) return document, index

Your goal here is to modify the pre-processing in load_book one more

time! Make a small modification to the input: load_book(book_id, pos = True,

TypeError: sequence iten 0: expected str instance, tuple found \( \begin{aligned} \mathbf{M} & =\text { S6:Sonitychech } \\ & \text { print("Sentence with ('cold', 'ADJ') :") } \\ & =\text { ".jein(fast_kwic(decueent, index, search_terns = }\{(\text { 'cold', 'ADJ') }\})\left[\left(\left(^{\prime} \text { cold", 'ADJ') } ight][\theta][1] ight) ight.\end{aligned} \) Sentence with ("cold", "ADJ"): IndexError Traceback (most recent cal1 last) *\AppData\Local\Tenp/1pykernel_24432/299715389. Py in cmodule> 1 \# B6:SandtyCheck 2 print("Sentence with ("cold", "ADJ") :") ... 3 " " "join(fast_kuic(document, index, search_terms = {( "cold", "ADJ") )[( "cold", "ADJ")][e] [1]) * AppData\Local\Tenp/1pykernel_24432/3068648249.py in fast_kwic(docunent, index, search_terns) 1 = index [.] for j in range (len(1)) : Ist1 =[1[j], document [1[j][]][1[j][1]]] data[i] - append(1st1) IndexError: list index out of range

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!