Question: please implement step 2 In [1] : import nltk import re from nitk import pos_tag, word_tokenize, Tree from nitk.stem import WordNetLenmatizer Regular expression practice: In

 please implement step 2 In [1] : import nltk import re
from nitk import pos_tag, word_tokenize, Tree from nitk.stem import WordNetLenmatizer Regular expression
practice: In this example, we show one regex pattern example for Hearst
please implement step 2

In [1] : import nltk import re from nitk import pos_tag, word_tokenize, Tree from nitk.stem import WordNetLenmatizer Regular expression practice: In this example, we show one regex pattern example for Hearst pattern: NP such as {NP,)* {lor | and}} NP (https://docs.python.org/3/library/re.htm) In [2): regex = "(NP_\W+ (,)?such as (NP_\W+ ?(,)? (and lor )?)+)" test_str = "NP_1 such as NP_2, NP_3 and NP_4 matches = re.search(regex, test_str) if matches: # Match.groupIgroupl, ...) Returns one or more subgroups of the match. # If there is a single argument, the result is a single string; # If there are multiple arguments, the result is a tuple with one item per argument. # Without arguments, groupl defaults to zero (the whole match is returned). print(matches.group()) NP_1 such as NP_2, NP_3 and NP_4 Step1: Chunking Sentence . Note the result is not the chunked np, instead is the chunk tree structure In [3]: fron nitk import ne_chunk def np_chunking(sentence): # your implementation result = ne_chunkipos_tag(word_tokenize sentence))) return result print(np_chunking like to listen to music from musical genres, such as blues, rock and jazz. )) (s I/PRP like/VBP to/TO listen/VB to/TO music/NN from/IN musical/J) genres/NNS .l. such/J) as/IN blues/NNS .l. rock/NN and/CC jazz/NN ./.) Step2: Prepare the chunked result for subsequent Hearst pattern matching Traverse the chunked result, if the label is NP, then merge all the words in this chunk and add a prefix NP_ . All the tokens are separated with a white space(" ") Remember to lemmatize words, using WordNetLenmatizer (fron nitk.stem import WordNetLenmatizer) In (4): prepare the chunked sentence by merging words and add prefix NP_ def prepare_chunks (chunks): # 17 chunk is NP, start with NP_ and join tokens in chunk with - Else just keep the token as it is terms=0 for chunk in chunks: label - None try: see if the chunk is simply a word or a NP. But non-NP fail on this method call label chunk. label() except: pass * Based on the label, do processing, your implementation here.... In ts: raw_text= "I like to listen to music from musical genres, such as blues, rock and jazz." chunk_res np_chunking(row_text) print(prepare_chunks (chunk_res)) I like to listen to NP_music from NP_musical_genre, such as NP_blue, NP_rock and NP_jazz

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!