Question: Fix this code so that it gives correct output: Conduct punctuation removal, stop word removal, casefolding, lemmatization, and stemming on the documents. import pandas as

Fix this code so that it gives correct output:

Conduct punctuation removal, stop word removal, casefolding, lemmatization, and stemming on the documents.

import pandas as pd import nltk from nltk.tokenize import RegexpTokenizer from nltk.corpus import stopwords import re from nltk.stem import PorterStemmer from nltk.stem import WordNetLemmatizer nltk.download("wordnet")

sentences=["Can we go to Disney??!!!!!! Let's go on a plane!","The New England Patriots won the Super Bowl.." ,"I HATE going to school so early","When will I be considered an adult?" ,"I want to go to A&M, Baylor, or the University of Texas."]

#remove punctuation and stop words using nltk tokens=[] stop_words=[] tokenizer = RegexpTokenizer(r'w+') print("sentences after punctuation removal are :") print(" ") for i in range(len(sentences)): tokens.append(tokenizer.tokenize(sentences[i])) print(" ".join(list(tokens[i]))) print(" ")

print("sentences after stop word removal are :") print(" ") for i in range(len(sentences)): stop_words.append([w for w in tokens[i] if not w in stopwords.words('english')]) print(" ".join(list(stop_words[i]))) print(" ")

#casefold string print("sentences after casefold are :") for i in range(len(stop_words)): for j in range(len(stop_words[i])): stop_words[i][j]=stop_words[i][j].casefold() print(" ".join(list(stop_words[i]))) print(" ") print("lemmatization:") #lemmatization of words lemmatizer = WordNetLemmatizer() for i in range(len(stop_words)): for j in range(len(stop_words[i])): print(stop_words[i][j],":",lemmatizer.lemmatize(stop_words[i][j])) stop_words[i][j]=lemmatizer.lemmatize(stop_words[i][j]) #stemming the documents print(" ") print("steming:") ps = PorterStemmer() for i in range(len(stop_words)): for j in range(len(stop_words[i])): print(stop_words[i][j],":",ps.stem(stop_words[i][j])) stop_words[i][j]=ps.stem(stop_words[i][j]) print(" ") print("final output:") #final output after completing above operations for i in range(len(stop_words)): print(" ".join(list(stop_words[i])))

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Using NLP and LDA Based Robotic Automation to Improve Customer Feedback Analysis in Retail In the competitive landscape of modern retail, understanding customer sentiments through feedback is...

can u fix the code so the second string fraction being divided can also be a regular fraction like 3/4 it can only handle it if it was a mixed fraction like 1 3/4 as well as add a bubble sort with...

Please solve just Prob #2. Fix the given code as shown in order. What to do now The file you'l be working with is named Lab3 Practice.java . You should now attempt to compile and run the code to see...

Complete the Hashtable code by adding code to the parts where it says fix this code The code must be fixed to include double hashing package hashtable; import java.util.Iterator; import...

1 num pancakes 2 3 #You may modify the lines of code above, but don't move them! 4 #When you Submit your code, we'll change these lines to 5 #assign different values to the variables. 6 # 7 #Write a...

How do you fix the code given in java? Code given: /** * This Java program inlcudes several errors. One error will keep it from compiling. * The remaining errors will throw exceptions and/or prevent...

Use Python to fix the code from __future__ import print_function # Ladybug moves randomly looking for aphids to eat. # Each aphid gives her ten units of life energy. # Each move costs her one unit of...

So I submitted for someone to fix my code on here, to fix the physics of my code but they kind of broke my code because now when I move the mallet towards the puck it just throws me with errors and I...

Please fix the code below to get this output. I need to have this output OUTPUT: depth: 13 14 15 16 17 18 19 20 21 22 23 count: 9 168 1798 11113 36291 67735 79899 62443 31144 8576 824...

The most Grammy awards won in a single year by any artist is 8, tied between Micheal Jackson & Santana. Close behind are Beyonce and Adele who have one 6 Grammys in one year. Georg Solti holds the...

Using exponential notation, we can write the product 5-5-5-5-5-5 as (b) In the expression 34 the number 3 is called the and the number 4 is called the

Given that f(1) = 2, f'(1) = - 1, g(1) = 0 and g'(1) = 1, find F'(1) where F(x) = f(x) cos g(x).

In class discussion we talked about how calculating the NPV for a project requires us to forecast the expected change in FCFs across time for that project. In class we used the following equation...

CT Corp Comprehensive Question Canadian Tire Corporation, Limited ( Canadian Tire ) is a family of companies that includes a retail segment and a financial services division, among others. The retail...

3. Evaluate your listeners and tailor your speech to them

4. Locate a persuasive speech that you found particularly compelling. Print it out and edit it, removing any and all of the material that you feel is persuasive in nature (for example, the speakers...

3. Informative speeches are everywherein your classroom, on the news, and in your community. Watch an informative speech (or read a transcript, available at the Web sites of many government agencies...