Question: Python code I need help with programming the highlighted instructions. Or at least how it should be done to get the same result the instructions

Python code

I need help with programming the highlighted instructions. Or at least how it should be done to get the same result the instructions asked for. I am getting different result.

Thanks in advance. Python code I need help with programming the highlighted instructions. Or

Objective: Use Python and NLTK to process text. Turn in: Your Python.py file Instructions: 1. Read in moby_dick.txt which you can download from 2. riuss the text a. use a string function to replace all occurrences of b. use regex to remove all digits c. use regex to replace punctuation with a single space 3. Tokenize the text and print the number of tokens. 4. Create a set of unique tokens and print the number of 5. Calculate and print the lexical diversity, which is the Format the number with commas. unique tokens, formatted as above. number of unique tokens divided by the number of tokens. Format the number for floating point. 6. Create a list of important words by removing stop words from the unique tokens list. Display the number of important words, formatted. 7. Using the list of important words, create a list of tuples of the word and stemmed word, like this [(remarkably','remark'), (prevented', prevent') 8. Create a dictionary where the key is the stem and the value is a list of words with that stem, like this: achiev': [achieved', 'achieve] accident': [accidentally, 'accidental']... \ 9. Print the number of dictionary entries. 10. For the 25 dictionary entries with the longest lists, print the stem and its list. Hint on sorting dict by length of values: for k in sorted(stem dict, key-lambda k: len(stem dict[k]), reverse- True): 11. Perform POS tagging on the words in the top 25 dictionary entries. 12. Create a dictionary of POS counts where the key is the POS and the count is the number of these words with that POS. Print the dictionary Objective: Use Python and NLTK to process text. Turn in: Your Python.py file Instructions: 1. Read in moby_dick.txt which you can download from 2. riuss the text a. use a string function to replace all occurrences of b. use regex to remove all digits c. use regex to replace punctuation with a single space 3. Tokenize the text and print the number of tokens. 4. Create a set of unique tokens and print the number of 5. Calculate and print the lexical diversity, which is the Format the number with commas. unique tokens, formatted as above. number of unique tokens divided by the number of tokens. Format the number for floating point. 6. Create a list of important words by removing stop words from the unique tokens list. Display the number of important words, formatted. 7. Using the list of important words, create a list of tuples of the word and stemmed word, like this [(remarkably','remark'), (prevented', prevent') 8. Create a dictionary where the key is the stem and the value is a list of words with that stem, like this: achiev': [achieved', 'achieve] accident': [accidentally, 'accidental']... \ 9. Print the number of dictionary entries. 10. For the 25 dictionary entries with the longest lists, print the stem and its list. Hint on sorting dict by length of values: for k in sorted(stem dict, key-lambda k: len(stem dict[k]), reverse- True): 11. Perform POS tagging on the words in the top 25 dictionary entries. 12. Create a dictionary of POS counts where the key is the POS and the count is the number of these words with that POS. Print the dictionary

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Hello, I have the attached assignment due on 11/29. Could you help? TAX CONFUSION AND FORM 1040 TURMOIL It was the night of April 14 and Daniel was pacing furiously around the house muttering to...

Hello, Please see the attached assignment and let me know if you can help TAX CONFUSION AND FORM 1040 TURMOIL It was the night of April 14 and Daniel was pacing furiously around the house muttering...

I need full code for this project. All the resources are found here. The code for lab 3 is: -) https://ucsb csB.github.io/w19 matni/lab/project01/ Goal and Background The goal of this project is to...

Question 1 1.5 pts You have been asked to paint the outside of your house. You have never done this before. You believe that because you painted your room over a year ago, the techniques you used...

this is a python program please can anyone help me thank you Introduction In problem set 5, you will build a program to monitor news feeds over the Internet. Your program will filter the news,...

Publication 15 Withholding for Federal Income Tax can be calculated using one of two methods. The first, covered in this section of the exercise, is the Percentage Method. Everett Kelly is married...

Publication 15 In this step, you will use the Wage Bracket Method to calculate withholding. In this method, you do not need to calculate withholding allowances, but there are fairly-low limits as to...

There is a C++ assignment. I need a correct answer which could fulfill all the important notes. Besides, use the skeleton codes please. Thank you so much!!! Skeleton Code /* * COMP2011 (Spring 2021)...

MINI CASE Global Logistics and the Maritime Transport Ecosystem, pg 66 Textbook: IT strategy (Issues and Practice), 4th edition. Writers: James D.Mckeen and Heather A.Smith Discussion Questions: 1....

Please read this article and make a value chain analysis. Carlos Fernandez, COO of Global Logistics (GL), punched a button on his phone to summon his secretary, Alice. "Get me that new guy we just...

Kallie Jungemann, owner of Flowers 4 You, operates a local chain of fioral shops. Each shop has its own delivery van. Instead of charging a flat delivery fee, Jungemann wants to set the delivery fee...

Describe the different types of ethical issues and legal responsibilities of a retailer associated with promotion management. Provide specific examples. Checklist: Identify ethical issues and laws...

Question content area top Part 1 Which of the following institutions is the most important participant in foreign currency markets? Question content area bottom Part 1 A . A foreign exchange broker B...

1 ) Rebecca Santos works as a customer assistant for Brooks International in Ontario. She was provided with Company - owned automobile. The car is available to her 3 6 5 days, and she asked to use...

3. Identify what is wrong with each of the following training objectives. Then rewrite it. a. To be aware of the safety rules for operating the ribbon-cutting machine in three minutes. b. Given a...

1. Diagnose and solve a transfer of training problem.

6. Go to http://agelesslearner.com/intros/adultlearning.html, a site authored by Marcia L. Conner about how adults learn. Scroll down to the bottom of the page and click on Learning Styles...