Question: Write a Python script named assignment3.py that imports the file variables.py. This file has variables named moby_dock and stop_words. The moby_dock variable is a string
Write a Python script named assignment3.py that imports the file variables.py. This file has variables named moby_dock and stop_words. The moby_dock variable is a string the contains the text of Moby-Dock by Herman Melville from Project Gutenberg. The stop_words variable contains a set of strings. The goal of this assignment is to report the ten most frequent words in Moby-Dock that are not stop words. The report should include the frequency of each word.
The words should be normalized before counting them. For this assignment, that means removing all punctuation and making all the words lower case. For example, the words Dont and dont are normalized to the word dont. For reference, the most frequent non-stop word after normalization is whale with a frequency of 907.
The Python string module has a variable that contains a string of punctuation characters.
>>> import string
>>> string . punctuation !"# $ %&\ ()*+ , -./:; ? @ [\\]^ _ {|}~
There are many Python string methods that you can read about in the official documentation: https: //docs.python.org/release/3.6.0/library/stdtypes.html#text-sequence-type-str For this assignment, you may find the following string methods useful: lower, replace, and split.
The built-in sorted function can be used to sort a dictionarys keys in the descending order of the values. This is useful if you are using a dictionary to map words to counts.
>>> d = { a : 1 , b : 2 , c : 3}
>>> sorted (d , key = d . __getitem__ , reverse = True ) [ c , b , a ]
* Note for word dock replace o with i as its referencing the book and file variable.py has book in the file
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
