Question: Assignment # 13: Finding and Counting Unique Words I have uploaded a file named tale4653.txt to Blackboard; It contains the first four chapters of A
Assignment # 13: Finding and Counting Unique Words
I have uploaded a file named tale4653.txt to Blackboard;
It contains the first four chapters of A Tale of Two Cities, by Charles Dickens.
As the file name implies, the file contains 4653 words. It will take a bit of time to process.
IF you wish to work on something smaller to start with, create and save a file containing the well-known pangram, The quick brown fox jumps over the lazy dog.
We wish to determine how many unique words there are in this file as an estimate of the richness of Dickens writing vocabulary.
Dividing the number of unique words by the total number of words and multiplying by 100 yields the percentage of unique words in the file.
Use a dictionary to determine how many unique words are contained in the file.
Most of the code you need is in this weeks notes
Use .lower() to avoid counting capitalized words.
Use .translate(None, '!@#$%^&*()-|:;",.?/_|~`[]{}+=') to remove punctuation, except apostrophes.
After creating your dictionary, count your keys (unique words)
Calculate and print the percentage of unique words in the file.
Remember to use good programming style and formatted output.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
