Question: WE ARE USING PYTHON 1. Book Analysis In this lab, we will parse text from a file and run some basic analysis on the content

WE ARE USING PYTHON WE ARE USING PYTHON 1. Book Analysis In this lab, we

will parse text from a file and run some basic analysis on

the content of the file. In order to parse the file, we

will implement different functions in the module book_analysis.py. In order to complete

the module, we will write the following functions: 1. read_file(filename) Given a

1. Book Analysis In this lab, we will parse text from a file and run some basic analysis on the content of the file. In order to parse the file, we will implement different functions in the module book_analysis.py. In order to complete the module, we will write the following functions: 1. read_file(filename) Given a filename, the function should read the text off the file and then, return a list of the lines of text in the file. Input: filename (string type) Output: list of lines in the file (list type) 2. get_words_from_string(line) Given a line (string), the function should return a list of words in the line by i) converting each word to lower case ii) replacing each punctuation with a space Hint: You can use string.punctuation to check if a character is punctuation or not string.punctuation is merely a string of punctuations strung together. Input: Line (a string) Output: a list of strings/words where each string/word is a sequence of alphanumeric characters Input: "This is a Poorly @ written Line." se Book Analysis Output: ["this", "is", "a", "poorly", "written", "Line"] 3. get_words_from_line_list(list_of_lines) Given a list of line, parse the list of text lines into words. Return a list of all words found in the list of lines. Input: ["This the First-Line.", "This is the Second@Line."] Output: ["this", "the", "first", "Line", "this", "is", "the", "second", "line"] Note that the output contains no punctuation/special characters such as "-" and "@" and each word is in lower case. Hint: You can make use of get_words_from_string(...) function defined previously. 4. count_frequency(word_list) Given a list of words, return a dictionary where each word is a key and each word's count is its corresponding value Input: ["this", "is", "the", "first", "is", "Line", "second", "line", "this") Output: { "this": 2, "is": 2, ook Analysis "this": 2, "is": 2, "the": 1, "first": 1, "line": 2, "second": 1 } 5. find_most_common_word(freq_map) Return the most common word in the dictionary Input: freq_map (dictionary) Output: the word with the highest frequency (str type) Input: freq_map = { "this": 2, "is": 2, "the": 1, "first": 10, "Line": 2, "second": 1 } Output: "first" Collapse o III 1. Book Analysis 6. write_result(filename) Given a filename, use previously defined functions to find the following values 1. number of lines in the file 2. number of words in the file 3. number of distinct words in the file 4. the most occuring word in the file write all these values to a text file called "result.txt" in the same order in 5 different lines. Eg: Input: filename-> "test.txt" Output written in the "result.txt": File: test.txt Number of lines: 1000 Number of words: 8000 Number of distinct words: 2000 Most commonly used word: to Please edit the book_analysis.py starter file to write your code. There is also a module called function_caller.py where you can test the functions implemented in book_analysis.py module. A text file called shakespeare.txt is provided in the starter code in case you want to test the functions with 1 WN - import book_analysis import string 2 4 # Use this to call different module in book_analysis.py 5 1. Book Analysis In this lab, we will parse text from a file and run some basic analysis on the content of the file. In order to parse the file, we will implement different functions in the module book_analysis.py. In order to complete the module, we will write the following functions: 1. read_file(filename) Given a filename, the function should read the text off the file and then, return a list of the lines of text in the file. Input: filename (string type) Output: list of lines in the file (list type) 2. get_words_from_string(line) Given a line (string), the function should return a list of words in the line by i) converting each word to lower case ii) replacing each punctuation with a space Hint: You can use string.punctuation to check if a character is punctuation or not string.punctuation is merely a string of punctuations strung together. Input: Line (a string) Output: a list of strings/words where each string/word is a sequence of alphanumeric characters Input: "This is a Poorly @ written Line." se Book Analysis Output: ["this", "is", "a", "poorly", "written", "Line"] 3. get_words_from_line_list(list_of_lines) Given a list of line, parse the list of text lines into words. Return a list of all words found in the list of lines. Input: ["This the First-Line.", "This is the Second@Line."] Output: ["this", "the", "first", "Line", "this", "is", "the", "second", "line"] Note that the output contains no punctuation/special characters such as "-" and "@" and each word is in lower case. Hint: You can make use of get_words_from_string(...) function defined previously. 4. count_frequency(word_list) Given a list of words, return a dictionary where each word is a key and each word's count is its corresponding value Input: ["this", "is", "the", "first", "is", "Line", "second", "line", "this") Output: { "this": 2, "is": 2, ook Analysis "this": 2, "is": 2, "the": 1, "first": 1, "line": 2, "second": 1 } 5. find_most_common_word(freq_map) Return the most common word in the dictionary Input: freq_map (dictionary) Output: the word with the highest frequency (str type) Input: freq_map = { "this": 2, "is": 2, "the": 1, "first": 10, "Line": 2, "second": 1 } Output: "first" Collapse o III 1. Book Analysis 6. write_result(filename) Given a filename, use previously defined functions to find the following values 1. number of lines in the file 2. number of words in the file 3. number of distinct words in the file 4. the most occuring word in the file write all these values to a text file called "result.txt" in the same order in 5 different lines. Eg: Input: filename-> "test.txt" Output written in the "result.txt": File: test.txt Number of lines: 1000 Number of words: 8000 Number of distinct words: 2000 Most commonly used word: to Please edit the book_analysis.py starter file to write your code. There is also a module called function_caller.py where you can test the functions implemented in book_analysis.py module. A text file called shakespeare.txt is provided in the starter code in case you want to test the functions with 1 WN - import book_analysis import string 2 4 # Use this to call different module in book_analysis.py 5

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!

code should be in python version 3 For Americans entering the work force in the 2020s, most of their retirement income will be a from a the money they have invested over their working career. A...

I have been working on a Python and Flowgorithm project and I can't seem to figure out what exactly this professor wants. Its very vague. I have developed a Flowgorithm program and converted it into...

! Required information [The following information applies to the questions displayed below.] NOTE: Throughout this lab, every time a screenshot is requested, use your computer's screenshot tool, and...

Assignment: Project # 1 - Impacts on Cardiac - Related Deaths Purpose and Preparation The purpose of this project is to introduce you to the basic concepts of data analysis through descriptive...

The purpose of this project is to introduce you to the basic concepts of data analysis through descriptive statistics and data visualization. The provided dataset includes variables which may...

What are the main social responsibilities of business managers and public administrators? Have these responsibilities changed over the years? How? If you were the CEO of a large corporation, how...

These data indicate that rats and humans with orbitofrontal (OFC) damage show delay discounting. A Rats B Humans 100 100 Control 80 80 60 60 40 40 Control 20 20 OFC lesion OFC damage 10 20 30 30 90...

Use the table to find the monthly payment. Imported Asset Michelle borrowed $10,125 at 12.5 percent for 10 years.What was her monthly...

Prove or disprove: The number of bits required to express the n th Fibonacci number in binary is (n).