Question: (PYTHON) PLEASE FOLLOW TEMPLATE PROVIDED IN THE BOTTOM Your task is to write a Python program that opens and reads a very large text file.

(PYTHON) PLEASE FOLLOW TEMPLATE PROVIDED IN THE BOTTOM

Your task is to write a Python program that opens and reads a very large text file.

The program prompts the user to enter the file name.

The program then computes some language statistics based on the contents of the file.

The longest word used in the file. If there is more than one, just print one of the longest. There is no need to find all the longest words.

The five most common words in the file with the number of times they appear in the file.

The word count of all the words in the file, sorted alphabetically this last output has to be written to a file in the current working directory with the name out.txt. Open this file for writing in w mode.

Your program must work with any input text file that uses 'UTF-8' encoding. The program must also be efficient, computing statistics for a large file (pride.txt) in seconds.

Make sure that you include docstrings, comments where necessary and follow the general Python style guidelines: no long lines in your code, pythonic variable and function names, etc

Use functions to structure your code.

You dont have to worry about validating the input at this point. Assume that the file name provided by the user exists.

Assume that the file is really large so you dont want to read it all at once. However you only want to open the file once and read each line once.

Make sure that you ignore capitalization when you process the file. So This and this should be counted as the same word.

Also make sure that you take out leading and trailing punctuation characters and numbers from your words.

here and here. should both be considered one word. Same with Hi, and Hi!.

Punctuation characters inside words should be kept so that hyphenated words such as 'arm-in-arm' and contractions such as "don't" are left intact.

Testing:

Start testing with a small text file (5-10 lines). Once you have it working, try the larger files.

Note that your program must generate console output as well as a file output.

Below is the console output based on the text file provided - Pride and Prejudice novel by Jane Austen. Note that your program has to print one of the longest words, not all of them.

The longest word is: disinterestedness

or:

The longest word is: misrepresentation

or:

The longest word is: communicativeness

The 5 most common words are:

the: 4332

to: 4138

of: 3612

and: 3580

her: 2225

(*****USE THIS TEMPLATE*****)

""" Docstring: Enter your one-line overview here  and your detailed description """ import string import random def count_words(filename): """ Enter your function docstring here """  # build and return the dictionary for the given filename def report(word_dict): """ Enter your function docstring here """  # report on various statistics based on the given word count dictionary def main(): # get the input filename and save it in a variable # call count_words to build the dictionary for the given file # save the dictionary in the variable word_count # call report to report on the contents of the dictionary word_count # If you want to generate a word cloud, uncomment the line below. # draw_cloud(word_count) if __name__ == '__main__': main()

Hints:

Write a function count_words that reads a given file, line by line, then word by word, andreturns a dictionary. The dictionary should have an entry for each word in the file. The value corresponding to a given word should be the number of times the word appears in the file.

The dictionary will be of the form :

{'the': 20, 'ate': 1, 'morning': 2, etc...}

As you process a given word from a given line in the file you can check whether that word is already in the dictionary: if it is, you update the count. Otherwise you create a new dictionary entry and initialize it.

Use the sorted function on the dictionary to get the different sorts. For some, you'll need to specify the keyargument.

Take advantage of Python built-in function max.

Take advantage of list slicing: my_list[0:10] is the list of the first 10 items in my_list.

Make sure that you open the input file only once and read it one line at a time.

This is how the output is supposed to look like from 'out.txt'

a: 1948

abatement: 1

abhorrence: 6

abhorrent: 1

abide: 1

abiding: 1

abilities: 6

able: 54

ablution: 1

abode: 8

abominable: 6

abominably: 4

abominate: 2

abound: 1

about: 122

above: 21

abroad: 4

abrupt: 1

abruptly: 2

etc..

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

The problem is to write a Python3 program that reads a text file or web page named by the user, and prints a neat table showing the counts of all of the different characters it reads. You must also...

can someone help me finish this using for, while, dictionary, or things like that, do not use enumerate ,strip, break CountChars.py (40 points: reading from file or URL, formatting strings, data...

please please please dont use ai and please include psedocode so i can learn from it thank you program 6 _ 1 . py PROGRAM OBJECTIVE Your task is to create a program that maintains a record of vinyl...

program 6 _ 1 . py PROGRAM OBJECTIVE Your task is to create a program that maintains a record of vinyl LPs that are on sale in a local record store. The program should use a while loop and file...

PLEASE USE TEMPLATE PROVIDED AT THE BOTTOM OF QUESTION!!!! Human vs (Dumb) Machine A very simple way to have a program play Tic-Tac-Toe is to simply have the program pick the first empty cell to play...

DO NOT ANSWER IF YOU CANNOT USE THE TEMPLATE!!! Your task is to write a Python program that opens and reads a very large text file. The program p... Your task is to write a Python program that opens...

Your task is to write a Python program that opens and reads a very large text file. The program prompts the user to enter the file name. The program then computes some language statistics based on...

Meta Tic Tac Toe is a board game composed of nine tic-tac-toe boards arranged in a 3-by-3 grid. In other words, it is a Tic-Tac-Toe board with 9 tic-tac-toe games. Players take turns playing in the...

Pedagogical Goal : Refresher of Python and hands-on experience with algorithm coding, input validation, exceptions, file reading, Queues, and data structures with encapsulation. Due date: June 9 The...

Please do not copy from another place ! thank you very much.... The program is by java and follow the following instruction and add on the following code please.! add the output to please.....thank...

Last year Microsoft paid out a dividend of $2.50. Next year, the expected return on equity is equal to 12.5%. You know that an adequate return required by the companys shareholders on holding...

Name and define the classes of tenancies. Within the section about classes of tenancies, the textbook makes reference to other types of 'tenant' (or, perhaps not a true tenant but something less...

Which of the following items must be examined by the controller or treasurer before signing a check? A . the purchase order B . the ledger C . the joumal entry D . the confirmation report

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

3. Symbolic analogy. What symbol or image best captures what this problem looks like? Feels like? Sounds like?

4. Innovation is someone elses job and not part of everyones responsibilities.

2. Fear of cannibalizing current business prevents investment in new areas.