Question: Write a function create_dictionary(filename) that takes a string representing the name of a text file, and that returns a dictionary of key-value pairs in which:

Write a function create_dictionary(filename) that takes a string representing the name of a text file, and that returns a dictionary of key-value pairs in which:

each key is a word encountered in the text file

the corresponding value is a list of words that follow the key word in the text file.

For example, the dictionary produced for the text I love roses and carnations. I hope I get roses for my birthday. would include the following key-value pairs, among others:

'I': ['love', 'hope', 'get'] 'love': ['roses'] 'roses': ['and', 'for'] 'my': ['birthday.'] # as well as others! 

Guidelines:

You should not try to remove the punctuation from the words of the text file.

The keys of the dictionary should include every word in the fileexcept the sentence-ending words. A sentence-ending word is defined to be any word whose last character is a period ('.'), a question mark ('?'), or an exclamation point ('!'). A sentence-ending word should be included in the lists associated with the words that it follows (i.e., in the value parts of the appropriate key-value pairs), but it not appear as its own key.

If a word w1 is followed by another word w2 multiple times in the text file, then w2 should appear multiple times in the list of words associated with w1. This will allow you to capture the frequency with which word combinations appear.

In addition to the words in the file, the dictionary should include the string $ as a special key referred to as the sentence-start symbol. This symbol will be used when choosing the first word in a sentence. In the dictionary, the list of words associated with the key '$' should include:

the first word in the file

every word in the file that follows a sentence-ending word.

Doing this will ensure that the list of words associated with '$'includes all of the words that start a sentence. For example, the dictionary for the text I scream. You scream. We all scream for ice cream. would include the following entry for the sentence-start symbol:

'$': ['I', 'You', 'We'] 

You may find it helpful to consult the word_frequencies functionfrom lecture. We will also discuss some additional strategies for create_dictionary in lecture.

Examples:

To test your code, download the sample.txt file into the same directory that contains ps8pr3.py. This sample text file contains the following contents:

A B A. A B C. B A C. C C C. 

Once this file is in place, run your ps8pr3.py in IDLE and test your function from the Shell:

>>> word_dict = create_dictionary('sample.txt') >>> word_dict {'A': ['B', 'B', 'C.'], 'C': ['C', 'C.'], 'B': ['A.', 'C.', 'A'], '$': ['A', 'A', 'B', 'C']} 

The order of the keysor of the elements within a given keys list of valuesmay not be the same as what you see above, but the elements of the lists should appear in the quantities shown above for each of the four keys 'A', 'B', 'C', and '$'.

Here are some additional files you can use for testing:

edited_mission.txt - an edited version of BUs mission statement, and the dictionary that we derived from it.

brave.txt - lyrics from the song Brave by Sara Bareilles, and its dictionary.

Here again, the ordering that you obtain for the keys and list elements in the dictionaries may be different. In addition, we have edited the formatting of the dictionaries to make them easier to read.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!