Question: In the first part of this assignment you will implement a first-order Markov text generator. Writing this function will involve two functions: (1) one to

In the first part of this assignment you will implement a first-order Markov text generator. Writing this function will involve two functions: (1) one to process a file and create a dictionary of legal word transitions and (2) another to actually generate the new text.

First function to create : createDictionary( filename )

createDictionary( filename ) takes in a string, the name of a text file containing some sample text. It should return a dictionary whose keys are words encountered in the text file and whose entries are a list of words that may legally follow the key word. Note that you should determine a way to keep track of frequency information. That is, if the word "cheese" is followed by the word "pizza" twice as often as it is followed by the word "sandwich", your dictionary should reflect this trend. For example, you might keep multiple copies of a word in the list

The dictionary returned by createDictionary will allow you to choose word t+1 given a word at time t. But how do you choose the first word, when there is no preceding word to use to index into your dictionary?

To handle this case, your dictionary should include the string "$" representing the sentence- start symbol. The first word in the file should follow this string. In addition, each word in the file that follows a sentence-ending word should follow this string. A sentence-ending word will be defined to be any raw, space-separated word whose last character is a period ., a question mark ?, or an exclamation point !

How do I determine if a word ends in a punctuation mark? The easiest way is to check w[-1]. We will only worry about '.', '?', and '!'

Checking your code... To check your code, paste the following text into a plain-text file (for example, into a new file window in Sublime): A B A. A B C. B A C. C C C. Save this file as t.txt in the same directory where you're your_name_project02.py lives. Then, see if your dictionary d matches the sample below: >>> d = createDictionary( 't.txt' ) >>> d {'A': ['B', 'B', 'C.'], 'C': ['C', 'C.'], 'B': ['A.', 'C.', 'A'], '$': ['A', 'A', 'B', 'C']} The elements within each list need not be in the same order, but they should appear in the quantities shown above for each of the four keys, 'A', 'C', 'B', and '$' . Here are the contents of the poptarts file, named a.txt, from class. I like poptarts and 42 and spam. Will I get spam and poptarts for the holidays? I like spam poptarts! You'll want to be sure that the output dictionary from this file is the same as the one in the class notes (note that the order of the keys can vary and they won't be separated line-by-line): >>> d = cd( 'a.txt' ) >>> d {'and': ['42', 'spam.', 'poptarts'], '$': ['I', 'Will', 'I'], 'for': ['the'], 'get': ['spam'], 'I': ['like', 'get', 'like'], 'spam': ['and', 'poptarts!'], '42': ['and'], 'Will': ['I'], 'poptarts': ['and', 'for'], 'the': ['holidays?'], 'like': ['poptarts', 'spam']}

2nd Function to create: generateText( d, n )

generateText( d, n ) will take in a dictionary of word transitions d (generated in your createDictionary function, above) and a positive integer, n. Then, generateText should print a string of n words.

The first word should be randomly chosen from among those that can follow the sentence- starting string "$". Remember that random.choice will choose one item randomly from a list! The second word will be randomly chosen among the list of words that could possible follow the first, and so on... . When a chosen word ends in a period ., a question mark ?, or an exclamation point !, the generateText function should detect this and start a new sentence by again choosing a random word from among those that follow "$".

Don't include the '$' in the output text itself -- it will be a marker internal to your function.

For this problem, you should not strip the punctuation from the raw words of the text file. Leave the punctuation as it appears in the text -- and when you generate words, don't worry if your generated text does not end with legal punctuation, i.e., you might end without a period, which is ok. The text you generate won't be perfect, but you might be surprised how good it is!

Here are two examples that use the dictionary d, from above. Yours will differ because of the randomness, but should be similar in spirit. >>> generateText( d, 20 ) B C. C C C. C C C C C C C C C C C. C C C. A >>> generateText( d, 20 ) A B A. C C C. B A B C. A C. B A. C C C C C C.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

After having the opportunity to complete the course, what would you change and why? What topic particularly caught your interest and what do you want to know more about? Last, but not least, if you...

In c++ code please Only Part B that says "Write a C++ program called thread.cpp and make an executable with thread and a log file called thread.log for this part of your assignment." Information is...

CS 112 Project 5 Dictionaries and File IO Due Date: Sunday, April 23rd, 11:59pm Last chance to use tokens! (P6 won't allow late submissions) The purpose of this assignment is to explore dictionaries...

Java 1. The data file. The first part of your assignment is to select a subject for a data file, which will be a simple version of what is called a "database." A data file typically contains...

CS 540 Program #4: Rusty to x86-64 Due date on blackboard For this last programming assignment you are going to generate x86-64 code to implement the semantics of the input Rusty code. I talked about...

Programming Project Objectives In this programming project, students will learn: - How to design and define a class - How to create and use objects - How to write code to read data from text files -...

Working with FASTA data Modules you can use: sys (Links to an external site.) collections (Links to an external site.) os (Links to an external site.) re (Links to an external site.) argparse (Links...

IT-209-Assignment 1 (A1) - Reviewing and Exercising Python Capabilities IT209 - Lab Assignment 1 (LA1) Assignment Given: Lab #1,01/31/2023 Lab Assignment Due: end of Lab or at the discretion of the...

There are two parts to this assignment. In the first part, using C programming language, you will complete the implementation of heap-based Priority Queue and implement the Heapsort algorithm. In the...

PLEASE HELP ME CODE THIS PROGRAM. ALL I ASK FOR ARE SOME EXAMPLES OF HOW TO GO ABOUT DOING THIS AS YOU WILL SEE IN MY BOLD & ITALIC MESSAGES THROUGHOUT THIS PROBLEM. I HAVE INCLUDED CODE FOR YOU TO...

1.Convert the following vectors to cylindrical and spherical systems: F = ra, +ya, +za, +y* [ra, + ya, + za;] G = 2.The acceleration of a particle is given bya = 2.4a,m/s2. The initial position of...

What must a plaintiff prove to establish a prima facie case under the Equal Pay Act?

STOCKHOLDERS ARE PAID AHEAD OF BONDHOLDERS IF A COMPANY DECLARES BANKRUPTCY. TRUE FALSE

Problem 10A-8 Applying Overhead; Overhead Variances [LO10-3, LO10-4] Lane Company manufactures a single product that requires a great deal of hand labor. Overhead cost is applied on the basis of...

KEY QUESTION What are the two characteristics of public goods? Explain the significance of each for public provision as opposed to private provision. What is the free-rider problem as it relates to...

KEY QUESTION What are the three major legal forms of business organization? Which form is the most prevalent in terms of numbers? Why do you think that is so? Which form is dominant in terms of total...

KEY QUESTION The following are production possibilities tables for China and the United States. Assume that before specialization and trade the optimal product mix for China is alternative B and for...