Question: In python: Natural language processing (NLP) refers to computational technique involving language. It is a broad field. For this assignment, we will learn a bit

In python:

Natural language processing (NLP) refers to computational technique involving language. It is a broad field. For this assignment, we will learn a bit about NLP to build a predictive keyboard. The keyboard learns to make predictions by text given at startup and by things the user types into the program. As users type, the program predicts the most likely word given the currently entered word prefix. So for example, if the user has typed "th", the keyboard will probably predict "the".

We will use a hash map to implement the keyboard, which is crucial to making the keyboard's learning and predictions fast and efficient. For a hash map, we will use Python's built in dictionary object and our own Map class.

Note: for this assignment, Use standard .py files.

Implement a Map class that inherits from a HashTable class (see the class notes for much of this code). Your HashTable class should implement chaining for collision handling.

The brains of this whole operation is the class WordPredictor. This class learns how to make word predictions based on being shown training data. It can learn from all the words in a file (via the train() method) or from a single individual word (via the train_word() method). After new training data is added, the build() method must be called so the class can recompute the most likely word for all possible prefixes. Here is the API for the WordPredictor class:

Model training occurs in the train() and train_word() methods. train() should parse out each word in the specified file on disk. If the file cannot be read, it should print out an error, "Could not open training file: file.txt". All training words should be converted to lowercase and stripped of any characters that are not a-z or the single apostrophe. During training, you need to update the instance variables:

word_to_count: Map of string (word key) to integer (count value) pairs

total: Integer

The word_to_count instance variable is a Map where the keys are the unique words encountered in the training data. The values are the integer count of how many times we've seen each word in the data. Only words seen in the training data will have an entry in the word_to_count Map. The total instance variable tracks how many words of training data we have seen. That is, total should equal the sum of all integer counts stored in your map. Training is cumulative for repeated calls to train() and train_word(), so you just keep increasing the counts stored in your word_to_count map and your total count of training words.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

PYTHON QUESTION...... Overview and Requirements Natural language processing (NLP) refers to computational technique involving language. It is a broad field. For this assignment, we will learn a bit...

*******PLEASE ANSWER IN PYTHON ONLY********* PA4 Maps (100 pts) Due: Learner Objectives ----------------- At the conclusion of this programming assignment, participants should be able to: Implement...

moby_end.txt ------------------ But as the last whelmings intermixingly poured themselves over the sunken head of the Indian at the mainmast, leaving a few inches of the erect spar yet visible,...

********PLEASE ANSWER IN PYTHON ONLY********* PA4 Maps (100 pts) Due: Learner Objectives ----------------- At the conclusion of this programming assignment, participants should be able to: Implement...

*******PLEASE ANSWER IN PYTHON ONLY********* Learner Objectives ----------------- At the conclusion of this programming assignment, participants should be able to: Implement hash tables and hash...

PA4 Maps (100 pts) Due: Learner Objectives ----------------- At the conclusion of this programming assignment, participants should be able to: Implement hash tables and hash functions Linear probing...

Python and most Python libraries are free to download or use, though many users use Python through a paid service. Paid services help IT organizations manage the risks associated with the use of...

Students are expected to read and critically assess an article assigned by the instructor. Read, reread, and consider the ideas advanced by the author. Do you agree? Disagree? Are the ideas presented...

Old MathJax webview Students are expected to read and critically assess an article assigned by the instructor. Read, reread, and consider the ideas advanced by the author. Do you agree? Disagree? Are...

You have to prepare a plan for marketing Coke in France. Visit www.invest-in-france.org/north-america/en or www.afii.fr/ North America to find the relevant information.

Analog proportional-derivative controllers sometimes are formulated with a transfer function of the form: where ? = 0.05 to 02. The ideal PD transfer function is obtained when ? = 0. G1(s) = Kc(?DS +...

Data Warehouse Application ( 4 0 Points ) I. Consider the "Data _ mod.xlsx " file ( it is posted with this assignment ) Points

Seved Help 14 Wisconsin Snowmobile Corp. is considering a switch to level production Cost efficiencies would occur under level production, and aftertax costs would decline by $31,500, but inventory...

ADVANCED ANALYSIS Suppose that the linear equation for consumption in a hypothetical economy is C 40 .8Y. Also suppose that income (Y ) is $400. Determine (a) the marginal propensity to consume,...

KEY QUESTION Assume that in a particular year the natural rate of unemployment is 5 percent and the actual rate of unemployment is 9 percent. Use Okuns law to determine the size of the GDP gap in...

KEY QUESTION Complete the following table: a. Show the consumption and saving schedules graphically. b. Find the break-even level of income. Explain how it is possible for households to dissave at...