Question: Project Description Develop a spelling checker (i.e., best word predictor) using a 3-gram language model. Each student needs to collect an Arabic corpus of 1

 Project Description Develop a spelling checker (i.e., best word predictor) using

Project Description Develop a spelling checker (i.e., best word predictor) using a 3-gram language model. Each student needs to collect an Arabic corpus of 1 million words at least. Students can not share the same corpus, fully or partially with each other, and cannot re-use text from previous years. Tokenize the corpus into tokens/words, and then build a tri-gram language model for this corpus. The language model should contain: token, count. + the probability (or log) of the token, and should be saved in a CSV file. Develop an interface to allow the user to write text then click a "spell" button. If the user writes "#" in the text, the program will suggest the top five words and their probability) as a replacement of the #, using the language model. Each student should submit his/her project via Moodle. The project should include. The source code, corpus, language model. The project should be JAVA. Example: Spell # 0.81 0.4 0.38 0.21 0.75

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!