Question: C++ Code: This programming homework is to develop a simple spelling checker. The file dict.txt [Preview the document] [View in a new window] contains 25,021

C++ Code:

This programming homework is to develop a simple spelling checker. The file dict.txt [Preview the document] [View in a new window] contains 25,021 frequently used words, each on a separate line in lowercase. Read the file, and insert the words into a hash table with 1373 buckets. Remember to move dict.txt to the csegrid. Then run the command dos2unix dict.dat (to remove those pesky /r's created by Windows)

Prompt for the name of an input text file to check. This file will contain a number of words.

For this assignment a word is any sequence of one or more characters separated by one or more Spaces or newlines. You could be reading text from a book, so you have to delete starting and ending quotations, and delete periods, question marks, exclamation marks and semi-colons from the back of the string. Since there are some unknown characters in a book, you should loop through the entire word and remove non-alpha characters. If you include cctype, you will be able to ask if isalnum. Also remember the purpose of << operators. Also note that strings have an erase command

Read the document, and separate it into a sequence of words converted to lowercase. Use http://www.cplusplus.com/reference/locale/tolower/ (Links to an external site.)Links to an external site. as an example way to convert. A for loop could be useful in converting all characters.

Print out the words that could be mispelled, then print out the # of Words in the Dictionary, # of Words in the File, # of words not in the dictionary.

Here are two files to check against (you may be a few off depending on how you coded):

check.txt [Preview the document] [View in a new window]

25021 dictionary words, 29 words in file, 4 misspelled (Your algorithms should come up with the write answer for this one)

Potter.txt [Preview the document] [View in a new window]

25021 dictionary words, 78452 words in file, 16588 words misspelled (Note: you may have hundreds more words misspelled depending on which characters you delete in checking the database).

/********************************************************************************/

HASH TABLES

A hash table contains buckets into which an object (data item) can be placed. When a hash function is applied to an object, a hash value is generated. The hash value is used to determine which bucket the object is assigned to. A bucket is a cluster (or a sub container) that holds a set of data items that hash to the same table location. Obviously, you can not store 25K words in 1373 slots and you need to use some kind of chaining schemes such as linear probing or the second hashing. The size of a bucket is independent from the number of data items you put into the hash. So if you have too many buckets, the hash will not have many collisions but you may waste the storage and you may have to deal with a rather complex hash function and longer keys. If you have too small number of buckets, then you have to deal with frequent collisions. Finding a good bucket number would play an important role in reducing collisions. That's why we usually pick a prime number for the number of bucket. We picked 1373 for the bucket number.

/*************************************************************************************/

IMPLEMENTATION

Implement with an array (SIZE=1373) of linked lists. Your linked lists should contain the word that was hashed to that array. When you land on a particular array cell (equal to the hash of the word), traverse the linked list until you either find the word, or the nullptr...then add the word. (You can use the STL list if you choose). For your hash function, you will be hashing strings.

To get a hash string you should declare a variable something like this: hash hashStr;

then when you want to hash a particular string (let's say called string1)

hashStr(string1);

This "function" will produce a long data type (which you should mod by 1373)

You don't have to interpret verb tense, plurals, conjugations etc. All you have to do is to check with each word against the dictionary.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Language: C++ This programming homework is to develop a simple spelling checker. The file dict.txt contains 25,021 frequently used words, each on a separate line in lowercase. Read the file, and...

I can read in the dict.txt file, and I am pretty sure I can make the hash table, its just the rest as far as reading in the test files and making sure it meets the requirements. In C++, this...

I have to create a program in C and I can't figure it out. The program has to read a source file. Please help. /******************************************************************** PROJECT: Glossary...

CERTIFICATE IV IN FINANCE AND MORTGAGE BROKING - FN540820 Page 1 UNIT 9 MANAGE PERSONAL AND PROFESSIONAL DEVELOPMENT Unit Code: BSBPEF501 This unit describes the skills and knowledge required to...

Rev.Confirming Pages C H A P T E R 7 Planning, Composing, and Revising Chapter Outline The Ways Good Writers Write Activities in the Composing Process Using Your Time Effectively Brainstorming,...

Part 1: Planning D Introduction Planning for a project is an important skill, so the TT284 EMA includes an element to encourage you to think about the work you will have to complete, how long the...

RMIT UNIVERSITY Programming Fundamentals (COSC2531) Assignment 2 Individual assignment (no group work). Submit online via Canvas/Assignments/Assignment 2. Marks are awarded per rubric (please see the...

There are two problems due this week (each worth 35 points) as follows. Problem 1.6 (page 20) In comprehensive paragraphs, answerrequirements a to e. You will have 5 paragraphs total of four to five...

Company Interview Discuss how you would interview a client at a new company in order to understand their business. Develop a list of questions you would ask that would allow you to customize...

Activity: Introduction Page 1 of 6 Terminology Object-oriented programming Java API (class library) Documentation Comments Class Method Identifiers Reserved words White space Machine language...

Draw the network diagram properly, identify the paths, duration and slack, draw the table showing ES, LS, EF, LF, TF, FF and REMARKS. Identify the critical path/s. Indirect cost is Php 12,000.00 per...

Use integration to determine the moment of inertia of a thin circular hoop of radius R and mass M for rotation about a diameter. Check your answer by referring to Table 9-1. Table 9-1 Moments of...

Whole - firm LBOs tend to result in all the following negative outcomesEXCEPT: large debt and increased financialrisk failure to invest inR

Which of the following values was NOT among those we identified as important in the monastic vision developed by the "Cappadocians"? a . The creation of hospitals and orphanages attached to...

In the Data Source View in Visual Studio, what option is available to view data in any Source View Table? What are the primary uses this capability?

What Microsoft Analysis Services Extension for Visual Studio 2017 needs to be installed before beginning work on a Multidimensional OLAP Cube Project? How can the installation be verified?

Why would the FedScope Employment database be more representative of the General Population in terms of Salary Data than the CPS studies?