Question: In C++ Please fix my code or create new code where it would get the expected output (shown below) Test Expected Got ./program 25 mobydick.txt

In C++ Please fix my code or create new code where it would get the expected output (shown below)

Test	Expected	Got
	./program 25 mobydick.txt ignoreWords.txt	Array doubled: 8 Distinct non-common words: 13744 Total non-common words: 67327 Probability of next 10 words from rank 25 --------------------------------------- 0.00302 - other 0.00300 - over 0.00297 - been 0.00296 - these 0.00290 - sea 0.00285 - said 0.00282 - down 0.00276 - yet 0.00275 - any 0.00270 - whales	Array doubled: 8 Distinct non-common words: 13745 Total non-common words: 71939 Probability of next 10 words from rank 25 --------------------------------------- 0.00285 mobydick 0.00282 other 0.00281 over 0.00278 been 0.00277 these 0.00271 sea 0.00267 said 0.00264 down 0.00259 yet 0.00257 any
	./program 5 mobydick.txt ignoreWords.txt	Array doubled: 8 Distinct non-common words: 13744 Total non-common words: 67327 Probability of next 10 words from rank 5 --------------------------------------- 0.00515 - were 0.00480 - some 0.00460 - now 0.00452 - no 0.00449 - then 0.00443 - upon 0.00437 - like 0.00437 - when 0.00428 - ye 0.00419 - are	Array doubled: 8 Distinct non-common words: 13745 Total non-common words: 71939 Probability of next 10 words from rank 5 --------------------------------------- 0.00566 had 0.00482 were 0.00449 some 0.00431 now 0.00423 no 0.00420 then 0.00414 upon 0.00409 like 0.00409 when 0.00400 ye
	./program 10 mobydick.txt ignoreWords.txt	Array doubled: 8 Distinct non-common words: 13744 Total non-common words: 67327 Probability of next 10 words from rank 10 --------------------------------------- 0.00443 - upon 0.00437 - like 0.00437 - when 0.00428 - ye 0.00419 - are 0.00407 - more 0.00364 - them 0.00359 - man 0.00345 - into 0.00325 - though	Array doubled: 8 Distinct non-common words: 13745 Total non-common words: 71939 Probability of next 10 words from rank 10 --------------------------------------- 0.00420 then 0.00414 upon 0.00409 like 0.00409 when 0.00400 ye 0.00392 are 0.00381 more 0.00341 them 0.00336 man 0.00322 into
	./program 15 HarryPotter.txt ignoreWords.txt	Array doubled: 6 Distinct non-common words: 5985 Total non-common words: 50331 Probability of next 10 words from rank 15 --------------------------------------- 0.00393 - got 0.00389 - no 0.00387 - could 0.00387 - didnt 0.00385 - like 0.00374 - know 0.00358 - down 0.00358 - just 0.00358 - professor 0.00358 - see	Array doubled: 6 Distinct non-common words: 5985 Total non-common words: 50331 Probability of next 10 words from rank 15 --------------------------------------- 0.00393 got 0.00389 no 0.00387 didnt 0.00387 could 0.00385 like 0.00374 know 0.00358 professor 0.00358 just 0.00358 see 0.00358 down

Code

In C++ Please fix my code or create new code where it would get the expected output (shown below) Test Expected Got ./program 25 mobydick.txt ignoreWords.txt Array doubled: 8 Distinct non-common words: 13744 Total non-common words: 67327 Probability of next 10 words from rank 25 --------------------------------------- 0.00302 - other 0.00300 - over 0.00297 - been 0.00296 - these 0.00290 #include #include #include #include #include using namespace std;

struct wordRecord { string word; int count; };

//Function prototypes void getIgnoreWords(const char* ignoreWordFileName, string ignoreWords[]); bool isIgnoreWord(string word, string ignoreWords[]); int getTotalNumberNonIgnoreWords(wordRecord distinctWords[], int length); void sortArray(wordRecord distinctWords[], int length); void printTenFromN(wordRecord distinctWords[], int N, int totalNumWords);

int main(int argc,char* argv[]) { //Command line argument error check if (argc != 4) { cout " > ignoreWords[i]) { i++; } in.close(); } //Function return true if the word found in array else false bool isIgnoreWord(string word, string ignoreWords[]) { for (int i = 0; i

Instructions In this assignment, we will write a program to analyze the word frequency of a document. As the number of words in the document may not be known a priori, we will implement a dynamically doubling array to store the information. Please read all the directions before writing code, as this write-up contains specific requirements for how the code must be written. Problem Overview: There are two files on Canvas. mobydick.txt - contains text to be read and analyzed by your program. The file contains the full text from Moby Dick. For your convenience, all the punctuation has been removed, words have been converted to lowercase, and the entire document can be read as if it were written on a single line. ignoreWords.txt contains the 50 of the most common words in the English language, which your program will ignore during analysis. Your program must take three command line arguments in the following order - a number N, the file name of the text to be read, and the file name containing the words to be ignored. It will read the text from the first file while ignoring the words from the second file and store all the unique words encountered in a dynamically doubling array. After necessary calculation, the program must print the following information: O . . The number of times array doubling was required to store all the unique words The number of unique non-ignore" words in the file The total word count of the file (excluding the ignore words) After calculating the probability of occurrence of each word and storing it in an array in the decreasing order of probability, starting from index N of the array, print the 10 most frequent words along with their probability (up to 5 decimal places) . For example, running your program with the command: ./Assignment2 25 mobydick.txt ignorewords.txt would print the next 10 words starting from index 25, i.e. your program must print the 25th-34th most frequent words, along with their respective probabilities. Keep in mind that these words must not be any of the words from ignoreWords.txt. The full results would be: Array doubled: 8 Distinct non-common words: 13744 Total non-common words: 67327 Probability of next 10 words from rank 25 0.00302 - other 0.00300 - over 0.00297 - been 0.00296 - these 0.00290 - sea 0.00285 - said 0.00282 - down 0.00276 - yet 0.00275 - any 0.00270 - whales Specifications: 1. Use an array of structs to store the words and their counts You will store each unique word and its count (the number of times it occurs in the document) in an array of structs. As the number of unique words is not known ahead of time, the array of structs must be dynamically sized. The struct must be defined as follows: struct wordRecord { string word; int count; }; 2. Use the array-doubling algorithm to increase the size of your array Your array will need to grow to fit the number of words in the file. Start with an array size of 100, and double the size whenever the array runs out of free space. You will need to allocate your array dynamically and copy values from the old array to the new array. (Array-doubling algorithm must be implemented in the main() function). Note: Don't use the built-in std::vector class. This will result in a loss of points. You're actually writing the code that the built-in vector uses behind-the-scenes! 3. Ignore the top 50 most common words that are read from the ignoreWords.txt file To get useful information about word frequency, we will be ignoring the 50 most common words in the English language as noted in ignoreWords.txt 4. Take three command line arguments Your program must take three command line arguments 1. a number N which tells your program the starting index to print the next 10 most frequent words 2. the name of the text file to be read and analyzed 3. The name of the text file with the words to be ignored. 5. Output the Next 10 most frequent words starting from index N Your program must print out the next 10 most frequent words - not including the common words - starting index N in the text where N is passed as a command line argument. If two words have the same frequency, list them alphabetically. 6. Format your final output this way: Array doubled: Distinct non-common words: Total non-common words: Probability of next 10 words from rank - - - For example, using the command: ./Assignment2 25 mobydick.txt ignorewords.txt This function must read the stop words from the file with the name stored in ignoreWordFileName and store them in the ignoreWords array. You can assume there will be exactly 50 stop words. There is no return value. In case the file fails to open, print an error message using the below cout statement: std::cout header. However, you may write your own implementation of a sorting algorithm, such as Bubble Sort. Feel free to refer to the pseudocode for bubble sort that was given in Assignment 1. f. printTenFromN function void printTenFromN (wordRecord distinctWords [], int n, int totalNumWords); This function must print the next 10 words after the starting index N from the sorted array of distinctWords. These 10 words must be printed with their probability of occurrence up to 5 decimal places. The exact format of this printing is given below . The function does not return anything. Probability of occurrence of a word at position ind in the array is computed using the formula: (Don't forget to cast to float!) probability-of-occurrence = (float) unique Words[ind].count/ totalNumWords Array doubled: 8 Distinct non-common words: 13744 Total non-common words: 67327 Probability of next 10 words from rank 25 - - 0.00302 - other 0.00300 - over 0.00297 been 0.00296 these 0.00290 sea 0.00285 said 0.00282 down 0.00276 yet 0.00275 any 0.00270 - whales - Instructions In this assignment, we will write a program to analyze the word frequency of a document. As the number of words in the document may not be known a priori, we will implement a dynamically doubling array to store the information. Please read all the directions before writing code, as this write-up contains specific requirements for how the code must be written. Problem Overview: There are two files on Canvas. mobydick.txt - contains text to be read and analyzed by your program. The file contains the full text from Moby Dick. For your convenience, all the punctuation has been removed, words have been converted to lowercase, and the entire document can be read as if it were written on a single line. ignoreWords.txt contains the 50 of the most common words in the English language, which your program will ignore during analysis. Your program must take three command line arguments in the following order - a number N, the file name of the text to be read, and the file name containing the words to be ignored. It will read the text from the first file while ignoring the words from the second file and store all the unique words encountered in a dynamically doubling array. After necessary calculation, the program must print the following information: O . . The number of times array doubling was required to store all the unique words The number of unique non-ignore" words in the file The total word count of the file (excluding the ignore words) After calculating the probability of occurrence of each word and storing it in an array in the decreasing order of probability, starting from index N of the array, print the 10 most frequent words along with their probability (up to 5 decimal places) . For example, running your program with the command: ./Assignment2 25 mobydick.txt ignorewords.txt would print the next 10 words starting from index 25, i.e. your program must print the 25th-34th most frequent words, along with their respective probabilities. Keep in mind that these words must not be any of the words from ignoreWords.txt. The full results would be: Array doubled: 8 Distinct non-common words: 13744 Total non-common words: 67327 Probability of next 10 words from rank 25 0.00302 - other 0.00300 - over 0.00297 - been 0.00296 - these 0.00290 - sea 0.00285 - said 0.00282 - down 0.00276 - yet 0.00275 - any 0.00270 - whales Specifications: 1. Use an array of structs to store the words and their counts You will store each unique word and its count (the number of times it occurs in the document) in an array of structs. As the number of unique words is not known ahead of time, the array of structs must be dynamically sized. The struct must be defined as follows: struct wordRecord { string word; int count; }; 2. Use the array-doubling algorithm to increase the size of your array Your array will need to grow to fit the number of words in the file. Start with an array size of 100, and double the size whenever the array runs out of free space. You will need to allocate your array dynamically and copy values from the old array to the new array. (Array-doubling algorithm must be implemented in the main() function). Note: Don't use the built-in std::vector class. This will result in a loss of points. You're actually writing the code that the built-in vector uses behind-the-scenes! 3. Ignore the top 50 most common words that are read from the ignoreWords.txt file To get useful information about word frequency, we will be ignoring the 50 most common words in the English language as noted in ignoreWords.txt 4. Take three command line arguments Your program must take three command line arguments 1. a number N which tells your program the starting index to print the next 10 most frequent words 2. the name of the text file to be read and analyzed 3. The name of the text file with the words to be ignored. 5. Output the Next 10 most frequent words starting from index N Your program must print out the next 10 most frequent words - not including the common words - starting index N in the text where N is passed as a command line argument. If two words have the same frequency, list them alphabetically. 6. Format your final output this way: Array doubled: Distinct non-common words: Total non-common words: Probability of next 10 words from rank - - - For example, using the command: ./Assignment2 25 mobydick.txt ignorewords.txt This function must read the stop words from the file with the name stored in ignoreWordFileName and store them in the ignoreWords array. You can assume there will be exactly 50 stop words. There is no return value. In case the file fails to open, print an error message using the below cout statement: std::cout header. However, you may write your own implementation of a sorting algorithm, such as Bubble Sort. Feel free to refer to the pseudocode for bubble sort that was given in Assignment 1. f. printTenFromN function void printTenFromN (wordRecord distinctWords [], int n, int totalNumWords); This function must print the next 10 words after the starting index N from the sorted array of distinctWords. These 10 words must be printed with their probability of occurrence up to 5 decimal places. The exact format of this printing is given below . The function does not return anything. Probability of occurrence of a word at position ind in the array is computed using the formula: (Don't forget to cast to float!) probability-of-occurrence = (float) unique Words[ind].count/ totalNumWords Array doubled: 8 Distinct non-common words: 13744 Total non-common words: 67327 Probability of next 10 words from rank 25 - - 0.00302 - other 0.00300 - over 0.00297 been 0.00296 these 0.00290 sea 0.00285 said 0.00282 down 0.00276 yet 0.00275 any 0.00270 - whales

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

(I need help with coding this. I keep ending up with too many errors, so if you can, please comment as much as you can. All the steps come together, and it needs to be coded in C++) OBJECTIVES 1....

IN C++ the be to of and a in that have i it for not on with he as you do at this but his by from they we say her she or an will my one all would there their what so up out if about who get which go...

I'm not getting this. Please help (ClassData) CS 210 5 ANTER ZACHARY L 91.78 68.78 73.09 87.36 51.08 ARREOLA JUAN J 50.15 65.04 61.32 59.27 89.99 BARBER ANTONIO D 72.71 78.46 89.64 91.76 91.35 BURKE...

In C++ please. Project requires code to be executable. Base Code is included at the bottom for your convenience. Base Code for your convenience: You will write an nxn tic-tac-toe game/program that...

C++ Please write clear code so i could understand it. 11 function about >>1. Default Constructor ,2.Overloaded constructor, 3.GetSongName,4. SetSongName, 5.GetAlbumName, 6.SetAlbumName,...

Please help. This is a C++ assignment. My code is below and it is incomplete. It compiles, but doesn't display everything I need it to. Please fix my code. C++ Program Help (Learning Goals) Functions...

This is a C++ Program I need help with. My code is below and it is incomplete. It compiles, but doesn't display everything I need it to. Please fix my code. C++ Program Help (Learning Goals)...

[Java] I'm keep getting an error and having no idea why. Please fix my code. The grader bot will give you 90 points for a perfectly working app. RecordFormatException This class should extend...

Task: Implement reliable transport protocol using Selective Repeat in C by studying and extending an implementation of Go Back N running on a network simulator. This practical requires thought and...

can anyone fix this code? Test test_stropy_1 (./test_stropy - passed Test test_stropy_2 (/test_stropy "a") - passed Test test_stropy_6 (/test_stropy "test test what") - passed Test test_stropy_4 (...

When beats occur at a rate higher than about 20 per second, they are not heard individually but rather as a steady hum, called a combination tone. The player of a typical pipe organ can press a...

Hamad Corporation began operations on January 1, 2011. Recently the corporation has had several unusual accounting problems related to the presentation of its income statement for financial reporting...

Upon identifying any form of noncompliance, Reed must communicate this to Smith Company's management and, if the situation warrants, to those overseeing governance. Depending on the jurisdiction,...

please dont use chat gpt or other AI 8 6 5 .

How would redundant storage of values in columns in a Table be eliminated?

What do Primary Keys, along with Third Normal Form Design in a Database Model, achieve?

Describe Table Structures in RDMSs.