Question: Program Specifications The following are requirements for your program: Read in the name of the file to process from the second command-line argument. Read in

Program Specifications The following are requirements for your program: Read in the name of the file to process from the second command-line argument. Read in the number of most common words to process from the first command-line argument.

Write a function named getStopwords that takes the name of the ignore- words file and a reference to a vector as parameters (returns void). Read in

the file for a list of the top 50 most common words to ignore (e.g., Table 1). These are commonly referred to as stopwords in NLP (Natural Language Processing). (Create this file yourself) o The file will have one word per line, and always have exactly 50 words in the file. We will test with files having different words in it! o Your function will update the vector passed to it with a list of the words from the file. Program Specifications The following are requirements for your program: Read in the name of the file to process from the second command-line argument. Read

my queastion below

Store the unique words found in the file that are not in the stopword list in a dynamically allocated array. o Call a function to check if the word is a stopword first, and if it is, then ignore that word. o Use an array of structs to store each unique word (variable name word) and a count (variable name count) of how many times it appears in the text file. o Use the array-doubling algorithm to increase the size of your array

We dont know ahead of time how many unique words the input file will have, so you dont know how big the array should be. Start with an array size of 100 (use the constant declared in the starter code), and double the size as words are read in from the file and the array fills up with new words. Use dynamic memory allocation to create your array Copy the values from the current array into the new array, and then Free the memory used for the current array. (Index of any given word in the array after resizing must match index in array before resizing.)

Output the top n most frequent words Write a function named printTopN that takes a reference to the array of structs and the value of n to determine the top n words in the array. Generate an array of the n top items sorted from most frequent to least frequent and print these out from most to least. Array MUST be sorted before calling printTopN. Output the number of times you had to double the array. Output the number of unique non-stop words. Output the total number of non-stop words.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!