Question: The following code counts how often different words occur in a text file. In this program, bstmap has an interface and behavior identical to a

The following code counts how often different words occur in a text file. In this program, bstmap has an interface and behavior identical to a std::map but is implemented using simple binary search trees.

char toLC (char c) { return (c >= 'A' && c <= 'Z') ? c-'A'+'a' : c; } void countWordsInFile (istream& inFile, bstmap& counts) { string word; while (inFile >> word) { // strip away any non-alphabetic characters string::size_type n = word.find_first_not_of (wordCharacters); if (n != string::npos) word = word.substr (0, n); if (word.length() > 0) { // if there's anything left, count it word = transform(word.begin(), word.end(), word.begin(), toLC); /** ... increment the appropriate counter in counts ... **/ map::iterator pos = counts.find (word); if (pos == counts.end()) // if this is the 1st time we've seen // this word counts.insert (map::value_type(word, 1)); else // else if we've seen this word before ++((*pos).second); } } }

Let W denote the number of words in the document file being processed. Many of these words, in a typical document, will be duplicates. Let D denote the number of distinct (non-duplicate) words, D <= W. A study of several large text files has suggested that, for sufficiently large W, authors will use an average vocabulary of D=c*log(W) distinct words, where c is some constant. Assume the length of any individual word is bounded by some constant (e.g., no single word is longer than "antidisestablishmentarianism"). What is the average-case complexity for countWordsInFile?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!