Question: Write a C program to find the ten most frequently used words in a given file. Your executable file must be named wordcount . It

Write a C program to find the ten most frequently used words in a given file.

Your executable file must be named wordcount.

It should take as argument the name of a file and outputs ten lines, one for each of the ten most frequently used words in the file in sorted order.

If two words w1, w2 have the same occurance counts and w1 is alphabetically smaller than w2, then w1 precedes w2. Each line shows the occurance counts, followed by a whitespace, and then followed by the corresponding word.

For example, suppose the contents of the example.txt are given below:

$ cat example.txt A potential victory for Donald J. Trump may hinge on one important (and large) group of Americans: whites who did not attend college. Polls have shown a deep division between whites of different education levels and economic circumstances. A lot rides on how large these groups will be on Election Day: All pollsters have their own assessment of who will show up, and their predictions rely on these evaluations.

Then, the expected output of the ./wordcount example.txt command should be:

$ ./wordcount example.txt 2 large 2 their 2 these 2 whites 2 who 2 will 3 a 3 and 3 of 4 on

In order to count the words, your program needs to transform the given textfile into a collection of words. First, you need to split the textfile into words. Each word refers to a consecutive sequence of alpha-numeric characters. Words are separated by one or more non-alphanumeric characters. (Hint: usage the C library function isalnum. For usage, type man isalnum) Second, you need to ``canonicalize`` (aka ``normalize'') each word by turning any uppercase letters into a lower case one.

To count the occurances of words, there are several strategies.

You can implement a hash table to store the mapping from each word (C strings) to its occurance counter. C's standard library does not have hash tables nor dictionary, so you'll have to implement your own. After you've counted all the words, you'll need to sort them by their occurance counters.

You can sort all words first. For sorting, you should learn to use the library function qsort (type man qsort to learn how to use it). You can then sum consecutive identical words in the sorted list to count them. Last, you need to sort words by their occurance counts.

You should write a Makefile to compile your program and write your own unit tests to check the correctness.

Note that we will test your program using large text files. Your program must not run slower than O(n*log_n), otherwise, you will not be able to pass the test. You may assume that the miximum size of a word is no more than 100 characters

I have a few days to complete this assignment, so please take your time when answering this question. Please make sure to attach the main() code as well as the Makefile. I look forward to reading your answer.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

In this assignment you will write a C program. Your program should compile correctly and produce the specified output. Given an input file (named CrosswordInput.txt) with the following format:...

I Want this code in C language , if you are expert in C++ do it in C++ The purpose of this assignment is to to work with linked lists, memory allocation, representing a graph in c, command line...

You are required to write a C program - whose input is an existing text file with a name specified in a command line as the first argument, - whose output is a new text file with a name specified in...

You are required to write a C program whose input is an existing text file with a name specified in a command line as the first argument, whose output is a new text file with a name specified in a...

You are required to write a C program whose input is an existing text file with a name specified in a command line as the first argument whose output is a new text file with a name specified in a...

Restaurant Waiting List System For this lab, write a C program that will implement a customerwaiting list that might be used by a restaurant. Use the base codeto finish the project. When people want...

1411116 - Programming I Assignment #3 Due Date: November 30, 2016 Submission Instructions: Submit your assignment on the blackboard link, corresponding to your Section: Please follow the following...

Part 1 (35 marks) LC3Edit is used to write LC-3 assembly programs. After a program is written, we use the LC-3 assembler (ie, the "Translate ? Assemble" function in LC3Edit) to convert the assembly...

For this assignment, you will write a program to count the number of times the words in an input text file occur. The WordCount Structure Define a C++ struct called WordCount that contains the...

You are the manager of Local Electronics Shop (LES), a small brick-and-mortar retail camera and electronics store. One of your employees proposed a new online strategy whereby LES lists its products...

1. How much would you be willing to pay for a recording of your favorite band in concert? 2. How are these recordings made? 3. Is eBay liable for selling illegal items?

Colter Steel has $ 5 , 2 5 0 , 0 0 0 in assets. Temporary current assets$ 2 , 5 0 0 , 0 0 0 Permanent current assets 1 , 5 7 5 , 0 0 0 Fixed assets 1 , 1 7 5 , 0 0 0 Total assets$ 5 , 2 5 0 , 0 0 0...

You find the following corporate bond quotes. To calculate the number of years until maturity, assume that it is currently January 15, 2019. The bonds have a par value of $2,000 and semiannual...

11. Are your speaking notes helpful and effective?

3. Pay attention to how you meet people and the general first impression you receive from them. Ask yourself what makes you feel the way you do about the person. Does the person make you feel...

The Goals of Informative Speaking Topics for Informative