Question: Write a program to construct a dictionary of all words, defined to be runs of consecutive nonwhitespace, in a given text file. We might then

Write a program to construct a dictionary of all “words,” defined to be runs of consecutive nonwhitespace, in a given text file. We might then compress the file (ignoring the loss of whitespace information)

by representing each word as an index in the dictionary. Retrieve the file rfc791.txt from the RFC repository, and run your program on it.

Give the size of the compressed file, assuming first that each word is encoded with 12 bits (this should be sufficient) and then that the 128 most common words are encoded with 8 bits and the rest with 13 bits. Assume that the dictionary itself can be stored by using, for each word, length(word) + 1 bytes.

Step by Step Solution

3.52 Rating (145 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

There are several steps to answer your question First we need to build a program that constructs the ... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Computer Networking Questions!