Question: Implement this in C++ please! I'll appreciate it very much!! Huffman coding is used to compress data. The idea is straightforward: represent more common longer

Implement this in C++ please! I'll appreciate it very much!!

Implement this in C++ please! I'll appreciate it very much!! Huffman coding

is used to compress data. The idea is straightforward: represent more common

Huffman coding is used to compress data. The idea is straightforward: represent more common longer strings with shorter ones via a basic translation matrix. The translation matrix is easily computed from the data itself by counting and sorting by frequency. For example, in a well-known corpus used in Natural Language Processing called the "Brown" corpus (see nltk.org), the top-20 most frequent tokens, which are words or punctuation marks are listed below associated with frequency and code. The word "and" for example requires writing three characters. However, if I encoded it differently, say, using the word "5" (yes, I called "5" a word on purpose), then I save having to write two extra characters! Note, the word "and" is so frequent, I save those two extra characters many times over! the Code 1 2 1 3 4 5 6 7 8 Token Frequency 62713 58334 49346 of 36080 and 27932 to 25732 a 21881 in 19536 that 10237 is 10011 was 9777 for 8841 8837 8789 The 7258 with 7012 it 6723 as 6706 he 6566 his 6466 9 10 11 12 13 14 15 16 17 18 19 20 So the steps of Huffman coding are relatively straightforward: 1. Pass through the data once, collecting a list of token-frequency counts. 2. Sort the token-frequency counts by frequency, in descending order. 3. Assign codes to tokens using a simple counter, for example by incrementing over the integers; this is just to keep things simple. 4. Store the new mapping (token -> code) in a hashtable called "encoder". 5. Store the reverse mapping (code -> token) in a hashtable called "decoder". 6. Pass through the data a second time. This time, replace all tokens with their codes. Now, be amazed at how much you've shrunk your data! Delivery Notes: (1) Implement your own hashtable from scratch, you are not allowed to use existing hash table libraries. (2) To be useful, your output should include the coded data as well as the decoder (code -> token) mapping file. Huffman coding is used to compress data. The idea is straightforward: represent more common longer strings with shorter ones via a basic translation matrix. The translation matrix is easily computed from the data itself by counting and sorting by frequency. For example, in a well-known corpus used in Natural Language Processing called the "Brown" corpus (see nltk.org), the top-20 most frequent tokens, which are words or punctuation marks are listed below associated with frequency and code. The word "and" for example requires writing three characters. However, if I encoded it differently, say, using the word "5" (yes, I called "5" a word on purpose), then I save having to write two extra characters! Note, the word "and" is so frequent, I save those two extra characters many times over! the Code 1 2 1 3 4 5 6 7 8 Token Frequency 62713 58334 49346 of 36080 and 27932 to 25732 a 21881 in 19536 that 10237 is 10011 was 9777 for 8841 8837 8789 The 7258 with 7012 it 6723 as 6706 he 6566 his 6466 9 10 11 12 13 14 15 16 17 18 19 20 So the steps of Huffman coding are relatively straightforward: 1. Pass through the data once, collecting a list of token-frequency counts. 2. Sort the token-frequency counts by frequency, in descending order. 3. Assign codes to tokens using a simple counter, for example by incrementing over the integers; this is just to keep things simple. 4. Store the new mapping (token -> code) in a hashtable called "encoder". 5. Store the reverse mapping (code -> token) in a hashtable called "decoder". 6. Pass through the data a second time. This time, replace all tokens with their codes. Now, be amazed at how much you've shrunk your data! Delivery Notes: (1) Implement your own hashtable from scratch, you are not allowed to use existing hash table libraries. (2) To be useful, your output should include the coded data as well as the decoder (code -> token) mapping file

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

SEE "DOCUMENT 1" FOR CRITERIA 2 "DOCUMENT 2" FOR CRITERIA 4 You are to write a 3 to 4 page paper following APA rules for the title page, citations and appropriate references within the body of the...

SEE "DOCUMENT 1" FOR CRITERIA 2 "DOCUMENT 2" FOR CRITERIA 4. You are to write a 2 to 4 page paper following APA rules for the title page, citations and appropriate references within the body of the...

SEE "DOCUMENT 1" FOR CRITERIA 2 "DOCUMENT 2" FOR CRITERIA 4 You are to write a 3 to 4 page paper following APA rules for the title page, citations and appropriate references within the body of the...

CHA P TER 9 Understanding Software: A Primer for Managers 1. INTRODUCTION L E A R N I N G O B J E C T I V E S 1. Recognize the importance of software and its implications for the rm and strategic...

Attached is Accounting assignment along side recommended readings to answer certain questions. Thank you Assignment 1 Problem 1 15 points Reading - W. L. Ferrara, Cost/Management Accounting: The 21st...

Providing Quality School-Based Learning and Support Services 239 Chapter 6 Language and literacy support Your core task The core task of almost all TAs is to support students language and literacy...

Java Your final major program for CSII will be to create a pair of command-line programs to compress and decompress arbitrary les. This will require some bit-level operations, creating a Comparable...

Please discuss in five hundred words explaining how the articles connect the resources to concepts in Chapter 8. I have upload chapter 8 of the text book below and the articles to read. (Samovar, L....

Hi. I am going to do research project in Managemeerial accounting. What i need is to come up with good research question from one of these two articles below. The research project is going to be very...

Steve was a regular patron at Half Time Bar & Grill, which he visited at least 3 times/week. Everyone know Steve and liked him. Kelly - on her first day on the job - served him a small salad and 5...

What is price, and why is it important to a firm? What are some examples of monetary and non-monetary prices?

Up until three years ago, a firm opened an average of ten new retail stores a year. One of every ten new stores had to be closed within two years due to poor sales. This 9 0 percent success ratio was...

Please answer in same formatting. Sasha Systems completed the following stock issuance transactions: (Click the icon to view the transactions.) Requirements 1. Journalize the transactions....

1. Describe how organizational culture and the use of performance criteria and standards affect the remaining components of a performance management system.

4. Review the performance appraisal process and appraisal form used by a current or former employer and compare it with those provided by other students. Also review other appraisal issues by going...

5. As the new HR Director of a company in the behavioral health industry, you have the responsibility to develop a performance management system. You need to present a business case to senior...