Question: Huffman coding is a common compression technique that assigns fewer bits to frequent items, using a binary tree. The input is a document consisting of

Huffman coding is a common compression technique that assigns fewer bits to frequent items, using a binary tree. The input is a document consisting of a bunch of items (e.g., characters). Huffman coding outputs a set of codewords for all items.

Below, we use an example to illustrate the process.

Given input string aabcabdacb

1. Huffman coding first determines the frequencies of each item. Here, a occurs 4 times, b 3, c 2, and d 1. Total is 10.

2. When building the tree, each item is first set as a "leaf node":

3. The pair of nodes yielding the lowest sum is found, and merged into a new node formed with that sum. Here, c and d yield 2 + 1 = 3.

4. The merging continues. The lowest sum is b's 3 plus the new node's 3, yielding 6. (Note that c and d are no longer eligible nodes).

5. The merging ends when only 1 node exists.

6. Each leaf node's encoding is obtained by traversing from the top node to the left. Each left branch appends a 0, and each right branch appends a 1, to the code. Below are the codewords for all four items:

Item Codeword
a 0
b 10
c 110
d 111

7. Then the input string aabcabdacb can be decoded to 0010110010111011010.

Huffman coding is a common compression technique that assigns fewer bits to frequent items, using a binary tree. The input is a document consisting of a bunch of items (e.g., characters). Huffman coding outputs a set of codewords for all items.

Below, we use an example to illustrate the process.

Given input string aabcabdacb

1. Huffman coding first determines the frequencies of each item. Here, a occurs 4 times, b 3, c 2, and d 1. Total is 10.

2. When building the tree, each item is first set as a "leaf node":

3. The pair of nodes yielding the lowest sum is found, and merged into a new node formed with that sum. Here, c and d yield 2 + 1 = 3.

4. The merging continues. The lowest sum is b's 3 plus the new node's 3, yielding 6. (Note that c and d are no longer eligible nodes).

5. The merging ends when only 1 node exists.

6. Each leaf node's encoding is obtained by traversing from the top node to the left. Each left branch appends a 0, and each right branch appends a 1, to the code. Below are the codewords for all four items:

Item Codeword
a 0
b 10
c 110
d 111

7. Then the input string aabcabdacb can be decoded to 0010110010111011010.

Suppose we have a 100-character text has these character frequencies:

A: 50

C: 40

B: 4

D: 3

E: 3

Fill in the blanks the answer for each of the question

  1. What is the first merge?
    1. D and E: 6
    2. B and D: 7
    3. B and D and E: 10
  2. What is the second merge?
    1. B and D: 7
    2. DE and B: 10
    3. C and A: 90
  3. What is the third merge?
    1. DEB and C: 40
    2. DEB and C: 50
  4. What is the fourth merge?
    1. None
    2. DEBC and A: 100
  5. What is the fifth merge?
    1. None
    2. DEBCA and F
  6. What is the code for A?
    1. 0
    2. 1
  7. What is the code for B?
    1. 001
    2. 110
  8. What is the code for C?
    1. 1
    2. 01
    3. 10
  9. What is the code for D?
    1. 1110
    2. 1111
  10. What is the code for E?
    1. 1110
    2. 1111
  11. 5 unique characters (A, B, C, D, E) can each be uniquely encoded in 3 bits (like 000, 001, 010, 011, and 100). With such a fixed-length code, how many bits are needed for the 100-character text?
    1. 100
    2. 300
  12. For the Huffman code determined in the above questions, the number of bits per character is A: 1, C: 2, B: 3, D: 4, and E: 4. How many bits are needed for the 100-character text?
    1. 14
    2. 166
    3. 300

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Algorithms Questions!