Question: Basic Part ( 1 0 0 points ) The Huffman s Algorithm In this homework assignment, I would like you to implement Huffman s algorithm.
Basic Part points
The Huffmans Algorithm
In this homework assignment, I would like you to implement Huffmans algorithm.
The Huffman Algorithm: Given an input text file in C do the following:
Perform a linear scan to gather frequencies of all the letters that occurred in the file. You should not consider letters with zero frequencies. Save the frequencies in list L of binary tree nodes. Here, each node shall contain a letter and its frequency.
Sort the list L according to frequencies in increasing order.
Remove the first two nodes N and N with the lowest frequencies, build a new node N with a hypothetical letter a dummy and a frequency as the sum of these of N and N and add N as the left child of N and N as the right child of N Then, insert N into L to keep L in sorted order. Keep doing the above process until L has only one node T
The node T obtained from Step is the Huffman code tree. For any node in the tree, its edge pointing to its left child, if there is one, can be interpreted as Similarly, its right edge pointing to its right child, if there is one, can be interpreted as The binary string along the edge path from the root to a letter at a leaf node is thus the Huffman code for the letter.
Use Huffman codes from Step to encode the input text file and output the coded file in an output file, which is the encoded file.
Take the encoded file obtained in Step decode it using the Huffman codes from Step and save the result in another output file, which is the decoded file.
An important note for Basic Part: For Huffman coding, say, a letter is coded with
You may consider this coded string is a bytestring, ie a string of characters. You can output this bytestring as the compressed file, and then use the bytestring to uncompress the compressed file to uncover the original file. You do not have to encode the length of the bytestring to the compressed file. In fact, when you use such a bytestring, your compressed file may not be really compressed
In computer science and information theory, a Huffman code is an optimal prefix code found using the algorithm developed by David A Huffman while he was a PhD student at MIT, and published in the paper "A Method for the Construction of MinimumRedundancy Codes". The process of finding andor using such a code is called Huffman coding and is a common technique in entropy encoding, including in lossless data compression. The algorithm's output can be viewed as a variablelength code table for encoding a source symbol such as a character in a file Huffman's algorithm derives this table based on the estimated probability or frequency of occurrence weight for each possible value of the source symbol. As in other entropy encoding methods, more common symbols are generally represented using fewer bits than less common symbols. Huffman's method can be efficiently implemented, finding a code in linear time to the number of input weights if these weights are sorted. However, although optimal among methods encoding symbols separately, Huffman coding is not always optimal among all compression methods.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
