Question: 1 . File compression with Huffman coding As discussed in class on Nov. 1 4 , tries and Huffman coding can be used to compress

1. File compression with Huffman coding
As discussed in class on Nov. 14,triesandHuffman codingcan be used to compress files/messages by analyzing character frequencies and choosing bitcodes accordingly.
A nearly-completeHuffmanclass is defined for you inhuffman.hh(code here). It uses aTrieNodeclass that stores parent and child links as well as the character being encoded and its frequency in the file being compressed.
The main work ofHuffmanhappens in the following functions:
compress()
Reads input file character by character and counts occurrences (aka computes frequency) for each, storing the results by ASCII index in thechar_frequencyvectorApplies the Huffman trie-building algorithm presented in class 11/14 inbuild_optimal_trie(). This function is TO BE WRITTEN BY YOUCreates code look-up tables (stringtochardecompression_mapandchartostringcompression_map) from the optimal trie incompute_all_codes_from_trie(). This function is TO BE WRITTEN BY YOUWritesdecompression_mapas a header to the output fileRe-reads input file character by character and, usingcompression_map, writes encoded versions to the output file body (in ASCII or binary depending ondo_binaryflag)
decompress()
Reads the code look-up table from header of input file todecompression_mapReads rest of input file bitstream, gets character for each variable-length code read usingdecompression_map, and writes decoded versions to output file. The ASCII part of this function,ascii_decompress_body()is TO BE WRITTEN BY YOU
The "forest of tries" discussed in lecture is implemented in theHuffmanclass with anSTL priority queueofTrieNodepointers (each representing the root of a subtrie), ordered on the combined frequency of each subtrie. This priority queue is calledtrie_pq.
The compiled executable is run as follows:huff
If the input filename has the suffix.huf, it will bedecompressedto a plain-text file with an identical name except the suffix is changed to.HUF.
If the input filename has any other suffix, it will becompressedto a file with the suffix.huf.
The default compressor creates a binary file as output and the default decompressor expects a binary file as input. Feel free to compare the sizes of compressed files to the originals. However, for grading, compressed output will be written in ASCII and compressed input will also be expected to be ASCII. This is indicated by the optional flag-ascii.
The output files will be echoed to the terminal for grading for both compression (to test yourbuild_optimal_trie()+compute_all_codes_from_trie()combination) and decompression (to test yourascii_decompress_body()in isolation). In order to testbuild_optimal_trie()alone, another optional flag-triewill print only a representation of the built trie and not echo the output file.
2. Sample inputs/outputs
This tarfilecontains the following files:
seashells.txt(20 total chars because there is a newline, 7 different chars)
seashells.txt.huf
output1,output2,output3(these are what is written to the terminal, not what is saved in a file)
output1results from running./huff -ascii seashells.txt
output2results from running./huff -ascii -trie seashells.txt
output3results from running./huff -ascii seashells.txt.huf
3. Programming tasks
TheHuffmanmember functions to finish are inmain.cpp-- don't make any changes inhuffman.hh:
[1 point]Huffman::ascii_decompress_body(): This function should read one "bit" (a '0' or '1' char) at a time from the input stream until a legal code is assembled, then write the corresponding decoded character to the output stream, and repeat until reaching the end of file
[2 points]Huffman::build_optimal_trie(): This function should create a forest of one-node tries (akatrie_pq) representing the characters in the input file and their frequencies, then repeatedly combine the two smallest tries until only one is left according to the following instructions:
1. Remove trie intrie_pqwith SMALLEST frequency, assign tofirst, remove trie in PQ with NEXT SMALLEST frequency, assign tosecond2. Create newTrieNodeand makefirstandsecondits left and right children, respectively. Set new node's frequency to the sum offirstandsecond's frequencies, and then put the whole thing back into the priority queue
[2 points]Huffman::compute_all_codes_from_trie(): The function should recursively traverse the trie rooted atTto determine thehuffcodestring at every leaf and initializecompression_mapanddecompression_mapwith this information
1 . File compression with Huffman coding As

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!