Design and implement the Huffman Tree using Queue and PriorityQueue from the previous assignments. The Huffman Tree
Question:
Design and implement the Huffman Tree using Queue and PriorityQueue from the previous assignments. The Huffman Tree will be further used to encode input strings. Your task is to develop a simple, elegant implementation that does not sacrifice efficiency or waste space. Theoretically solving any problem includes progressive construction of a solution:
Part I: Design & Implement the TreeNode class
The TreeNode is the basic building block of the HuffmanTree. It will consist of elem, priority value, and two pointers, left for left child and right for right child.
- TreeNode(int priority): Initializes the TreeNode with the given priority value and sets all other internal states to be null.
- TreeNode(Character elem, int priority): Initializes the TreeNode with the given elem and priority values and sets all other internal states to be null. The TreeNode object is Comparable
in nature and should implement the compareTo() method: - int compareTo(TreeNode n): Takes TreeNode as a parameter and returns 1 if the current node has higher priority than the given node and -1 otherwise.
Part II: Design & Implement the HuffmanTree class
The HuffmanTree is a type of binary tree that consists of the internal state root, which is the entry point to the HuffmanTree.
- HuffmanTree(TreeNode left, TreeNode right): Initializes a new Huffman Tree by assigning a new root node and assigning the parameters as left and right children of the root.
- TreeNode getRoot(): Returns the reference of the root node.
Part III: Build Huffman Tree and create Huffman codes
What is Huffman Coding? Huffman coding is an algorithm devised by David A. Huffman of MIT in 1952 for compressing text data to make a file occupy a smaller number of bytes. This relatively simple compression algorithm is powerful enough that variations of it are still used today in computer networks, fax machines, modems, HDTV, and other areas. Normally text data is stored in a standard format of 8 bits per character, commonly using an encoding called ASCII that maps every character to a binary integer value from 0-255. The idea of Huffman coding is to abandon the rigid 8-bits-percharacter requirement and use different-length binary encodings for different characters. The advantage of doing this is that if a character occurs frequently in the file, such as the letter 'e', it could be given a shorter encoding (fewer bits), making the file smaller. The tradeoff is that some characters may need to use encodings that are longer than 8 bits, but this is reserved for characters that occur infrequently, so the extra cost is worth it.
Implement the BuildHuffmanTree class and break down the implementation into following methods:
- static TreeNode buildTreeQueue (HashMap
freqTbl, Queue q): Builds and returns the root node of the final
HuffmanTree using a Queue.
Note: Use your Queue implementation from Assignment_1b.
- static TreeNode buildTreePQ (HashMap
freqTbl, PriorityQueue pq): Builds and returns the root node of the final HuffmanTree using a PriorityQueue.
Note: Use your PriorityQueue implementation from Assignment_2a.
- static void encodeTraversal(TreeNode root, String code,
HashMap
Testing:
To test the implementation, a driver program is provided in the assignment folder: HuffmanTester.java. This tester file is incomplete. You need to provide implementation for the following methods:
static String encode(String test, HashMap
encodTbl): Convert the test string into encoded bits based on the Huffman codes generated from Part III. For each of the character in the test string, look up the Huffman codes in the HashMap and replace the character with the code and return the final encoded string.
• static void printOutput(String test, String encodedStrQ,
String encodedStrPQ): Print the output of all 5 test strings to an output file (output.txt) in the following format:
- [Test String] [Total # of uncompressed bits (8-bits per character)]
- [Encoded Bitstream using HuffmanTree_Q: compressed bits] [Total # of compressed bits] [Space Saving(%):?]
- [Encoded Bitstream using HuffmanTree_PQ: compressed bits] [Total # of compressed bits] [Space Saving(%):?]
Example:
[eeyjjjj][56]
[Encoded Bitstream using HuffmanTree_Q: 1010110000][10][Space
Saving(%):82.14]
[Encoded Bitstream using HuffmanTree_PQ: 0101001111][10] [Space
Saving(%):82.14]
Additional Notes: o As a programmer, you have the freedom to include additional operations that you deem necessary. o Refer to the example to understand how the Huffman Tree is built using both Queue and Priority Queue.
- For the set of characters that occur the same number of times, there may be more than one possible set of codes. Similarly, a non-leaf node and a leaf node can have same number of frequencies and it can cause more than one possible set of codes. o Input (data) string can consist of any character (alphabets, punctuations, or empty string). o Space saving (how much compression achieved) is defined as
Space Saving = 1 - compressed bits/ uncompressed bits
- You will be using your implementation of Queue and PriorityQueue from previous assignments. Use of Java's Queue & PriorityQueue will lead to deduction in points. o Though Javadoc style of commenting is not mandatory, your program should have necessary comments (inline and block) for each source code file.
Additional Files:
- Queue.java/.cpp - the Queue class file.
- SLL.java/.cpp - the parent class file.
- PriorityQueue.java/.cpp - the PriorityQueue class file.
- ArrayHeap.java/.cpp - the parent class file.
Auditing and Assurance services an integrated approach
ISBN: 978-0132575959
14th Edition
Authors: Alvin a. arens, Randal j. elder, Mark s. Beasley