Overview Reads a plain text file Builds a Huffman tree from the character frequencies in the file Outputs the original length in bits, Huffman compression length in bits, and the alphabet frequency code length table Input The program accepts the input filename as a command line argument a out input filename txt The input filename will contain less than 100,000 characters with ASCII values in the range 0,127 (ASCII Table Reference Link) Every single character in the input file will be included in the frequency counts and encoded message To match the examples, be sure to use nix style line endings ( LF not CRLF ) for the input text files The line endings can easily be changed in VS Code by clicking the CRLF LF on the status bar on the bottom right of the editor Two example input files are downloadable below under Submission Instructions Implementation Details Compute the frequency of each character in the file Remember that each character is also an int with the int value equal to its ASCII value Consider making an array to count each character's appearances frequencies 'a' is the same as frequencies 97 If you want to know the frequency of a character, then frequencies ch could give you the frequency Build a Huffman tree with each character that appears in the file as a leaf node in the tree You will need to build a Min Heap to implement the Huffman tree algorithm An efficient array based min heap ( O(lg n) insert and extract min) is required The min heap should contain Huffman tree nodes with the frequency as the keys The min heap can be implemented as either a 0 based or 1 based array The formulae are as follows 0 based parent (child 1) 2 leftChild parent 2 1 rightChild parent 2 2 1 based (like the slides) parent child 2 leftChild parent 2 rightChild parent 2 1 Construct a table containing the string binary encoding for each character The encoding for each character is not guaranteed to be unique and will depend on the implementation however, the compressed length of each file should be the same if the implementation is correct Compute the binary encoding of the input file Build all the data structures yourself, following the pseudocode from the slides (and potentially the ZyBook) Use good object based organization, using classes in an appropriate way You are encouraged to (and will likely) use recursion however, be sure the recursion works and does not crash the program when run on large inputs from a stack overflow Only include iostream, string, fstream, sstream, vector Output Output the following information in the same format as the examples below The input file's uncompressed length in bits (string length 8) The input file's compressed length in bits (string length of encoded string of 1s and 0s) The input file's alphabet, frequency, and code table in the following format 'char' frequency binary encoding length The binary encoding length of a character is the encoding's number of bits If the binary code is 0101 , then the length is 4

The Answer is in the image, click to view ...

Question: Overview Reads a plain text file Builds a Huffman tree from the character frequencies in the file Outputs the original length in bits, Huffman compression

Overview

Reads a plain text file
Builds a Huffman tree from the character frequencies in the file
Outputs the original length in bits, Huffman compression length in bits, and the alphabet|frequency|code_length table.

Input

The program accepts the input filename as a command-line argument: ./a.out input_filename.txt
The input filename will contain less than 100,000 characters with ASCII values in the range [0,127] (ASCII Table Reference Link).
Every single character in the input file will be included in the frequency counts and encoded message.
To match the examples, be sure to use *nix-style line endings (LF not CRLF) for the input text files. The line endings can easily be changed in VS Code by clicking the CRLF/LF on the status bar on the bottom right of the editor.
Two example input files are downloadable below under Submission Instructions.

Implementation Details

Compute the frequency of each character in the file.
- Remember that each character is also an int with the int value equal to its ASCII value.
- Consider making an array to count each character's appearances:
  - frequencies['a'] is the same as frequencies[97]
  - If you want to know the frequency of a character, then frequencies[ch] could give you the frequency.
Build a Huffman tree with each character that appears in the file as a leaf node in the tree.
You will need to build a Min-Heap to implement the Huffman tree algorithm.
- An efficient array-based min-heap (O(lg n) insert and extract-min) is required.
- The min-heap should contain Huffman tree nodes with the frequency as the keys.
- The min-heap can be implemented as either a 0-based or 1-based array. The formulae are as follows:
  - 0-based:
    - parent = (child-1)/2
    - leftChild = parent*2 + 1
    - rightChild = parent*2 + 2
  - 1-based (like the slides):
    - parent = child/2
    - leftChild = parent*2
    - rightChild = parent*2 + 1
Construct a table containing the string binary encoding for each character.
- The encoding for each character is not guaranteed to be unique and will depend on the implementation; however, the compressed length of each file should be the same if the implementation is correct.
Compute the binary encoding of the input file.
Build all the data structures yourself, following the pseudocode from the slides (and potentially the ZyBook).
- Use good object-based organization, using classes in an appropriate way.
- You are encouraged to (and will likely) use recursion; however, be sure the recursion works and does not crash the program when run on large inputs from a stack overflow.
Only include iostream, string, fstream, sstream, vector.

Output

Output the following information in the same format as the examples below:

The input file's uncompressed length in bits (string_length * 8).
The input file's compressed length in bits (string length of encoded string of 1s and 0s).
The input file's alphabet, frequency, and code table in the following format: 'char'|frequency|binary_encoding_length
- The binary encoding length of a character is the encoding's number of bits. If the binary code is 0101, then the length is 4.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

Design and implement the Huffman Tree using Queue and PriorityQueue from the previous assignments. The Huffman Tree will be further used to encode input strings. Your task is to develop a simple,...

9. a) Find the Huffiman code for the following symbols with the given frequencies. a-6, b- 2, d-1, m-5, s-10. Show the code tree. Show the symbols and their codes b) Encode the word madsbad into...

This assignment is an exercise in C programming, specifically in using the C library to do file input and to do some basic string processing. It probably introduces you to at least one C library...

You are to write a program that reads a plain text file, computes a Huffman code tree for that text file, and writes out the encoded version of the text. Your program should read the text from the...

Huffman Code Trees CS101 Project 4 Due 11/21 You are to write a program that reads a plain text file, computes a Huffman code tree for that text file, and writes out the encoded version of the text....

Java: The purpose of this project is to determine the frequency of characters/key-strokes in any writing. By performing an analysis, you may determine the number of key strokes required for the type...

Information Technology 1. Introduction Aim In this assignment, you'll build your own simple 1/0 library called myio, vaguely similar to stdio.h, by writing your own functions for performing various...

ava project Please help me with this java project. link to the website:http: //cs.boisestate.edu/~cs121/projects/p4/...

Please help me with this java project. link to the website:http: //cs.boisestate.edu/~cs121/projects/p4/...

Now that you have read the chapter on estate planning, what do you recommend to Juliana and Fernando on the subject of retirement and estate planning regarding? 1. How much in Social Security...

Find the vertical force under the front tires (point A) and rear tires (point B) due to the the 2 forces shown. As always, include a FBD and clearly label all variables used in your calculations. 500...

view 4/6 When a baby nurses, the stimulation negatively feeds back to decrease milk flow to the nipple. O True O False Check

Julian Birkinshaw, James Manktelow, Vittorio DAmato, Elena Tosca, and Francesca Macchi In recent years, one of the most profound transformations in the workplace has been the increase in age...