Question: Given the two files DNA _ SEQ, and PROT _ SEQ compress them using ( i ) Binary coding, ( ii ) Huffman coding. For

Given the two files DNA_SEQ, and PROT_SEQ compress them using (i) Binary coding,
(ii) Huffman coding.
For binary coding if you have n unique letters, then you can use the binary representation for
each letter. For example, if the text has 8 unique letters, then you can code the letters using three
(log2) bits as,000,001,010,011,100,101,110,111. For Huffman coding use the method discussed
in the slides. As usual, for either method you can use or adapt codes available online.
(i) Compare the memory required to store (a) the DNA_SEQ using Binary and
Huffman coding (b) PROT_SEQ using Binary and Huffman coding. (10)
(ii) Discuss why the DNA_SEQ did not show as significant savings as the PROT_SEQ (10)
(iii) Develop an algorithm by which you can modify how you store the DNA_SEQ so that you
obtain better saving than the binary method. This has to be a lossless compression, similar to
Huffman coding. You only have to describe the algorithm in
detail (no code needed) and explain why it will improve the storage. (20)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!