Question: Given the two files DNA _ SEQ, and PROT _ SEQ compress them using ( i ) Binary coding, ( ii ) Huffman coding. For
Given the two files DNASEQ, and PROTSEQ compress them using i Binary coding,
ii Huffman coding.
For binary coding if you have n unique letters, then you can use the binary representation for
each letter. For example, if the text has unique letters, then you can code the letters using three
log bits as For Huffman coding use the method discussed
in the slides. As usual, for either method you can use or adapt codes available online.
i Compare the memory required to store a the DNASEQ using Binary and
Huffman coding b PROTSEQ using Binary and Huffman coding.
ii Discuss why the DNASEQ did not show as significant savings as the PROTSEQ
iii Develop an algorithm by which you can modify how you store the DNASEQ so that you
obtain better saving than the binary method. This has to be a lossless compression, similar to
Huffman coding. You only have to describe the algorithm in
detail no code needed and explain why it will improve the storage.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
