Question: Write a function hash() that takes as its argument a (k)-gram (string of length (k) ) whose characters are all (mathrm{A}, mathrm{C}, mathrm{G}), or (mathrm{T})
Write a function hash() that takes as its argument a \(k\)-gram (string of length \(k\) ) whose characters are all \(\mathrm{A}, \mathrm{C}, \mathrm{G}\), or \(\mathrm{T}\) and returns an int value between 0 and \(4^{k}-1\) that corresponds to treating the strings as base- 4 numbers with \(\{A, C, G, T\}\) replaced by \(\{0,1,2,3\}\), respectively. Next, write a function unHash () that reverses the transformation. Use your methods to create a class Genome that is like Sketch (Program 3.3.4), but is based on exact counting of \(k\) grams in genomes. Finally, write a version of CompareDocuments (PrOGRAM 3.3.5) for Genome objects and use it to look for similarities among the set of genome files on the booksite.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
