Question: 6* (extra credit: 15 points). A bioinformatics problem. A protein is a chain molecule whose chemical identity (and, consequently, the structure) is determined by the


6* (extra credit: 15 points). A bioinformatics problem. A protein is a chain molecule whose chemical identity (and, consequently, the structure) is determined by the sequence of its amino acids. There are 20 standard amino acids. In case you have not taken a biochemistry class, it does not matter for the purpose of this problem: You may think of a protein as a string of letters taken from an alphabet that contains a total of 20 letters. If two proteins have similar sequences, they are likely to have similar structure and/or have an evolutionary relation, so there is a lot of interest in trying to find out how similar sequences of different proteins are. To measure the similarity between two protein sequences of the same length N, aja...ay and b/b2...by, we will use the following procedure (note that this is not how it's really done in bioinformatics): Align the two sequences against each other a, azaz...an bbbz...by and count the number of times n you see a match between the letters at the respective positions, a; = b;. We then calculate the sequence similarity as the ratio: r=n/N For example, the similarity between VYPTQ and VPYTQ is r= 3/5 = 60% (matches at positions 1,4, and 5) and the similarity between VYPTQ and QTPYV is r= 1/5 = 20%. If two sequences are identical, their similarity is 1. A. Somebody gave you a protein with a sequence A = a, ...ay that is of length N= 10. Now suppose that you have randomly picked another sequence R of the same length. What is the probability that the sequence similarity between A and R is equal to or higher than 20%? (In other words, among all possible sequences R of length N, what is the fraction of those that have a sequence similarity with A of 0.2 or higher?) B. Answer the question of Part A assuming N=20
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
