Question: Question 1: Most Likely Mutation Tree This question is inspired by this research article: Spada et al. J Clin Microbiol. 2004 Sep; 42(9): 4230-4236. and

 Question 1: Most Likely Mutation Tree This question is inspired by

Question 1: Most Likely Mutation Tree This question is inspired by this research article: Spada et al. J Clin Microbiol. 2004 Sep; 42(9): 4230-4236. and this episode of the erstwhile popular TV show Numb3rs https://www.hulu.com/watch/315041 (need a hulu subscription). Viruses have RNA which mutate rapidly. Let us assume that the RNA of a viral species can be written as an l letter string made up of A, C, T and G. While replicating, a virus can mutate through random changes in k out of these l positions with probability proportional to 2k2. We collect viral samples starting from a single individual and after sequencing, we observe n mutants A1,,An, but we do not know which individual mutated to another nor what the order of these mutations were. We wish to reconstruct the mutation tree that connects Ai to Aj if Ai mutated into Aj or vice-versa. Assume that l is large enough that the same RNA sequence cannot be obtained through two different sequences of mutations. You are given a weighted undirected graph G whose vertices are the RNA sequences A1,,An and there is an edge between any two nodes (Ai,Aj) with weight 2d(i,j))2, where d(i,j) is the the number of differences between Ai and Aj. A spanning tree T of G then represents a possible history of mutations, the likelihood of which is given by the product of the edge weights of T. Show how to compute the most likely spanning tree T in this graph. Answer 1 (Expected Length: 6 lines) Question 1: Most Likely Mutation Tree This question is inspired by this research article: Spada et al. J Clin Microbiol. 2004 Sep; 42(9): 4230-4236. and this episode of the erstwhile popular TV show Numb3rs https://www.hulu.com/watch/315041 (need a hulu subscription). Viruses have RNA which mutate rapidly. Let us assume that the RNA of a viral species can be written as an l letter string made up of A, C, T and G. While replicating, a virus can mutate through random changes in k out of these l positions with probability proportional to 2k2. We collect viral samples starting from a single individual and after sequencing, we observe n mutants A1,,An, but we do not know which individual mutated to another nor what the order of these mutations were. We wish to reconstruct the mutation tree that connects Ai to Aj if Ai mutated into Aj or vice-versa. Assume that l is large enough that the same RNA sequence cannot be obtained through two different sequences of mutations. You are given a weighted undirected graph G whose vertices are the RNA sequences A1,,An and there is an edge between any two nodes (Ai,Aj) with weight 2d(i,j))2, where d(i,j) is the the number of differences between Ai and Aj. A spanning tree T of G then represents a possible history of mutations, the likelihood of which is given by the product of the edge weights of T. Show how to compute the most likely spanning tree T in this graph. Answer 1 (Expected Length: 6 lines)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!