Question: Write a matlab code named HW7.m which takes a collection of k-mer reads and outputs the assembled genome. Codes referenced provided below PatternToNumber code: function
Write a matlab code named "HW7.m" which takes a collection of k-mer reads and outputs the assembled genome.
Codes referenced provided below

PatternToNumber code:
function number = PatternToNumber(pattern)
number = 0;
CharacterMap = containers.Map({'A', 'C', 'G', 'T'}, [0, 1, 2, 3]);
for i = 1: length(pattern)
number = number * 4 + CharacterMap(pattern(i));
end
number = 1 + number;
end
FindEulerianPath code:
function [path] = FindEulerianPath(adj,n) indegree = zeros(n);%for storing indegree of all vertices outdegree = zeros(n);%for storing outdegree of all vertices %calculating outdegree for i=1:n for j=1:n outdegree(i) = outdegree(i) + adj(i,j); end end %calculating indegree for j=1:n for i=1:n indegree(j) = indegree(j) + adj(i,j); end end
v1 = -1; v2 = -1; %step-1 ,search vertex whose outdegree-indegree = 1 for i=1:n if outdegree(i)-indegree(i) == 1 v1 = i; end end %search vertex whose outdegree-indegree = -1 for i=1:n if outdegree(i)-indegree(i) == -1 v2 = i; end end j = 1; %check if all others have balanced degree for i=1:n if i == v1 || i==v2 continue end if outdegree(i)~=indegree(i) j = 0; break; end end %if not balanced, no eulerian path if j == 0 fprintf('no eulerian path... '); return; end %if no such vertices no eulerian path if(v1 == -1 || v2 == -1) fprintf('no eulerian path... '); return; end %stack for path finding stack = java.util.Stack(); path = []; stack.push(v1);%inserting v1 while 1 if stack.isEmpty() break; end i = stack.pop(); path = [i,path]; for j=1:n if adj(i,j) == 1 stack.push(j); adj(i,j) = 0; end end end path = fliplr(path);%reversing the list to get the real path end
NumberToPattern code:
function pattern= NumberToPattern(number, k)
number = number - 1;
CharacterMap = ['A', 'C', 'G', 'T'];
pattern = ' ';
for i = 1: k
n = mod(number, 4);
number = (number - n) / 4;
pattern = strcat(CharacterMap(n + 1), pattern);
end
end
reads.mat
reads.mat
'TGTTTA'
'TGGTTT'
'TTTAAT'
'TTTTGG'
'TTTTTT'
'CATGCC'
'CAATTG'
'GCCCAA'
'AAAACG'
'TAATGG'
'GCCCCC'
'CCCCCT'
'GCCATG'
'GGTTTT'
'TTTTTG'
'CCATGC'
'TTTTGT'
'GGCCCC'
'TGCCCA'
'TGGGGA'
'CCCCTT'
'CCATGC'
'AATTGG'
'CATGCC'
'CCTTTT'
'CCCTTT'
'CCCAAT'
'TGGCCC'
'GAAAAC'
'CTTTTG'
'CCCCCC'
'GGCCCA'
'ATGCCC'
'TTGTTT'
'ATTGGT'
'TTTAAT'
'TGCCAT'
'TTGGGG'
'TGTTTA'
'GGGGAA'
'AATGGC'
'AATGGC'
'TTGGTT'
'ATGCCA'
'TTTGTT'
'GCCCAT'
'ATGGCC'
'GTTTAA'
'TTAATG'
'TAATGG'
'CCCATG'
'GTTTAA'
'CCAATT'
'TTTGGG'
'TTAATG'
'GTTTTT'
'GGGAAA'
'ATGGCC'
'GGAAAA'
'TGGCCC'
Section 1 of the code: Make an adjacency matrix (Al of the graph from the reads Hint: The uploaded "reads.mat file consists of sixty 6-mers. Each 6-mer represents an edge where the beginning node is the prefix and the ending node is the suffix of the 6-mer. Prefix and suffix have length of 5. So the graph can have at most 4 nodes, and therefore you can define A as a matrix with 4 rows and 4 columns where all its elements are zero at first. Then all you have to do is Start reading each k-mer, find its prefix and suffix, convert each to a number between 0 and 4-1 use your PatternToNumber code from HW2). This way you can update the value of matrix A accordingly. For example, the first 6-mer is TGTTTA So preix TGTTT-959 and suffix='GTTA'=764. This means you need to add 1 to the element of the matrix A that sits on the 960" row and 765h column. Section 1 of the code: Make an adjacency matrix (Al of the graph from the reads Hint: The uploaded "reads.mat file consists of sixty 6-mers. Each 6-mer represents an edge where the beginning node is the prefix and the ending node is the suffix of the 6-mer. Prefix and suffix have length of 5. So the graph can have at most 4 nodes, and therefore you can define A as a matrix with 4 rows and 4 columns where all its elements are zero at first. Then all you have to do is Start reading each k-mer, find its prefix and suffix, convert each to a number between 0 and 4-1 use your PatternToNumber code from HW2). This way you can update the value of matrix A accordingly. For example, the first 6-mer is TGTTTA So preix TGTTT-959 and suffix='GTTA'=764. This means you need to add 1 to the element of the matrix A that sits on the 960" row and 765h column
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
