Question: Write a matlab code named HW7.m which takes a collection of k-mer reads and outputs the assembled genome. Codes referenced provided below PatternToNumber code: function

Write a matlab code named "HW7.m" which takes a collection of k-mer reads and outputs the assembled genome.

Codes referenced provided below

Write a matlab code named "HW7.m" which takes a collection of k-mer

PatternToNumber code:

function number = PatternToNumber(pattern)

number = 0;

CharacterMap = containers.Map({'A', 'C', 'G', 'T'}, [0, 1, 2, 3]);

for i = 1: length(pattern)

number = number * 4 + CharacterMap(pattern(i));

end

number = 1 + number;

end

FindEulerianPath code:

function [path] = FindEulerianPath(adj,n) indegree = zeros(n);%for storing indegree of all vertices outdegree = zeros(n);%for storing outdegree of all vertices %calculating outdegree for i=1:n for j=1:n outdegree(i) = outdegree(i) + adj(i,j); end end %calculating indegree for j=1:n for i=1:n indegree(j) = indegree(j) + adj(i,j); end end

v1 = -1; v2 = -1; %step-1 ,search vertex whose outdegree-indegree = 1 for i=1:n if outdegree(i)-indegree(i) == 1 v1 = i; end end %search vertex whose outdegree-indegree = -1 for i=1:n if outdegree(i)-indegree(i) == -1 v2 = i; end end j = 1; %check if all others have balanced degree for i=1:n if i == v1 || i==v2 continue end if outdegree(i)~=indegree(i) j = 0; break; end end %if not balanced, no eulerian path if j == 0 fprintf('no eulerian path... '); return; end %if no such vertices no eulerian path if(v1 == -1 || v2 == -1) fprintf('no eulerian path... '); return; end %stack for path finding stack = java.util.Stack(); path = []; stack.push(v1);%inserting v1 while 1 if stack.isEmpty() break; end i = stack.pop(); path = [i,path]; for j=1:n if adj(i,j) == 1 stack.push(j); adj(i,j) = 0; end end end path = fliplr(path);%reversing the list to get the real path end

NumberToPattern code:

function pattern= NumberToPattern(number, k)

number = number - 1;

CharacterMap = ['A', 'C', 'G', 'T'];

pattern = ' ';

for i = 1: k

n = mod(number, 4);

number = (number - n) / 4;

pattern = strcat(CharacterMap(n + 1), pattern);

end

end

reads.mat

reads.mat

'TGTTTA'

'TGGTTT'

'TTTAAT'

'TTTTGG'

'TTTTTT'

'CATGCC'

'CAATTG'

'GCCCAA'

'AAAACG'

'TAATGG'

'GCCCCC'

'CCCCCT'

'GCCATG'

'GGTTTT'

'TTTTTG'

'CCATGC'

'TTTTGT'

'GGCCCC'

'TGCCCA'

'TGGGGA'

'CCCCTT'

'CCATGC'

'AATTGG'

'CATGCC'

'CCTTTT'

'CCCTTT'

'CCCAAT'

'TGGCCC'

'GAAAAC'

'CTTTTG'

'CCCCCC'

'GGCCCA'

'ATGCCC'

'TTGTTT'

'ATTGGT'

'TTTAAT'

'TGCCAT'

'TTGGGG'

'TGTTTA'

'GGGGAA'

'AATGGC'

'AATGGC'

'TTGGTT'

'ATGCCA'

'TTTGTT'

'GCCCAT'

'ATGGCC'

'GTTTAA'

'TTAATG'

'TAATGG'

'CCCATG'

'GTTTAA'

'CCAATT'

'TTTGGG'

'TTAATG'

'GTTTTT'

'GGGAAA'

'ATGGCC'

'GGAAAA'

'TGGCCC'

Section 1 of the code: Make an adjacency matrix (Al of the graph from the reads Hint: The uploaded "reads.mat file consists of sixty 6-mers. Each 6-mer represents an edge where the beginning node is the prefix and the ending node is the suffix of the 6-mer. Prefix and suffix have length of 5. So the graph can have at most 4 nodes, and therefore you can define A as a matrix with 4 rows and 4 columns where all its elements are zero at first. Then all you have to do is Start reading each k-mer, find its prefix and suffix, convert each to a number between 0 and 4-1 use your PatternToNumber code from HW2). This way you can update the value of matrix A accordingly. For example, the first 6-mer is TGTTTA So preix TGTTT-959 and suffix='GTTA'=764. This means you need to add 1 to the element of the matrix A that sits on the 960" row and 765h column. Section 1 of the code: Make an adjacency matrix (Al of the graph from the reads Hint: The uploaded "reads.mat file consists of sixty 6-mers. Each 6-mer represents an edge where the beginning node is the prefix and the ending node is the suffix of the 6-mer. Prefix and suffix have length of 5. So the graph can have at most 4 nodes, and therefore you can define A as a matrix with 4 rows and 4 columns where all its elements are zero at first. Then all you have to do is Start reading each k-mer, find its prefix and suffix, convert each to a number between 0 and 4-1 use your PatternToNumber code from HW2). This way you can update the value of matrix A accordingly. For example, the first 6-mer is TGTTTA So preix TGTTT-959 and suffix='GTTA'=764. This means you need to add 1 to the element of the matrix A that sits on the 960" row and 765h column

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!