Question: Write a matlab code named HW7.m which takes a collection of k-mer reads and outputs the assembled genome. Codes referenced provided below reads.mat 'TGTTTA' 'TGGTTT'

Write a matlab code named "HW7.m" which takes a collection of k-mer reads and outputs the assembled genome.

Codes referenced provided below

Write a matlab code named "HW7.m" which takes a collection of k-mer

reads.mat

'TGTTTA'

'TGGTTT'

'TTTAAT'

'TTTTGG'

'TTTTTT'

'CATGCC'

'CAATTG'

'GCCCAA'

'AAAACG'

'TAATGG'

'GCCCCC'

'CCCCCT'

'GCCATG'

'GGTTTT'

'TTTTTG'

'CCATGC'

'TTTTGT'

'GGCCCC'

'TGCCCA'

'TGGGGA'

'CCCCTT'

'CCATGC'

'AATTGG'

'CATGCC'

'CCTTTT'

'CCCTTT'

'CCCAAT'

'TGGCCC'

'GAAAAC'

'CTTTTG'

'CCCCCC'

'GGCCCA'

'ATGCCC'

'TTGTTT'

'ATTGGT'

'TTTAAT'

'TGCCAT'

'TTGGGG'

'TGTTTA'

'GGGGAA'

'AATGGC'

'TTGGTT'

'ATGCCA'

'TTTGTT'

'GCCCAT'

'ATGGCC'

'GTTTAA'

'TTAATG'

'TAATGG'

'CCCATG'

'GTTTAA'

'CCAATT'

'TTTGGG'

'TTAATG'

'GTTTTT'

'GGGAAA'

'ATGGCC'

'GGAAAA'

'TGGCCC'

PatternToNumber code:

function number = PatternToNumber(pattern)

number = 0;

CharacterMap = containers.Map({'A', 'C', 'G', 'T'}, [0, 1, 2, 3]);

for i = 1: length(pattern)

number = number * 4 + CharacterMap(pattern(i));

end

number = 1 + number;

end

FindEulerianPath code:

function [path] = FindEulerianPath(adj,n) indegree = zeros(n);%for storing indegree of all vertices outdegree = zeros(n);%for storing outdegree of all vertices %calculating outdegree for i=1:n for j=1:n outdegree(i) = outdegree(i) + adj(i,j); end end %calculating indegree for j=1:n for i=1:n indegree(j) = indegree(j) + adj(i,j); end end

v1 = -1; v2 = -1; %step-1 ,search vertex whose outdegree-indegree = 1 for i=1:n if outdegree(i)-indegree(i) == 1 v1 = i; end end %search vertex whose outdegree-indegree = -1 for i=1:n if outdegree(i)-indegree(i) == -1 v2 = i; end end j = 1; %check if all others have balanced degree for i=1:n if i == v1 || i==v2 continue end if outdegree(i)~=indegree(i) j = 0; break; end end %if not balanced, no eulerian path if j == 0 fprintf('no eulerian path... '); return; end %if no such vertices no eulerian path if(v1 == -1 || v2 == -1) fprintf('no eulerian path... '); return; end %stack for path finding stack = java.util.Stack(); path = []; stack.push(v1);%inserting v1 while 1 if stack.isEmpty() break; end i = stack.pop(); path = [i,path]; for j=1:n if adj(i,j) == 1 stack.push(j); adj(i,j) = 0; end end end path = fliplr(path);%reversing the list to get the real path end

NumberToPattern code:

function pattern= NumberToPattern(number, k)

number = number - 1;

CharacterMap = ['A', 'C', 'G', 'T'];

pattern = ' ';

for i = 1: k

n = mod(number, 4);

number = (number - n) / 4;

pattern = strcat(CharacterMap(n + 1), pattern);

end

Section 1 of the code: Make an adjacency matrix (Al of the graph from the reads Hint: The uploaded "reads.mat file consists of sixty 6-mers. Each 6-mer represents an edge where the beginning node is the prefix and the ending node is the suffix of the 6-mer. Prefix and suffix have length of 5. So the graph can have at most 4 nodes, and therefore you can define A as a matrix with 4 rows and 4 columns where all its elements are zero at first. Then all you have to do is Start reading each k-mer, find its prefix and suffix, convert each to mber between 0 and 4-1 (use your PatternToNumber code from HW2). This way you can update the value of matrix A accordingly. For example, the first 6-mer is TGTTTA So preix TGTTT-959 and suffix='GTTA'=764. This means you need to add 1 to the element of the matrix A that sits on the 960" row and 765h column. Section 2 of the code: Find an Eulerian path for your adjacency matrix. Hint: Use you FindEulerianpath code from HW#6 Subtract 1 from each node number on the path to have a number between 0 and 41 again. For example, if Path- 1960 765... The corrected numbers should be 1959 764.. Section 3 of the code: Assemble the genome from the corrected path numbers Hint: Convert each corrected path number to a string of 5 nucleotides (use your NumberTo Pattern code from HW2). Take the whole 5 nucleotides from the first node, but for the rest of the nodes, just add the last nucleotide from their converted 5-mer strings Section 1 of the code: Make an adjacency matrix (Al of the graph from the reads Hint: The uploaded "reads.mat file consists of sixty 6-mers. Each 6-mer represents an edge where the beginning node is the prefix and the ending node is the suffix of the 6-mer. Prefix and suffix have length of 5. So the graph can have at most 4 nodes, and therefore you can define A as a matrix with 4 rows and 4 columns where all its elements are zero at first. Then all you have to do is Start reading each k-mer, find its prefix and suffix, convert each to mber between 0 and 4-1 (use your PatternToNumber code from HW2). This way you can update the value of matrix A accordingly. For example, the first 6-mer is TGTTTA So preix TGTTT-959 and suffix='GTTA'=764. This means you need to add 1 to the element of the matrix A that sits on the 960" row and 765h column. Section 2 of the code: Find an Eulerian path for your adjacency matrix. Hint: Use you FindEulerianpath code from HW#6 Subtract 1 from each node number on the path to have a number between 0 and 41 again. For example, if Path- 1960 765... The corrected numbers should be 1959 764.. Section 3 of the code: Assemble the genome from the corrected path numbers Hint: Convert each corrected path number to a string of 5 nucleotides (use your NumberTo Pattern code from HW2). Take the whole 5 nucleotides from the first node, but for the rest of the nodes, just add the last nucleotide from their converted 5-mer strings

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Write a matlab code named "HW7.m" which takes a collection of k-mer reads and outputs the assembled genome. Codes referenced provided below PatternToNumber code: function number =...

Write a matlab code named "HW7.m" which takes a collection of k-mer reads and outputs the assembled genome. Referenced codes are provided below. Help with any section is greatly appreciated...

Using Matlab write a code for the following The referenced code provided below PatternToNumber code: function number = PatternToNumber(pattern) number = 0; CharacterMap = containers.Map({'A', 'C',...

I've created Matlab code for this question, everything works fine for both rk4 and rk3 but separately. How can I include RK3 and RK4 in one program, and if you find any requirements not present,...

Write a Matlab code named CalculateDistancePatternDna.m to obtain the distance between a Pattern and a set of strings Dna A set of string for example could be Dna = ['ATCCT';'GTCTA';'ATTCG';'TTCGA'];...

Problem statement While typical engineering systems have positive damping, it is possible to design systems with negative damping by adding an active feedback loop to the system. This type of systems...

The primary form of data in MATLAB is vectors. They can be one-dimensional (i.e., vectors), or two-dimensional (i.e., matrices). a) Write the MATLAB code to create a 1 times 3 vector containing the...

Please Write a MATLAB function named ReadBatchData and use MATLAB. Problem #2 Sutures are strands or fibers used to sew living tissue together after an injury or an operation. Packages of sutures...

Develop the matrix problems Ax = b for the following systems. Please develop a MatLab code for this problem and for all the parts. Develop your code logically and please make sure it is correct. 2....

A sophomore with nothing better to do adds heat to 0350 kg of ice at 0.0oC until it is all melted. (a) What is the change in entropy of the water? (b) The source of heat is a very massive body at a...

solve the function Suppose f(x) = Evaluate f(2) Evaluate f(10) Evaluate f(6) Evaluate f(4) (0.5x 1 if x 8 A A A/

The YTMs of three $ 1 , OOO face value bonds that mature in 1 0 years and have the same level of risk are equal. Bond A has an 8 % annual coupon, Bond B has a 1 0 % annual coupon, and Bond C has a 1...

Complete the following using present value. (Use the Table 12.3 provided.) Note: Do not round Intermedlate calculatlons. Round the "Rate used" to the nearest tenth percent. Round the "PV factor" to 4...

Experience with SharePoint and/or Microsoft Project desirable

3. Have you taken course work in project management? What do you know about process documentation?

Knowledge of process documentation (process flow charting)