Question: MATLAB ASSGNMT: I NEED HELP FOR PART 2 AND 3 HERE'S WHAT I DID FOR PART 1 : Part 1: Create a Matlab script hw7_1.m
MATLAB ASSGNMT: I NEED HELP FOR PART 2 AND 3 HERE'S WHAT I DID FOR PART 1: Part 1: Create a Matlab script hw7_1.m that prompts the user for two DNA sequence strings and scores an alignment. Each A or T match contributes a value of +2 Each C or G match contributes a value of +3 Each mismatch or Gap contributes a score of -2 You can assume the input will be valid (only DNA bases) and the sequences will be at most 99 bases in length. For example, two sequences and their corresponding score would be: Sequence 1: ATGCTGACTGCA Sequence 2: CTTGAGACG--- A/T score = 2* 2= +4 G/C score = 2* 3= +6 Mismatches + GAPs = 8*-2= -16 Score = -6 Part 1: clc; clear; % Read the DNA sequences from user seq1 = input('Please enter the first DNA sequence ', 's'); seq2 = input('Please enter the second DNA sequence ', 's'); score = 0.0; score = score - (2*abs(numel(seq1) - numel(seq2))); if numel(seq1) < numel(seq2) shorterLength = numel(seq1); else shorterLength = numel(seq2); end for index=1:1:shorterLength if (seq1(index)==seq2(index)) if(seq1(index) == 'A' || seq1(index) == 'T') score = score + 2; else score = score + 3; end else score = score - 2; end end score Part 2: Create a Matlab script hw7_2.m (Copy and modify part 1). The second program will additionally access if each DNA sequence could be a coding strand for a protein (assume no introns). Scan each sequence (in both forward and reverse directions in all 3 reading frames) to determine if there is at least one start codon (i.e. the coding strand DNA must contain ATG, which is transcribed to the RNA start codon AUG) with an in frame stop codon (i.e. the coding strand DNA must contain TAA, TAG or TGA, which are transcribed to the RNA stop codons UAA, UAG or UGA, respectively). The output for each sequence should simply be a message that the sequence does or does not code for a protein. If the sequence does, report the number of amino acids in the translated protein.
% CODE TO START PART 3
function HW7_3
% Ask the user for DNA sequence 1.
prompt = 'Enter the first DNA sequence of length 1 to 99: ';
seq1 = input(prompt, 's');
len1 = length(seq1);
sprintf(' Seq 1 %s has %d bases', seq1, len1)
% Ask the user for DNA sequence 2.
prompt = 'Enter the second DNA sequence of length 1 to 99: ';
seq2 = input(prompt, 's');
len2 = length(seq2);
sprintf(' Seq 2 %s has %d bases', seq2, len2)
% Guess sequence 1 is longer than sequence 2.
longSeq = seq1;
longLen = len1;
shortSeq = seq2;
shortLen = len2;
% Correct an incorrect guess.
if (len2 > len1)
longSeq = seq2;
longLen = len2;
shortSeq = seq1;
shortLen = len1;
end
% Create "buffered" strings to facilitate the scan.
bufLength = longLen + 2 * shortLen;
scan1 = blanks(bufLength);
scan2 = blanks(bufLength);
% Fill the "scan" strings with dashes
for i = 1:bufLength
scan1(i) = '-';
scan2(i) = '-';
end
% Initialize the first scan string with the long sequence in the middle.
for i = shortLen + 1:shortLen + longLen
scan1(i) = longSeq(i-shortLen);
end
% Initialize the second scan string with the short sequence at the
% beginning.
for i = 1:shortLen
scan2(i) = shortSeq(i);
end
% Score & print the initial alignment.
score = scoreSequences(scan1, scan2);
printSeq(scan1, scan2, score);
% Rotate the short sequence to the right, score & print.
scan2 = rotateSeqRight(scan2);
score = scoreSequences(scan1, scan2);
printSeq(scan1, scan2, score);
% Repeat in a loop!!
end
% Use your scoring logic from part 1.
function score = scoreSequences(seq1, seq2)
% To Do!
score = 0;
end
% Print out the sequences and score.
function printSeq(seq1, seq2, score)
fprintf(1, '%s ', seq1);
fprintf(1, '%s ', seq2);
fprintf(1, 'Score: %d ', score);
end
% Rotate the sequence to the right.
function seq = rotateSeqRight(seq)
seqLen = length(seq);
last = seq(seqLen);
for i=seqLen:-1:2
seq(i) = seq(i-1);
end
seq(1) = last;
end
Part 3: Create a Matlab script hw7_3.m. Please see the code below as a starting point. This script will try all overlapping combinations to find the maximum alignment score. You can add gaps on the front/back of the sequences to facilitate the analysis. You do not need to insert internal gaps. Each A or T match contributes a score of +2 Each C or G match contributes a score of +3 Each mismatch or gap contributes a score of-2 To simplify the assignment, any character paired with a dash (- should be called a gap. For example: Enter the fir t DNA equence of length 1 to 991 Enter the eoond DHA equence of length 1 to 99 seql seg2 0 C/G matches 21 mismatches + gape scorei-42 8eq2 0 c/G matches 20 mismatohes gaps soore -40 seg2 0 C/G matches 19 mismatches gaps 8eq2 0 c/G matches 21 mismatches + gape scorei-42 NanSoore-22? Hints: Although there ae many ways to do part 3; it is surprisingly difficult. You have to be mindful of the end of the array/string (depending on how you implement your solution) and careful with loop indices. For my solution, I made sure to always know which string was the longest. also made a function called "otateSeqRight(seq)" that simply took an array as input, and rotated all the characters right one positionStep by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
