Question: PLEASE SOLVE IN R PROGRAMING LANGUAGE (5) A DNA sequence consists of four types of bases (nucleotides): Adenine (A), Guanine (G), Cytosine (C), Thymine (T).
PLEASE SOLVE IN R PROGRAMING LANGUAGE
(5) A DNA sequence consists of four types of bases (nucleotides): Adenine (A), Guanine (G), Cytosine (C), Thymine (T). Write a function that accepts a DNA sequence (as a single string consisting of A,G,C,T characters only) and a number n>=2 and returns a vector with all DNA subsequences (as strings) that start with triplet ("codon") "AAA" or "GAA" and end with triplet ("codon") "AGT" and have at least 2 and at most " n " other triplets ("codons") between the start and the end. Note that " n " is not fixed but must be a parameter given by the user. Eg., for "GAACCCACTAGTATAAAATTTGGGAGTCCCAAACCCTTTGGGAGT" and for n=2, the answer is "GAACCCACTAGT", "AAATTTGGGAGT". For n=3 the answer is: "GAACCCACTAGT" "AAATTTGGGAGT" "AAACCCTTTGGGAGT". For n=7 the answer is: "GAACCCACTAGTATAAAATTTGGGAGT", "AAACCCTTTGGGAGT". (Note also that in regular expressions longer matches take precedence)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
