Question: ***I need this question to be coded on Java Exercise 4.27. A DNA sequence is made up of a sequence of four nucleotide bases, A,
***I need this question to be coded on Java
Exercise 4.27. A DNA sequence is made up of a sequence of four nucleotide bases, A, C, G, T (adenine, cytosine, guanine, thymine). One particularly interesting statistic of a DNA sequence is finding a CG island: a subsequence that contains the highest frequency of guanine and cytosine. For simplicity, we will be interested in subsequences of a particular length, n that will be provided as part of the input. Write a program that takes, as command line arguments, an integer n and a DNA sequence. The program should then find all subsequences of the given DNA string of length n with the maximal frequency of C and G in it. For example, if the DNA sequence is
ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGC
and the window size that were interested in is n = 5 then you would scan the sequence and find every subsequence with the maximum number of C or G bases. Your output should include all CG Islands (by indices) in the sequence similar to the following.
n = 5 highest frequency: 5 / 5 = 100.00%
CG Islands:
15 thru 20: CCCCC
16 thru 21: CCCCG
17 thru 22: CCCGG
18 thru 23: CCGGC
19 thru 24: CGGCC
42 thru 47: CCGGG
43 thru 48: CGGGG
44 thru 49: GGGGC
45 thru 50: GGGCC
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
