Question: The regular expression corresponding to a standard coding sequence (CDS) is given by: ATG(?:[ACGT]{3})*(?:TAA|TGA|TAG) Note: it is not necessary to understand the regular expression (which
The regular expression corresponding to a standard coding sequence (CDS) is given by: "ATG(?:[ACGT]{3})*(?:TAA|TGA|TAG)" Note: it is not necessary to understand the regular expression (which is more advanced than the previous examples we have seen. But for completeness, the regular expression can be interpreted as follows: ATG the start codon, ATG (?:[ACGT]{3})* - either 0 or more of any codon (any three nucleotides) (?:TAA|TGA|TAG) any of the stop codons Suppose that a file called sequences.fasta contains a large number of sequences in FASTA format. Write a python script that generates a list of the sequences that contain at least one possible CDS.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
