Question: The regular expression corresponding to a standard coding sequence (CDS) is given by: ATG(?:[ACGT]{3})*(?:TAA|TGA|TAG) Note: it is not necessary to understand the regular expression (which

The regular expression corresponding to a standard coding sequence (CDS) is given by: "ATG(?:[ACGT]{3})*(?:TAA|TGA|TAG)" Note: it is not necessary to understand the regular expression (which is more advanced than the previous examples we have seen. But for completeness, the regular expression can be interpreted as follows: ATG the start codon, ATG (?:[ACGT]{3})* - either 0 or more of any codon (any three nucleotides) (?:TAA|TGA|TAG) any of the stop codons Suppose that a file called sequences.fasta contains a large number of sequences in FASTA format. Write a python script that generates a list of the sequences that contain at least one possible CDS.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!