Question: This python program supposed to look for Open Reading Frame( string starts with ATG and end with either TAA, TGA or TAG. It also supposed

This python program supposed to look for Open Reading Frame( string starts with ATG and end with either TAA, TGA or TAG.

It also supposed to count the length (how many characters in the string). However, it doesn't work. Why?

import re import string

with open('dna.txt', 'rb') as f: data=f.read (GAGTTTTATCGCTTCCATGACGCAGAAGTTAACACTTTCGGAATGATGAAAAA)

data = [x.split(' ', 1) for x in data.split('>')] data = [(x[0], ''.join(x[1].split())) for x in data if len(x) == 2]

start, end = [re.compile(x) for x in 'ATG TAG|TGA|TAA'.split()]

revtrans = string.maketrans("ATGC","TACG")

def get_longest(starts, ends): ''' Simple brute-force for now. Optimize later... Given a list of start locations and a list of end locations, return the longest valid string. Returns tuple (length, start position)

Assume starts and ends are sorted correctly from beginning to end of string. ''' results = {} # Use smallest end that is bigger than each start ends.reverse() for start in starts: for end in ends: if end > start and (end - start) % 3 == 0: results[start] = end + 3 results = [(end - start, start) for start, end in results.iteritems()] return max(results) if results else (0, 0)

def get_orfs(dna): ''' Returns length, header, forward/reverse indication, and longest match (corrected if reversed)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!