Question: For this assignment you will write a single Python script that will have several steps & specific requirements. Be sure your file name includes your

For this assignment you will write a single Python script that will have several steps &
specific requirements. Be sure your file name includes your last name and module number.
For full credit the script should include comments, run without error and produce correct,
properly formatted output. Partial credit will be given on the following scale:
Runs without error, correct output data, output not formatted
properly
90%
Runs without error, incorrect output data 75%
Does not run without errors, code is mostly accurate 60%
Does not run without errors, some code is accurate 50%
Any of the above, without comments -10% more
Besides this document, there are two attached files: One is an input file (cftr_cds.fa), the
other (gen_code.py) contains a full genetic code dictionary that you can copy & paste into
your script.
Your script will open the input file, which is a dna sequence for the CFTR cds region. You
will calculate the percent CG content, transcribe into RNA, translate the RNA into protein
and write the protein sequence to a new text file.
Some additional details:
Write a function to calculate the percent CG content. The function will take a single
argument, a nucleotide sequence and return the percent CG. In the function use this
logic: Percent CG =((# of Cs + # of Gs)/ length of sequence)*100. This will
generate a float with Pythons standard precision, with a lot of decimal places. Use
the round() function to round the result to 2 decimal places, before you return the
value. In Spyder, you can get help on the round function syntax by typing
help(round) in the console (also known as the interactive shell).
After the percent CG content is calculated and returned, use a conditional to check
the value. If its >40, print a message to the screen that states the percent CG
content.
Transcription to RNA is performed by substituting the base T with the base U.
CMBI-6621/ BIOL-4521
Page 2 of 2
Be sure to use with when opening your files.
When you write the protein sequence to your file, include a new header that has the
gene name and some additional text that indicates its protein. Format the sequence
so its 60 characters per line. You by this time have seem several examples of doing
this so you can adapt that example code.
The full genetic code dictionary is provided to save you some typing. You can open
the attached file in Spyder and simply copy & paste the dictionary into your script.
Alternatively, you can save the file in the same directory as your script and import it
into your script with the following command:
from gen_code import gen_code
If you do this, put it at the beginning of your script.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!