Question: Include all the code with the functions (the def) from the book chapter in a Google Collaboratory. Now write some code where a DNA sequence


Include all the code with the functions (the "def") from the book chapter in a Google Collaboratory. Now write some code where a DNA sequence is given of at least 21 length (your choice of the sequence composed of A,C,G,T), and for this sequence give its transcription to RNA, its translation to protein, and for the protein we get after translation the molecular weight and a scan for hydrophobicity. This code has to be structured so that each of the different operations are in functions def, and each operation for example translation - is performed by passing the DNA sequence to the def as argument, and the protein sequence is returned (similarly if you are passing protein sequence to get back hydrophobicity). 3. Coding exercise on Python dictionaries (30 points). Now using the material from the last part of the chapter, "accessing data from public databases", we will fetch some data from NCBI Entrez and ExPASy. You can do a web search to learn more about these databases (NCBI stands for National Center for Biotechnology Information), which we will keep using also in subsequent lectures and homeworks. Describe briefly what each database provides. One thing you need to do in your Google Collaboratory, is to install BioPython for the code to work, executing the command !pip install biopython, in a code cell before you write any other code. It will take a minute or so, and then Page 1 of 2 it will conclude with the message "Successfully installed biopython-1.78". Then run the code shown in the book chapter, both for the Entrez and the ExPASy. Very important, you need to add the line from Bio import Seqlo, as the way it is written in the book it is not clear that you need to use this line for the code to work You will notice that the code is using dnaObj.description and dnaObj.seq (and similarly for the protein). Explain what each is for, based on the output you see after you run the code. For the protein data fetch from ExPASy, expand the code so that it saves the data from the protein Obj.description to a variable, then processes that variable and creates a dictionary with key-value pairs such as for example {"RecName": "..Hemoglobin... ", "Altname": "... Beta..". "...} The example data I show for this dictionary, will make more sense when you run the code, and see what is returned from protein Obj.description. To complete this, you will need you to do some string splitting and looping that we have seen in previous lectures, combined with dictionary creation, in addition to data retrieval from remote databases that we are seeing now. Finally, get the data from protein Obj.seq, find its molecular weight (using a def function code from the no.2 question above), and add that information as an additional key-value pair on the dictionary (you can name the key "molweight" or whatever you like). Include all the code with the functions (the "def") from the book chapter in a Google Collaboratory. Now write some code where a DNA sequence is given of at least 21 length (your choice of the sequence composed of A,C,G,T), and for this sequence give its transcription to RNA, its translation to protein, and for the protein we get after translation the molecular weight and a scan for hydrophobicity. This code has to be structured so that each of the different operations are in functions def, and each operation for example translation - is performed by passing the DNA sequence to the def as argument, and the protein sequence is returned (similarly if you are passing protein sequence to get back hydrophobicity). 3. Coding exercise on Python dictionaries (30 points). Now using the material from the last part of the chapter, "accessing data from public databases", we will fetch some data from NCBI Entrez and ExPASy. You can do a web search to learn more about these databases (NCBI stands for National Center for Biotechnology Information), which we will keep using also in subsequent lectures and homeworks. Describe briefly what each database provides. One thing you need to do in your Google Collaboratory, is to install BioPython for the code to work, executing the command !pip install biopython, in a code cell before you write any other code. It will take a minute or so, and then Page 1 of 2 it will conclude with the message "Successfully installed biopython-1.78". Then run the code shown in the book chapter, both for the Entrez and the ExPASy. Very important, you need to add the line from Bio import Seqlo, as the way it is written in the book it is not clear that you need to use this line for the code to work You will notice that the code is using dnaObj.description and dnaObj.seq (and similarly for the protein). Explain what each is for, based on the output you see after you run the code. For the protein data fetch from ExPASy, expand the code so that it saves the data from the protein Obj.description to a variable, then processes that variable and creates a dictionary with key-value pairs such as for example {"RecName": "..Hemoglobin... ", "Altname": "... Beta..". "...} The example data I show for this dictionary, will make more sense when you run the code, and see what is returned from protein Obj.description. To complete this, you will need you to do some string splitting and looping that we have seen in previous lectures, combined with dictionary creation, in addition to data retrieval from remote databases that we are seeing now. Finally, get the data from protein Obj.seq, find its molecular weight (using a def function code from the no.2 question above), and add that information as an additional key-value pair on the dictionary (you can name the key "molweight" or whatever you like)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
