Question: Include all the code with the functions (the def) from the book chapter in a Google Collaboratory. Now write some code where a DNA sequence

Include all the code with the functions (the "def") from the book chapter in a Google Collaboratory. Now write some code where a

Include all the code with the functions (the "def") from the book chapter in a Google Collaboratory. Now write some code where a DNA sequence is given of at least 21 length (your choice of the sequence composed of A,C,G,T), and for this sequence give its transcription to RNA, its translation to protein, and for the protein we get after translation the molecular weight and a scan for hydrophobicity. This code has to be structured so that each of the different operations are in functions def, and each operation for example translation - is performed by passing the DNA sequence to the def as argument, and the protein sequence is returned (similarly if you are passing protein sequence to get back hydrophobicity). 3. Coding exercise on Python dictionaries (30 points). Now using the material from the last part of the chapter, "accessing data from public databases", we will fetch some data from NCBI Entrez and ExPASy. You can do a web search to learn more about these databases (NCBI stands for National Center for Biotechnology Information), which we will keep using also in subsequent lectures and homeworks. Describe briefly what each database provides. One thing you need to do in your Google Collaboratory, is to install BioPython for the code to work, executing the command !pip install biopython, in a code cell before you write any other code. It will take a minute or so, and then Page 1 of 2 it will conclude with the message "Successfully installed biopython-1.78". Then run the code shown in the book chapter, both for the Entrez and the ExPASy. Very important, you need to add the line from Bio import Seqlo, as the way it is written in the book it is not clear that you need to use this line for the code to work You will notice that the code is using dnaObj.description and dnaObj.seq (and similarly for the protein). Explain what each is for, based on the output you see after you run the code. For the protein data fetch from ExPASy, expand the code so that it saves the data from the protein Obj.description to a variable, then processes that variable and creates a dictionary with key-value pairs such as for example {"RecName": "..Hemoglobin... ", "Altname": "... Beta..". "...} The example data I show for this dictionary, will make more sense when you run the code, and see what is returned from protein Obj.description. To complete this, you will need you to do some string splitting and looping that we have seen in previous lectures, combined with dictionary creation, in addition to data retrieval from remote databases that we are seeing now. Finally, get the data from protein Obj.seq, find its molecular weight (using a def function code from the no.2 question above), and add that information as an additional key-value pair on the dictionary (you can name the key "molweight" or whatever you like). Include all the code with the functions (the "def") from the book chapter in a Google Collaboratory. Now write some code where a DNA sequence is given of at least 21 length (your choice of the sequence composed of A,C,G,T), and for this sequence give its transcription to RNA, its translation to protein, and for the protein we get after translation the molecular weight and a scan for hydrophobicity. This code has to be structured so that each of the different operations are in functions def, and each operation for example translation - is performed by passing the DNA sequence to the def as argument, and the protein sequence is returned (similarly if you are passing protein sequence to get back hydrophobicity). 3. Coding exercise on Python dictionaries (30 points). Now using the material from the last part of the chapter, "accessing data from public databases", we will fetch some data from NCBI Entrez and ExPASy. You can do a web search to learn more about these databases (NCBI stands for National Center for Biotechnology Information), which we will keep using also in subsequent lectures and homeworks. Describe briefly what each database provides. One thing you need to do in your Google Collaboratory, is to install BioPython for the code to work, executing the command !pip install biopython, in a code cell before you write any other code. It will take a minute or so, and then Page 1 of 2 it will conclude with the message "Successfully installed biopython-1.78". Then run the code shown in the book chapter, both for the Entrez and the ExPASy. Very important, you need to add the line from Bio import Seqlo, as the way it is written in the book it is not clear that you need to use this line for the code to work You will notice that the code is using dnaObj.description and dnaObj.seq (and similarly for the protein). Explain what each is for, based on the output you see after you run the code. For the protein data fetch from ExPASy, expand the code so that it saves the data from the protein Obj.description to a variable, then processes that variable and creates a dictionary with key-value pairs such as for example {"RecName": "..Hemoglobin... ", "Altname": "... Beta..". "...} The example data I show for this dictionary, will make more sense when you run the code, and see what is returned from protein Obj.description. To complete this, you will need you to do some string splitting and looping that we have seen in previous lectures, combined with dictionary creation, in addition to data retrieval from remote databases that we are seeing now. Finally, get the data from protein Obj.seq, find its molecular weight (using a def function code from the no.2 question above), and add that information as an additional key-value pair on the dictionary (you can name the key "molweight" or whatever you like)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Finding genes in DNA is a fundamental problem in biology. After all, only a few percent of human DNA actually contains protein-coding genes. Sifting through the more than 3 billion base pairs to find...

please try to answer parts 3,7&8 especially using python 2.7. my code is not working and this is due today. verify using working test codes. Basically I am stuck with the longestORF non reading (part...

# 1. # Explain what's wrong with the code below and rewrite it so it works correctly. # Rewrite the code so that the code runs without error counter = 0 def increment(): counter += 1 increment() #...

Input: insert 0 DNA AATTCCGGAATTCCGG insert 2 RNA UAGACAUGGAUU insert 1 DNA ABCDE insert 1 RNA TTTT insert 4 DNA AATTCCGGAATTCCGG print remove 1 remove 4 print print 0 print 2 print 4 clip 0 0 7...

My Scrabble: a game of words This project will take you to your childhood where we loved word games like Scrabble. This game that you are going to code is a lot like Scrabble. Letters are dealt to...

Prompt: Given the genomic sequences for an organism; one of the first steps in identifying the genes is to identify the open reading frames (ORFs). An open reading frame is a maximal length sequence...

Working with FASTA data Modules you can use: sys (Links to an external site.) collections (Links to an external site.) os (Links to an external site.) re (Links to an external site.) argparse (Links...

Urgent! Need help ASAP with this C Programming Assignment For this assignment, you will be simulating the manipulation the nucleotides in strands of DNA. You do not need to know much about DNA to do...

Part 1 : dna analysis.c main: You will write a main function which takes in a DNA sequence from the user, represented as a string of characters A , T , C , and G , and assigns it to a char array. The...

REFER TO 1 QUESTION POSTED BEFORE:...

Henegar Corporation sells products for $14 each that have variable costs of $11 per unit. Henegars annual fixed cost is $153,000. Required Determine the break-even point in units and dollars.

Barry, a recent engineering graduate, never took engineering economics. When he graduated, he was hired by a prominent architectural firm. The earnings from this job allowed him to deposit $750 each...

Why do accountants apply the lower - of - cost - net - realizable - value method? Select all that apply. 1 ) Because under the expense recognition principle, we recognize an impairment in the value...

CASE 3 On December 5, 2009, Football Innovations, Inc., executed an Initial Public Offering of its shares to raise capital for research. Football Innovations, Inc. (Company) is a C corporation, with...

14-13 Assess the importance of the Unemployment Compensation Modernization System project for the state of Pennsylvania.

14-12 Form a group with three or four other students. Write a description of the implementation problems you might expect to encounter in one of the systems described in the Interactive Sessions or...

After selecting a mortgage, calculate your closing costs and the monthly payment. When you are finished, evaluate the whole process. For example, assess the ease of use of the site and your ability...