Question: python http://www.cse.msu.edu/~cse231/Online/Projects/Project06/ C.elegans.gff file link- http://www.cse.msu.edu/~cse231/Online/Projects/Project06/C.elegans.gff C.elegans_small.gff file link - http://www.cse.msu.edu/~cse231/Online/Projects/Project06/C.elegans_small.gff CSE 231 Spring 2018 Programming Project 06 Edit 2/21: removed one line from Function

python

http://www.cse.msu.edu/~cse231/Online/Projects/Project06/ python http://www.cse.msu.edu/~cse231/Online/Projects/Project06/ C.elegans.gff file link- http://www.cse.msu.edu/~cse231/Online/Projects/Project06/C.elegans.gff C.elegans_small.gff file link - http://www.cse.msu.edu/~cse231/Online/Projects/Project06/C.elegans_small.gff CSE 231 Spring 2018 Programming Project 06 Edit 2/21: removed one line from Function Test 2 to improve clarity This assignment is worth 45 points and must be completed and turned in before 11:59 on Monday, February 26, 2018 Assignment Overview This assignment will give you more experience on

the use of: 1. Lists and tuples 2. function 3. File manipulation

C.elegans.gff file link- http://www.cse.msu.edu/~cse231/Online/Projects/Project06/C.elegans.gff

C.elegans_small.gff file link - http://www.cse.msu.edu/~cse231/Online/Projects/Project06/C.elegans_small.gff

CSE 231 Spring 2018 Programming Project 06 Edit 2/21: removed one line from Function Test 2 to improve clarity This assignment is worth 45 points and must be completed and turned in before 11:59 on Monday, February 26, 2018 Assignment Overview This assignment will give you more experience on the use of: 1. Lists and tuples 2. function 3. File manipulation The goal of this project is to extract gene lengths from a gene annotation file. With a gene annotation GFF file, you will need to extract the gene coordinates on each chromosome and calculate the average and standard deviation of gene lengths Assignment Background The eukaryotic genome is composed of multiple chromosomes. On each chromosome, there are multiple genes. In bioinformatics, the genome annotations can be saved in a file format called GFF. In NCBI genome database (https://www.ncbi.nlm.nih.gov/genome/), there are many publically available annotated organisms. These annotated genomes can be downloaded in multiple file formats, including GFF format. For this project, we will focus on a relatively simple model species: Caenorhabditis elegans. This worm has a genome of six chromosomes named chrI, chrlI, chrIII, chrIV, chrV, and chrX We provide two input files: C.elegans-small .gff C.elegans.gff # a small file for development # a real BIG data file Project Description a) open file ) prompts the user to enter a filename. The program will try to open a tab- separated GFF file (a text file). An error message should be shown if the file cannot be opened. This function will loop until it receives proper input and successfully opens the file. It returns a file pointer b) read file (fp) receivers a file pointer of the data file and read all the genes information. For this project, we are only interested in the following columns: the chromosome name (string) is in column 0, the gene_start is in column 3, and the gene end is in column 4. Convert number strings to int. No other values are needed for this project. If a value is missing, use 0 as the value For each gene, save it in a tuple, (chromosome, gene_start, gene_end), and append each tuple to a list of genes. Sort the list and then return the sorted list of genes (sorting makes a canonical list for comparison testing on Mimir) b) extract_chromosome (genes_list, chromosome) receives a list of genes (such as what was returned by the read file() function) and a chromosome name, extract the gene information for this chromosome and save in list chrom gene_list. Sort and return the list (sorting makes a canonical list for comparison testing on Mimir) c) extract_genome (genes_list) receives a list of genes and extract the gene information for each chromosome. In this function, use extract_chromosome(genes_list, chromosome) to extract CSE 231 Spring 2018 Programming Project 06 Edit 2/21: removed one line from Function Test 2 to improve clarity This assignment is worth 45 points and must be completed and turned in before 11:59 on Monday, February 26, 2018 Assignment Overview This assignment will give you more experience on the use of: 1. Lists and tuples 2. function 3. File manipulation The goal of this project is to extract gene lengths from a gene annotation file. With a gene annotation GFF file, you will need to extract the gene coordinates on each chromosome and calculate the average and standard deviation of gene lengths Assignment Background The eukaryotic genome is composed of multiple chromosomes. On each chromosome, there are multiple genes. In bioinformatics, the genome annotations can be saved in a file format called GFF. In NCBI genome database (https://www.ncbi.nlm.nih.gov/genome/), there are many publically available annotated organisms. These annotated genomes can be downloaded in multiple file formats, including GFF format. For this project, we will focus on a relatively simple model species: Caenorhabditis elegans. This worm has a genome of six chromosomes named chrI, chrlI, chrIII, chrIV, chrV, and chrX We provide two input files: C.elegans-small .gff C.elegans.gff # a small file for development # a real BIG data file Project Description a) open file ) prompts the user to enter a filename. The program will try to open a tab- separated GFF file (a text file). An error message should be shown if the file cannot be opened. This function will loop until it receives proper input and successfully opens the file. It returns a file pointer b) read file (fp) receivers a file pointer of the data file and read all the genes information. For this project, we are only interested in the following columns: the chromosome name (string) is in column 0, the gene_start is in column 3, and the gene end is in column 4. Convert number strings to int. No other values are needed for this project. If a value is missing, use 0 as the value For each gene, save it in a tuple, (chromosome, gene_start, gene_end), and append each tuple to a list of genes. Sort the list and then return the sorted list of genes (sorting makes a canonical list for comparison testing on Mimir) b) extract_chromosome (genes_list, chromosome) receives a list of genes (such as what was returned by the read file() function) and a chromosome name, extract the gene information for this chromosome and save in list chrom gene_list. Sort and return the list (sorting makes a canonical list for comparison testing on Mimir) c) extract_genome (genes_list) receives a list of genes and extract the gene information for each chromosome. In this function, use extract_chromosome(genes_list, chromosome) to extract

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Python code please. Please help, this is urgent. The link is : https://www.cse.msu.edu/~cse231/Online/Projects/Project06/Project06.pdf CSE 231 Spring 2018 Programming Project 06 Edit 2/21: removed...

Python computer program. CSE 231 Spring 2018 Programming Project 06 Edit 2/21: removed one line from Function Test 2 to improve clarity This assignment is worth 45 points and must be completed and...

EXAMPLE PROGRAMS listed in Canvas Modules) 2: Enter your full name here Lab: Chapters 4 Description: This needs to be at least a paragraph explanation of the program (in your own words) 6 7 BB...

ASSIGN In this assignment you will use the given code to solve the Map Coloring, and the n-Queens problems as discussed in the lectures. Please see the lecture slides and the Chapter-6 of the...

Excel File Link - https://www.dropbox.com/s/agrizm8ycruf302/Excel.xlsx?dl=0 This is an assignment I need help on (I'm a Graduate Student) Question) In Python 2.7 or the latest one (doesn't matter...

Instructions 1. Using Python IDLE, create a New Empty Script File in your working drive. Note: refer to (SET) How to Download Install and Use Python IDLE link (Page 8 within module 2 on how to create...

How to complete python code? # get a hold of the local module url for web support import url # define repository for text files of interest TEXT_REPOSITORY =...

NOTE: The questions depend on the previous questions answered by an expert here. The previous questions and solutions are provided immediately after the first three questions. This is to enable any...

Help with this question. Use an example file path for the menu text file so I know how to put the file path. What to do & Why? The beach side restaurant is pleased with the menu program you created....

Use Python programing. First please download all 4 files listed below into the same folder. Start by reading the poemTranslator.pdf file to learn about the assignment. You will eventually be filling...

Ben and Jen are married and plan to file a joint return. The following occurred during 2020 Ben's salary... Jen's business income. Bank account interest income... Interest income from state bonds.....

Consider a household that possesses $100,000 worth of valuables (computers, stereo equipment, jewelry, and so forth). This household faces a 0.10 probability of a burglary. If a burglary were to...

Consider the following cash flows: 5 5 9 8 dollars at t = 1 , 1 6 1 7 dollars at t = 2 , and - 7 0 0 dollars at t = 3 . Note the cash flow at t = 3 is negative and the timeline is in years. Using an...

what are the potential implications to commercial banking activities arising from changes to Basel III on capital and liquidity requirements?

How clear are you about your research questions and the implications of them for choosing appropriate quantitative data analysis tests?

What are the strengths and limitations of the data you have collected? What are the implications of the sample size and response rate for your analysis? What seem to be the main features of the data...

Factor analysis can be used to: a. Assess whether some variables can be grouped or clustered together to form a coherent composite variable. b. Work out the extent to which two variables are...