Question: Assignment 3 - Summer 2018 CS 4329 Introduction to bioinformatics. (100 points) Due Date: 11:59 PM of 06/27/2018 All code must be written in C++

Assignment 3 - Summer 2018

CS 4329 Introduction to bioinformatics. (100 points)

Due Date: 11:59 PM of 06/27/2018

All code must be written in C++ (C++ was also used in CS1, CS2, and CS3). You should submit

The source codes.

A document showing the output.

Put all the individual files in one single folder and compress the folder. Upload the compressed folder. Files should include your first and last names.

1. In this question, you will investigate the nucleotides at the splicing sites (intersection of the exon and intron) within protein coding genes in human genome. You are given a fasta file called gene_fasta_chr12.fa which contain the sequences of randomly selected 2,412 protein coding genes from chromosome 12 in human. The sequence includes both the exon and intron portions of the gene. The nucleotides in exons are uppercased and the ones in the intron are lower case. Implement programs to compute the following [100 points]

Average number of exons in a gene

Average number of introns in a gene

Length of the longest and shortest intron

Length of the longest and shortest exon

Look at the positions immediately after each exon (donor site or the first two bases of each intron) in all the genes and count the frequency of all possible 2-mers at those locations. (GT is expected to have the highest frequency).

Look at the positions immediately before internal exons (splice acceptor sites or the last two bases of each intron) in all the genes and count the frequency of all possible 2-mers at those locations. (AG is expected to have the highest frequency).

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!