Question: Assignment 1 - BIOTECH 4813 Bioinformatics A kmer is a stretch of DNA sequence data of length k. Any sequencing read of length n can

Assignment 1 - BIOTECH 4813 Bioinformatics A kmer is a stretch of DNA sequence data of length k. Any sequencing read of length n can be broken up into a set of overlapping n-k+1 kmers. Kmer analysis is used in bioinformatics as a tool to understand the composition of a sequencing library and identify reads containing errors. For this assignment you will create a kmer counting program in Python. The program will take the name of the sequencing file to parse, the kmer size and sequencing file format as command-line arguments. count_kmers . py sequences . fastq 17 fastq The program will output a tab delimited file with each line of the file having the kmer sequence and how many times that kmer appears in the file of sequencing reads. A sample file will be provided to you. This assignment will be marked on; 1. Correctness of function 2. Clearly written, formatted and documented code What to hand in: A single Python program named count_kmers.pl meeting the requirements of the assignment
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
