Question: This question is based on a real life problem and a common situation in data analysis. You have a folder (directory) of fasta (DNA sequence)

This question is based on a real life problem and a common situation in data analysis. You have a folder (directory) of fasta (DNA sequence) data files here. (Note that the folder is a compressed zip file than you need to uncompress). You also have a master spreadsheet (header_changes.csv) here. Each fasta file has a header line, but these need to be changed. Each row in header_changes.csv pertains to a file and the first item in that row is the file name. Then a comma, then what should be the current header (you might want to check to see if that is correct, and even deliberately insert one with the wrong header to make sure it can detect that!). The third row is the new header name. So your job is to replace the headers in all files correctly. You might want to make sure that instead of writing over the old files, you save the new files in a new directory. You can solve this in python, bash, or any combination of the 2. You must include some documentation so that anyone else can run your code easily and understand what is going on..

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!