Question: Python: 6. Sanity-check the data For each of the 11 .fastq files, compare the following three quantities: the sum of the A count, the C
Python:
6. Sanity-check the data For each of the 11 .fastq files, compare the following three quantities: the sum of the A count, the C count, the G count, and the T count the total_count variable the length of the seq variable. You can compute this with len(seq). In other words, compute the three numbers for test-small.fastq and determine whether they are equal or different. Then do the same for test-high-gc-1.fastq, etc. For at least one file, at least two of these metrics will differ. In your answers.txt file, state which file(s) and which metrics. (If all the metrics are equal for each file, then your code contains a mistake.) In your answers.txt file, write a short paragraph that explains why. Explaining why (or debugging your code if all the metrics were the same) might require you to do some detective work. For instance, to understand the issue, you may need to load a file into a text editor and examine it. We strongly suggest that you start with the smallest file for which the numbers are not all the same. Perusal of the file may help you. Failing that, you can manually compute each of the counts, and then compare your manual results to what your program computes to determine where the error lies. A final approach would be to modify your program, or create a new program, to compute the three metrics for each line of a file separately: if the metrics differ for an entire file, then they must differ for some specific line, and then examining that line will help you understand the problem. If all of the three quantities that you measured are the same, then it would not matter which one you used in the denominator when computing the GC content. In fact, you saw that the numbers are not the same. In file answers.txt, state which of these quantities can be used in the denominator and which cannot, and why. If your program incorrectly computed the GC content (which should be equal to (G+C)/(A+C+G+T)), then state that fact in your answers.txt file. Then, go back and correct it, and also update any incorrect answers elsewhere in your answers.txt file.
Don't how to solve this one. Please help.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
