DNA Profiling Learner Objectives At the conclusion of this programming assignment, participants should be able to Write a C program that accepts a CSV file representing a DNA database and a text file representing a DNA sequence Use a combination of loops and string manipulation, and file I O to identify whom the DNA sequence belongs The data in the above file would suggest that Harry has the sequence AGAT repeated 2 times consecutively somewhere in his DNA, the sequence AATG repeated 8 times, and TCTAG repeated 3 times Ron, meanwhile, has those same three STRs repeated 4 times, 1 times, and 5 times, respectively And Hermione has those same three STRs repeated 3, 2, and 5 times, respectively So given a sequence of DNA, how might you identify to whom it belongs Well, imagine that you looked through the DNA sequence for the longest consecutive sequence of repeated AGAT and found that the longest sequence was 4 repeats long If you then found that the longest sequence of AATG is 1 repeat long, and the longest sequence of TCTAG is 5 repeats long, that would provide pretty good evidence that the DNA was Ron's Of course, it's also possible that once you take the counts for each of the STRs, it doesn't match anyone in your DNA database, in which case you have no match st In practice, since analysts know on which chromosome and at which location in the DNA an STR will be found, they can localize their search to just a narrow section of DNA But we'll ignore that detail for this problem Implementation Requirements The program should require as its first command line argument the name of a CSV file containing the STR counts for a list of individuals and should require as its second command line argument the name of a text file containing the DNA sequence to identify Your program should open the CSV file and read its contents For example, below is the contents of database small csv name,AGATC,AATG,TATC Alice,2,8,3 Bob,4,1,5 Charlie,3,2,5 The first row of the CSV file will be the column names The first column will be the word name and the remaining columns will be the STR sequences themselves You will read these STR sequences and store them in a vector vector strSequence Then read the rest of the contents into a vector of struct Data struct Data string name person's name vector strCounters count for each STR Your program should open the DNA sequence and read its contents into a string For each of the STRs (from the first line of the CSV file), your program should compute the longest run of consecutive repeats of the STR in the DNA sequence to identify If the STR counts match exactly with any of the individuals in the CSV file, your program should print out the name of the matching individual You may assume that the STR counts will not match more than one individual If the STR counts do not match exactly with any of the individuals in the CSV file, your program should print No match Additional Requirements The executable program should be called profile Practice modular programming by breaking down your program into functions If you want to use classes, you may do so, but try to analyze how this should be represented as a class Use the three file structure Add a file header comment for each file Add a function header comment for each function Add in line comments in your code Commit your code frequently Usage Your program should behave per the example below $ profile database small csv sequences 01 txt Bob

The Answer is in the image, click to view ...

Question: DNA Profiling Learner Objectives At the conclusion of this programming assignment, participants should be able to: Write a C++ program that accepts a CSV file

DNA Profiling

Learner Objectives

At the conclusion of this programming assignment, participants should be able to:

Write a C++ program that accepts a CSV file representing a DNA database and a text file representing a DNA sequence.
Use a combination of loops and string manipulation, and file I/O to identify whom the DNA sequence belongs.

The data in the above file would suggest that Harry has the sequence AGAT repeated 2 times consecutively somewhere in his DNA, the sequence AATG repeated 8 times, and TCTAG repeated 3 times. Ron, meanwhile, has those same three STRs repeated 4 times, 1 times, and 5 times, respectively. And Hermione has those same three STRs repeated 3, 2, and 5 times, respectively.

So given a sequence of DNA, how might you identify to whom it belongs? Well, imagine that you looked through the DNA sequence for the longest consecutive sequence of repeated AGAT and found that the longest sequence was 4 repeats long. If you then found that the longest sequence of AATG is 1 repeat long, and the longest sequence of TCTAG is 5 repeats long, that would provide pretty good evidence that the DNA was Ron's. Of course, it's also possible that once you take the counts for each of the STRs, it doesn't match anyone in your DNA database, in which case you have no match. st In practice, since analysts know on which chromosome and at which location in the DNA an STR will be found, they can localize their search to just a narrow section of DNA. But we'll ignore that detail for this problem.

Implementation Requirements

[ ] The program should require
- [ ] as its first command-line argument the name of a CSV file containing the STR counts for a list of individuals and
- [ ] should require as its second command-line argument the name of a text file containing the DNA sequence to identify
[ ] Your program should open the CSV file and read its contents. For example, below is the contents of database/small.csv
name,AGATC,AATG,TATC Alice,2,8,3 Bob,4,1,5 Charlie,3,2,5
[ ] The first row of the CSV file will be the column names. The first column will be the word name and the remaining columns will be the STR sequences themselves. You will read these STR sequences and store them in a vector.
vector strSequence;
[ ] Then read the rest of the contents into a vector of struct Data
struct Data { string name; // person's name vector strCounters; // count for each STR };
[ ] Your program should open the DNA sequence and read its contents into a string.
[ ] For each of the STRs (from the first line of the CSV file), your program should compute the longest run of consecutive repeats of the STR in the DNA sequence to identify.
[ ] If the STR counts match exactly with any of the individuals in the CSV file, your program should print out the name of the matching individual.
- [ ] You may assume that the STR counts will not match more than one individual.
- [ ] If the STR counts do not match exactly with any of the individuals in the CSV file, your program should print "No match"
Additional Requirements
[ ] The executable program should be called profile
[ ] Practice modular programming by breaking down your program into functions. If you want to use classes, you may do so, but try to analyze how this should be represented as a class.
[ ] Use the three file structure
[ ] Add a file header comment for each file
[ ] Add a function header comment for each function
[ ] Add in-line comments in your code
[ ] Commit your code frequently

Usage

Your program should behave per the example below:

$ ./profile database/small.csv sequences/01.txt Bob

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

This needs to be done in C++. My main struggle is just getting the file to even load to the doubly linked list. the file is formatted like this: Taylor Swift,1989,Shake it Off,Pop,3:35,12,3...

*******PLEASE ANSWER IN PYTHON ONLY********* PA4 Maps (100 pts) Due: Learner Objectives ----------------- At the conclusion of this programming assignment, participants should be able to: Implement...

********PLEASE ANSWER IN PYTHON ONLY********* PA4 Maps (100 pts) Due: Learner Objectives ----------------- At the conclusion of this programming assignment, participants should be able to: Implement...

*******PLEASE ANSWER IN PYTHON ONLY********* Learner Objectives ----------------- At the conclusion of this programming assignment, participants should be able to: Implement hash tables and hash...

PA4 Maps (100 pts) Due: Learner Objectives ----------------- At the conclusion of this programming assignment, participants should be able to: Implement hash tables and hash functions Linear probing...

*******PLEASE ANSWER IN PYTHON ONLY********* Learner Objectives ----------------- At the conclusion of this programming assignment, participants should be able to: Implement hash tables and hash...

I. Learner Objectives At the conclusion of this programming assignment, participants should be able to: *Read, write to, and update files Open and close files *Manipulate file handles Apply standard...

* 43% 12:01 AM eecs.wsu.edu .ill Verizon Lab 4: Wonderful World of "if" Statements in C Assigned: Week of February 5, 2018 Due: At the end of the lab session I. Learner Objectives: At the conclusion...

3:31 PM eecs.wsu.edu ntl Verizon I. Learner Objectives: At the conclusion of this programming assignment, participants should be able to: Apply repetition structures within algorithms *Construct...

Can you verify thr steps please I. Learner Objectives: At the conclusion of this programming assignment, participants should be able to: * Declare and define arrays of pointers to strings Manipulate...

A firm with earnings before interest and taxes of $500,000 needs $1 million of additional funds. If it issues debt, the bonds will mature after 20 years and have a coupon of 10 percent. The firm...

Suppose that you have constructed a stem-and-leaf diagram and discover that it is only moderately useful because there are too few stems. How can you remedy the problem?

Describe the cost flows associated with job-order costing.

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

3. Have you ever developed an analytics report for users from scratch? What BI tools, or tools and data sets, did you use? Can you talk more about how you worked with users to elicit the information...

2. In your experience with data analysis and business intelligence, did you ever work with tools that were not as user-friendly as they could have been? What would you have recommended to improve the...

Project management skills and/or experience desirable