Question: C++ Your assignment is to build an application that can read a file of nucleotides into a linked list. 1. The file contains a sequence
C++
Your assignment is to build an application that can read a file of nucleotides into a linked list.
1. The file contains a sequence of nucleotides (G, C, A, T).
2. The file will always contain a multiple of 3 (due to that they are trinucleotides).
3. The file needs to be imported into a user made linked list.
4. The file is loaded via command line argument (included in the provided makefile and driver.cpp).
5. No static arrays or vectors are allowed in this project although dynamically allocated vectors are allowed.
6. All user inputs will be assumed to be the correct data type. For example, if you ask the user for an integer, they will provide an integer.
7. Regardless of the sample output below, all user input must be validated. If you ask for a number between 1 and 5 with the user entering an 8, the user should be re-prompted.
8. Have a main menu that asks the user if they want to:
a.What would you like to do?:
i.Display DNA (Leading Strand)
ii.Display DNA (Base Pairs)
iii.Inventory Basic Amino Acids
iv.Sequence Entire DNA Strand
v.Exit
1.Upon exit, nothing is saved
DNA.h
#ifndef DNA_H
#define DNA_H
#include
#include
#include
#include
using namespace std;
//Constant number of nucleotides in trinucleotide
const int TRINUCLEOTIDE_SIZE = 3;
struct Nucleotide {
char m_payload;
Nucleotide *m_next;
};
class DNA {
public:
//name: DNA (default constructor)
//pre: None
//post: A linked list (DNA) where m_head and m_tail points to NULL
DNA();
//name: ~DNA (destructor)
//pre: There is an existing linked list
//post: A linked list (DNA) is deallocated (including all dynamically
// allocated nucleotides)
// to have no memory leaks!
~DNA();
//name: InsertEnd
//pre: Takes in a char. Creates new node (nucleotide).
// Requires a linked list (strand of DNA)
//post: Adds a new node (nucleotide) to the end of the linked list (strand of DNA).
void InsertEnd(char payload);
//name: Display
//pre: Outputs the dna strand(s); Pass it 1 for just the nucleotides;
// 2 for the nucleotides and it's complement (base pair)
//post: None
void Display(int numStrand);
//name: NumAmino
//pre: Takes in an amino acid name and its trinucleotide codon
// Trinucleotides are just three nucleotides in a row.
//post: Searches the linked list for specific sequence; outputs results
void NumAmino(string name, string trinucleotide);
//name: Sequence
//pre: Takes in full genetic code of one polynucleotide and looks at
// one trinucleotide at a time.
// Known amino acids are displayed, others are unknown. Stored in dynamic array.
//post: Displays either name of amino acid or unknown for each trinucleotide
void Sequence();
//name: Translate (Provided)
//pre: Takes in three nucleotides (must be G,C,T, or A)
//post: Translates a trinucleotide to its amino acid
string Translate(string);
//name: IsEmpty
//pre: Takes in a linked list (DNA)
//post: Checks to see if the linked list (strand of DNA) is empty or not
bool IsEmpty();
//name: SizeOf
//pre: Takes in a linked list (DNA)
//post: Populates m_size with the total number of nucleotides loaded
void SizeOf();
//name Clear
//pre: Takes in a linked list (DNA)
//post: Clears out the linked list (all nodes too)
void Clear();
private:
Nucleotide *m_head;
Nucleotide *m_tail;
int m_size;
};
#endif
Sequencer.h
#ifndef SEQUENCER_H
#define SEQUENCER_H
#include "DNA.h"
#include
#include
#include
#include
using namespace std;
class Sequencer {
public:
//name: Sequencer (default constructor)
//pre: A linked list (DNA) is available
//post: A linked list (DNA) where m_head and m_tail points to NULL
// m_size is also populated with SizeOf
Sequencer(string fileName);
//name: Sequencer (destructor)
//pre: There is an existing linked list (DNA)
//post: A linked list (DNA) is deallocated (including nucleotides) to have no memory leaks!
~Sequencer();
//name: readFile
//pre: Valid file name of characters (Either A, T, G, or C)
//post: Populates the LinkedList (DNA)
void readFile();
//name: mainMenu
//pre: Populated LinkedList (DNA)
//post: None
void mainMenu();
private:
DNA m_dna;
string m_fileName;
};
#endif
driver.cpp
#include "Sequencer.h"
#include
using namespace std;
//Uses command line arguments to pass the name of the data file to the sequencer
int main (int argc, char* argv[]) {
if (argc < 2)
{
cout << "You are missing a data file." << endl;
cout << "File 1 should be a file of half of the dna base pairs. " << endl;
}
else
{
Sequencer D = Sequencer(argv[1]);
}
return 0;
}
Translate_to_DNA.cpp
//This code belongs in DNA.cpp
string DNA::Translate(const string trinucleotide){
if((trinucleotide=="ATT")||(trinucleotide=="ATC")||(trinucleotide=="ATA"))
return ("Isoleucine");
else if((trinucleotide=="CTT")||(trinucleotide=="CTC")||(trinucleotide=="CTA")||
(trinucleotide=="CTG")|| (trinucleotide=="TTA")||(trinucleotide=="TTG"))
return ("Leucine");
else if((trinucleotide=="GTT")||(trinucleotide=="GTC")||
(trinucleotide=="GTA")||(trinucleotide=="GTG"))
return ("Valine");
else if((trinucleotide=="TTT")||(trinucleotide=="TTC"))
return ("Phenylalanine");
else if((trinucleotide=="ATG"))
return ("Methionine");
else if((trinucleotide=="TGT")||(trinucleotide=="TGC"))
return ("Cysteine");
else if((trinucleotide=="GCT")||(trinucleotide=="GCC")||
(trinucleotide=="GCA")||(trinucleotide=="GCG"))
return ("Alanine");
else if((trinucleotide=="GGT")||(trinucleotide=="GGC")||
(trinucleotide=="GGA")||(trinucleotide=="GGG"))
return ("Glycine");
else if((trinucleotide=="CCT")||(trinucleotide=="CCC")||
(trinucleotide=="CCA")||(trinucleotide=="CCG"))
return ("Proline");
else if((trinucleotide=="ACT")||(trinucleotide=="ACC")||
(trinucleotide=="ACA")||(trinucleotide=="ACG"))
return ("Threonine");
else if((trinucleotide=="TCT")||(trinucleotide=="TCC")||
(trinucleotide=="TCA")||(trinucleotide=="TCG")||
(trinucleotide=="AGT")||(trinucleotide=="AGC"))
return ("Serine");
else if((trinucleotide=="TAT")||(trinucleotide=="TAC"))
return ("Tyrosine");
else if((trinucleotide=="TGG"))
return ("Tryptophan");
else if((trinucleotide=="CAA")||(trinucleotide=="CAG"))
return ("Glutamine");
else if((trinucleotide=="AAT")||(trinucleotide=="AAC"))
return ("Asparagine");
else if((trinucleotide=="CAT")||(trinucleotide=="CAC"))
return ("Histidine");
else if((trinucleotide=="GAA")||(trinucleotide=="GAG"))
return ("Glutamic acid");
else if((trinucleotide=="GAT")||(trinucleotide=="GAC"))
return ("Aspartic acid");
else if((trinucleotide=="AAA")||(trinucleotide=="AAG"))
return ("Lysine");
else if((trinucleotide=="CGT")||(trinucleotide=="CGC")||(trinucleotide=="CGA")||
(trinucleotide=="CGG")||(trinucleotide=="AGA")||(trinucleotide=="AGG"))
return ("Arginine");
else if((trinucleotide=="TAA")||(trinucleotide=="TAG")||(trinucleotide=="TGA"))
return ("Stop");
else
cout << "returning unknown" << endl;
return ("Unknown");
}
The Translate_to_DNA.cpp is a single function that belongs in DNA.cpp.
The project must be completed in C++. You may not use any libraries or data structures that we have not learned in class. Libraries we have learned include
You must use the function prototypes as outlined in the DNA.h and Sequencer.h header file. Do not edit the header files.
You need to write the functions for the class (DNA.cpp) based on the header file (DNA.h). The nucleotides (i.e. Nodes) for the linked list that you are implementing are structs that hold two pieces of information a char and a pointer to the next node. Do not use the STL for this project.
DNA() The constructor creates a new empty linked list. m_head and m_tail are always NULL and m_size is zero.
~DNA() The destructor de-allocates any dynamically allocated memory. (May call clear)
Clear() Clears the linked list.
InsertEnd() Always inserts new nucleotides at the end of the linked list.
Display() Takes in a variable to know how many strands you want to display. 1 shows just the nucleotides that were loaded. 2 shows the nucleotides and their complements (G-C), (C-G), (T-A), or (T-A).
IsEmpty() Returns if the linked list is empty.
SizeOf() Populates m_size of sequencer with how many nucleotides were loaded.
NumAmino() Takes in the name and trinucleotide codon. Counts the number of instances of that trinucleotide codon in just the provided strand. For example, it could take Tryptophan and TGG or Phenylalanine and TTT. It then iterates over the structure to count how many instances of those amino acids exist in the DNA. Additionally, if we had the sequence T-T-T-T-G-G, we would have exactly 2 codons (TTT) and (TGG). The same if we had a sequence that was 15,000 nucleotides long. We would have exactly 5,000 trinucleotide codons. We never count overlapping codons. Run numAmino on at least Tryptophan (TGG) and Phenylalanine (TTT).
Sequence() Iterates over entire structure and converts trinucleotides to amino acids for all nucleotides in the file. Stores the amino acid name in a dynamic array. Displays amino acid list.
Translate() Converts a trinucleotide string to an amino acid name. It is available for download in my folder above and is named: Translate_to_DNA.cpp.
You need to code up the various functions that are called in the Sequencer.cpp file that are prototyped in Sequencer.h.
Sequencer() The constructor builds the DNA (linked list), reads the file, and calls mainMenu.
~Sequencer() The destructor de-allocates any dynamically allocated memory.
ReadFile() The ReadFile function loads a file of nucleotides into the DNA (linked list). The file itself is passed to the ReadFile function from the command line (in driver.cpp which is provided). Also, calls SizeOf to populate m_size.
MainMenu() Calls the various functions in the DNA (linked list).
Choices (1 and 2) calls the DNA function Display.
Choice 3 calls the DNA function NumAmino.
Choice 4 - calls the DNA function Sequence.
Choice 5 - Exits.
Sample Input and Output
| m-bash-4.1$ make run1 ./proj3 proj3_9.csv New Sequencer loaded What would you like to do?: 1. Display Sequencer (Leading Strand) 2. Display Sequencer (Base Pairs) 3. Inventory Basic Amino Acids 4. Sequence Entire DNA Strand 5. Exit 1 Base Pairs: A A G T G G C T A END 9 nucleotides listed. 3 trinucleotides listed. What would you like to do?: 1. Display Sequencer (Leading Strand) 2. Display Sequencer (Base Pairs) 3. Inventory Basic Amino Acids 4. Sequence Entire DNA Strand 5. Exit 2 Base Pairs: A-T A-T G-C T-A G-C G-C C-G T-A A-T END 9 base pairs listed. 3 trinucleotides listed. What would you like to do?: |
Here are the runs looking at Inventory Basic (3) and Sequence Entire DNA (4):
| What would you like to do?: 1. Display Sequencer (Leading Strand) 2. Display Sequencer (Base Pairs) 3. Inventory Basic Amino Acids 4. Sequence Entire DNA Strand 5. Exit 3 Tryptophan: 1 identified Phenylalanine: 0 identified What would you like to do?: 1. Display Sequencer (Leading Strand) 2. Display Sequencer (Base Pairs) 3. Inventory Basic Amino Acids 4. Sequence Entire DNA Strand 5. Exit 4 Amino Acid List: Lysine Tryptophan Leucine Total Amino Acids Identified: 3 What would you like to do?: 1. Display Sequencer (Leading Strand) 2. Display Sequencer (Base Pairs) 3. Inventory Basic Amino Acids 4. Sequence Entire DNA Strand 5. Exit 5 DNA removed from memory -bash-4.1$ |
Finally this is if you were going to validate a menu entry.
| -bash-4.1$ make run1 ./proj3 proj3_9.csv New Sequencer loaded What would you like to do?: 1. Display Sequencer (Leading Strand) 2. Display Sequencer (Base Pairs) 3. Inventory Basic Amino Acids 4. Sequence Entire DNA Strand 5. Exit 0 What would you like to do?: 1. Display Sequencer (Leading Strand) 2. Display Sequencer (Base Pairs) 3. Inventory Basic Amino Acids 4. Sequence Entire DNA Strand 5. Exit 6 What would you like to do?: 1. Display Sequencer (Leading Strand) 2. Display Sequencer (Base Pairs) 3. Inventory Basic Amino Acids 4. Sequence Entire DNA Strand 5. Exit 5 DNA removed from memory -bash-4.1$ |
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
