Question: c++ For this assignment you will receive as input two text files, rebase210.txt and sequences.txt. After the header, each line of the database file rebase210.txt

c++

For this assignment you will receive as input two text files, rebase210.txt and sequences.txt. After the header, each line of the database file rebase210.txt contains the name of a restriction enzyme and possible DNA sites the enzyme may cut (cut location is indicated by a ) in the following format:

enzyme_acronym/recognition_sequence//recognition_sequence//

For instance, the first few lines of rebase210.txt are:

AanI/TTA'TAA// AarI/CACCTGCNNNN'NNNN/'NNNNNNNNGCAGGTG// AasI/GACNNNN'NNGTC// AatII/GACGT'C// AbsI/CC'TCGAGG// AccI/GT'MKAC// AccII/CG'CG// AccIII/T'CCGGA// Acc16I/TGC'GCA// Acc36I/ACCTGCNNNN'NNNN/'NNNNNNNNGCAGGT// Acc65I/G'GTACC//

PsiI/TTA'TAA//

That means that each line contains one enzyme acronym associated with one or more recognition sequences. For example on line 2:

The enzyme acronym AarI corresponds to the two recognition sequences CACCTGCNNNN'NNNN and 'NNNNNNNNGCAGGTG.

Question:

You will create a parser to read in this database and construct an AVL tree. For each line of the database and for each recognition sequence in that line, you will create a new SequenceMap object that contains the recognition sequence as its recognition_sequence_ and the enzyme acronym as the only string of its enzyme_acronyms_, and you will insert this object into the tree. This is explained with the following pseudo code:

Tree a_tree; string db_line; // Read the file line-by-line:

while (GetNextLineFromDatabaseFile(db_line)) {

// Get the first part of the line:

string an_enz_acro = GetEnzymeAcronym(db_line); string a_reco_seq; while (GetNextRegocnitionSequence(db_line, a_rego_seq){ SequenceMap new_sequence_map(a_reco_seq, an_enz_acro); a_tree.insert(new_sequence_map); } // End second while.

}

// End first while.

In the case that the new_sequence_map.recognition_sequence_ equals the recognition_sequence_ of a node X in the tree, then the search trees insert() function will call the X.Merge(new_sequence_map) function of the existing element. This will have the effect of updating the enzyme_acronym_ of X. Note, that this will be part of the functionality of the insert() function. The Merge() will only be called in case of duplicates as described above. Otherwise, no Merge() is required and the new_sequence_map will be inserted into the tree.

To implement the above, write a test program named query_tree which will use your parser to create a search tree and then allow the user to query it using a recognition sequence. If that sequence exists in the tree then this routine should print all the corresponding enzymes that correspond to that recognition sequence.

Your programs should run from the terminal as follows:

query_tree

For example, you can write on the terminal:

./query_tree rebase210.txt

The user should enter THREE strings (supposed to be recognition sequences) for instance:

CC'TCGAGG

TTA'TAA

TC'C

Your program should print in the standard output their associated enzyme acronyms. In the above example the output will be

AbsI

AanI PsiI

Not Found

I will test it with a file containing three strings and run your code like that:

./query_trees rebase210.txt < input_part2a.txt

Please make sure the program receives the expected output.

Here is the sequence map.h

#include #include using namespace std;

#ifndef SEQUENCEMAP_H #define SEQUENCEMAP_H

#include #include #include using namespace std;

class SequenceMap { public: /* // Zero-parameter constructor. SequenceMap() = default;*/ // Copy-constructor. SequenceMap(const SequenceMap &rhs) = default; // Copy-assignment. SequenceMap& operator=(const SequenceMap &rhs) = default; // Move-constructor. SequenceMap(SequenceMap &&rhs) = default; // Move-assignment. SequenceMap& operator=(SequenceMap &&rhs) = default; // Destructor. ~SequenceMap() = default;

// Start of Part 1

// Constructor for recognition sequence and enzyme acronym SequenceMap(const string &a_rec_seq, const string &an_enz_acro) { recognition_sequence_ = a_rec_seq; enzyme_acronyms_.push_back(an_enz_acro); }

/* // Constructor for recognition sequence only SequenceMap(const string &a_rec_seq) { recognition_sequence_ = a_rec_seq; enzyme_acronyms_.push_back(""); }*/

// Overload the < operator bool operator<(const SequenceMap &rhs) const { return (recognition_sequence_ < rhs.recognition_sequence_); }

// Overload the << operator to print the recognition sequence with enzyme acronyms friend std::ostream &operator<<(std::ostream &out, const SequenceMap &a_SequenceMap) { out << a_SequenceMap.recognition_sequence_ << " "; for (int i = 0; i < a_SequenceMap.enzyme_acronyms_.size(); ++i) { out << a_SequenceMap.enzyme_acronyms_[i] << " "; } return out; }

// Merge two SequenceMap objects void Merge(const SequenceMap &other_sequence) { for (int i = 0; i < other_sequence.enzyme_acronyms_.size(); ++i) { enzyme_acronyms_.push_back(other_sequence.enzyme_acronyms_[i]); } }

/* // Print the recognition sequence string getRecognitionSequence() const { return recognition_sequence_; }

// Print enzyme acronym void printAllEnzAcroOfRecSeq() const { for (int i = 0; i < enzyme_acronyms_.size() ; ++i) { cout << enzyme_acronyms_[i] << " "; } cout << endl; }*/

private: string recognition_sequence_ ; vector enzyme_acronyms_; };

#endif //end of SequenceMap

//test program code started

// Main file for Part2(a) of Homework 2.

#include "avl_tree.h" //just need to assume this. info is below

#include "sequence_map.h"

#include #include using namespace std;

namespace {

// @db_filename: an input filename. // @a_tree: an input tree of the type TreeType. It is assumed to be // empty. template void QueryTree(const string &db_filename, TreeType &a_tree) { // Code for running Part2(a) // You can use public functions of TreeType. For example:

//already provided in avl_tree.h a_tree.insert(10); a_tree.printTree(); }

} // namespace

int main(int argc, char **argv) { if (argc != 2) { cout << "Usage: " << argv[0] << " " << endl; return 0; } const string db_filename(argv[1]); cout << "Input filename is " << db_filename << endl; // Note that you will replace AvlTree with AvlTree AvlTree a_tree; QueryTree(db_filename, a_tree); return 0; }

Please fill out the query_tree.cc program to parse the file and insert it into the tree.

Thank you.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!