Question: In C + + please. Upon completion of this assignment, the student will be able to use string functions Background A strand of DNA DNA,

In C++ please. Upon completion of this assignment, the student will be able to use string functions Background A strand of DNA DNA, or deoxyribonucleic acid, is the primary carrier of genetic information in most organisms. The information in DNA is represented using a string of nucleotides. There are four kinds of nucleotides: Adenine, Thymine, Cytosine, and Guanine. They are typically abbreviated to their initials, so a strand of DNA might be expressed as "ACTGGCATA...". These sequences are recipes for building proteins - the sequence encodes a list of amino acids that must be built to form a particular protein. A sequence of three nucleotides forms a codon that specifies either an amino acid to be built or a special "stop" signal. For instance, the sequence ATT indicates Isoleucine while the sequence AGA indicates Arginine. The sequence TAA indicates stop. The only kind of codon we will interpret in this assignment is the stop codon. There are three different sequences that all correspond to stop: TAA, TAG, or TGA. Assignment Instructions Use the BasicProject template folder. Submit file: assign4.zip I should be able to compile and run your program with: g++-std=c++17 main.cpp -o program.exe .\program.exe (./program.exe on a mac) Write a program that takes a string representing the DNA for one or more proteins, breaks it up into the proteins, and prints them out. You will be starting with something like: CATCCACCAGAAGGCTAATCTCCTTAA If the string's length is not divisible by 3, we know that it will not map evenly to codons. In that case, print out an error message and do not try to process the string. You can completely exit your program by calling exit like this: exit(1); //Exit with code of 1. Any non-zero exit code indicates there was a problem That can be a useful way to make sure your program stops without executing any more code. If the input is a valid length, you may assume it will end with a stop codon, but it might encode multiple proteins separated by stops. Your task is to break the string up into its proteins. Initially, just focus on the stop sequence TAA. In the string CATCCACCAGAAGGCTAATCTCCTTAA, there are two TAA's and thus two proteins. The first is CATCCACCAGAAGGC and the second is TCTCCT (note that the stop codon just marks the end and is not considered part of the protein). In a real string of DNA, we would not want to identify ATAAGG as having a stop codon, as the TAA is broken up between two different codons: ATA and AGG. You do not have to worry about this case. You can assume the sequence you are working with will NOT have any sequences that look like "stop" but are actually broken up between two codons. Proteins can be really long, so we don't want to always print them out completely. When you print out a protein, if it is more than 3 codons (9 characters) long, print its first codon (3 characters), then three dots, then its last codon, and finally its length in codons (how many groups of three characters there are). So CATCCACCAGAAGGC should be printed as: CAT...GGC (5) Finally for full credit, your program should handle the other possible stop codons. Look for TGA and TAG as possible stop markers in addition to TAA. Tips: Tackle one part at a time! As you start "chopping up" the string, don't worry about things like the other stop codons (TAG and TGA) or printing the condensed format for long proteins. Just focus on using TAA to chop up the string and print out the results. Once that is working you can worry about the extra features. To make testing faster while you develop, hard code in a test string: string input = "CATCCACCAGAAGGCTAATCTCCTTAA"; // cin >> input; // no real input for now Before turning in the program, remove the hard-coded string and read in the input. Then make sure to test a few different sample inputs. // string input = "CATCCACCAGAAGGCTAATCTCCTTAA"; string input; cin >> input; Sample Runs Sample run 1: (user input in red) Example of invalid input. Enter DNA sequence: ACAGTTAA Error: length not divisible by 3 Sample run 2: (user input in red) Example of one valid protein. Enter DNA sequence: CATCCATAA CATCCA Sample run 3: (user input in red) Example of two valid proteins. Enter DNA sequence: CATCCATAATCTCCTGCGTAA CATCCA TCTCCTGCG Sample run 4: (user input in red) Example of four valid proteins. Enter DNA sequence: CATCCATAATCTCCTGCGTAAGAATAAGCGCATTAA CATCCA TCTCCTGCG GAA GCGCAT Sample run 5: (user input in red) This example has one valid protein. It is longer than 3 codons (9 letters), so it is printed in the shortened format. Enter DNA sequence: CATCCACCAGAAGGCTAA CAT...GGC (5) Sample run 6: (user input in red) The first protein is long, so it is printed in the shortened format. The second is only 6 characters (2 codons), so it is printed normally Enter DNA sequence: CATCCACCAGAAGGCTAATCTCCTTAA CAT...GGC (5) TCTCCT S

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!