Question: CS 368 Introduction to Bioinformatics Homework 6 (Submit one single zip file including the program and the outputs of the program to D2L Assignments by
CS 368 Introduction to Bioinformatics Homework 6 (Submit one single zip file including the program and the outputs of the program to D2L Assignments by 10:00PM 4/28/2017)
This is our second programming assignment.
Total: 30 Points
Write a Perl program that prompts for a sequence file name and a motif file name. It will read data from the files and then report the sequence that matches the motif pattern and the position of the match. Recall that the sequence position starts at 1 where as string position starts at 0. The sequence files for ALP are ALP0.txt and ALP.txt. The sequence files for ZF are ZF0.txt and ZF.txt. A motif pattern file contains the regular expression of the motif pattern. The test run and result should look like the following. (Note that pattern matching is not substring searching.)
For ALP0.txt:
Please enter the input sequence file name: ALP0.txt
Please enter the motif pattern file name: pattern_alp.txt
Motif IRDSASTGT found at position 23
For ALP.txt:
Please enter the input sequence file name: ALP.txt
Please enter the motif pattern file name: pattern_alp.txt
Motif VPDSAGTAT found at position 107
For ZF0.txt:
Please enter the input sequence file name: ZF0.txt
Please enter the motif pattern file name: pattern_zf.txt
Motif CSDOCCHICHENGLINPHDTEACH found at position 105
For ZF.txt:
Please enter the input sequence file name: ZF.txt
Please enter the motif pattern file name: pattern_zf.txt
Motif CPYCHRLFSQATHLEVHVRSH found at position 595
Grading
The grade is based on the correctness, robustness, and documentation of the programs.
And I have 4 txt file so I am including that too here
ZF.txt
MPPPTAQFMGPTQAGQNESQNQSSGEAGEQNQEHGQGPTPILNQSQPASSQPQHQQQRNESISYYTNFNQ
PRYSTDASINSFLNISDNVPVTSTGGPSSGGAYSNLPRLSTSSTHQPPDLSQIGRGFSIVNNLFPQQQQL QNQHRQQQQQQQQQSHQQPPFKTPSFSTGLTGSSSQYQFLPRNDNTSQPPSKRNSVYLGPNDGPDFEFFS MQQSQQPQFQPSSRRESNSMRPPLLIPAATTKSQSNGTNNSGNMNTNADYESFFNTGTNNSNSNQNPYFL SSRNNSLKFNPEDFDFQFKRRNSFVRGTLDHSSQNAFIPESRLNSLSVNNKANGDPVADNVTNNMKGKSN EVDNDDGNDSSNNNNNNNNNNNNENNNDNNNDNNDNSINSATSTNIPNQEDHSLASTDTTSNSRKDLKEI EQRLRKHLNDEDNYSSAISRPLDKNDVIEGSEGLNKHIDESGMQPNIIKKRKKDDSTVYVKNEMPRTDPP MSKDNSTSAEGAAMANFSGKEPPIPDISSVSDDATNLIGATKVDQLMLIIQARKKGFTEKVNTTQDGDLL FNQTMDILPPKSELVGGVEKPKGTQNTRAVKKHECPYCHRLFSQATHLEVHVRSHIGYKPFVCDYCGKRF TQGGNLRTHERLHTGEKPYSCDICDKKFSRKGNLAAHLVTHQKLKPFVCKLENCNKTFTQLGNMKAHQNR FHKETLNALTAKLAEMNPSENIPLEERQLLEYFASIYKNSNRGIKGRGKGVGTKKSTISSPENHPASTIL NPNTNANNAIANDSENNGNPEGNIDSSSNSNPGSHSMISPTQKDMGTLQSQFIQNNFNNSVNSSNPSNQP IINYNYTTLPHSRLGSSSSSNTNNNNSNFSVGAAPGVLMAPTTNNDFSFNLDQSNDNERSQQEQVRFKNI NYKS
ZF0.txt
MSKDNSTSAEGAAMANFSGKEPPIPDISSVSDDATNLIGATKVDQLMLIIQARKKGFTEKVNTTQDGDLL FNQTMDILPPKSELVGGVEKPKGTQNTRAVKKHECSDOCCHICHENGLINPHDTEACHIGYKPFVCDYCG TQGGNLRTHERLHTGEKPYSCDICDKKFSRKGNLAAHLVTHQKLKPFVCKLENCNKTFTQLGNMKAH
ALP.txt
MILPFLVLAIGPCLTNSFVPEKEKDPSYWRQQAQETLKNALKLQKLNTNVAKNIIMFLGDGMGVSTVTAA RILKGQLHHNTGEETRLEMDKFPFVALSKTYNTNAQVPDSAGTATAYLCGVKANEGTVGVSAATERTRCN TTQGNEVTSILRWAKDAGKSVGIVTTTRVNHATPSAAYAHSADRDWYSDNEMRPEALSQGCKDIAYQLMH NIKDIDVIMGGGRKYMYPKNRTDVEYELDEKARGTRLDGLDLISIWKSFKPRHKHSHYVWNRTELLALDP SRVDYLLGLFEPGDMQYELNRNNLTDPSLSEMVEVALRILTKNPKGFFLLVEGGRIDHGHHEGKAKQALH EAVEMDEAIGKAGTMTSQKDTLTVVTADHSHVFTFGGYTPRGNSIFGLAPMVSDTDKKPFTAILYGNGPG YKVVDGERENVSMVDYAHNNYQAQSAVPLRHETHGGEDVAVFAKGPMAHLLHGVHEQNYIPHVMAYASCI GANLDHCAWASSASSPSPGALLLPLALFPLRTLF
ALP0.txt
KANEGTVGVSAAKTYNTNCH AQIRDSASTGTAYLCGV
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
