Question: A sequence contains characters selected from {A,C,G,T). A short sequence might be something like AAATGCGCGT for example. Our task is to locate a sequence within

 A sequence contains characters selected from {A,C,G,T). A short sequence might

be something like "AAATGCGCGT" for example. Our task is to locate a

A sequence contains characters selected from {A,C,G,T). A short sequence might be something like "AAATGCGCGT" for example. Our task is to locate a sequence within another (much larger) sequence. If I search for the sequence "CGCG"in the above, it is a 100% match starting at location ive I will have data files that are 1Mbyte (1,048,576 bytes) long, which is the large string. Let's call this the "sequence". I will have other files that are 10,240 bytes long, which will contain the DNA we wish to locate, which we will call the "subsequence". I will tell you in advance that there is no guarantee of a 100% match on the subsequence What is the starting byte position, from 0 to the 1,038,336th byte that gives the highest count of matched bytes? The slow method to determine this is a nested loop for( each starting position 0..1,038,336 for( each byte 0.10,239 if there is a match increment a counter and the largest match wins. Did I mention that this is the slow method? It is an O(N2) algorithm, which is terrible. For our puposes it will work, but real sequence searches use faster algorithms. But that's not our point. We want to execute this in parallel. OK so here's what I want. . You must write this program in either "C" or "C++" Divide the work into N processes. Each process will calculate 1/Nth of the work. .The value for N will come from the command line, along with the name of the sequence file and the subsequence file, in that order. For example, suppose my program is named findDNA: $ ./findDNA 12 seqfile subseqfile Looking for string using 12 processes Best match is at position 123456 with 9876/10240 correct. This is the 1Mbyte minus the 10,240. You can "stop early" for this assignment when you get to this position. As good programmers you should be checking the command line parameters for validity and not just blowing up if things are wrong Instead of using "pthread create0" use the "fork0" system call. Instead of 'pthread _join0" use "waitpid0". o Be very careful about this. For example, how many processes do you think this code starts? . for( i =0; i

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!