Question: 1. (45 pts) Recal that the string alignment problem takes as input two strings and y composed of symbols xi,yj E , for a fixed

1. (45 pts) Recal that the string alignment problem takes as

1. (45 pts) Recal that the string alignment problem takes as input two strings and y composed of symbols xi,yj E , for a fixed symbol set , and returns a minimal-cost set of edit operations for transforming the string z into string y Let a contain n, symbols, let y contain ny symbols, and let the set of edit operations be those defined in the lecture notes (substitution, insertion, deletion, and transposition) Let the cost of indel be 1, the cost of swap be 10 (plus the cost of the two sub ops), and the cost of sub be 10, except when j, which is a "no-op" and has cost In this problem, we will implement and apply three functions. (i) alignStrings(x,y) takes as input two ASCII strings r and y, and runs a dynamic programming algorithm to return the cost matrix S, which contains the optimal costs for all the subproblems for aligning these two strings. alignStrings(x,y) // x,y are ASCII strings S table of length nx by ny // for memoizing the subproblem costs initialize S for i = 1 to nx // fill in the basecases for j = 1 tony S[i.j] -cost (i.j) // optimal cost for x [O..] and y[O..j] b) return S (ii) extractAlignment (S,x,y) takes as input an optimal cost matrix S, strings r, y and returns a vector a that represents an optimal sequence of edit operations to convert r into y. This optimal sequence is recovered by finding a path on the implicit DAG of decisions made by alignStrings to obtain the value SIn, ny], starting from S0,0 extractAlignment (S,x,y) I/S is an optimal cost matrix from alignStrings initialize a II empty vector of edit operations // initialize the search for a path to So,0] while i >0 or j > 0 a[] [i.j] determineOptima10pCS,i.jx,y) - updateIndices (S,i,j,a) // what was an optimal // move to next position choice? return a When storing the sequence of edit operations in a, use a special symbol to denote ii) commonSubstrings(x,L,a) which takes as input the ASCII string r, an integer 1 S Ln, and an optimal sequence a of edits to z, which would transform into y. This function returns each of the substrings of length at least L in r that aligns exactly, via a run of no-ops, to a substring in y (a) From scratch, implement the functions alignStrings, extractAlignment, and commonSubstrings. You may not use any library functions that make their imple- mentation trivial. Within your implementation of extractAlignment, ties must be broken uniformly at random. Submit (i) a paragraph for each function that explains how you implemented it describe how it works and how it uses its data structures), and (ii) your code implementation, with code comments. Hint: test your code by reproducing the APE / STEP and the EXPONENTIAL POLYNOMIAL examples in the lecture notes (to do this exactly, you'll need to use unit costs instead of the ones given above 1. (45 pts) Recal that the string alignment problem takes as input two strings and y composed of symbols xi,yj E , for a fixed symbol set , and returns a minimal-cost set of edit operations for transforming the string z into string y Let a contain n, symbols, let y contain ny symbols, and let the set of edit operations be those defined in the lecture notes (substitution, insertion, deletion, and transposition) Let the cost of indel be 1, the cost of swap be 10 (plus the cost of the two sub ops), and the cost of sub be 10, except when j, which is a "no-op" and has cost In this problem, we will implement and apply three functions. (i) alignStrings(x,y) takes as input two ASCII strings r and y, and runs a dynamic programming algorithm to return the cost matrix S, which contains the optimal costs for all the subproblems for aligning these two strings. alignStrings(x,y) // x,y are ASCII strings S table of length nx by ny // for memoizing the subproblem costs initialize S for i = 1 to nx // fill in the basecases for j = 1 tony S[i.j] -cost (i.j) // optimal cost for x [O..] and y[O..j] b) return S (ii) extractAlignment (S,x,y) takes as input an optimal cost matrix S, strings r, y and returns a vector a that represents an optimal sequence of edit operations to convert r into y. This optimal sequence is recovered by finding a path on the implicit DAG of decisions made by alignStrings to obtain the value SIn, ny], starting from S0,0 extractAlignment (S,x,y) I/S is an optimal cost matrix from alignStrings initialize a II empty vector of edit operations // initialize the search for a path to So,0] while i >0 or j > 0 a[] [i.j] determineOptima10pCS,i.jx,y) - updateIndices (S,i,j,a) // what was an optimal // move to next position choice? return a When storing the sequence of edit operations in a, use a special symbol to denote ii) commonSubstrings(x,L,a) which takes as input the ASCII string r, an integer 1 S Ln, and an optimal sequence a of edits to z, which would transform into y. This function returns each of the substrings of length at least L in r that aligns exactly, via a run of no-ops, to a substring in y (a) From scratch, implement the functions alignStrings, extractAlignment, and commonSubstrings. You may not use any library functions that make their imple- mentation trivial. Within your implementation of extractAlignment, ties must be broken uniformly at random. Submit (i) a paragraph for each function that explains how you implemented it describe how it works and how it uses its data structures), and (ii) your code implementation, with code comments. Hint: test your code by reproducing the APE / STEP and the EXPONENTIAL POLYNOMIAL examples in the lecture notes (to do this exactly, you'll need to use unit costs instead of the ones given above

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

1. (45 pts) Recall that the string alignment problem takes as input two strings z and y, composed of symbols xi, yi , for a fixed symbol set , and returns a minimal-cost set of edit operations for...

Implement the function alignStrings in Python. Do not use any library functions. Must be able to read the strings from two files. Recall that the string alignment problem takes as input two strings x...

Recall that the string alignment problem takes as input two strings x and y , composed of symbols x i , y j is a element of , for a fixed symbol set , and returns a minimal-cost set of edit...

Implement function alignStrings in Python Implement alignStrings in Python 1. Recall that the string alignment problem takes as input two strings x and y, composed of symbols xi,yj E , for a fixed...

Briefly describe ASCII and Unicode and draw attention to any relationship between them. [3 marks] (b) Briefly explain what a Reader is in the context of reading characters from data. [3 marks] A...

ttth Suppose that the sequence of bags {Bn | n N} is recursively enumerated by the computable function e(n, x) = fn(x), [7 marks] Hence prove that the set of all recursive bags cannot be recursively...

Portray in words what transforms you would have to make to your execution to some degree (a) to accomplish this and remark on the benefits and detriments of this thought.You are approached to compose...

The new line character is utilized solely as the last person in each message. On association with the server, a client can possibly (I) question the situation with a client by sending the client's...

Java protests instead of sending messages as message.Characterize, in a programming language documentation of your choice, a recursive drop parser that will foster the hypothetical sentence structure...

A creative engineer suggests structuring the TLB so that not all the bits of the presented address need match to result in a hit. Suggest how this might be achieved, and what might be the costs and...

According to By the Numbers, approximately how many years did it take for real GDP per capita in the United States to double from $5,000 to $10,000? How about from $25,000 to $50,000? During which...

A 0.50 g sample of AgCl(s) is shaken with 5.0 mL of 6.0 M NH3 untilthere is no more net reaction (Kf for Ag(NH3)2+ = 1.7 * 10^7 Write the net ionic equation wiht phase symbols. Does any AgCl remain...

26. In the figure below, AB and AC are tangents to the circle at B and C respectively. O is the centre of the circle, CD is a diameter of the circle and LAOD Is a a d LAOD = 122. Find BAC. A C B 122 D

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

5. Wyeth Pharmaceuticals

2. Go to www.quality.nist.gov, the Web site for the National Institute of Standards and Technology (NIST). The NIST oversees the Malcolm Baldrige Quality Award. Click on Criteria for Performance...

4. Conduct a phone or personal interview with a manager. Ask this person to describe the role that training plays in his or her company.