1. (20%) When performing pattern matching in DNA, a useful type of imprecision is the ability...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
1. (20%) When performing pattern matching in DNA, a useful type of imprecision is the ability to specify residue classes. As one example, the temperature at which a DNA molecule "melts" (i.e. the two strands of the double helix separate) is determined not only by its length longer strands melt at higher temperatures but also by the proportion of "weak" (A or T) vs. "strong" (C or G) bases that it containts. Strands of a given length with a higher proportion of strong residues melt at higher temperatures. In this problem, we want to search a DNA database for matches to a pattern P in which each character is either a single DNA base or one of the symbols W or S, denoting the weak and strong classes respectively. A symbol W or S in the pattern matches either element of its class. For example, the pattern AWG would match either of the text strings AAG or ATG, while the pattern SSSS would match any of the 16 possible 4-mers composed entirely of C's and G's. (a) We can easily extend the definition of sp, values in the KMP algorithm to include prefix-suffix matches in which one or both substrings contain residue classes. Show, however, that if the pattern contains a residue class at even a single position, the pattern-shifting rule used by the KMP algorithm can yield incorrect answers. Explain why the rule fails. (b) Describe how to extend the basic KMP algorithm to work correctly using a pattern with a residue class at exactly one position. Hint: make your sp-values conditional on which character matched the S or W. Argue that your revised method is still correct. Don't worry about explaining how to compute sp-values in this revised algorithm, but do show that, once you have these values, your revised algorithm preserves KMP's property of performing at most 27 comparisons to the text. (c) How much space (asymptotically) do you need to store sp-values for a pattern with residue classes in k positions? Justify your answer. 1. (20%) When performing pattern matching in DNA, a useful type of imprecision is the ability to specify residue classes. As one example, the temperature at which a DNA molecule "melts" (i.e. the two strands of the double helix separate) is determined not only by its length longer strands melt at higher temperatures but also by the proportion of "weak" (A or T) vs. "strong" (C or G) bases that it containts. Strands of a given length with a higher proportion of strong residues melt at higher temperatures. In this problem, we want to search a DNA database for matches to a pattern P in which each character is either a single DNA base or one of the symbols W or S, denoting the weak and strong classes respectively. A symbol W or S in the pattern matches either element of its class. For example, the pattern AWG would match either of the text strings AAG or ATG, while the pattern SSSS would match any of the 16 possible 4-mers composed entirely of C's and G's. (a) We can easily extend the definition of sp, values in the KMP algorithm to include prefix-suffix matches in which one or both substrings contain residue classes. Show, however, that if the pattern contains a residue class at even a single position, the pattern-shifting rule used by the KMP algorithm can yield incorrect answers. Explain why the rule fails. (b) Describe how to extend the basic KMP algorithm to work correctly using a pattern with a residue class at exactly one position. Hint: make your sp-values conditional on which character matched the S or W. Argue that your revised method is still correct. Don't worry about explaining how to compute sp-values in this revised algorithm, but do show that, once you have these values, your revised algorithm preserves KMP's property of performing at most 27 comparisons to the text. (c) How much space (asymptotically) do you need to store sp-values for a pattern with residue classes in k positions? Justify your answer.
Expert Answer:
Answer rating: 100% (QA)
a When using the KnuthMorrisPratt KMP algorithm the patternshifting rule relies on the concept of the longest proper suffix of the substring that is a... View the full answer
Related Book For
Managing Human Resources
ISBN: 978-8522104291
12th Edition
Authors: Susan E Jackson, Randall S Schuler, Steve Werner
Posted Date:
Students also viewed these programming questions
-
CANMNMM January of this year. (a) Each item will be held in a record. Describe all the data structures that must refer to these records to implement the required functionality. Describe all the...
-
Portray in words what transforms you would have to make to your execution to some degree (a) to accomplish this and remark on the benefits and detriments of this thought.You are approached to compose...
-
1. (Adverse Selection) Consider a labor market model with many identical firms hiring workers. The firms produce a homogeneous product with a constant-returns-to-scale technology and act as price...
-
Jeans Co. paid the cost of freight, $100. Journalize the transaction.
-
Each employer faces competitive weekly wages of $2,000 for whites and $1,400 for blacks. Suppose employers under-value the efforts/skills of blacks in the production process. In particular, every...
-
What is an IBR and what is it intended to accomplish?
-
Develop a level aggregate plan for the Draper Company if no back orders are permitted. (a) Show what would happen if this plan were implemented. (b) Calculate the costs associated with this plan. (c)...
-
1- Workpapers (year of adquisicion, fair value/book value differentials, intercompanu balances) Pik Corporation acquired 80 percent of Sel Corporation's common stock on January 1, 2011, for $210,000...
-
Consider the realm of digital marketing. Analyze which digital marketing approach may be most effective with at least two different generational cohorts. Be sure to consider why you think these...
-
Starting at \($3\) million, a firms net fixed assets has increased by 20 percent for three consecutive years. If depreciation has been \($250,000\) every year, how much has the firm invested in fixed...
-
Fill in the Blank. The Duhamel integral can be used to find the response of ___________ single-degree-of-freedom systems under arbitrary excitations.
-
Fill in the Blank. The velocity response spectrum, determined from the acceleration spectrum, is known as the ___________ spectrum.
-
Fill in the Blank. The Laplace transform of \(x(t)\) is denoted as ___________
-
Fill in the Blank. The equation of motion \(m \ddot{x}+c \dot{x}+k x=f \overline{(t) \text { corresponds to }}\) ____________ order system.
-
Premiere Movie Source You are an assistant manager at Premiere Movie Source, an online company that enables customers to download movies for a fee. You are required to track movie download sales by...
-
A Alkynes can be made by dehydrohalogenation of vinylic halides in a reaction that is essentially an E2 process. In studying the stereochemistry of this elimination, it was found that...
-
Derive the finite difference equations governing the forced longitudinal vibration of a fixedfree uniform bar, using a total of \(n\) mesh points. Find the natural frequencies of the bar, using...
-
Derive the finite difference equations for the forced vibration of a fixed-fixed uniform shaft under torsion, using a total of \(n\) mesh points.
-
Find the first three natural frequencies of a uniform fixed-fixed beam.
Study smarter with the SolutionInn App