Question: Exercise 1: Explain how to solve the Overlap Alignment Problem. A hint: how can we change the alignment graph for global alignment graph by adding

Exercise 1: Explain how to solve the Overlap Alignment Problem. A

hint: how can we change the alignment graph for global alignment graph

Exercise 1: Explain how to solve the Overlap Alignment Problem. A hint: how can we change the alignment graph for global alignment graph by adding zero-weight "free ride" edges? What should these edges be, how do they compare to the free ride edges for local alignment? What is the resulting recurrence relation? The reads produced by genome sequencers fall into two main categories. We have worked with algorithms for short reads (of a few hundred nucleotides) like the ones produced by illumina's Sequencing by Synthesis technique. The other type of read is much longer, containing tens of thousands of nucleotides (or even longer than 100,000 nucleotides). However, long reads are very error-prone; the reads produced by a company like Pacific Biosciences have a -15% error rate. The benefit of longer reads is that we will need fewer reads to obtain the same coverage of the genome. However, with an error every few nucleotides, the current approach based on exact overlap of k-mers will completely fall apart. Instead, we might imagine using an alignment-based heuristic, since sequence alignments will easily find 85% similarity between two strings. In particular, we could have as a first step aligning every pair of reads; we then form an overlap graph of sorts in which nodes correspond to reads and an edge connects x to y if the resulting alignment is above some threshold score. The question is what type of alignment to use. We don't want global alignment, since only the ends of the reads will be similar. We don't want local alignment, since some substrings don't represent valid overlaps of reads. We want to have alignments of the form below that are "global-ish but only of the ends of the reads (where we don't know in advance how long the overlap will be). ATGCATGCCGG T-CC-GAAAC An overlap alignment of strings v= V1 ... Vn and w=w1 ... Wm is a global alignment of a suffix of v with a prefix of w. An optimal overlap alignment of strings v and w maximizes the global alignment score between an i-suffix of v and a j-prefix of wi.e., between Vi... Vn and W1 ... Wi) among all i and j. Overlap Alignment Problem: Construct a highest-scoring overlap alignment between two strings. Input: Two strings and a matrix score. Output: A highest-scoring overlap alignment between the two strings as defined by the scoring matrix score. Exercise 1: Explain how to solve the Overlap Alignment Problem. A hint: how can we change the alignment graph for global alignment graph by adding zero-weight "free ride" edges? What should these edges be, how do they compare to the free ride edges for local alignment? What is the resulting recurrence relation? The reads produced by genome sequencers fall into two main categories. We have worked with algorithms for short reads (of a few hundred nucleotides) like the ones produced by illumina's Sequencing by Synthesis technique. The other type of read is much longer, containing tens of thousands of nucleotides (or even longer than 100,000 nucleotides). However, long reads are very error-prone; the reads produced by a company like Pacific Biosciences have a -15% error rate. The benefit of longer reads is that we will need fewer reads to obtain the same coverage of the genome. However, with an error every few nucleotides, the current approach based on exact overlap of k-mers will completely fall apart. Instead, we might imagine using an alignment-based heuristic, since sequence alignments will easily find 85% similarity between two strings. In particular, we could have as a first step aligning every pair of reads; we then form an overlap graph of sorts in which nodes correspond to reads and an edge connects x to y if the resulting alignment is above some threshold score. The question is what type of alignment to use. We don't want global alignment, since only the ends of the reads will be similar. We don't want local alignment, since some substrings don't represent valid overlaps of reads. We want to have alignments of the form below that are "global-ish but only of the ends of the reads (where we don't know in advance how long the overlap will be). ATGCATGCCGG T-CC-GAAAC An overlap alignment of strings v= V1 ... Vn and w=w1 ... Wm is a global alignment of a suffix of v with a prefix of w. An optimal overlap alignment of strings v and w maximizes the global alignment score between an i-suffix of v and a j-prefix of wi.e., between Vi... Vn and W1 ... Wi) among all i and j. Overlap Alignment Problem: Construct a highest-scoring overlap alignment between two strings. Input: Two strings and a matrix score. Output: A highest-scoring overlap alignment between the two strings as defined by the scoring matrix score

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

In this question you will be asked to reflect on a project you have been involved in or observed, in which a design evolved, or could have evolved, through applying a theory of user behaviour. You...

MATHEMATICIANS RISE TO A CHALLENGE ne of the theorems we teach in eighth grade is a + b= *, where c is the length of the hypotenuse of a right triangle in Euclidean space, and a and b are the lengths...

Describe the types of cybercrimes facing organizations and critical infrastructures, explain the motives of cybercriminals, and evaluate the financial Explain both low-tech and high-tech methods...

Why is using a three-level Manhattan more efficient than using a Manhattan with long indel edges to solve the Alignment with Affine Gap Penalties Problem? Building Manhattan on three levels The trick...

Question: How would cost-plus pricing change continuous improvement at Kenco? (10 marks) Compare and explain fully the strategic and operational implications of cost-plus pricing versus Kencos...

(a) In SystemVerilog, what is the difference between: (i) The ternary operator ? and if...then...else statements? [2 marks] (ii) always_ff and always_comb? [2 marks] (iii) Blocking, non-blocking and...

Question: Describe Kencos CI system; compare this to process change using traditional budgeting Abstract This case illustrates a strategy-driven costing system combining ideas from lean management...

Question: Kencos organizational design borrows ideas from both lean management and the Theory of Constraints. Describe the unique characteristics of Kencos design and accounting system that are...

Question: Compare how the strategy for lean manufacturing and Kencos system lead to unique production and information systems. Include in your discussion the design characteristics underlying each...

Welcome! Please read this page (in particular) very carefully. Instructions You need to understand how to send your assignments (deliverables) Instructor: to your instructor. The tabs (bottom of each...

On the last weekly pay of the first quarter, Lorenz is paid her current pay of $90 per day for four days worked and one day sick pay (total $450). She is also paid her first-quarter commission of...

Why is it important to perform an interface evaluation before the system is built?

On 1.1.2014 a company purchased a machine for ~ 1,00,000. It was decided to write off 10% depreciation under Straight Line Method, estimating the life of the machine at 10 years and scrap value at ~...

3. A Green House Operator is Growing Young Trees to Transplant to Residential Lawns. a Over a 6-year period, a tree left to grow for 6 years sells for $7500. For shorter periods, the sales prices...

Read the source of spotlight on the law 9.8 and compare their decisions over reasonable adjustment with more recent cases. Has the position changed as more decisions have been made at higher courts?

Annualised hours (see case study 7.1 and focus on research 7.1) appear to have considerable advantages for the employer. Read the article and book chapter on which these extracts are based and...

Specify which techniques of training are best suited to the following: Learning to drive a car Students needing a basic understanding of the business cycle Teaching teenagers about personal...