Question: A highly simplified view of copy number detection with microarray techniques. For this exercise, we assume that a DNA of an individual has been measured

A highly simplified view of copy number detection with microarray techniques. For this exercise, we assume that a DNA of an individual has been measured by a DNA microarray, such that different DNA segments correspond to probes that react with light intensity. Higher intensity indicates the presence of a DNA segment, lower intensity the absence of a DNA segment. We assume that the probes of a segment with normal copy number"(state 2) emit an intensity with Gaussian distribution with a mean 2=7.0 and standard deviation 2=1.0 A segment with a copy number deletioncorresponds to a probe with intensity of 1=5.5,1=2.0 and a segment with copy number amplification "with intensity 3=8.0 and 3=1.5. (HINT: note that these distributions are necessary to calculate pE(xtut) from the Lecture Notes). We will consider five probes that are located on consecutive DNA segments, such that the copy number in one segment depends on the copy number of the precursor segment. This means we will have observation vectors such as x=(8.70,6.64,10.27,9.83,6.61), where each of the entries corresponds to the measurement of one probe and the probes are located in this order on the DNA. We will model the data arising from these DNA segments with a hidden Markov model with three hidden states (S={1,2,3}) corresponding to the copy number status "deletion", normal", and "gain", where each state emits signals according to the above-mentioned Gaussian (normal distributions). The transition probabilities are given by A=0.380.050.010.600.900.390.020.050.60, where Aij corresponds to the probability that the state i transits to state j, thus the row sums are 1 . The initial state probabilities are pinit=(0.02,0.95,0.03) You are given the sequence of hidden states u=(2,2,3,3,2) and v=(2,2,2,2,2). These five hidden states correspond to the (unknown) copy numbers in each of the five consecutive DNA segments. Calculate the probabilities to observe these hidden state vectors! Which of those is more likely to be observed? 16.2 Sub-task 2: Joint probabilities of observed and hidden states (3 points) - a) You are given the observations x=(8.708048,6.641348,10.278741,9.839337,6.609083) and hidden states u=(2,2,3,3,2). Calculate the likelihood p(x,u)! - b) You are given the observations x=(8.708048,6.641348,10.278741,9.839337,6.609083) and hidden states u=(2,2,2,2,2). Calculate the likelihood p(x,u)! 16.3 Sub-task 3: Likelihood of observed data (7 points) Calculate the likelihood p(x) for the given observations x=(8.708048,6.641348,10.278741,9.839337,6.609083) in the following two ways! - a) Enumerate all 35=243 possible hidden state sequences and use the procedures from sub-task 2 to calculate their probabilities p(x,u). Then sum up over all these hidden state sequences. HINT: In Python, the function https://docs,python.org/3/1ibrary/itertools.htm1itertools,product might be useful to generate all those combinations. - b) Implement the forward algorithm as discussed and introduced in the lecture to calculate the likelihood! HINT: Note that the results of these two calculations must be the same! 16.4 Sub-task 4: Most likely hidden sequence (7 points) Calculate the most likely hidden sequence u that has generated the observations x=(8.708048,6.641348,10.278741,9.839337,6.609083) by implementing the Viterbi algorithm (introduced and discussed in the lecture)! HINT: In sub-task 3a) you might have done some calculations already that help you with debugging your Viterbi algorithm

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

I have an assignment where you need an experiment design study from Golub study. (NEED TO USE R PROGRAMMING). The script of the R program for the assignment is below. The assignment is attached...

Hi, I need someone to do summary for the article I upload AUDITING: A JOURNAL OF PRACTICE & THEORY Vol. 28, No. 2 November 2009 pp. 1-34 American Accounting Association DOI: 10.2308 / aud.2009.28.2.1...

CCST 4085 Biostatistics Assignment One Due date Instruction Task: You need to write up an experimental design based on the Golub Case Study. You may use the statistical techniques from Week I to 3...

please help me to find the answer for part 1, part3 and part4 Queensland University of Technology QUT Business School School of Accountancy AYB 339 Accountancy Capstone Integrated Case Study Semester...

Please i need 8 slide power point presentation on this article please . Required: Question: Make 8 slide presentation of powerpoint Leading With Next-Generation Key Performance Indicators 4 Executive...

SUMMARY OF LEARNING OBJECTIVES AND KEY POINTS 1. Identify the basic elements of organizations. Organizations are made up of a series of elements: Designing jobs Grouping jobs Establishing reporting...

Hi, This subject is financial accounting, here is a short essay type question, approximately 5 paragraphs. ''Drawing on private interest theory, what powers do you believe the Australian Accounting...

LEAN QUIZ This quiz is based upon: Staats, B. R., & Upton, D. M. (2011). Lean knowledge work. Harvard Business Review, (October), Complete 5 of 7 questions correctly for maximum of 5 points. Which is...

Listen to the following song, "Jeremy" by Pearl Jam (and read the lyrics below): https://www.youtube.com/watch?v=3g1Tu2Ulrk0&feature=youtu.be At home, drawing pictures of mountain tops With him on...

Ask a manager to list the main performance measures that he or she uses to evaluate how well the organization is achieving its goals.

Compute the expected value of perfect information for the Willow Caf in Problem S1-12. Explain what this value means and how such information might be obtained.

Review the financials of a K-12 school division or institute of higher education and address the challenges funding the local K-12 school division

Which of the following are problems with identifying users of ABC? Multiple select question. ABC means different things to different organizations. Organizations will announce the discontinuance of...

Identify and control your anxieties

Understanding and Addressing Anxiety

2. Read a famous or familiar speech (such as Martin Luther Kings I Have a Dream speech), and create an outline for it. Can you follow a clear sequence of points? Do the subpoints support the speakers...