You are right outside the entrance to a cave, as seen in the figure below. The...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
You are right outside the entrance to a cave, as seen in the figure below. The cave has n> 1 chambers numbered 1, 2,..., n. You (denoted "Y" in the figure) are in chamber "0", a convenient name for "just outside". The cave is linear, with each chamber connected to one on the left and one on the right (except that the last chamber, n, has no right neighbour). Y 0 1 2 3 m n RIGHT q r← → P LEFT q Pr You have arrived at the cave to go in and retrieve a diamond, which known to be in chamber me {1,2,...,n}. Since caves are dangerous places, you would like to be done with your mission as quickly as possible. Unfortunately the cave has a slippery interior, and so your two available actions RIGHT and LEFT can have unintended consequences. When you take the RIGHT action from chamber i {1,2,...,n-1}, you move to chamber i + 1 with probability p, you remain in chamber i with probability q, and you go to chamber i-1 with probability r, where p, q, r = [0, 1], p+q + r = 1, and p > r (so displacement in the intended direction is more probable than displacement in the opposite direction). Similarly, when you take the LEFT action from chamber ie {1,2,..., n-1}, you move to chamber i - 1 with probability p, you remain in chamber i with probability q, and you go to chamber i +1 with probability r. From chamber 0, taking LEFT keeps you (deterministically) in chamber 0; action RIGHT moves you to chamber 1 with probability p and keeps you in chamber 0 with probability 1- p. From chamber n, taking LEFT and RIGHT take you to chamber n - 1 with probabilities p and r, respectively, but keep you in chamber n with the remaining probabilities. You can assume that as soon as you enter chamber m for the first time, the diamond (instantly) drops into your pocket and stays there (no additional action or time needed to collect it). Hence, your task can be viewed as that of visiting chamber m (at least once) and thereafter returning to chamber 0. 3a. You have enough time to formulate your plan before entering the cave. However, due to the slippery surface inside, the actual number of steps taken by your plan is bound to be random. How would you plan so that the expected number of steps to retrieve the diamond and return to chamber 0 is minimised? Prove that your plan is optimal, in the sense that you cannot finish in fewer (expected) steps. We are using the informal word "plan" here in place of the more usual "policy" because we have not formalised the task in the language of MDPs. It is up to you to do so (and then, as usual, you can think in terms of policies). [4 marks] 3b. For the special case of r = 0 (that is, no backward slides), calculate the expected number of steps to complete the task using your optimal plan from 3a. Your answer must be a function n, m, p, and q (although you might be able to eliminate dependencies on some of these parameters). [3 marks] 3c. Your friends are worried about your expedition, and have decided that if you are not back to chamber 0 with the diamond even after T steps (for some T≥ 2m), they will send a search party to find you. Write down pseudocode to compute the probability that they will send a search party. Inputs to your code will be n, m, p, q, r, and T; the output must be the required probability. Only code is required; no need for a closed-form expression. The running time of your code must be polynomial in n and T. [5 marks] You are right outside the entrance to a cave, as seen in the figure below. The cave has n> 1 chambers numbered 1, 2,..., n. You (denoted "Y" in the figure) are in chamber "0", a convenient name for "just outside". The cave is linear, with each chamber connected to one on the left and one on the right (except that the last chamber, n, has no right neighbour). Y 0 1 2 3 m n RIGHT q r← → P LEFT q Pr You have arrived at the cave to go in and retrieve a diamond, which known to be in chamber me {1,2,...,n}. Since caves are dangerous places, you would like to be done with your mission as quickly as possible. Unfortunately the cave has a slippery interior, and so your two available actions RIGHT and LEFT can have unintended consequences. When you take the RIGHT action from chamber i {1,2,...,n-1}, you move to chamber i + 1 with probability p, you remain in chamber i with probability q, and you go to chamber i-1 with probability r, where p, q, r = [0, 1], p+q + r = 1, and p > r (so displacement in the intended direction is more probable than displacement in the opposite direction). Similarly, when you take the LEFT action from chamber ie {1,2,..., n-1}, you move to chamber i - 1 with probability p, you remain in chamber i with probability q, and you go to chamber i +1 with probability r. From chamber 0, taking LEFT keeps you (deterministically) in chamber 0; action RIGHT moves you to chamber 1 with probability p and keeps you in chamber 0 with probability 1- p. From chamber n, taking LEFT and RIGHT take you to chamber n - 1 with probabilities p and r, respectively, but keep you in chamber n with the remaining probabilities. You can assume that as soon as you enter chamber m for the first time, the diamond (instantly) drops into your pocket and stays there (no additional action or time needed to collect it). Hence, your task can be viewed as that of visiting chamber m (at least once) and thereafter returning to chamber 0. 3a. You have enough time to formulate your plan before entering the cave. However, due to the slippery surface inside, the actual number of steps taken by your plan is bound to be random. How would you plan so that the expected number of steps to retrieve the diamond and return to chamber 0 is minimised? Prove that your plan is optimal, in the sense that you cannot finish in fewer (expected) steps. We are using the informal word "plan" here in place of the more usual "policy" because we have not formalised the task in the language of MDPs. It is up to you to do so (and then, as usual, you can think in terms of policies). [4 marks] 3b. For the special case of r = 0 (that is, no backward slides), calculate the expected number of steps to complete the task using your optimal plan from 3a. Your answer must be a function n, m, p, and q (although you might be able to eliminate dependencies on some of these parameters). [3 marks] 3c. Your friends are worried about your expedition, and have decided that if you are not back to chamber 0 with the diamond even after T steps (for some T≥ 2m), they will send a search party to find you. Write down pseudocode to compute the probability that they will send a search party. Inputs to your code will be n, m, p, q, r, and T; the output must be the required probability. Only code is required; no need for a closed-form expression. The running time of your code must be polynomial in n and T. [5 marks]
Expert Answer:
Related Book For
Project Management The Managerial Process
ISBN: 9781260570434
8th Edition
Authors: Eric W Larson, Clifford F. Gray
Posted Date:
Students also viewed these computer engineering questions
-
Two students take a college entrance exam known to have a normal distribution of scores. The students receive raw scores of 63 and 93, which correspond to z scores (often called the standardized...
-
Name two places where you can go to find information about a company?
-
Two identical, uniform beams weighing 260 N each are connected at one end by a frictionless hinge. A light horizontal crossbar attached at the mid points of the beams maintains an angle of 53.0°...
-
How to respond to this response The effective utilization of data is integral to enhancing efficiency and making informed decisions?
-
Eisler Corporation issued 2,000 $1,000 bonds at 101. Each bond was issued with one detachable share warrant. At issuance, the net present value of the bonds without the warrants was $1,970,000....
-
The Morenos invest $10,000 in an account that grows to $12,000 in 5 years. What is the annual interest rate r if interest is compounded a. Quarterly b. Continuously
-
Journal entries for both buyer and seller periodic inventory system Non- GST version (a) Prepare general journal entries to record the following transactions (i) for Esther Ltd and (ii) for Barton...
-
ERP software programs allow tighter linkages within a supply chain than were possible with earlier generations of software. Consider the possibility of a tighter link between the marketing and...
-
Government spending as a fiscal policy tool is used to: A) ?Decrease the national debt B) ?Directly stimulate economic activity by increasing demand C) ?Reduce inflation D) ?Lower interest rates
-
J.A. Coghill owned a used Rolls Royce Corniche automobile, which he sold to a man claiming to be Daniel Bellman. Bellman gave Coghill a cashiers check for $ 94,500. When Coghill tried to cash the...
-
Graph = [[], [[5, 6], [3, 9]], [], [[4, 7]], [[6, 1]], [[6, 8]], [[3, 5], [1, 6]]] Suppose we have a graph given above. I would like to perform dijkstra algorithm on it where source = 2 and...
-
Write out the sample space for the given experiment. Use the following letters to indicate each choice: Y for yellow, B for blue, P for purple, U for unfinished, Efor ebony, and C for cherry While...
-
Sarah and Kate are meeting their bank manager Lynn Harvey next month. She has asked them to bring a Cash Budget showing planned receipts and payments for the next six months. Using the following...
-
A particular stock-keeping unit (SKU) has demand that averages 14 units per year and is Poisson distributed. That is, the time between demands is exponentially distributed with a mean of 1/14 years....
-
17-25. Double integrals Evaluate each double integral over the region R by converting it to an iterated integral. 17. [ (x (x + 2y) dA; R = {(x,y): 0x 3,1 y 4} 18. [] (x + xy) dA; R = {(x,y): 1 x...
-
Create a Crow's Foot ER diagram for each business rule. The diagrams should clearly describe and identify each entity and show relationships. Show all cardinality, primary keys, and foreign keys. Use...
-
What is the financial break-even point? Price = $115 per unit; variable cost = $47 per unit, fixed cost = $36,500.00 per year; depreciation = $16,500.00 per year. Assume a discount rate of 10%,...
-
What is master production scheduling and how is it done?
-
Do you agree that true innovation can only come from a small group of dedicated professionals?
-
1 . Can you think of a local public project that had significant cost overruns like the Portland Tram project? 2 . Do you agree with the statement that nothing great would ever be built if people...
-
What kinds of estimates are best suited for this method?
-
Let \(Y\) be distributed according to the beta \((1510)\) distribution. (a) Find \(\mathrm{E}[Y]\). (b) Find \(\operatorname{Var}[Y]\). (c) Find \(P(Y <5)\) using the normal approximation.
-
In order to determine how e ective a magazine is at reaching its target audience, a market research company selects a random sample of people from the target audience and interviews them. Out of the...
-
A city is considering building a new museum. The local paper wishes to determine the level of support for this project, and is going to conduct a poll of city residents. Out of the sample of 120...
Study smarter with the SolutionInn App