Question: Problem 2 (16 marks) Consider a Markov Decision Process (MDP) with states S = {4,3,2,1,0}, where 4 is the starting state. In states k >

 Problem 2 (16 marks) Consider a Markov Decision Process (MDP) with

Problem 2 (16 marks) Consider a Markov Decision Process (MDP) with states S = {4,3,2,1,0}, where 4 is the starting state. In states k > 1 you can walk (W) and T(k, W, k 1) = 1. In states k > 2 you can also jump (J) and T(k, J, K - 2) = 3/4 and T(k,), k) = 1/4. State 0 is a terminal state. The reward R(s, a, s') = (s s')2 for all (s, a,s'). Use a discount of y = 1/2. Compute both V*(2) and Q*(3,7). Clearly show how you computed these values. Problem 2 (16 marks) Consider a Markov Decision Process (MDP) with states S = {4,3,2,1,0}, where 4 is the starting state. In states k > 1 you can walk (W) and T(k, W, k 1) = 1. In states k > 2 you can also jump (J) and T(k, J, K - 2) = 3/4 and T(k,), k) = 1/4. State 0 is a terminal state. The reward R(s, a, s') = (s s')2 for all (s, a,s'). Use a discount of y = 1/2. Compute both V*(2) and Q*(3,7). Clearly show how you computed these values

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!