Question: (40) 1. Consider the following Markov decision process problem in which S = ($1, 82, 83), As, = (an. a12), r(s1, an) = 0, r(1.012)

 (40) 1. Consider the following Markov decision process problem in whichS = ($1, 82, 83), As, = (an. a12), r(s1, an) =

(40) 1. Consider the following Markov decision process problem in which S = ($1, 82, 83), As, = (an. a12), r(s1, an) = 0, r(1.012) = 2, and p(s2 51, an) = 1, p($1 81. 012) = 1. As, = (a21,a22.023), r($2.a21) = 1, r($2,a2) = 1, r( $2. 023) = 3. p(s3 82, a21 ) = 1, p(s| |82, a22) = 1. p(s2 82. 023) = 1, As, = (031, 032). r(s . 031 ) = 2. r(83, 032) = 4. and p( salsa. a31 ) = 1, and p(s3|$3. (32) = 1 . (20) (a) Classify this Markov decision process problem. Please mention every- thing that applies.(20) (b) Perform the appropriate policy iteration to compute the long-run av- erage reward optimal policy

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!