Question: ( 3 0 points ) Figure 1 in the last page shows an MDP M = S , A , P , R . (

(30 points) Figure 1 in the last page shows an MDP M =S, A, P, R.
(a)(10 points) Write down the state space (S), action space (A), statetransition matrix (P) and reward vector (R) of M. Label the rows and
columns of P and the rows of R.
(b)(10 points) How many policies does M have? List all the policies of M in
the format shown below. The first component is the action for state s1,
the second is for state s2, and the third is for state s3.
\pi 1=
a1
a1
a1
1
(c)(10 points) Let \pi be the policy that takes action a2 in all the states, i.e.,
\pi =
a2
a2
a2
Define \pi using symbols. Draw \pi as an MDP. Write down the statetransition matrix and the reward vector of \pi .Figure 1: An MDP M=(:S,A,P,R:). The immediate rewards of all state-action
pairs are shown within square brackets. The state-transition probability of each
state-action pair is shown close to the arrows representing actions. Action a2 is
deterministic, so the probability 1 is not shown in the figure. (30 points) Figure 1 in the last page shows an MDP M=(:S,A,P,R:).
(a)(10 points) Write down the state space (S), action space (A), state-
transition matrix (P) and reward vector (R) of M. Label the rows and
columns of P and the rows of R.
(b)(10 points) How many policies does M have? List all the policies of M in
the format shown below. The first component is the action for state s1,
the second is for state s2, and the third is for state s3.
1=[a1a1a1]
(c)(10 points) Let be the policy that takes action a2 in all the states, i.e.,
=[a2a2a2]
Define using symbols. Draw as an MDP. Write down the state-
transition matrix and the reward vector of .
( 3 0 points ) Figure 1 in the last page shows an

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!