Question: ( 3 0 points ) Figure 1 in the last page shows an MDP M = S , A , P , R . (
points Figure in the last page shows an MDP M S A P R
a points Write down the state space S action space A statetransition matrix P and reward vector R of M Label the rows and
columns of P and the rows of R
b points How many policies does M have? List all the policies of M in
the format shown below. The first component is the action for state s
the second is for state s and the third is for state s
pi
a
a
a
c points Let pi be the policy that takes action a in all the states, ie
pi
a
a
a
Define pi using symbols. Draw pi as an MDP Write down the statetransition matrix and the reward vector of pi Figure : An MDP :: The immediate rewards of all stateaction
pairs are shown within square brackets. The statetransition probability of each
stateaction pair is shown close to the arrows representing actions. Action is
deterministic, so the probability is not shown in the figure. points Figure in the last page shows an MDP ::
a points Write down the state space action space state
transition matrix and reward vector of Label the rows and
columns of and the rows of
b points How many policies does have? List all the policies of in
the format shown below. The first component is the action for state
the second is for state and the third is for state
c points Let be the policy that takes action in all the states, ie
Define using symbols. Draw as an MDP Write down the state
transition matrix and the reward vector of
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
