Question: Consider the deterministic grid world shown below with the absorbing goal state G. The agent can start in any state and move in any of





Consider the deterministic grid world shown below with the absorbing goal state G. The agent can start in any state and move in any of the four compass directions (i.e., up, down, left, right) from any non-absorbing state. Here the immediate rewards are 10 for the labeled transitions and 0 for all unlabeled transitions. Assume that y = 0.8. S1 S2 S3 S4 S5 10 S6 10 10 G S7 Sg S9 (a) Calculate the v*(s) value for every state in this grid world and fill in the table below. V* (S) ali ON S 1 2. 3 4 5 6 7 8 9 (b) Calculate the q* (s, a) value for every transition and fill in the table below. * q* (s, up) q*(s, down) q*(s,left) q*(s,right) S 1 2 3 4 5 6 7 8 9 (c) Show an optimal policy. S1 S2 S3 S4 S5 S6 G S7 Sg Sg
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
