Question: Solve Example 3.8: Solving the Gridworld Suppose we solve the Bellman equation for u* for the simple grid task introduced in Example 3.5 and shown
Solve
Example 3.8: Solving the Gridworld Suppose we solve the Bellman equation for u* for the simple grid task introduced in Example 3.5 and shown again in Figure 3.5 (left). Recall that state A is followed by a reward of +10 and transition to state A', while state B is followed by a reward of +5 and transition to state B'. Figure 3.5 (middle) shows the optimal value function, and Figure 3.5 (right) shows the corresponding optimal policies. Where there are multiple arrows in a cell, all of the corresponding actions are optimal. Gridworld Figure 3.5: Optimal solutions to the gridworld example. Figure 3.5 (middle) gives the optimal value of the best state of the gridworld as 24.4, to one decimal place. Use your knowledge of the optimal policy and the following equation (3.8 in the book) to compute it to three decimal places (take y = 0.9 in this case). Gi = Rest + Reve + Res = Do ReetStep by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
