Question: Solve Example 3.8: Solving the Gridworld Suppose we solve the Bellman equation for u* for the simple grid task introduced in Example 3.5 and shown

Solve

Example 3.8: Solving the Gridworld Suppose we solve the Bellman equation for u* for the simple grid task introduced in Example 3.5 and shown again in Figure 3.5 (left). Recall that state A is followed by a reward of +10 and transition to state A', while state B is followed by a reward of +5 and transition to state B'. Figure 3.5 (middle) shows the optimal value function, and Figure 3.5 (right) shows the corresponding optimal policies. Where there are multiple arrows in a cell, all of the corresponding actions are optimal. Gridworld Figure 3.5: Optimal solutions to the gridworld example. Figure 3.5 (middle) gives the optimal value of the best state of the gridworld as 24.4, to one decimal place. Use your knowledge of the optimal policy and the following equation (3.8 in the book) to compute it to three decimal places (take y = 0.9 in this case). Gi = Rest + Reve + Res = Do Reet

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!