Consider an (N + 1) (N + 1) (N + 1) cubic gridworld. Luckily, all

Question:

Consider an (N + 1) × (N + 1) × (N + 1) cubic gridworld. Luckily, all the cells are empty – there are no walls within the cube. For each cell, there is an action for each adjacent facing open cell (no corner movement), as well as an action stay. The actions all move into the corresponding cell with probability p but stay with probability 1 − p. Stay always stays. The reward is always zero except when you enter the goal cell at (N, N, N), in which case it is 1 and the game then ends. The discount is 0 < γ < 1.

a. How many iterations k of value iteration will there be before V_k(0, 0, 0) becomes nonzero? If this will never happen, write never.

b. If and when V_k(0, 0, 0) first becomes non-zero, what will it become? If this will never happen, write never.

c. What is V^∗ (0, 0, 0)? If it is undefined, write undefined.

Fantastic news! We've Found the answer you've been seeking!