Consider the 4 x 3 world shown in Figure.
a. implement an environment simulator for this environment, such that the specific geography of the environment is easily altered. Some code for doing this is already in the online code repository.
b. Create an agent that uses policy iteration, and measure its performance in the environment simulator from various starting states. Perform several experiments from each starting state, and compare the average total reward received per run with the utility of the state, as determined by your algorithm.
c. Experiment with increasing the size of the environment. How does the runtime for policy iteration vary with the size of theenvironment?

  • CreatedFebruary 14, 2011
  • Files Included
Post your question