Question: ( 5 p ) Consider the DiscountGrid layout, shown below. This grid has two terminal states with positive payoff ( in the middle row )

(5p) Consider the DiscountGrid layout, shown below. This grid has two terminal states with positive
payoff (in the middle row), a close exit with payoff +1 and a distant exit with payoff +10. The bottom
row of the grid consists of terminal states with negative payoff; each state in this "cliff" region has
payoff -10. The starting state is the white square (1,2). We distinguish between two types of paths:
(1) paths that "risk the cliff" and travel near the bottom row of the grid; these paths are shorter but
risk earning a large negative payoff, and are represented by the red arrow in the figure below. (2) paths
that "avoid the cliff" and travel along the top edge of the grid. These paths are longer but are less
likely to incur huge negative payoffs.
In this question, you will choose settings of the discount, noise, and living reward parameters for this
MDP to produce optimal policies of several different types. Your setting of the parameter values for
each part should have the property that, if your agent followed its optimal policy without being subject
to any noise, it would exhibit the given behavior. If a particular behavior is not achieved for any setting
of the parameters, assert that the policy is impossible by returning the string 'NOT POSSIBLE'.
Here are the optimal policy types you should attempt to produce:
Prefer the close exit (+1), risking the cliff (-10)
Prefer the close exit (+1), but avoiding the cliff (-10)
Prefer the distant exit (+10), risking the cliff (-10)
Prefer the distant exit (+10), avoiding the cliff (-10)
Avoid both exits and the cliff (so an episode should never terminate)
HINT: Consider which of the three parameters influence the desired behavior.
Solve the process using formula's instead of code. Detailed explanation.
 (5p) Consider the DiscountGrid layout, shown below. This grid has two

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!