Question: 3 2 points Suppose that a simple environment has 3 states which are encoded as integers 0 , 1 and 2 . The environment has

32 points
Suppose that a simple environment has 3 states which are encoded as integers 0,1 and 2. The environment has two actions, 0 and 1.
The dictionary below provides the transition probabilities when taken action 1 from state 0. For example, this dictionary tells use that p(s'=2|s=0,a=1)=0.3
{0:0.2,1:0.5,2:0.3}
The dictionary below explains the rewards earned by the agent when transitioning to a particular state from state 0.
{0:0,1:4,2:2}
Suppose that the state-value function V(s) for a certain policy is represented by the dictionary below.
{0:18,1:13,2:15}
Use the Bellman equation and the information above to calculate Q(0,1) for the given policy. Assume a discount rate of =0.9.
Type your answer...
3 2 points Suppose that a simple environment has

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!