Question: Figure l . a shows a 4 x 4 robot navigation field. The shade squares are obstacles, and the two cells ( 4 , 2

Figure l.a shows a 4x4 robot navigation field. The shade squares are obstacles, and the two cells (4,2) and [4,3] are terminal states, and the values showing are the reward of the terminal states (each cell is also a state). The reward for each of the rest states (except the obstacles) is -0.05. In. order.to train a robot to navigate in the field, a stochastic transition model showing in Figure 1.b is used. At any particular. location, say [1,1], if the robot cannot move in a certain direction (e.g., there is wall or obstacle), it will remain in the same position. For example, when the robot is at [1,1), it cannot move to the left because of the wall. The discount f=1, and the initial utility values of each state are 0.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!