Question: Figure l . a shows a 4 x 4 robot navigation field. The shade squares are obstacles, and the two cells ( 4 , 2
Figure la shows a x robot navigation field. The shade squares are obstacles, and the two cells and are terminal states, and the values showing are the reward of the terminal states each cell is also a state The reward for each of the rest states except the obstacles is In order.to train a robot to navigate in the field, a stochastic transition model showing in Figure b is used. At any particular. location, say if the robot cannot move in a certain direction eg there is wall or obstacle it will remain in the same position. For example, when the robot is at it cannot move to the left because of the wall. The discount f and the initial utility values of each state are
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
