Question: Question 4 [ 4 pts ] : Figure 3 . a shows a 3 x 4 robot navigation field. The shade squares are obstacles, and
Question pts: Figure a shows a x robot navigation field. The shade squares are obstacles, and the three cells and are terminal states, and the values showing are the reward of the terminal states each cell is also a state The reward for each of the rest states except the obstacles and terminal states is To train a robot to navigate in the field, a stochastic transition model shown in Figure b is used. At any location, say if the robot cannot move in a certain direction eg there is wall or obstacle it will remain in the same position. For example, when the robot is at it cannot move to the left because of the wall. The discount and the initial utility values of each state are
Figure b
Use value iteration algorithm to find utility values for cells and respectively after the FIRST iteration exclude terminal states and obstacles Solutions must show calculations no need to calculate values for other cells pts
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
