Question: In this code, robot explores the whole maze with epsilon - greedy. Then it finds the shortest path according to the Q values. Make this
In this code, robot explores the whole maze with "epsilongreedy". Then it finds the shortest path according to the Q values. Make this finding shortest path with "Flood fill". : clear all;
clc;
empty, obstacle, goal
maze ;
;
;
;
;
;
;
;
Parameters
alpha ; Learning rate parameter
gamma ; Discount factor
epsilon ; Exploration rate
episodes ; Iteration number
Function to learn Q values
Q learnqvaluesmaze alpha, gamma, epsilon, episodes;
Function to simulate shortest path after training
simulateshortestpathQ maze;
Function to learn Q values
function Q learnqvaluesmaze alpha, gamma, epsilon, episodes
rows cols sizemaze; Determine the size of the Q table to be created
Q randrows cols, ; Initialize Q values for each cell of the maze for four possible actions with random low values
Learning process
for episode :episodes Perform iterations for the specified number of episodes
Set initial position
y ;
x ;
done false;
while ~done Continue until reaching the goal
Epsilongreedy approach
if rand epsilon
action randi; Select a random action
else
~ action maxQy x :;
end
Apply action to get the new state
newY newX, reward, done mystepy x action, maze;
Update Qtable
oldvalue Qy x action;
nextmax maxQnewY newX, :; Find the maximum Q value in the Qtable for all actions in the new state
Qy x action oldvalue alpha reward gamma nextmax oldvalue; Update Qtable based on old value, alpha, reward, gamma, and the maximum Q value in the future state
Update current position along with updating Qtable
y newY;
x newX;
end
Display Qvalues obtained at the end of each episode
dispAt episode numstrepisode Qvalues:;
dispQ;
end
end
Function to simulate the shortest path
function simulateshortestpathQ maze
rows cols sizemaze;
Start at the initial position
y ;
x ;
Move inside the maze using learned Qvalues
done false;
dispStarting simulation of the shortest path';
Create a figure for visualization
figure;
while ~done
dispmazey x maze; Show the current state of the maze
~ action maxQy x :; Choose action with maximum Qvalue
newY newX, ~ done mystepy x action, maze; Apply action and get the new state
Update position
y newY;
x newX;
Wait for visualization purposes
pause;
end
Show the final state of the maze
dispmazey x maze;
if done
dispGoal reached!;
else
end
end
Function to display the current state of the maze
function dispmazey x maze
vismaze maze; Create a copy of the maze for visualization
Mark the current position of the robot
vismazey x; Use a different value to represent the robot
Display the maze using imagesc and color map
imagescvismaze;
colormap; ; ; White empty space, Black obstacles, Green goal
axis off;
Wait for visualization purposes
pause;
end
Function to simulate a step
function newY newX, reward, done mystepy x action, maze
Implement step logic here
switch action
case Up
newY y ;
newX x;
case Down
newY y ;
newX x;
case Left
newY y;
newX x ;
case Right
newY y;
newX x ;
end
Keep the position within the bounds of the maze
newY max minsizemaze newY; New Y position lies within the row bounds of the maze
newX max minsizemaze newX; New X position lies within the column bounds of the maze
Update reward and done according to the logic
if mazenewY newX
reward ; Punish the agent if it encounters an obstacle
done false;
elseif mazenewY newX
reward ; Goal reached
done true;
else
reward ; Empty space
done false;
end
end
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
