Question: In this code, robot explores the whole maze with epsilon - greedy. Then it finds the shortest path according to the Q values. Make this

In this code, robot explores the whole maze with "epsilon-greedy". Then it finds the shortest path according to the Q values. Make this finding shortest path with "Flood fill". : clear all;
clc;
%0= empty, 1= obstacle, 2= goal
maze =[000120;
010100;
001101;
000000;
010000;
000100;
001100;
001100];
% Parameters
alpha =0.01; % Learning rate parameter
gamma =0.9; % Discount factor
epsilon =0.5; % Exploration rate
episodes =5000; % Iteration number
% Function to learn Q values
Q = learn_q_values(maze, alpha, gamma, epsilon, episodes);
% Function to simulate shortest path after training
simulate_shortest_path(Q, maze);
% Function to learn Q values
function Q = learn_q_values(maze, alpha, gamma, epsilon, episodes)
[rows, cols]= size(maze); % Determine the size of the Q table to be created
Q = rand(rows, cols, 4)*0.01; % Initialize Q values for each cell of the maze for four possible actions with random low values
% Learning process
for episode =1:episodes % Perform iterations for the specified number of episodes
% Set initial position
y =1;
x =1;
done = false;
while ~done % Continue until reaching the goal
% Epsilon-greedy approach
if rand < epsilon
action = randi([14]); % Select a random action
else
[~, action]= max(Q(y, x, :));
end
% Apply action to get the new state
[newY, newX, reward, done]= my_step(y, x, action, maze);
% Update Q-table
old_value = Q(y, x, action);
next_max = max(Q(newY, newX, :)); % Find the maximum Q value in the Q-table for all actions in the new state
Q(y, x, action)= old_value + alpha *(reward + gamma * next_max - old_value); % Update Q-table based on old value, alpha, reward, gamma, and the maximum Q value in the future state
% Update current position along with updating Q-table
y = newY;
x = newX;
end
% Display Q-values obtained at the end of each episode
disp(['At episode ' num2str(episode)' Q-values:']);
disp(Q);
end
end
% Function to simulate the shortest path
function simulate_shortest_path(Q, maze)
[rows, cols]= size(maze);
% Start at the initial position
y =1;
x =1;
% Move inside the maze using learned Q-values
done = false;
disp('Starting simulation of the shortest path');
% Create a figure for visualization
figure;
while ~done
disp_maze(y, x, maze); % Show the current state of the maze
[~, action]= max(Q(y, x, :)); % Choose action with maximum Q-value
[newY, newX, ~, done]= my_step(y, x, action, maze); % Apply action and get the new state
% Update position
y = newY;
x = newX;
% Wait for visualization purposes
pause(0.5);
end
% Show the final state of the maze
disp_maze(y, x, maze);
if done
disp('Goal reached!');
else
end
end
% Function to display the current state of the maze
function disp_maze(y, x, maze)
vis_maze = maze; % Create a copy of the maze for visualization
% Mark the current position of the robot
vis_maze(y, x)=3; % Use a different value to represent the robot
% Display the maze using imagesc and color map
imagesc(vis_maze);
colormap([111; 000; 010]); % White = empty space, Black = obstacles, Green = goal
axis off;
% Wait for visualization purposes
pause(0.1);
end
% Function to simulate a step
function [newY, newX, reward, done]= my_step(y, x, action, maze)
% Implement step logic here
switch action
case 1% Up
newY = y -1;
newX = x;
case 2% Down
newY = y +1;
newX = x;
case 3% Left
newY = y;
newX = x -1;
case 4% Right
newY = y;
newX = x +1;
end
% Keep the position within the bounds of the maze
newY = max(1, min(size(maze,1), newY)); % New Y position lies within the row bounds of the maze
newX = max(1, min(size(maze,2), newX)); % New X position lies within the column bounds of the maze
% Update reward and done according to the logic
if maze(newY, newX)==1
reward =-1; % Punish the agent if it encounters an obstacle
done = false;
elseif maze(newY, newX)==2
reward =1; % Goal reached
done = true;
else
reward =0; % Empty space
done = false;
end
end

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!