Question: Maze Navigation Reinforcement Learning Java, C++, C, Python For the maze, assume bottom left cell is at (1, 1) and the format for the coordinates
Maze Navigation Reinforcement Learning Java, C++, C, Python
For the maze, assume bottom left cell is at (1, 1) and the format for the coordinates is (, )
b. 10 X 10 world with no obstacles and reward +1 at (5, 5)
Write a program that prompts the user via a start menu to first select the RL algorithm (1) Direct utility estimation, (2) Adaptive Dynamic Programming, (3)Temporal Difference. Your program should select an appropriate number of trials or epochs to learn the utilities and/or model. When the algorithm finishes, your program should again prompt the user to input a start state (two integer coordinates separated by a space, with check for input being a valid state inside environment, not obstacle). From this start state, your agent should navigate until it reaches a terminal state (correct operation should reach the +1 terminal state). Your program should then printout the coordinates of the states the agent navigated through until it reached the terminal state.
Remember to return the program to the start menu after each run, and add an exit option to the start menu, so that the program can be tested multiple times.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
