Question: We have learned several learning algorithms ( e . g . , Q - learning, Monte Carlo, dynamic programming, double Q - learning, TD ,

We have learned several learning algorithms (e.g., Q-learning, Monte Carlo, dynamic
programming, double Q-learning, TD, SARSA and others).
You are free to pick up any one algorithm and implement on a grid world goal searching
problem.
Choose one algorithm you are going to implement and provide your complete pseudo code.
Design your own grid world example (should be bigger than 3**2) and with obstacles.
Show your goal searching process with step-to-go curve, sum of squared error and/or
theoretical value table
Please submit the report/code
Please include following five sections.
Introduction and Background (aims/motivation, review/research)
Project Specification (goals/objective, problem design, and expected solution)
Implementation (evaluation, such as case studies)
Summary (conclusions)
Please include your pseudocode, problem statement, input sequence, and output in the report.
Please give your derived (theoretical) solution of V table or Q table for your problem.
Visualizing the graphs or providing the tables/graphs in the report is suggested
 We have learned several learning algorithms (e.g., Q-learning, Monte Carlo, dynamic

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!