Question: Task 1 : Complete get _ next _ state ( current _ state _ pos, action, grid _ size ) function to return the next
Task : Complete getnextstatecurrentstatepos, action, gridsize function to return the next state's grid positions row column based on the given currentstatepos and action.
Complete Tasks to update the row andor column value as needed
Task : Complete the qlearning function by implementing the Qlearning algorithm following greedy policy It should return the final Qtable as qtable.
To help you, partial code has been givenComplete the code for Tasks Note: do not change the function header, your solution should be such that the function must not need any additional inputs.
Note: Do not change any variable names, function names or function inputoutput variable names in the prewritten code
# Global Paramaters Do not change these parameter names
STUDENTID and goal positions
GRIDSIZE #
ACTIONS # DO NOT CHANGE
EPISODES # CHANGE to an appropriate number to ensure agent learns to find the optimal path and that Q table converges
# Do not change number of episodes parametervariable anywhere else in the code
ALPHA # DO NOT CHANGE
EPSILON # DO NOT CHANGE
GAMMA # DO NOT CHANGE
# TASK Complete the function to get next state based on given action
def getnextstatecurrentstatepos, action, gridsize: # DO NOT CHANGE THIS LINE
row, column currentstatepos # DO NOT CHANGE THIS LINE
if action and row : # Move up
# Task update row andor column as needed
# YOUR CODE HERE
elif action and row gridsize : # Move down
# Task update row andor column as needed
# YOUR CODE HERE
elif action and column : # Move left
# Task update row andor column as needed
elif action and column gridsize : # Move right
# Task update row andor column as needed
# YOUR CODE HERE
return row, column # DO NOT CHANGE THIS LINE
# TASK
# Complete the getaction function in Task
# This function will be called from the qlearning function see below
# Inputs:
# qtable, epsilon, currentstateindex
# Outputs:
# action: based on epsilongreedy decision making policy, should be either or
#
def getactionqtable, epsilon, currentstateindex:
# Task Choose an action using epsilongreedy policy
# YOUR CODE HERE
return action
# TASK
# Complete the updateqtable function in Task
# This function will be called from the qlearning function
# Inputs:
# qtable, rtable, currentstateindex, action, nextstateindex, alpha gamma
# Outputs:
# qtable: with updated Q values
def updateqtableqtable, rtable, currentstateindex, action, nextstateindex, alpha gamma:
# Task Update the qtable using the Q learning equations taught in class
# YOUR CODE HERE
return qtable
# TASKS and : Qlearning algorithm following epsilongreedy policy
# Inputs:
# qtable, rtable: initialized by calling the initializeqrtables function inside the main function
# startpos, goalpos: given by the getrandomstartgoal function based on studentid and gridsize
# numepisodes: taken from the global constant EPISODES you need to determine the episodes needed to train the agent to find the optimal path
# gridsize: To try different grid sizes, change the GRIDSIZE global constant
# alpha, gamma, epsilon: DO NOT CHANGE
# Outputs:
# qtable: the final qtable after training
def qlearningstartpos, goalpos, qtableqtableg rtablertableg numepisodesEPISODES, alpha gamma epsilon gridsize:
for episode in rangenumepisodes:
# Initialize the state index corresponding to the starting position
currentstateindex startpos gridsize startpos
currentstatepos startpos # currentstatepos has current row, column position of the agent
done False
while not done:
# Task COMPLETE THE CODE IN getaction FUNCTION ABOVE
action getactionqtable, epsilon, currentstateindex
# Task Get next state based on the chosen action
# YOUR CODE HERE
nextstatepos # Complete this line of code, DO NOT CHANGE VARIABLE NAMES
nextstateindex # Complete this line of code, DO NOT CHANGE VARIABLE NAMES
# Task COMPLETE THE CODE IN updateqtable FUNCTION ABOVE
qtable updateqtableqtable, rtable, currentstateindex, action, nextstateindex, alpha, gamma
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
