Question: Epsilon - Greedy Approach 1 point possible ( graded ) greedy approach tries to balance exploration and exploitation by randomly sampling an action with probability
EpsilonGreedy Approach point possible graded greedy approach tries to balance exploration and exploitation by randomly sampling an action with probability and by choosing the best currently available option with probability Which of the following options is correct about greedy approach. should be slowly increased with time until should decay with time after certain point during training must always be held constant for the greedy approach to converge to the optimal policy Increasing decreases the exploration aspect of the RL algorithm
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
