Question: The Dyna agent with exploration bonus, i.e. Dyna-Q , performs better in the first phase as well as in the second phase of the blocking
The Dyna agent with exploration bonus, i.e. Dyna-Q , performs better in the first phase as well as in the second phase of the blocking and shortcut experiments (shown in the textbook). The superior performance in the first phase is because the exploration bonus makes the agent actively seek out the areas at the "edge" of its experience, which causes it to execute unexplored actions sooner, and thus find the goal more quickly. Is this true
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
