Question: Compare the different parameter settings for Q-learning for the game of Example 13.2 (page 585) (the monster game in AIPython (aipython.org)) In particular, compare the

Compare the different parameter settings for Q-learning for the game of Example 13.2 (page 585) (the “monster game” in AIPython (aipython.org))

In particular, compare the following situations:

(i) step size

(c) = 1/c and the Q-values are initialized to 0.0.

(ii) step size

(c) = 10/(9 +

c) varies, and the Q-values are initialized to 0.0.

(iii) α varies (using whichever of (i) and (ii) is better) and the Q-values are initialized to 5.0.

(iv) α is fixed to 0.1 and the Q-values are initialized to 0.0.

(v) α is fixed to 0.1 and the Q-values are initialized to 5.0.

(vi) Some other parameter settings.

For each of these, carry out multiple runs and compare

(a) the distributions of minimum values

(b) the zero crossing

(c) the asymptotic slope for the policy that includes exploration

(d) the asymptotic slope for the policy that does not include exploration (to test this, after the algorithm has explored, set the exploitation parameter to 100%

and run additional steps).

Which of these settings would you recommend? Why?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Management And Artificial Intelligence Questions!