The direct utility estimation method in Section 21.2 uses distinguished terminal states to indicate the end of

Question:

The direct utility estimation method in Section 21.2 uses distinguished terminal states to indicate the end of a trial. How could it be modified for environments with discounted rewards and no terminal states?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Artificial Intelligence A Modern Approach

ISBN: 978-0137903955

2nd Edition

Authors: Stuart J. Russell and Peter Norvig

Question Posted: