Question: Both value iteration and policy iteration require the knowledge of the transition model P ( s ' | s , a ) . Can we
Both value iteration and policy iteration require the knowledge of the transition model Ps s a Can we learn a good policy if we don't know the transition model Ps s a
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
