Question: ( 4 scrreenshots phyton codes please use them ) ( 3 0 points ) Implement a general policy iteration algorithm in Python to determine the
scrreenshots phyton codes please use them points Implement a general policy iteration algorithm in Python to determine
the optimal policy for an MDP problem. For this, write three functions: Policy
evaluation that takes the MDP and a policy as an input and returns the state values,
policy improvement that takes the MDP a policy and the state values as an input
and returns an improved policy, and general policy iteration that calls the functions
and iteratively until the convergence criterion is met. In the Python template
Scheduling MDP DP HWpy you will find the core structure of these three
functions with missing code sections marked as #CODE HERE.
b points Solve the biopharmaceutical batch fermentation problem from Question
in Python using policy or value iteration. The Python template Scheduling MDP Biopharma Case provides you the parameters and some prefilled code sections for this case. You have
to define the state space, action space and reward function.
What is the optimal harvest policy for this problem and how can it be implemented
in practice? How does the policy change if the batch harvest CH are doubled, from
CH to CH Why does the harvest policy change like this?
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
