Question: Actor - Critic Problem: Design the Actor - Critic algorithm using TensorFlow. Design Reward Function. Environment Solution Train the model over 5 0 0 episodes
ActorCritic Problem:
Design the ActorCritic algorithm using TensorFlow.
Design Reward Function.
Environment Solution
Train the model over episodes to minimize energy consumption while
maintaining an indoor temperature of deg C
Evaluate the performance of the model on test set to measure its performance
Provide graphs showing the convergence of the Actor and Critic losses.
Plot the learned policy by showing the action probabilities across different state
values eg temperature settings
Provide an analysis on a comparison of the energy consumption before and
after applying the reinforcement learning algorithm.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
