Question: Deep Reinforcement Learning Assignment 0 1 Problem Statement 1 3 Marks Title: Propose a suitable title Problem Statement: Define a problem statement of your own
Deep Reinforcement Learning
Assignment Problem Statement
Marks
Title: Propose a suitable title
Problem Statement: Define a problem statement of your own with a welldefined
objective, gaming environment, and game controls. Mark
Concept Sketch: A pen and paperbased game concept sketching to illustrate the
proposed gaming problem statement. Mark
Additional Information: Provide any necessary information assumedconsidered for
the game implementation.
Requirements and Deliverables:
Elaborate on how the described problem could be solved using deep neural
network and explain the action plan to create a gaming environment. Mark
Prepare a Colab sheet with outputs saved satisfying the following requirements.
Implementation should be in OpenAI gym with python. Develop a deep neural
network architecture and training procedure that effectively learns the optimal
policy for the spaceship to avoid collisions with asteroids and maximize its
survival time in the game environment.
i Environment Setup: Define the game environment, including the state
space, action space, rewards, and terminal conditions. Mark
ii Replay Buffer: Implement a replay buffer to store experiences state
action, reward, next state, terminal flag Mark
iii. Deep QNetwork Architecture: Design the neural network architecture
for the DQN using Convolutional Neural Networks. The input to the
network is the game state, and the output is the Qvalues for each
possible action. Marks
iv EpsilonGreedy Exploration: Implement an exploration strategy such
as epsilongreedy to balance exploration trying new actions and
exploitation using learned knowledge Mark
v Training Loop: Initialize the DQN and the target network a separate
network used to stabilize training In each episode, reset the
environment and observe the initial state. Marks
vi Testing and Evaluation: After training, evaluate the DQN by running it
in the environment without exploration set epsilon to Monitor metrics
such as average reward per episode, survival time, etc., to assess the
performance. Mark
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
