Question: Deep Reinforcement Learning Assignment 0 1 Problem Statement 1 3 Marks Title: Propose a suitable title Problem Statement: Define a problem statement of your own

Deep Reinforcement Learning
Assignment 01 Problem Statement
13 Marks
Title: Propose a suitable title
Problem Statement: Define a problem statement of your own with a well-defined
objective, gaming environment, and game controls. [1 Mark]
Concept Sketch: A pen and paper-based game concept sketching to illustrate the
proposed gaming problem statement. [1 Mark]
Additional Information: Provide any necessary information assumed/considered for
the game implementation.
Requirements and Deliverables:
Elaborate on how the described problem could be solved using deep neural
network and explain the action plan to create a gaming environment. [1 Mark]
Prepare a Colab sheet with outputs saved satisfying the following requirements.
Implementation should be in OpenAI gym with python. Develop a deep neural
network architecture and training procedure that effectively learns the optimal
policy for the spaceship to avoid collisions with asteroids and maximize its
survival time in the game environment.
i. Environment Setup: Define the game environment, including the state
space, action space, rewards, and terminal conditions. [1.5 Mark]
ii. Replay Buffer: Implement a replay buffer to store experiences (state,
action, reward, next state, terminal flag).[1.5 Mark]
iii. Deep Q-Network Architecture: Design the neural network architecture
for the DQN using Convolutional Neural Networks. The input to the
network is the game state, and the output is the Q-values for each
possible action. [2 Marks]
iv. Epsilon-Greedy Exploration: Implement an exploration strategy such
as epsilon-greedy to balance exploration (trying new actions) and
exploitation (using learned knowledge).[1 Mark]
v. Training Loop: Initialize the DQN and the target network (a separate
network used to stabilize training). In each episode, reset the
environment and observe the initial state. [2 Marks]
vi. Testing and Evaluation: After training, evaluate the DQN by running it
in the environment without exploration (set epsilon to 0). Monitor metrics
such as average reward per episode, survival time, etc., to assess the
performance. [2 Mark]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!