Question: Consider a cube state space defined by 0 < = , , < = . Suppose you are piloting / programming a drone to learn

Consider a cube state space defined by 0<=,,<=
. Suppose you are piloting/programming a drone to learn how to land on a platform at the center of the =0
surface (the bottom). Some assumptions:
In this discrete world, if I say the drone is at (,,)
I mean that it is in the box centered at (,,)
. And there are boxes (states) centered at (,,)
for all 0<=,,<=
. Each state is a 1 unit cube. So when =2
(for example), there are cubes centered at each =0,1,2
,=0,1,2
and so on, for a total state space size of 33=27
states.
All of the states with =0
are terminal states.
The state at the center of the bottom of the cubic state space is the landing pad. So, for example, when =4
, the landing pad is at (,,)=(2,2,0)
.
All terminal states except the landing pad have a reward of -1. The landing pad has a reward of +1.
All non-terminal states have a reward of -0.01.
The drone takes up exactly 1 cubic unit, and begins in a random non-terminal state.
The available actions in non-terminal states include moving exactly 1 unit Up (+z), Down (-z), North (+y), South (-y), East (+x) or West (-x). In a terminal state, the training episode should end.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!