Question: We would like to use a Q - learning agent for Pacman, but the size of the state space for a large grid is too

We would like to use a Q-learning agent for Pacman, but the size of the state
space for a large grid is too massive to hold in memory. To solve this, we will
switch to feature-based representation of Pacmans state.
1. We will have two features, Fg and Fp, defined as follows:
Fg (s, a)= A(s)+ B(s, a)+ C(s, a)
Fp(s, a)= D(s)+2E(s, a)
where
A(s)= number of ghosts within 1 step of state s
B(s, a)= number of ghosts Pacman touches after taking action a from state s
C(s, a)= number of ghosts within 1 step of the state Pacman ends up in after taking action a
D(s)= number of food pellets within 1 step of state s
E(s, a)= number of food pellets eaten after taking action a from state s
For this pacman board, the ghosts will always be stationary, and the action
space is {lef t, right, up, down, stay}.Calculate the features for the actions in {lef t, right, up, stay} from the current state.
2. After a few episodes of Q-learning, the weights are wg =10(1) and wp =100+(3). Calculate the Q value for each action in {lef t, right, up, stay} from the current state.
3. We observe a transition that starts from the state above, s, takes action up, ends in state s(the state with the food pellet above) and receives a reward R(s, a, s)=250. The available actions from state s are down and stay. Assuming a discount of \gamma =0.5, calculate the new estimate of the Q value for s based on this episode.
4. With this new estimate and a learning rate \alpha =0.5, update the weights
for each feature.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!