Question: Reinforcement Learning: The Q - learning Algorithm Please write a code in Python to produce the same outputs as in the pictures but on a

Reinforcement Learning: The Q-learning Algorithm
Please write a code in Python to produce the same outputs as in the pictures but on a bigger grid like 6x6 or 10x10. Please use Python and DO NOT use open AIs gym package!
The taxi driving problem:
There are four designated locations in the grid world indicated by R(ed), G(reen), Y(ellow), and B(lue).
When the episode starts, the taxi starts off at a random square and the passenger is at a random location (R, G, Y or B).
The taxi drives to the passengers location, picks up the passenger, drives to the passengers destination (another one of the four specified locations), and then drops off the passenger. While doing so, our taxi driver needs to drive carefully to avoid hitting any wall, marked as |. Once the passenger is dropped off, the episode ends.
What are the actions the agent can choose from at each step?
0 drive down
1 drive up
2 drive right
3 drive left
4 pick up a passenger
5 drop off a passenger
And the states?
25 possible taxi positions, because the world is a 5x5 grid.
5 possible locations of the passenger, which are R, G, Y, B, plus the case when the passenger is in the taxi.
4 destination locations
Which gives us 25 x 5 x 4=500 states
What about rewards?
-1 default per-step reward. Why -1, and not simply 0? Because we want to encourage the agent to spend the shortest time, by penalizing each extra step. This is what you expect from a taxi driver, dont you?
+20 reward for delivering the passenger to the correct destination.
-10 reward for executing a pickup or dropoff at the wrong location.
Random agent baseline
Before you start implementing any complex algorithm, you should always build a baseline model.
 Reinforcement Learning: The Q-learning Algorithm Please write a code in Python

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!