Question: Question 1 ( a ) Consider a simple game where your character is a sailor carrying passengers across a river that separates two towns, A

Question 1
(a) Consider a simple game where your character is a sailor carrying passengers across a
river that separates two towns, A and B. Each day you can decide to stay in the town
where you are or cross the river once, carrying a number of passengers of your choice,
between one and three. Each passenger pays a 50 point fare before boarding. Every
time you attempt to cross the river with n passengers, there is a probability n/10 of the
boat sinking, which ends the game. Each day 10 points are deducted to cover your
living costs, whether you cross the river or not. Describe how the game can be
modelled as a Markov Decision Process (MDP) and, in particular, determine the
values of the elements of the tuple used to formally define an MDP: .
(b) Explain the following equation in the context of reinforcement learning:
()= max
in ()
[(,)+
()]
(c) Consider a reinforcement learning problem modelled as a MDP with deterministic
transitions and actions. The states are S ={A, B, C, D, E, G1, G2, G3} while the
actions are trivially A ={toA, toB, toC, toD, toE, toG1, toG2, toG3}. The possible
transitions and the corresponding rewards (if any) are indicated in the state transition
diagram shown in Figure 1.1 below. Assuming a discount factor 0.6, calculate the
discounted cumulative value of each state, also providing a brief explanation of the
procedure followed.
Figure 1.1
(d) Discuss the role of the discount factor in the context of reinforcement learning. In
particular, consider your answer to part (c) of this question and discuss how the
optimal policy from a given state, for example D, changes depending on the choice of
the discount factor .

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Finance Questions!