Question: ( 3 0 points ) Figure 1 in the last page shows an MDP M = S , A , P , R . (

(30

points

)

Figure

1

in the last page shows an MDP M

=

,

,

,

.

(

) (10

points

)

Write down the state space

(

),

action space

(

),

statetransition matrix

(

)

and reward vector

(

)

of M

.

Label the rows and

columns of P and the rows of R

.

(

) (10

points

)

How many policies does M have? List all the policies of M in

the format shown below. The first component is the action for state s

1,

the second is for state s

2,

and the third is for state s

3 .

\

1 =

1

1

1

1

(

) (10

points

)

Let

\

pi be the policy that takes action a

2

in all the states, i

.

.,

\

=

2

2

2

Define

\

pi using symbols. Draw

\

pi as an MDP

.

Write down the statetransition matrix and the reward vector of

\

.

Figure

1

: An MDP

M = (

S, A, P, R

) .

The immediate rewards of all state

-

action

pairs are shown within square brackets. The state

-

transition probability of each

state

-

action pair is shown close to the arrows representing actions. Action

a_{2}

deterministic, so the probability

1

is not shown in the figure.

(30

points

)

Figure

1

in the last page shows an MDP

M = (

S, A, P, R

) .

(

) (10

points

)

Write down the state space

(S),

action space

(A),

state

-

transition matrix

(P)

and reward vector

(R)

M .

Label the rows and

columns of

P

and the rows of

R .

(

) (10

points

)

How many policies does

M

have? List all the policies of

M

the format shown below. The first component is the action for state

s_{1},

the second is for state

s_{2},

and the third is for state

s_{3} .

_{1} = [\begin{matrix} a_{1} \\ a_{1} \\ a_{1} \end{matrix}]

(

) (10

points

)

Let

be the policy that takes action

a_{2}

in all the states, i

.

.,

= [\begin{matrix} a_{2} \\ a_{2} \\ a_{2} \end{matrix}]

Define

using symbols. Draw

as an MDP

.

Write down the state

-

transition matrix and the reward vector of

.

( 3 0 points ) Figure 1 in the last page shows an

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

Risk and Return 2 PEACHTREE SECURITIES, INC. (A) Peachtree Securitics is a regional brokerage house based in Atlanta. Although the firm is only 20 years old, it has prospered by following a simple...

Google Earth and Meandering Rivers For our final lab we are going to look at meandering rivers in Google Earth. We'll explore rivers in China, Ohio, Taiwan, and Washington to see how meanders move...

Could you explain how to sketch the graph? All answers were copy and pasted and do not go over how to sketch to a PFR. PLEASE DO NOT COPY OTHER ANSWERES TO PROBLEM 3. (10 points) Figure 1 shows the...

9:39 PM Tue Apr 5 28% 64 TO + : O Homework #11 - Integration of Functions Setup Up to now, any time you've encountered a pendulum in physics or dynamics, we have limited the motion to within the...

I need all the calculation detail for this case Sun Mircrosystem, would you be able to help? I the case solution said about the exhibit xx.. But none of them is attached, Financial Administration...

matlab 3. Plot a circle of radius 2 with center as (0,0) (5 points) 4. Make sure to switch on the grid for the plot and title the figure as Throwing Darts'(1 point) 5. Create variables named x dart...

For each x(t) shown in Figure 1 (see the last page), express x(t) as a linear combination of (shifted) steps and ramps. Sketch the following continuous-time signals. (a) x(t) = u (t + 4) - 3 u (t +...

For each x(t) shown in Figure 1 (see the last page), express x(t) as a linear combination of (shifted) steps and ramps. Sketch the following continuous-time signals (a) x(t) = u(t + 4) - 3 u(t + 2) +...

C programming Problem Statement You are asked to write three functions to perform basic matrix operations matrix addition, matrix multiplication, matrix transpose. You need to verify your functions...

Risk and Return 2 PEACHTREE SECURITIES, INC. (A) Peachtree Securitics is a regional brokerage house based in Atlanta. Although the firm is only 20 years old, it has prospered by following a simple...

What is SOAPST? Another look at explicating a poem S 1. What is the Subject? (What) In a few words or phrases, identify the general topic, content, and Ideas. The subject is about her descending to...

Kendall Corners Inc. recently reported net income of $3.1 million and depreciation of $500,000. What was its net cash flow? Assume it had no amortization expense.

Write a procedure named WriteScaled that outputs a decimal ASCII number with an implied decimal point. Suppose the following number were defined as follows, where DECIMAL_OFFSET indicates that the...

Order in strictness or stringency by filling in the blank with > (greater than),