Question: (a) Consider the sequence of iterates, V., V1, ..., Vk, ..., in value iteration, where Vk = T*Vk-1. Suppose y 1, we have |||V Vi|x

 (a) Consider the sequence of iterates, V., V1, ..., Vk, ...,

(a) Consider the sequence of iterates, V., V1, ..., Vk, ..., in value iteration, where Vk = T*Vk-1. Suppose y 1, we have |||V Vi|x =||V - V6+n +Vn - Vi+n=1+...+V+1- V || 5||V. - Vx+n|100 + 11Vx+i Vk+i-1 ||00 (triangle inequality) n i=1 (b) Suppose || V V* || 20 Se for some V ERSI. Let a be the greedy policy with respect to V, 7(8) = arg max, R(s,a) + vEs. P(s'|s, a)V(s). Show that ||V* V*|| 20 sati Hint: Using the fact that TTV = T*V since n is greedy with respect to V, write || V*-V" || 20 = ||V* T*V +T" V V+ || 20 and apply the triangle inequality. (a) Consider the sequence of iterates, V., V1, ..., Vk, ..., in value iteration, where Vk = T*Vk-1. Suppose y 1, we have |||V Vi|x =||V - V6+n +Vn - Vi+n=1+...+V+1- V || 5||V. - Vx+n|100 + 11Vx+i Vk+i-1 ||00 (triangle inequality) n i=1 (b) Suppose || V V* || 20 Se for some V ERSI. Let a be the greedy policy with respect to V, 7(8) = arg max, R(s,a) + vEs. P(s'|s, a)V(s). Show that ||V* V*|| 20 sati Hint: Using the fact that TTV = T*V since n is greedy with respect to V, write || V*-V" || 20 = ||V* T*V +T" V V+ || 20 and apply the triangle inequality

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!