Question: Consider an error function (W - 3) (w -4) (w-3) (w - 4) 4 + 9 6 . Different variants of gradient descent algorithm

Consider an error function (w - 3) (w -4) (w-3) (w - 4) 4 + 9 6 . Different variants of gradient descent

Consider an error function (W - 3) (w -4) (w-3) (w - 4) 4 + 9 6 . Different variants of gradient descent algorithm can be used to minimize this error function w.r.t (w, W2). Assume (w, W) = (1, 1) at time (t-1) and after update (w, W2) = (1.5, 2.0) at time (t). Assume a = 1.5, B =0.6, n = 0.3. 1. E(W, W) = 0.05 + Compute the value that minimizes (w1 , w2). Compute the minimum possible value of error. 2. What will be value of (w1, w2 ) at time (t + 1) if standard gradient descent is used? 3. What will be value of (w1, w2 ) at time (t + 1) if momentum is used? 4. What will be value of (w1, w2 ) at time (t + 1) if RMSPRop is used? 5. What will be value of (w1, w2 ) at time (t + 1) if Adam is used?

Step by Step Solution

3.45 Rating (155 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

To find the minimum set the partial derivatives of E w w with respect to w and w to zero and s... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!