Question: Consider an error function (W - 3) (w -4) (w-3) (w - 4) 4 + 9 6 . Different variants of gradient descent algorithm
Consider an error function (W - 3) (w -4) (w-3) (w - 4) 4 + 9 6 . Different variants of gradient descent algorithm can be used to minimize this error function w.r.t (w, W2). Assume (w, W) = (1, 1) at time (t-1) and after update (w, W2) = (1.5, 2.0) at time (t). Assume a = 1.5, B =0.6, n = 0.3. 1. E(W, W) = 0.05 + Compute the value that minimizes (w1 , w2). Compute the minimum possible value of error. 2. What will be value of (w1, w2 ) at time (t + 1) if standard gradient descent is used? 3. What will be value of (w1, w2 ) at time (t + 1) if momentum is used? 4. What will be value of (w1, w2 ) at time (t + 1) if RMSPRop is used? 5. What will be value of (w1, w2 ) at time (t + 1) if Adam is used?
Step by Step Solution
3.45 Rating (155 Votes )
There are 3 Steps involved in it
To find the minimum set the partial derivatives of E w w with respect to w and w to zero and s... View full answer
Get step-by-step solutions from verified subject matter experts
