Question: Problem 3 (5pts) You run gradient descent for 15 iterations with =0.3 and compute J() after each iteration. You find that the value of J()

Problem 3 (5pts) You run gradient descent for 15 iterations with =0.3 and compute J() after each iteration. You find that the value of J() decreases quickly then levels off. Based on this, which of the following conclusions seems most plausible? A: Rather than use the current value of , it would be more promising to try a larger value of (say =1.0). B: =0.3 is an effective choice of learning rate. C: Rather than use the current value of , it would be more promising to try a smaller value of (say =0.1)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
