Question: A stochastic gradient descent algorithm requires tuning of the learning rate parameter over time. Should we reduce it over time or increase it over
A stochastic gradient descent algorithm requires tuning of the learning rate parameter over time. Should we reduce it over time or increase it over time? Explain either way. B (10 points) Explain intuitively, how the learning rate should be adjusted as a function of the mini-batch size. A stochastic gradient descent algorithm requires tuning of the learning rate parameter over time. Should we reduce it over time or increase it over time? Explain either way. B (10 points) Explain intuitively, how the learning rate should be adjusted as a function of the mini-batch size.
Step by Step Solution
There are 3 Steps involved in it
The choice of whether to reduce or increase the learning rate over time in a stochastic gradient descent SGD algorithm depends on the specific charact... View full answer
Get step-by-step solutions from verified subject matter experts
