Question: Suppose you have a dataset x 1 , . . . , xn , and you double the size of the dataset by adding each

Suppose you have a dataset x1,..., xn, and you double the size of the dataset by adding each data point
twice. Answer the below questions about what happens to the behavior of SGD, minibatch SGD (for a
fixed batch size m), and gradient descent.
1. What happens to the cost per iteration of each method?
2. What happens to the expected value of the gradient estimate for SGD and minibatch SGD?
3. What happens to the variance of the gradient estimator for SGD and minibatch SGD?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!