Question: Q 2 . Gradient Descent ( 2 pt ) Given N training data points { ( x k , y k ) } , k
Q Gradient Descent pt
Given N training data points dots, in and labels as in either or
we seek a linear discriminant function where is the feature value
of attribute of a data point optimizing a special loss function where
Let be the learning rate, please derive the gradient update for a randomly selected data point
in the stochastic gradient descent SGD method.
Hint: Note that SGD randomly pick one data sample for gradient update per iteration. We can write
where is the feature value of attribute of a data point
You need to first write partial derivative with respect to attribute and then get the vector
consisting of partial derivatives with respect to the attributes
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
