Question: 1 1 . ( 1 0 points ) Consider a supervised learning problem, we define an error for a single data point ( In ,

11.(10 points) Consider a supervised learning problem, we define an error for a single data point (In, Yn) to be en (w)= max(0,-ynw+xn) where In is the feature, yn is the label and w is the weight we want to learn in the hypothesis. Argue that Perceptron Learning Algorithm (PLA) can be viewed as Stochastic Gradient Descent (SGD) on en with learning rate n =1. Hint: Recall the vector form of PLA as h(x)= sign(wx) and the update rule of PLA is w(t +1) w(t)+ y(t)x(t), where (z(t), y(t)) is misclassified example at iteration t. The weight update in SGD is w w - nVen(w)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!