Question: [ 6 pts ] Suppose S = { ( x _ ( i ) , y _ ( i ) ) } _ ( i

[6 pts] Suppose S={(x_(i),y_(i))}_(i)=1^(n)subR^(d)\times {-1,+1} is a linearly separable training dataset. We saw
in that there exists a w^(**)inR^(d) such that y_(i)(:w^(**),x_(i):)>1 for all iin[n]. Recall that the Perceptron
algorithm outputs a hyperplane that separates the positive and negative examples (i.e., y_(i)(:w,x_(i):)>0,
for all iin[n].
(a) Devise a new algorithm called MARGIN-PERCEPTRON that outputs a widehat(w) that separates the positive
and negative examples by a margin, that is, y_(i)(:(widehat(w)),x_(i):)>=1 for all iin[n].
(b) Suppose, as in class, that Rmax_(i)||x||_(2) and Bmin{(||w||_(2)):} : for all {:y_(i)(:w,x_(i):)>=1}. Show using
the technique we used in class to show that MARGIN-PERCEPTRON in at most B^(2)(R^(2)+2) steps.Suppose we modify the Perceptron algorithm as follows: In the update step, instead of performing w(t+1)= w(t)+ yixi, whenever we make a mistake, we instead perform w(t+1)= w(t)+\eta yixi, for some \eta >0; \eta is sometimes referred to as the learning rate or the step size. Show that this modified Perceptron will perform the same number of iterations as the original Perceptron we studied in class, and that it will converge to a vector that points in the same direction as the output of the vanilla Perceptron. Hint: What can you say about the relationship between the signs of w, x and \eta w, x?
 [6 pts] Suppose S={(x_(i),y_(i))}_(i)=1^(n)subR^(d)\times {-1,+1} is a linearly separable training dataset.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!