Question: For binary classification, we assume training data set matrix as D=211421511151 where xpR21 represented by the first 2 rows and ypR11 in the last row.

For binary classification, we assume training data set matrix as D=211421511151 where xpR21 represented by the first 2 rows and ypR11 in the last row. The model parameter vector is defined as: w^=w1w2b=[wb] We apply hinge loss with ridge regularization on the feature touching weights w as the training objective function to train the binary classifier: minimizeL(w^)=P1p=1Pmax(0,1ypw^Tx^p)+2w22 a) Please write down all the sample-label pair {(xp,yp)} in the data matrix D in the format: (x1,y1)(xP,yP) numerically. b) Apply PRP +Conjugate Gradient Algorithm to update w^0 to w^2 by minimizing the regularized hinge loss L(w^). Please use a constant step size parameter =0.1 in each iteration. c) Compute the average hinge loss defined by L(w^)=P1p=1Pmax(0,1ypw^Tx^p) with the trained model w^2 over the training data set. d) Please write down the decision boundary with respect to w^2 and x^. What is the position of the decision boundary, i.e., the perpendicular distance of the decision boundary from the origin? Please calculate the numerical result. e) What is the predicted results for all the samples in the training data set by w^2 ? Which samples are misclassified
Step by Step Solution
There are 3 Steps involved in it
Sure Lets go through each part of the problem stepbystep a SampleLabel Pairs Given the matrix D D be... View full answer
Get step-by-step solutions from verified subject matter experts
