Question: Please answer for Exercise 4.13 Exercise 4.13 respect to w, we obtain E(w)=n=1N(yntn)n where we have made use of (4.88). We see that the factor

Please answer for Exercise 4.13 Exercise 4.13 respect to w, we obtainPlease answer for Exercise 4.13

Exercise 4.13 respect to w, we obtain E(w)=n=1N(yntn)n where we have made use of (4.88). We see that the factor involving the derivative of the logistic sigmoid has cancelled, leading to a simplified form for the gradient of the log likelihood. In particular, the contribution to the gradient from data point n is given by the 'error' yntn between the target value and the prediction of the model, times the basis function vector n. Furthermore, comparison with (3.13) shows that this takes precisely the same form as the gradient of the sum-of-squares error function for the linear regression model. If desired, we could make use of the result (4.91) to give a sequential algorithm in which patterns are presented one at a time, in which each of the weight vectors is updated using (3.22) in which En is the nth term in (4.91). It is worth noting that maximum likelihood can exhibit severe over-fitting for data sets that are linearly separable. This arises because the maximum likelihood solution occurs when the hyperplane corresponding to =0.5, equivalent to wT= 0 , separates the two classes and the magnitude of w goes to infinity. In this case, the logistic sigmoid function becomes infinitely steep in feature space, corresponding to a Heaviside step function, so that every training point from each class k is assigned

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!