Question: (40 points) Softmax classifier gradient. For softmax classifier, derive the gradient of the log likelihood. Concretely, assume a classification problem with c classes Samples are

 (40 points) Softmax classifier gradient. For softmax classifier, derive the gradient

(40 points) Softmax classifier gradient. For softmax classifier, derive the gradient of the log likelihood. Concretely, assume a classification problem with c classes Samples are (x1), y(1)), ..., (x(m), y(m)), where x) ER", y0) {1,...,c}, j = 1, ..., m , (() Parameters are 0 x - {Wi, bi}i=1,..., i= Probablistic model is Pr (76) = i|x6), ) = softmax;(x)) = = where softmaxi(x) ewtxtb; E-lewx+bx X Derive the log-likelihood L, and its gradient w.r.t. the parameters, Vw;L and Vb;L, for i = 1, ...,C. Note: We can group wi and b into a single vector by augmenting the data vectors with an additional dimension of constant 1. Let x = then as(x) = wx+b = {x. bi This unifies Vw;L and Vb;L into Vw;L. X Wi = Wi= = W = ai T 2 7 7

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!