Question: 1.3. (Gradient Descent, 10 points) Suppose we have training data { (21, y1), (x2, yz), ..., (N, yN) }, where x; E Rd and y;

 1.3. (Gradient Descent, 10 points) Suppose we have training data {

1.3. (Gradient Descent, 10 points) Suppose we have training data { (21, y1), (x2, yz), ..., (N, yN) }, where x; E Rd and y; E RK, i = 1,2, ..., N. (1) Find the closed-form solution of the following prob- lem. N min aillyi - Wai - bllz, W,b i=1 where the diagonal of diagonal matrix diag(A) = (01, 02, ..., aN) are weights for different sample; (2) Show how to use gradient descent to solve the problem. Hint: You can use either definition or differentiation method to derive the derivatives. If you use differentiation method, please note that N aillyi - Wai - bll3 = tr[( Y - XW)"A(Y - XW)] i=1 where Y = (y1, y2, . .., yN)T E RNxk, X = [(x], 1)], (x7, 1)T, ..., (x(, 1)TIT E RNx(d+1) W = (W,b)T ER(d+1)xk, and A = diag(@1, 02, . . ., ON)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!