Question: 1. Let the training examples be x (1) = [1, 0.5, 3]>, x (2) = [1, 0.2, 1]>, y (1) = 5, y (2) =

1. Let the training examples be x (1) = [1, 0.5, 3]>, x (2) = [1, 0.2, 1]>, y (1) = 5, y (2) = 2. Write the MSE loss function J() with the parameters and the above training data. 2. Derive the gradient (a 3 dimensional column vector) of J() with respect to . Need to derive the partial derivatives of J() with respect to each element of explicitly. Then write the gradient vector in the form of a matrix-vector product plus a constant vector. We called this the "vectorized" gradient that will be useful for gradient descent. 3. Use matrix calculus to derive the gradient of J() with respect to . The gradient must use the design matrix with the i-th rows being (x (i) ) >, i = 1, 2, and the target value vector y = [y (1), y(2)] >. Please don't derive it based on partial derives as in the above question. 4. Use matrix calculus to find the gradient of the gradient J() . This is the so-called "second-order derivatives" or the Hessian matrix H. Show that this matrix is positive-semidefinite (PSD), defined as for any vector u R 3 , u >Hu 0. Note: the result is a 3-by-3 matrix.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!