Question: 1. Let the training examples be x (1) = [1, 0.5, 3]>, x (2) = [1, 0.2, 1]>, y (1) = 5, y (2) =
1. Let the training examples be x (1) = [1, 0.5, 3]>, x (2) = [1, 0.2, 1]>, y (1) = 5, y (2) = 2. Write the MSE loss function J() with the parameters and the above training data. 2. Derive the gradient (a 3 dimensional column vector) of J() with respect to . Need to derive the partial derivatives of J() with respect to each element of explicitly. Then write the gradient vector in the form of a matrix-vector product plus a constant vector. We called this the "vectorized" gradient that will be useful for gradient descent. 3. Use matrix calculus to derive the gradient of J() with respect to . The gradient must use the design matrix with the i-th rows being (x (i) ) >, i = 1, 2, and the target value vector y = [y (1), y(2)] >. Please don't derive it based on partial derives as in the above question. 4. Use matrix calculus to find the gradient of the gradient J() . This is the so-called "second-order derivatives" or the Hessian matrix H. Show that this matrix is positive-semidefinite (PSD), defined as for any vector u R 3 , u >Hu 0. Note: the result is a 3-by-3 matrix.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
