Question: 5. Recall that a linear system of equations (solved when tting linear regression models) is ill- posed when there are more variables / predictors (columns)


5. Recall that a linear system of equations (solved when tting linear regression models) is ill- posed when there are more variables / predictors (columns) than equations/ instances (rows). For example, the equation a, + b = 5 does not have a unique solution for (1,1), since there are two variables and just one equation. This question deals with ridge and linear regression. Consider a centered data matrix X with n rows and p predictors and outcome vector 3;. Let x,- refer to row/ instance '23 of X and m, to the jth predictor in instance 2'. The models trained in this question will not have an intercept. We now create a new data matrix X ' by taking X and adding p new rows to it. The new rows are $i1+1:" . ,miHP. In each of the new rows, set all elements to zero except for an:1 +3.33. = x/X. So the jth new row is all zeros, except for the j-th column.We similarly create a new outcome variable y' by appending p zeros to '9 (so '9; = y,- for 1 5 i 5 n and y; = 0 for n+ 1 32' S n+p). Figure 1 gives a visual representation of the augmented matrix. Note: questions (c) (e) are for extra credit. p predictors l l l X, X2. X 261 x12 ... yl Y g .752 y}! C B .E _ K 0 0 X Y 0 yr Figure 1: The original (X, y) (left), and the augmented data (X " , y' ) (right). (a) (2 points) Suppose we do ridge regression (without an intercept) on (X, y) to get coef cients )8 = ()81, . . . , 31,). What is the ridge regression objective function in terms of X, y, [3\"? (b) (3 points) Suppose we train a linear regression model on (X', y') and get coefcients )6\". What is the residual of instance 93:, +1 (the rst row that we added to X ' ) in terms of A: X', y', 5"? (c) (2 points) Using your solution to the previous question, what is the linear regression objective function for the model trained on (X ' ,y')? (d) (3 points) Compare the ridge objective for (X , y) and linear regression objective on (X ' , y'). What do you notice? (e) (1 point) We have focused on using regularization to shrink coeicients and get simpler models, give one other reason to use ridge regression
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
