Question 4. Consider k-nearest-neighbor (k-NN) or Kernel Regression with a Gaussian Kernel. Show how to control...

Fantastic news! We've Found the answer you've been seeking!

Question:

Transcribed Image Text:

Question 4. Consider k-nearest-neighbor (k-NN) or Kernel Regression with a Gaussian Kernel. Show how to control the bias-variance trade-off for either of the methods. That is, say you pick k-NN. Clearly, k is a paramter you control. Show, how the bias and variance changes as a result of changing k (similar paramter in kernel regression is h, see Q1.). Even though I am not looking for a formal proof, you need to be able to use the definition of bias and variance either on an example, by simulation or mathematically. You also need to be a able to give an intuitive explanation (something like, 'as we see here, as k increases bias .. because ...). Note I will accept any method's bias-variance behavior as long as you can show concretely that the behavior exists as a result of changing something in the method/model. 2. Bias-variance trade-off in the presence of multicollinearity. Let Y = X3 + e be a linear regression model in vector form. Here, Y, e € R", X is a known n xp predictor matrix, and 3 € RP is the parameter vector to be estimated. We assume E(e | X) = 0 and Var(e| X) = ³1. We showed in class that the least squares estimator is given by 3= arg min (Y-X 3)' (Y-X3) = arg min ||Y-XB| = (X'X)~¹X'Y, BERP BER which is the minimum variance unbiased estimator. Unbiasedness is a desirable property, but it may come at a cost of variance. Note that the estimator 3 and its variance both depend on (X'X)-¹. In- verting the matrix X'X is problematic if it is ill-conditioned, or in the regression setting, if the predictor variables exhibit multicollinearity (that is, if one pre- dictor variable can be linearly predicted by the others with a high degree of accuracy). One approach to alleviate this problem is to add an F-penalty to the sum of squares criterion. This is called ridge regression: 3= arg min (Y-XB)' (Y-X3) + AS8= arg min | Y - XB||+X|||| SER SER where A is a fixed nonnegative constant called the ridge parameter. (a) Show that 3x = (X'X+XI)-¹X'Y. Although X'X may be ill-conditioned, the matrix X'X+XI becomes increasingly well-conditioned as A increases. (b) Calculate E(3x | X) and Var(3x | X). What can you say about the bias- variance trade-off for the ridge estimator 3 compared to the least squares estimator 3? Question 4. Consider k-nearest-neighbor (k-NN) or Kernel Regression with a Gaussian Kernel. Show how to control the bias-variance trade-off for either of the methods. That is, say you pick k-NN. Clearly, k is a paramter you control. Show, how the bias and variance changes as a result of changing k (similar paramter in kernel regression is h, see Q1.). Even though I am not looking for a formal proof, you need to be able to use the definition of bias and variance either on an example, by simulation or mathematically. You also need to be a able to give an intuitive explanation (something like, 'as we see here, as k increases bias .. because ...). Note I will accept any method's bias-variance behavior as long as you can show concretely that the behavior exists as a result of changing something in the method/model. 2. Bias-variance trade-off in the presence of multicollinearity. Let Y = X3 + e be a linear regression model in vector form. Here, Y, e € R", X is a known n xp predictor matrix, and 3 € RP is the parameter vector to be estimated. We assume E(e | X) = 0 and Var(e| X) = ³1. We showed in class that the least squares estimator is given by 3= arg min (Y-X 3)' (Y-X3) = arg min ||Y-XB| = (X'X)~¹X'Y, BERP BER which is the minimum variance unbiased estimator. Unbiasedness is a desirable property, but it may come at a cost of variance. Note that the estimator 3 and its variance both depend on (X'X)-¹. In- verting the matrix X'X is problematic if it is ill-conditioned, or in the regression setting, if the predictor variables exhibit multicollinearity (that is, if one pre- dictor variable can be linearly predicted by the others with a high degree of accuracy). One approach to alleviate this problem is to add an F-penalty to the sum of squares criterion. This is called ridge regression: 3= arg min (Y-XB)' (Y-X3) + AS8= arg min | Y - XB||+X|||| SER SER where A is a fixed nonnegative constant called the ridge parameter. (a) Show that 3x = (X'X+XI)-¹X'Y. Although X'X may be ill-conditioned, the matrix X'X+XI becomes increasingly well-conditioned as A increases. (b) Calculate E(3x | X) and Var(3x | X). What can you say about the bias- variance trade-off for the ridge estimator 3 compared to the least squares estimator 3?

Related Book For answer-question

answer-question

Auditing and Assurance Services Understanding the Integrated Audit

Auditing and Assurance Services Understanding the Integrated Audit

ISBN: 978-0471726340

1st edition

Authors: Karen L. Hooks

See More Books

Posted Date: Feb 03, 2024 06:48 AM

See More Questions