Question: Assume we have a 10-variable regression problem with a training data set and testing data set. We run the following three regression methods on the
Assume we have a 10-variable regression problem with a training data set and testing data set. We run the following three regression methods on the training data: best subsets forward selection (forward stepwise) backward elimination (backward stepwise) For each method we keep the chosen models with 5 and 7 variables (total of 6 models).
(a) Which 5-variable model will have the lowest Residual Sum of Squares (RSS) for the training data? Briefly explain your answer.
(b) Which 5-variable model will have the lowest prediction RSS, ie the lowest RSS when using the model fit using training data to predict for the testing data? Briefly explain your answer.
(c) For which of the methods are we guaranteed that the 5-variables in the 5-variable model are a subset of the 7-variables in the 7-variable model?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
