Question: MLR-FS.2022.Descript Multiple Linear Regression and Feature Selection Shown below is the output of two linear regressions run on the same dataset. Model 1 contains all

MLR-FS.2022.Descript

Multiple Linear Regression and Feature Selection

Shown below is the output of two linear regressions run on the same dataset. Model 1 contains all available independent variables. Model 2 is the result of removing from Model 1 the variable with the largest p-value.

Coefficients p-value
Intercept -336.790 0.012
X1 1.650 0.343
X2 -5.630 0.680
X3 0.260 0.878
X4 185.500 0.010

Coefficients p-value
Intercept -342.919 0.078
X1 1.834 0.174
X2 -5.749 0.667
X4 181.220 0.005

1. Which model is better in terms of goodness-of-fit?

Model 1

Model 2

2. Which independent variable in Model 1 is the most statistically significant ?

X1

X4

X3

X2

3. Which independent variable in Model 2 is the most statistically significant ?

X1

X4

X2 4. Suppose we continue with this process of feature elimination where we remove the independent variable with the highest p-value, producing Model 3. Which variable is likely to become newly significant as a result of this iteration? Is goodness-of-fit guaranteed to be better in Model 3 than in Model 2?

X1; no

X1; yes

X4; yes

X4; no

X2; no

X2; yes

5. In linear regression, which of the following is not an advantage enjoyed by a model with fewer independent variables?

lower computational requirements

less susceptible to underfitting

easier to understand and explain

lower cost of data collection / management Explain your answers to the preceding 5 questions. Please explicitly state what part of your response to this question goes with which multiple-choice question.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!