Question: Compare the two outputs from a Regression Analysis. They are both from the same dataset. The dataset looks to determine median housing values in various

Compare the two outputs from a Regression Analysis. They are both from the same dataset. The dataset looks to determine median housing values in various Boston suburbs. So the data is looking at the suburb and the summary/aggregate data for all single-family residential (as opposed to rental) properties in each suburb.
The top picture is the result of running a regression analysis on mean housing values (MEDV) as predicted by all the variables in the dataset. The second summary uses Feature Selection to find the top 3 features (or variables) used to predict the median housing values. These three features are: RM (average number of rooms per dwelling), LSTAT (percent of the population that is lower class), and PTRATIO (pupil-teacher ratio by town) Using standard analytics model selection criteria, which would be the preferred model and why?
A) MEDV as Predicted by all variables in the dataset
F2022 Final - Reg Anal Case - Regression Eval - All data.png
B) MEDV as Predicted by RM, LSTAT, and PTRATIO
F2022 Final - Reg Anal Case - Regression Eval - SelectKBest data.png
Group of answer choices
A would be the preferred model because it explains a higher percentage of the variance in MEDV.
Model B would be better because it uses fewer predictors and is, therefore, more easily implemented.
It does not matter which model you use since they have nearly identical accuracy ratings.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!