Question: b. Fit a multiple linear regression model to the median house price (MEDV) as a function of CRIM, CHAS, and RM. Write the equation for

b. Fit a multiple linear regression model to the median house price (MEDV) as a function of CRIM, CHAS, and RM. Write the equation for predicting the median house price from the predictors in the model. c. Using the estimated regression model, what median house price is predicted for a tract in the Boston area that does not bound the Charles River, has a crime rate of 0.1, and where the average number of rooms per house is 6? What is the prediction error? d. Reduce the number of predictors: i. Which predictors are likely to be measuring the same thing among the 13 predictors? Discuss the relationships among INDUS, NOX, and TAX. ii. Computer the correlation table for the 12 numerical predictors and search for highly correlated pairs. These have protentional redundancy and can cause multicollinearity. Choose which ones to remove based on this table. iii. Use stepwise regression with the three options (backward, forward, both) to reduce the remaining predictors as follows: Run stepwise on the training set. Choose the top model from each stepwise run. Then use each of these models separately to predict the validation set. Compare RMSE, MAPE, and mean error, as well as lift charts. Finally, describe the best model. Problem 2: (20 points) Please refer to the exhaustive search example from the lecture slides (copied as below) and solve the following questions: (a) generate a C plot based on its output, i.e., "sumScp"; (b) make recommendation for the optimal model based on the C plot
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
