Question: Please this is the assignment but I only really need the answer for F and G PLEASE HELP Identify most important variables using best subset
Please this is the assignment but I only really need the answer for F and G
PLEASE HELP
Identify most important variables using best subset algorithm to predict logarithm of housing prices using
three model selection criteria (rss = residual sum of square; adjr^2 = adjusted R^2 and bic =Bayesian
information criteria).
B. Provide descriptive statistics for identified important variables in Part 1 A.
C. Run multiple linear regression model with these identified variables (in Part 1 A). Write regression equation
and interpret any four regression coefficients.
D. Run multiple regression analysis with the variables identified in Part 1 with adding quadratic effects of
house size (SQFT) and provide average marginal effects (ame) with comments.
E. Plot and interpret prediction and average marginal effects of house size.
F. Run MARS using all variables in the housing data to investigate non-linear response patterns. Comments on your findings that show non-linear patterns (if exist).
G. Are the identified variables using MARS the same or different from Part 1 A? Please comment, if any.
PRICEK House prices in thousand dollars
DSALE Distress Sale (1 = yes, 0= No)
BEDROOM Number of Bedrooms
SQFT House size
Ln_LotSize Log of Lot Size
CENTRALAIR 1 = Yes, 0 = No
BRICK 1 = Full or Partial Brick, otherwise 0
GARAGE Numbers of car garages
FIREPLACE Number of fireplaces
BASE_FIN 1 = Yes, 0 = No
MASONRY 1 = Yes, 0= No
PUBOPEN 1 = Yes, 0 = No (public open space)
MICHLAKE 1 = Located within 1 mile from Michigan Lake, 0= No
LAKE_RIVER =1 if located within 660 feet from river and lake, 0 otherwise
METRA_DIST Distance from Metro station
RAIL_DIST Distance from Rail station
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
