State Location of the home: CA NJ NY PA Price Asking price (in $1,000's) Size Area of
Question:
State | Location of the home: CA NJ NY PA |
Price | Asking price (in $1,000's) |
Size | Area of all rooms (in 1,000's sq. ft.) |
Beds | Number of bedrooms |
Baths | Number of bathrooms |
Prepare a scatterplot of Price and Size and justify an appropriate model using log transformation for these 2 variables (add graphs on your PDF submission). Which transformation works better for a model with these 2 variables (log-lin/ lin-log/ log-log, use these options, careful with typos).
Run a regression model predicting the Price as a function of Size, Bedrooms, and Baths (Model1). Build an appropriate model considering only main effects using the transformation you selected for the variables Price and Size. Report the coefficients of the variables you used (or their transformations) with 4 values after the decimal point. This is Model1.
Run residuals diagnostic plots and attach them in your file. Run a Shapiro test of Normality to evaluate residuals for this model, report the p-value with 4 digits after the decimal point.
Find any observations that are unusual and evaluate the top 3 extreme observations for impact on your conclusion. List the number of observations you considered extreme (or outliers) based on the residual’s plots in ascending order. Based your answer in the Residual vs Fitted plot or in the QQ Plot
Run the regression (Model2) without these 3 outliers/extreme values, and compare it with Model1. Do you recommend to remove them? Show the work in your file and use this question to explain (be brief).
Calculate the estimated effect of a 10% increase in home size on the price. Use Model1 and report your answer as follow 12.45%, report it as 12.45
Calculate the difference between California and Pennsylvania in terms of average house price. To answer this question, you need to run another model (Model3) with the explanatory variables Size, Bedrooms, and Baths and variable(s) to account for the differences between California and Pennsylvania.
Calculate the difference in Multiple R-squared from the original Model1 and Model3. Report 4 digits.