Question: Module 5, Topic 1 Practice Problems Problem 1 House Prices. This data set includes prices and characteristics of 128 houses in a major metropolitan area.

Module 5, Topic 1 Practice Problems Problem 1 House Prices. This data set includes prices and characteristics of 128 houses in a major metropolitan area. The variables include Price (sales price in dollars), SqFt (size in square feet), Bed (number of bedrooms), Bath (number of bathrooms), Offers (number of offers the house has received while on the market), Brick (whether it is brick construction: Yes/No) and Nbrhood(East/North/West). The objective is to explain the sale price of a house as a function of its characteristics. In order to include the categorical variable Nbrhood in a regression model the following 2 indicator variables are defined. East i = 1 if the house is in the east neighborhood and otherwise 0. Northi = 1 if the house is in the north neighborhood and otherwise 0. a) The following are plots of the Price versus other variables. Scatterplot of Price vs Bed 225000 200000 200000 175000 175000 150000 150000 Price Price Scatterplot of Price vs SqFt 225000 125000 125000 100000 100000 75000 75000 50000 50000 1500 1750 2000 SqFt 2250 2500 2750 2.0 3.0 3.5 Bed 4.0 4.5 5.0 Scatterplot of Price vs Offers 225000 225000 200000 200000 175000 175000 150000 150000 Price Price Scatterplot of Price vs Bath 2.5 125000 125000 100000 100000 75000 75000 50000 50000 2.0 2.5 3.0 Bath 3.5 4.0 1 2 3 4 Offers 5 6 Module 5, Topic 1 Practice Problems Scatterplot of Price vs Brick 225000 200000 Price 175000 150000 125000 100000 75000 50000 0.0 0.2 0.4 0.6 0.8 1.0 Brick Comment on these plots, which variables seem to have an impact on the price? b) The following is the output from the multiple regression model with all the independent variables included in the model. The regression equation is Price = 22841 + 53.0 SqFt + 4247 Bed + 7883 Bath - 8267 Offers + 17297 Brick - 22242 East - 20681 North Predictor Constant SqFt Bed Bath Offers Brick East North S = 10018.9 Coef 22841 52.994 4247 7883 -8267 17297 -22242 -20681 SE Coef 10236 5.734 1598 2117 1085 1982 2532 3149 R-Sq = 86.9% T 2.23 9.24 2.66 3.72 -7.62 8.73 -8.79 -6.57 P 0.028 0.000 0.009 0.000 0.000 0.000 0.000 0.000 R-Sq(adj) = 86.1% Which of the variables are related to the Price of the house? Justify your answer. c) A plot of the residuals versus the fitted values follows. Module 5, Topic 1 Practice Problems Versus Fits (response is Price) 30000 20000 Residual 10000 0 -10000 -20000 -30000 80000 100000 120000 140000 160000 Fitted Value 180000 200000 Comment on this plot. Does this suggest any problems with the model. d) A plot of the histogram of the residuals follows. Histogram (response is Price) 25 Frequency 20 15 10 5 0 -20000 -10000 0 Residual 10000 20000 Comment on this plot. Does this suggest any problems with the model. 220000 Module 5, Topic 1 Practice Problems e) Write down the equation that describes the relationship between Price and the independent variables for a house in the West neighborhood. What is the equation for a house in the East neighborhood? What is the equation for a house in the North neighborhood? Problem 2 This data appeared in the Wall Street Journal. The advertisement were selected by an annual survey conducted by Video Board Tests, Inc., a New York ad-testing company, based on interviews with 20,000 adults who were asked to name the most outstanding TV commercial they had seen, noticed, and liked. The retained impressions were based on a survey of 4,000 adults, in which regular product users were asked to cite a commercial they had seen for that product category in the past week. The variables are: Spend: TV advertising budget, 1983 ($ millions) and Mil: Millions of retained impressions per week. A question of interest is whether or not TV advertising spending affects retained impression. a) A plot of the variable Mil versus Spend follows. Scatterplot of Mil vs Spend 100 80 Mil 60 40 20 0 0 50 100 Spend 150 200 Comment on this plot. Is there a relationship between the spending level and Mil? Is this relationship a linear relationship? b) The output below is for the simple linear regression model with dependent variable Mil and independent variable Spend. The plot of the residuals versus the fitted values for this model are also plotted. Module 5, Topic 1 Practice Problems The regression equation is Mil = 22.2 + 0.363 Spend Predictor Constant Spend Coef 22.163 0.36317 S = 23.5015 SE Coef 7.089 0.09712 R-Sq = 42.4% T 3.13 3.74 P 0.006 0.001 R-Sq(adj) = 39.4% Versus Fits (response is Mil) 50 Residual 25 0 -25 -50 20 30 40 50 60 Fitted Value 70 80 90 Comment of the results. Does the plot of the residuals versus the fitted values suggest any problems with the model? c) Consider the model 2 Y i= 0 + 1 X i+ 2 X i +e i where Y denotes the variable Mil and X denotes the variable Spend. This is fitting a second order polynomial to the data. The output from this model and the plot of the residuals versus the fitted values is given below. The regression equation is Mil = 7.06 + 1.08 Spend - 0.00399 Spend*Spend Predictor Constant Spend Spend*Spend S = 21.8185 Coef 7.059 1.0847 -0.003990 SE Coef 9.986 0.3699 0.001984 R-Sq = 53.0% T 0.71 2.93 -2.01 P 0.489 0.009 0.060 R-Sq(adj) = 47.7% Module 5, Topic 1 Practice Problems Versus Fits (response is Mil) 40 30 Residual 20 10 0 -10 -20 -30 -40 10 20 30 40 50 Fitted Value 60 70 80 Comment of the results. Does the plot of the residuals versus the fitted values suggest any problems with the model? d) Compare the model that was fit to the data in part b with the model that was fit to the data in part c. Which model fits the data better? Justify your

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!