Question: Instructions: Answer each question as briefly and completely as possible on a Microsoft Word document. Be sure to include the R commands along with your

Instructions: Answer each question as briefly and completely as possible on a Microsoft Word document. Be sure to include the R commands along with your answers. Scenario: The dataset CSDATA.xlsx (attached to the case study in Blackboard) contains survey responses of 500 heads of households in the Philadelphia area. The data contains information on their annual consumption expenditures and other information regarding the characteristics of their household. Your job is to construct the best regression model to explain how household characteristics impact consumption expenditures. The variables in the dataset are as follows: Y_i: Annual consumption expenditures of household i in thousands of US dollars. X_1i: Annual gross income of household i in thousands of US dollars. X_2i: Hourly wage rate of household i in US dollars. X_3i: Number of dependents in household i (i.e. number of children) X_4i: Average age of dependents in household i in years. X_5i: College education (1 = respondent has a college education, 0 = respondent does not have a college education) ________________________________________ 1. (10 pts) Test your independent variables (X_1i through X_5i) for potential multicollinearity. Report the variance inflation factor (VIF) for each independent variable, and state your conclusion. If your test suggests that you should remove an independent variable from your analysis, then remove it and test the remaining independent variables for multicollinearity again. ________________________________________ 2. (10 pts) Run a regression with consumption expenditure (Y_i) as your dependent variable and the remaining X variables as your independent variables. Check if any of the regression coefficients are insignificantly different from zero. If so, remove the corresponding independent variables (one at a time) from your analysis and arrive at a model where all regression coefficients are statistically significant. Only state your final regression results. ________________________________________ 3. (10 pts) Using your results from question 2, perform a partial residual analysis by testing the linearity assumption of your regression. Does there appear to be a pattern in any of the relationships between the residuals and your independent variables? If so, then extend your dataset by adding a nonlinear term of the independent variable in question (i.e. add a squared term). State your final regression results which successfully pass a residual analysis. ________________________________________ 4. (15 pts) Using your results from question 3, answer the following questions. a. What is the R^2 value? What does this say in the context of the scenario? b. Interpret each of the relationships between consumption expenditure and the independent variables. In the occurrence of a linear and quadratic term, be sure to interpret both terms jointly and not separately. c. Test the following hypothesis: For every $1 increase in income on average, a household will increase their consumption expenditure by MORE THAN $0.25. Be sure to state the null and alternative hypotheses, the p-value of the test, and your conclusion.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!