Question: Team Regression Assignment This is a team assignment. Team assignments should be completed by each team as a group effort. There are expectations that all

Team Regression Assignment This is a team assignment. Team assignments should be completed by each team as a group effort. There are expectations that all students will contribute their complete participation in the team assignments. Members of different team are not to discuss these assignments with each other. If your team has questions or needs clarification, please contact me. uscrime data (from Exercises 4.3 and 9.6) Develop a Regression model using automated stepwise regression to predict the reported crime rate per million population. The dataset from Vandaele (1978), also in Hand et al. (1994), contains data on the reported 1960 crime rate per million population and 13 potential explanatory variables from 47 states. The data appear in data(uscrime). The variables are: R reported crime rate per million population (response variable) Age the number of males aged 14 to 24 S 1 if Southern state, 0 otherwise Ed 10 times mean years of schooling of population age 25 or older Ex0 1960 per capital expenditures by state and local government on police protection Ex1 same as Ex0 but for 1959 LF number of employed urban males aged 14-24 per 1000 such individuals M number of males per 1000 females N state population size in hundred thousands NW number of nonwhites per 1000 population U1 unemployment rate per 1000 among urban males aged 14-24 U2 unemployment rate per 1000 among urban males aged 25-39 W a measure of wealth, units = 10 dollars X number of families per 1000 earning below one half of the median income (a measure of income inequality) PART 1: Use automated stepwise regression to develop a model to explain R, the reported crime rate per million population. Cut and paste your stepwise regression output from R (in courier new font). Evaluate and compare only p and cp and highlight the models that seem reasonable. PART 2: Of the models highlighted, select the best model. For this model list the model summary from R (in courier new font). This summary should include the linear model, coefficients estimates with their p-values, adjusted Rsquare and VIF amounts. Defend your choice and justify your final model. Your selected model should not have a multi-collinearity problem, and all predictor regression coefficients should be significantly different from zero. It may be helpful to use partial F-tests here. If so, please include your R output (in courier new font). Complete answers will also discuss adjusted R2, errors and residuals. PART 3: For the model you selected in part 2, interpret the coefficient of the last variable in your model. Clearly explain what this number means in one complete sentence. PART 4: Create a splom that only includes the variables in the model you selected in part 2. Do the positive and negative values of the coefficients in part 3 agree with the general direction seen in this splom for each variable when compared to the response variable R? Yes or No, if no highlight were they disagree. PART 5: Using your model from part 2, run a complete set of residual plots - that is residual plots and partial residual plots. Are there any concerns in the residual or partial residual plots? Explain by briefly discussing each row/plot. PART 6: Consider the diagnostics plots. Discuss any possible outliers (from the case statistics) and explain why there are flagged. Consider observation 29, NY. It should be outside the limits on some (but not all) case statistics on the diagnostic plot. Can your team begin to explain why? (Hint: To completely answer this question you will need to reference variable amounts from NY.) Some helpful R code is included below and also in your .R file. library(HH) data(uscrime) head(uscrime) summary(uscrime) uscrime$S <- as.factor(uscrime$S) summary(uscrime$S) ## the State variable should not be included in the regression analysis ## since this is a label (not a predictor) uscrime.var <- uscrime[, 1:14] ## removes column 15 the state variable head(uscrime.var) ## full regresson model uscrime.var.lm <- lm(R ~ . , data=uscrime.var) summary(uscrime.var.lm) ## for your stepwise regression remember to have data=uscrime.var Instructions for submitting Team Assignment: Please send one attachment to the dropbox in BlackBoard before the due date and time. Include your team name and all team members on the first page of this document. Each part can be answered in 1 or 2 paragraphs or less. The following subtitles are suggested: I. II. III. IV. V. VI. VII. Part 1 -R output (in courier new font) with correct highlighting Part 2 - R output (in courier new font) and written solution Part 3 -written solution Part 4 - graphic and written solution Part 5 - graphic and written solution Part 6 - graphic and written solution Appendix of all R code (in courier new font) All Figures and Tables should be clearly labeled. If a figure or table is included it should be discussed in your document. Your final document should be professionally written

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!